ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03098
  4. Cited By
Accounting for Variance in Machine Learning Benchmarks

Accounting for Variance in Machine Learning Benchmarks

1 March 2021
Xavier Bouthillier
Pierre Delaunay
Mirko Bronzi
Assya Trofimov
Brennan Nichyporuk
Justin Szeto
Naz Sepah
Edward Raff
Kanika Madan
Vikram S. Voleti
Samira Ebrahimi Kahou
Vincent Michalski
Dmitriy Serdyuk
Tal Arbel
C. Pal
Gaël Varoquaux
Pascal Vincent
ArXivPDFHTML

Papers citing "Accounting for Variance in Machine Learning Benchmarks"

39 / 89 papers shown
Title
Mixture Manifold Networks: A Computationally Efficient Baseline for
  Inverse Modeling
Mixture Manifold Networks: A Computationally Efficient Baseline for Inverse Modeling
Gregory P. Spell
Simiao Ren
L. Collins
Jordan M. Malof
19
1
0
25 Nov 2022
Lempel-Ziv Networks
Lempel-Ziv Networks
Rebecca Saul
Mohammad Mahmudul Alam
John Hurwitz
Edward Raff
Tim Oates
James Holt
26
2
0
23 Nov 2022
BiasBed -- Rigorous Texture Bias Evaluation
BiasBed -- Rigorous Texture Bias Evaluation
Nikolai Kalischek
Rodrigo Caye Daudt
T. Peters
Reinhard Furrer
Jan Dirk Wegner
Konrad Schindler
21
2
0
23 Nov 2022
Policy Gradient With Serial Markov Chain Reasoning
Policy Gradient With Serial Markov Chain Reasoning
Edoardo Cetin
Oya Celiktutan
BDL
LRM
24
2
0
13 Oct 2022
Deep Reinforcement Learning for Cryptocurrency Trading: Practical
  Approach to Address Backtest Overfitting
Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest Overfitting
Berend Gort
Xiao-Yang Liu
Xinghang Sun
Jiechao Gao
Shuai Chen
Chris Wang
32
13
0
12 Sep 2022
Reproducibility in machine learning for medical imaging
Reproducibility in machine learning for medical imaging
O. Colliot
Elina Thibeau-Sutre
Ninon Burgos
OOD
30
8
0
12 Sep 2022
Autism spectrum disorder classification based on interpersonal neural
  synchrony: Can classification be improved by dyadic neural biomarkers using
  unsupervised graph representation learning?
Autism spectrum disorder classification based on interpersonal neural synchrony: Can classification be improved by dyadic neural biomarkers using unsupervised graph representation learning?
C. Gerloff
K. Konrad
Jana A. Kruppa
M. Schulte-Rüther
Vanessa Reindl
27
4
0
17 Aug 2022
Why do tree-based models still outperform deep learning on tabular data?
Why do tree-based models still outperform deep learning on tabular data?
Léo Grinsztajn
Edouard Oyallon
Gaël Varoquaux
LMTD
35
357
0
18 Jul 2022
Robustness Evaluation of Deep Unsupervised Learning Algorithms for
  Intrusion Detection Systems
Robustness Evaluation of Deep Unsupervised Learning Algorithms for Intrusion Detection Systems
D'Jeff K. Nkashama
Ariana Soltani
Jean-Charles Verdier
Marc Frappier
Pierre-Marting Tardif
F. Kabanza
OOD
AAML
29
5
0
25 Jun 2022
Do we need Label Regularization to Fine-tune Pre-trained Language
  Models?
Do we need Label Regularization to Fine-tune Pre-trained Language Models?
I. Kobyzev
A. Jafari
Mehdi Rezagholizadeh
Tianda Li
Alan Do-Omri
Peng Lu
Pascal Poupart
A. Ghodsi
30
2
0
25 May 2022
Integrating Reward Maximization and Population Estimation: Sequential
  Decision-Making for Internal Revenue Service Audit Selection
Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection
Peter Henderson
Ben Chugg
Brandon R. Anderson
Kristen M. Altenburger
Alex Turk
J. Guyton
Jacob Goldin
Daniel E. Ho
OffRL
20
9
0
25 Apr 2022
A Revealing Large-Scale Evaluation of Unsupervised Anomaly Detection
  Algorithms
A Revealing Large-Scale Evaluation of Unsupervised Anomaly Detection Algorithms
Maxime Alvarez
Jean-Charles Verdier
D'Jeff K. Nkashama
Marc Frappier
Pierre Martin Tardif
F. Kabanza
26
17
0
21 Apr 2022
Sources of Irreproducibility in Machine Learning: A Review
Sources of Irreproducibility in Machine Learning: A Review
Odd Erik Gundersen
Kevin Coakley
Christine R. Kirkpatrick
Yolanda Gil
SyDa
27
33
0
15 Apr 2022
deep-significance - Easy and Meaningful Statistical Significance Testing
  in the Age of Neural Networks
deep-significance - Easy and Meaningful Statistical Significance Testing in the Age of Neural Networks
Dennis Ulmer
Christian Hardmeier
J. Frellsen
48
42
0
14 Apr 2022
Experimental Standards for Deep Learning in Natural Language Processing
  Research
Experimental Standards for Deep Learning in Natural Language Processing Research
Dennis Ulmer
Elisa Bassignana
Max Müller-Eberstein
Daniel Varab
Mike Zhang
Rob van der Goot
Christian Hardmeier
Barbara Plank
19
10
0
13 Apr 2022
Machine Learning State-of-the-Art with Uncertainties
Machine Learning State-of-the-Art with Uncertainties
Peter Steinbach
Felicita Gernhardt
Mahnoor Tanveer
Steve Schmerler
Sebastian Starke
UQCV
OOD
22
3
0
11 Apr 2022
A Siren Song of Open Source Reproducibility
A Siren Song of Open Source Reproducibility
Edward Raff
Andrew L. Farris
16
9
0
09 Apr 2022
Does the Market of Citations Reward Reproducible Work?
Does the Market of Citations Reward Reproducible Work?
Edward Raff
HAI
CML
20
12
0
08 Apr 2022
The worst of both worlds: A comparative analysis of errors in learning
  from data in psychology and machine learning
The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning
Jessica Hullman
Sayash Kapoor
Priyanka Nanayakkara
Andrew Gelman
Arvind Narayanan
33
39
0
12 Mar 2022
Spatial State-Action Features for General Games
Spatial State-Action Features for General Games
Dennis J. N. J. Soemers
Éric Piette
Matthew Stephenson
C. Browne
63
4
0
17 Jan 2022
Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of
  Hyperparameters and Metrics on Performance and Replicability
Empirical Evaluation of Deep Learning Models for Knowledge Tracing: Of Hyperparameters and Metrics on Performance and Replicability
Sami Sarsa
Juho Leinonen
Arto Hellas
19
13
0
30 Dec 2021
Ten years of image analysis and machine learning competitions in
  dementia
Ten years of image analysis and machine learning competitions in dementia
Esther E. Bron
S. Klein
Annika Reinke
J. Papma
Lena Maier-Hein
Daniel C. Alexander
N. Oxtoby
OOD
27
15
0
15 Dec 2021
DR3: Value-Based Deep Reinforcement Learning Requires Explicit
  Regularization
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
Aviral Kumar
Rishabh Agarwal
Tengyu Ma
Aaron Courville
George Tucker
Sergey Levine
OffRL
31
65
0
09 Dec 2021
Evaluating deep transfer learning for whole-brain cognitive decoding
Evaluating deep transfer learning for whole-brain cognitive decoding
A. Thomas
U. Lindenberger
Wojciech Samek
K. Müller
AI4CE
27
12
0
01 Nov 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement
  Learning
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning
Edoardo Cetin
Oya Celiktutan
OffRL
42
17
0
07 Oct 2021
A Framework for Cluster and Classifier Evaluation in the Absence of
  Reference Labels
A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels
R. Joyce
Edward Raff
Charles K. Nicholas
48
16
0
23 Sep 2021
Data Augmentation Through Monte Carlo Arithmetic Leads to More
  Generalizable Classification in Connectomics
Data Augmentation Through Monte Carlo Arithmetic Leads to More Generalizable Classification in Connectomics
Greg Kiar
Yohan Chatelain
A. Salari
Alan C. Evans
Tristan Glatard
OOD
13
3
0
20 Sep 2021
Torch.manual_seed(3407) is all you need: On the influence of random
  seeds in deep learning architectures for computer vision
Torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision
David Picard
3DV
VLM
17
88
0
16 Sep 2021
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems
  for HPO
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO
Katharina Eggensperger
Philip Muller
Neeratyoy Mallik
Matthias Feurer
René Sass
Aaron Klein
Noor H. Awad
Marius Lindauer
Frank Hutter
46
100
0
14 Sep 2021
Learning with Holographic Reduced Representations
Learning with Holographic Reduced Representations
Ashwinkumar Ganesan
Hang Gao
S. Gandhi
Edward Raff
Tim Oates
James Holt
Mark McLean
13
23
0
05 Sep 2021
Relating the Partial Dependence Plot and Permutation Feature Importance
  to the Data Generating Process
Relating the Partial Dependence Plot and Permutation Feature Importance to the Data Generating Process
Christoph Molnar
Timo Freiesleben
Gunnar Konig
Giuseppe Casalicchio
Marvin N. Wright
B. Bischl
4
60
0
03 Sep 2021
When are Deep Networks really better than Decision Forests at small
  sample sizes, and how?
When are Deep Networks really better than Decision Forests at small sample sizes, and how?
Haoyin Xu
K. A. Kinfu
Will LeVine
Sambit Panda
Jayanta Dey
...
M. Kusmanov
F. Engert
Christopher M. White
Joshua T. Vogelstein
Carey E. Priebe
25
23
0
31 Aug 2021
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Deep Reinforcement Learning at the Edge of the Statistical Precipice
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Aaron Courville
Marc G. Bellemare
OffRL
59
639
0
30 Aug 2021
Challenges for cognitive decoding using deep learning methods
Challenges for cognitive decoding using deep learning methods
A. Thomas
Christopher Ré
R. Poldrack
AI4CE
24
6
0
16 Aug 2021
The Benchmark Lottery
The Benchmark Lottery
Mostafa Dehghani
Yi Tay
A. Gritsenko
Zhe Zhao
N. Houlsby
Fernando Diaz
Donald Metzler
Oriol Vinyals
42
89
0
14 Jul 2021
Randomness In Neural Network Training: Characterizing The Impact of
  Tooling
Randomness In Neural Network Training: Characterizing The Impact of Tooling
Donglin Zhuang
Xingyao Zhang
Shuaiwen Leon Song
Sara Hooker
25
75
0
22 Jun 2021
GPU Semiring Primitives for Sparse Neighborhood Methods
GPU Semiring Primitives for Sparse Neighborhood Methods
Corey J. Nolet
Divye Gala
Edward Raff
Joe Eaton
Brad Rees
John Zedlewski
Tim Oates
27
4
0
13 Apr 2021
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
191
1,032
0
06 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,996
0
20 Apr 2018
Previous
12