ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.07322
  4. Cited By
Evaluation of Neural Architectures Trained with Square Loss vs
  Cross-Entropy in Classification Tasks

Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks

12 June 2020
Like Hui
M. Belkin
    UQCV
    AAML
    VLM
ArXivPDFHTML

Papers citing "Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks"

45 / 45 papers shown
Title
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition
Nathanael Tepakbong
Ding-Xuan Zhou
Xiang Zhou
33
0
0
13 May 2025
Continuous Visual Autoregressive Generation via Score Maximization
Continuous Visual Autoregressive Generation via Score Maximization
Chenze Shao
Fandong Meng
Jie Zhou
DiffM
26
0
0
12 May 2025
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations
Chenyu You
Haocheng Dai
Yifei Min
Jasjeet Sekhon
S. Joshi
James S. Duncan
60
2
0
01 Jan 2025
Does SGD really happen in tiny subspaces?
Does SGD really happen in tiny subspaces?
Minhak Song
Kwangjun Ahn
Chulhee Yun
66
4
1
25 May 2024
Towards Certification of Uncertainty Calibration under Adversarial Attacks
Towards Certification of Uncertainty Calibration under Adversarial Attacks
Cornelius Emde
Francesco Pinto
Thomas Lukasiewicz
Philip H. S. Torr
Adel Bibi
AAML
42
0
0
22 May 2024
The Interpolating Information Criterion for Overparameterized Models
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
20
7
0
15 Jul 2023
A Neural Collapse Perspective on Feature Evolution in Graph Neural
  Networks
A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks
Vignesh Kothapalli
Tom Tirer
Joan Bruna
34
10
0
04 Jul 2023
Dual Focal Loss for Calibration
Dual Focal Loss for Calibration
Linwei Tao
Minjing Dong
Chang Xu
UQCV
37
26
0
23 May 2023
Fairness Uncertainty Quantification: How certain are you that the model
  is fair?
Fairness Uncertainty Quantification: How certain are you that the model is fair?
Abhishek Roy
P. Mohapatra
14
5
0
27 Apr 2023
Automatic Gradient Descent: Deep Learning without Hyperparameters
Automatic Gradient Descent: Deep Learning without Hyperparameters
Jeremy Bernstein
Chris Mingard
Kevin Huang
Navid Azizan
Yisong Yue
ODL
16
17
0
11 Apr 2023
General Loss Functions Lead to (Approximate) Interpolation in High
  Dimensions
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai
Vidya Muthukumar
18
5
0
13 Mar 2023
Generalizing and Decoupling Neural Collapse via Hyperspherical
  Uniformity Gap
Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap
Weiyang Liu
L. Yu
Adrian Weller
Bernhard Schölkopf
32
17
0
11 Mar 2023
Calibrating a Deep Neural Network with Its Predecessors
Calibrating a Deep Neural Network with Its Predecessors
Linwei Tao
Minjing Dong
Daochang Liu
Changming Sun
Chang Xu
BDL
UQCV
6
5
0
13 Feb 2023
Cut your Losses with Squentropy
Cut your Losses with Squentropy
Like Hui
M. Belkin
S. Wright
UQCV
13
8
0
08 Feb 2023
Pathologies of Predictive Diversity in Deep Ensembles
Pathologies of Predictive Diversity in Deep Ensembles
Taiga Abe
E. Kelly Buchanan
Geoff Pleiss
John P. Cunningham
UQCV
38
13
0
01 Feb 2023
Supervision Complexity and its Role in Knowledge Distillation
Supervision Complexity and its Role in Knowledge Distillation
Hrayr Harutyunyan
A. S. Rawat
A. Menon
Seungyeon Kim
Surinder Kumar
22
12
0
28 Jan 2023
Annealing Double-Head: An Architecture for Online Calibration of Deep
  Neural Networks
Annealing Double-Head: An Architecture for Online Calibration of Deep Neural Networks
Erdong Guo
D. Draper
Maria de Iorio
35
0
0
27 Dec 2022
Principled and Efficient Transfer Learning of Deep Models via Neural
  Collapse
Principled and Efficient Transfer Learning of Deep Models via Neural Collapse
Xiao Li
Sheng Liu
Jin-li Zhou
Xin Lu
C. Fernandez‐Granda
Zhihui Zhu
Q. Qu
AAML
23
18
0
23 Dec 2022
Perturbation Analysis of Neural Collapse
Perturbation Analysis of Neural Collapse
Tom Tirer
Haoxiang Huang
Jonathan Niles-Weed
AAML
30
23
0
29 Oct 2022
The Fisher-Rao Loss for Learning under Label Noise
The Fisher-Rao Loss for Learning under Label Noise
Henrique K. Miyamoto
Fábio C. C. Meneghetti
Sueli I. R. Costa
NoLa
18
5
0
28 Oct 2022
Towards Understanding GD with Hard and Conjugate Pseudo-labels for
  Test-Time Adaptation
Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation
Jun-Kun Wang
Andre Wibisono
29
7
0
18 Oct 2022
Are All Losses Created Equal: A Neural Collapse Perspective
Are All Losses Created Equal: A Neural Collapse Perspective
Jinxin Zhou
Chong You
Xiao Li
Kangning Liu
Sheng Liu
Qing Qu
Zhihui Zhu
30
58
0
04 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis
  Function Decomposition
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
Scale-invariant Bayesian Neural Networks with Connectivity Tangent
  Kernel
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Sungyub Kim
Si-hun Park
Kyungsu Kim
Eunho Yang
BDL
26
4
0
30 Sep 2022
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the
  Optimization Landscape Around the True Solution
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the Optimization Landscape Around the True Solution
Jianhao Ma
S. Fattahi
40
5
0
15 Jul 2022
Making Look-Ahead Active Learning Strategies Feasible with Neural
  Tangent Kernels
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels
Mohamad Amin Mohamadi
Wonho Bae
Danica J. Sutherland
28
20
0
25 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
21
71
0
08 Jun 2022
Standalone Neural ODEs with Sensitivity Analysis
Standalone Neural ODEs with Sensitivity Analysis
Rym Jaroudi
Lukáš Malý
Gabriel Eilertsen
B. Johansson
Jonas Unger
George Baravdish
21
0
0
27 May 2022
Investigating Generalization by Controlling Normalized Margin
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
25
6
0
08 May 2022
The Effects of Regularization and Data Augmentation are Class Dependent
The Effects of Regularization and Data Augmentation are Class Dependent
Randall Balestriero
Léon Bottou
Yann LeCun
28
94
0
07 Apr 2022
Generalization Through The Lens Of Leave-One-Out Error
Generalization Through The Lens Of Leave-One-Out Error
Gregor Bachmann
Thomas Hofmann
Aurélien Lucchi
44
7
0
07 Mar 2022
On the Optimization Landscape of Neural Collapse under MSE Loss: Global
  Optimality with Unconstrained Features
On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features
Jinxin Zhou
Xiao Li
Tian Ding
Chong You
Qing Qu
Zhihui Zhu
22
97
0
02 Mar 2022
On the Regularization of Autoencoders
On the Regularization of Autoencoders
Harald Steck
Dario Garcia-Garcia
SSL
AI4CE
27
4
0
21 Oct 2021
slimTrain -- A Stochastic Approximation Method for Training Separable
  Deep Neural Networks
slimTrain -- A Stochastic Approximation Method for Training Separable Deep Neural Networks
Elizabeth Newman
Julianne Chung
Matthias Chung
Lars Ruthotto
39
6
0
28 Sep 2021
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of
  Overparameterized Machine Learning
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning
Yehuda Dar
Vidya Muthukumar
Richard G. Baraniuk
29
71
0
06 Sep 2021
Memorization in Deep Neural Networks: Does the Loss Function matter?
Memorization in Deep Neural Networks: Does the Loss Function matter?
Deep Patel
P. Sastry
TDI
13
8
0
21 Jul 2021
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient
  Training and Effective Adaptation
Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
Haoxiang Wang
Han Zhao
Bo-wen Li
29
88
0
16 Jun 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
28
29
0
01 May 2021
Understanding and Mitigating Accuracy Disparity in Regression
Understanding and Mitigating Accuracy Disparity in Regression
Jianfeng Chi
Yuan Tian
Geoffrey J. Gordon
Han Zhao
16
25
0
24 Feb 2021
Modeling Dynamic User Interests: A Neural Matrix Factorization Approach
Modeling Dynamic User Interests: A Neural Matrix Factorization Approach
Paramveer S. Dhillon
Sinan Aral
AI4TS
17
19
0
12 Feb 2021
Estimating informativeness of samples with Smooth Unique Information
Estimating informativeness of samples with Smooth Unique Information
Hrayr Harutyunyan
Alessandro Achille
Giovanni Paolini
Orchid Majumder
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
16
24
0
17 Jan 2021
Gradient Starvation: A Learning Proclivity in Neural Networks
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
45
257
0
18 Nov 2020
Classification vs regression in overparameterized regimes: Does the loss
  function matter?
Classification vs regression in overparameterized regimes: Does the loss function matter?
Vidya Muthukumar
Adhyyan Narang
Vignesh Subramanian
M. Belkin
Daniel J. Hsu
A. Sahai
36
148
0
16 May 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
Dropout as a Bayesian Approximation: Representing Model Uncertainty in
  Deep Learning
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
285
9,136
0
06 Jun 2015
1