ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.13018
  4. Cited By
Getting aligned on representational alignment

Getting aligned on representational alignment

18 October 2023
Ilia Sucholutsky
Lukas Muttenthaler
Adrian Weller
Andi Peng
Andreea Bobu
Been Kim
Bradley C. Love
Erin Grant
Iris Groen
Jascha Achterberg
Joshua B. Tenenbaum
Katherine M. Collins
Katherine L. Hermann
Kerem Oktar
Klaus Greff
M. Hebart
Nori Jacoby
Qiuyi Zhang
Raja Marjieh
Robert Geirhos
Sherol Chen
Simon Kornblith
Sunayana Rane
Talia Konkle
Thomas P. O'Connell
Thomas Unterthiner
Andrew Kyle Lampinen
Klaus-Robert Muller
M. Toneva
Thomas L. Griffiths
ArXivPDFHTML

Papers citing "Getting aligned on representational alignment"

26 / 26 papers shown
Title
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i
Kola Ayonrinde
Louis Jaburi
MILM
86
1
0
01 May 2025
ReSi: A Comprehensive Benchmark for Representational Similarity Measures
ReSi: A Comprehensive Benchmark for Representational Similarity Measures
Max Klabunde
Tassilo Wald
Tobias Schumacher
Klaus H. Maier-Hein
Markus Strohmaier
Adriana Iamnitchi
AI4TS
VLM
76
5
0
13 Mar 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
111
7
0
06 Feb 2025
We're Different, We're the Same: Creative Homogeneity Across LLMs
We're Different, We're the Same: Creative Homogeneity Across LLMs
Emily Wenger
Yoed Kenett
91
3
0
31 Jan 2025
Dimensions underlying the representational alignment of deep neural networks with humans
Dimensions underlying the representational alignment of deep neural networks with humans
F. Mahner
Lukas Muttenthaler
Umut Güçlü
M. Hebart
48
4
0
28 Jan 2025
Measuring Error Alignment for Decision-Making Systems
Measuring Error Alignment for Decision-Making Systems
Binxia Xu
Antonis Bikakis
Daniel Onah
A. Vlachidis
Luke Dickens
41
0
0
03 Jan 2025
Differentiable Optimization of Similarity Scores Between Models and Brains
Differentiable Optimization of Similarity Scores Between Models and Brains
Nathan Cloos
Moufan Li
Markus Siegel
S. Brincat
Earl K. Miller
Guangyu Robert Yang
Christopher J. Cueva
45
6
0
31 Dec 2024
Quantifying Knowledge Distillation Using Partial Information Decomposition
Quantifying Knowledge Distillation Using Partial Information Decomposition
Pasan Dissanayake
Faisal Hamman
Barproda Halder
Ilia Sucholutsky
Qiuyi Zhang
Sanghamitra Dutta
36
0
0
12 Nov 2024
Sparse Autoencoders Reveal Universal Feature Spaces Across Large Language Models
Sparse Autoencoders Reveal Universal Feature Spaces Across Large Language Models
Michael Lan
Philip H. S. Torr
Austin Meek
Ashkan Khakzar
David M. Krueger
Fazl Barez
43
10
0
09 Oct 2024
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Emily Cheng
Diego Doimo
Corentin Kervadec
Iuri Macocco
Jade Yu
A. Laio
Marco Baroni
112
11
0
24 May 2024
Learned feature representations are biased by complexity, learning
  order, position, and more
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
34
6
0
09 May 2024
Learning with Language-Guided State Abstractions
Learning with Language-Guided State Abstractions
Andi Peng
Ilia Sucholutsky
Belinda Z. Li
T. Sumers
Thomas L. Griffiths
Jacob Andreas
Julie A. Shah
LM&Ro
49
13
0
28 Feb 2024
Similarity of Neural Network Models: A Survey of Functional and Representational Measures
Similarity of Neural Network Models: A Survey of Functional and Representational Measures
Max Klabunde
Tobias Schumacher
M. Strohmaier
Florian Lemmerich
52
64
0
10 May 2023
Human Uncertainty in Concept-Based AI Systems
Human Uncertainty in Concept-Based AI Systems
Katherine M. Collins
Matthew Barker
M. Zarlenga
Naveen Raman
Umang Bhatt
M. Jamnik
Ilia Sucholutsky
Adrian Weller
Krishnamurthy Dvijotham
66
39
0
22 Mar 2023
Analyzing Diffusion as Serial Reproduction
Analyzing Diffusion as Serial Reproduction
Raja Marjieh
Ilia Sucholutsky
Thomas A. Langlois
Nori Jacoby
Thomas L. Griffiths
DiffM
33
4
0
29 Sep 2022
Improving alignment of dialogue agents via targeted human judgements
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
227
502
0
28 Sep 2022
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off
Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off
M. Zarlenga
Pietro Barbiero
Gabriele Ciravegna
G. Marra
Francesco Giannini
...
F. Precioso
S. Melacci
Adrian Weller
Pietro Lio'
M. Jamnik
79
52
0
19 Sep 2022
The developmental trajectory of object recognition robustness: children
  are like small adults but unlike big deep neural networks
The developmental trajectory of object recognition robustness: children are like small adults but unlike big deep neural networks
Lukas Huber
Robert Geirhos
Felix Wichmann
54
16
0
20 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Passive Attention in Artificial Neural Networks Predicts Human Visual
  Selectivity
Passive Attention in Artificial Neural Networks Predicts Human Visual Selectivity
Thomas A. Langlois
H. C. Zhao
Erin Grant
Ishita Dasgupta
Thomas L. Griffiths
Nori Jacoby
47
15
0
14 Jul 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
On the surprising similarities between supervised and self-supervised
  models
On the surprising similarities between supervised and self-supervised models
Robert Geirhos
Kantharaju Narayanappa
Benjamin Mitzkus
Matthias Bethge
Felix Wichmann
Wieland Brendel
OOD
SSL
DRL
74
46
0
16 Oct 2020
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Chih-Kuan Yeh
Been Kim
Sercan Ö. Arik
Chun-Liang Li
Tomas Pfister
Pradeep Ravikumar
FAtt
122
297
0
17 Oct 2019
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,595
0
18 Sep 2019
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
338
11,684
0
09 Mar 2017
Simple and Scalable Predictive Uncertainty Estimation using Deep
  Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
276
5,661
0
05 Dec 2016
1