ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.16476
  4. Cited By
Closed-Form Feedback-Free Learning with Forward Projection

Closed-Form Feedback-Free Learning with Forward Projection

27 January 2025
Robert O'Shea
Bipin Rajendran
ArXivPDFHTML

Papers citing "Closed-Form Feedback-Free Learning with Forward Projection"

17 / 17 papers shown
Title
Sparsification and Reconstruction from the Perspective of Representation Geometry
Sparsification and Reconstruction from the Perspective of Representation Geometry
Wenjie Sun
Bingzhe Wu
Zhile Yang
Chengke Wu
15
0
0
28 May 2025
Zero-Shot Vision Encoder Grafting via LLM Surrogates
Zero-Shot Vision Encoder Grafting via LLM Surrogates
Kaiyu Yue
Vasu Singla
Menglin Jia
John Kirchenbauer
Rifaa Qadri
Zikui Cai
A. Bhatele
Furong Huang
Tom Goldstein
VLM
16
0
0
28 May 2025
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling
Hovhannes Tamoyan
Subhabrata Dutta
Iryna Gurevych
HILM
KELM
39
0
0
27 May 2025
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
James Oldfield
Shawn Im
Yixuan Li
M. Nicolaou
Ioannis Patras
Grigorios G. Chrysos
MoE
38
0
0
27 May 2025
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Mengru Wang
Ziwen Xu
Shengyu Mao
Shumin Deng
Zhaopeng Tu
Ningyu Zhang
N. Zhang
LLMSV
49
0
0
23 May 2025
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models
Patrick Leask
Neel Nanda
Noura Al Moubayed
46
1
0
23 May 2025
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
Landon Butler
Abhineet Agarwal
Justin Singh Kang
Yigit Efe Erginbas
Bin Yu
Kannan Ramchandran
88
0
0
23 May 2025
Interpretability Illusions with Sparse Autoencoders: Evaluating Robustness of Concept Representations
Interpretability Illusions with Sparse Autoencoders: Evaluating Robustness of Concept Representations
Aaron Jiaxun Li
Suraj Srinivas
Usha Bhalla
Himabindu Lakkaraju
AAML
92
0
0
21 May 2025
On the creation of narrow AI: hierarchy and nonlocality of neural network skills
On the creation of narrow AI: hierarchy and nonlocality of neural network skills
Eric J. Michaud
Asher Parker-Sartori
Max Tegmark
58
0
0
21 May 2025
Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models
Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models
Woody Haosheng Gan
Deqing Fu
Julian Asilis
Ollie Liu
Dani Yogatama
Vatsal Sharan
Robin Jia
Willie Neiswanger
LLMSV
49
0
0
20 May 2025
Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis
Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis
Akarsh Kumar
Jeff Clune
Joel Lehman
Kenneth O. Stanley
OOD
55
0
0
16 May 2025
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Rui Melo
Claudia Mamede
Andre Catarino
Rui Abreu
Henrique Lopes Cardoso
59
0
0
15 May 2025
MIB: A Mechanistic Interpretability Benchmark
MIB: A Mechanistic Interpretability Benchmark
Aaron Mueller
Atticus Geiger
Sarah Wiegreffe
Dana Arad
Iván Arcuschin
...
Alessandro Stolfo
Martin Tutek
Amir Zur
David Bau
Yonatan Belinkov
73
1
0
17 Apr 2025
On Language Models' Sensitivity to Suspicious Coincidences
On Language Models' Sensitivity to Suspicious Coincidences
Sriram Padmanabhan
Kanishka Misra
Kyle Mahowald
Eunsol Choi
ReLM
LRM
52
0
0
13 Apr 2025
Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning
Julian Minder
Clement Dumas
Caden Juang
Bilal Chugtai
Neel Nanda
109
1
0
03 Apr 2025
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
Yixin Ou
Yunzhi Yao
N. Zhang
Hui Jin
Jiacheng Sun
Shumin Deng
Zechao Li
Ningyu Zhang
KELM
CLL
84
2
0
16 Feb 2025
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
73
15
0
10 Oct 2024
1