Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.04620
Cited By
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
5 July 2024
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
Genghan Zhang
Yann Dubois
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to (Learn at Test Time): RNNs with Expressive Hidden States"
50 / 66 papers shown
Title
Putting It All into Context: Simplifying Agents with LCLMs
Mingjian Jiang
Yangjun Ruan
Luis A. Lastras
Pavan Kapanipathi
Tatsunori Hashimoto
LLMAG
31
0
0
12 May 2025
Overflow Prevention Enhances Long-Context Recurrent LLMs
Assaf Ben-Kish
Itamar Zimerman
M. Jehanzeb Mirza
James R. Glass
Leonid Karlinsky
Raja Giryes
LRM
29
0
0
12 May 2025
Focus on the Likely: Test-time Instance-based Uncertainty Removal
Johannes Schneider
35
0
0
02 May 2025
TTTFusion: A Test-Time Training-Based Strategy for Multimodal Medical Image Fusion in Surgical Robots
Qinhua Xie
Hao Tang
50
0
0
29 Apr 2025
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo
Kaiyan Zhang
Shang Qu
Li Sheng
Xuekai Zhu
Biqing Qi
Youbang Sun
Ganqu Cui
Ning Ding
Bowen Zhou
OffRL
144
1
0
22 Apr 2025
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Ali Behrouz
Meisam Razaviyayn
Peilin Zhong
Vahab Mirrokni
38
0
0
17 Apr 2025
Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction
Dubing Chen
Huan Zheng
Jin Fang
Xingping Dong
Xianfei Li
Wenlong Liao
Tao He
Pai Peng
Jianbing Shen
42
0
0
17 Apr 2025
Open Problems and a Hypothetical Path Forward in LLM Knowledge Paradigms
Xiaotian Ye
M. Zhang
Shu Wu
KELM
ELM
39
0
0
09 Apr 2025
State Tuning: State-based Test-Time Scaling on RWKV-7
Liu Xiao
Li Zhiyuan
Lin Yueyu
33
0
0
07 Apr 2025
One-Minute Video Generation with Test-Time Training
Karan Dalal
Daniel Koceja
Gashon Hussein
Jiarui Xu
Yue Zhao
...
Tatsunori Hashimoto
Sanmi Koyejo
Yejin Choi
Yu Sun
Xiaolong Wang
ViT
91
3
0
07 Apr 2025
Learning from Streaming Video with Orthogonal Gradients
Tengda Han
Dilara Gokay
Joseph Heyward
Chuhan Zhang
Daniel Zoran
Viorica Patraucean
João Carreira
Dima Damen
Andrew Zisserman
48
0
0
02 Apr 2025
AU-TTT: Vision Test-Time Training model for Facial Action Unit Detection
Bohao Xing
Kaishen Yuan
Zitong Yu
X. Liu
Heikki Kälviäinen
ViT
39
0
0
30 Mar 2025
Resona: Improving Context Copying in Linear Recurrence Models with Retrieval
Xinyu Wang
Linrui Ma
Jerry Huang
Peng Lu
Prasanna Parthasarathi
Xiao-Wen Chang
Boxing Chen
Yufei Cui
KELM
45
1
0
28 Mar 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
85
14
0
27 Mar 2025
Scaled Supervision is an Implicit Lipschitz Regularizer
Z. Ouyang
Chunhui Zhang
Yaning Jia
Soroush Vosoughi
BDL
OffRL
77
0
0
19 Mar 2025
MambaIC: State Space Models for High-Performance Learned Image Compression
Fanhu Zeng
Hao Tang
Yihua Shao
Siyu Chen
Ling Shao
Yan Wang
Mamba
149
0
0
16 Mar 2025
Test-Time Training Provably Improves Transformers as In-context Learners
Halil Alperen Gozeten
M. E. Ildiz
Xuechen Zhang
Mahdi Soltanolkotabi
Marco Mondelli
Samet Oymak
46
1
0
14 Mar 2025
Centaur: Robust End-to-End Autonomous Driving with Test-Time Training
Chonghao Sima
Kashyap Chitta
Zhiding Yu
Shiyi Lan
Ping Luo
Andreas Geiger
Hao Li
Jose M. Alvarez
56
2
0
14 Mar 2025
Transformers without Normalization
Jiachen Zhu
Xinlei Chen
Kaiming He
Yann LeCun
Zhuang Liu
ViT
OffRL
65
7
0
13 Mar 2025
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling
Li weile
Liu Xiao
57
1
0
08 Mar 2025
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun
Disen Lan
Tong Zhu
Xiaoye Qu
Yu-Xi Cheng
MoE
103
2
0
07 Mar 2025
L
2
^2
2
M: Mutual Information Scaling Law for Long-Context Language Modeling
Zhuo Chen
Oriol Mayné i Comas
Zhuotao Jin
Di Luo
Marin Soljacic
67
0
0
06 Mar 2025
Delta-WKV: A Novel Meta-in-Context Learner for MRI Super-Resolution
Rongchang Lu
Bingcheng Liao
Haowen Hou
Jiahang Lv
Xin Hai
44
0
0
28 Feb 2025
Sliding Window Attention Training for Efficient Large Language Models
Zichuan Fu
Wentao Song
Yixuan Wang
X. Wu
Yefeng Zheng
Yingying Zhang
Derong Xu
Xuetao Wei
Tong Xu
Xiangyu Zhao
78
1
0
26 Feb 2025
MoM: Linear Sequence Modeling with Mixture-of-Memories
Jusen Du
Weigao Sun
Disen Lan
Jiaxi Hu
Yu-Xi Cheng
KELM
75
3
0
19 Feb 2025
Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education
Yanhao Jia
Xinyi Wu
Hao Li
Qinglin Zhang
Yuxiao Hu
Shuai Zhao
Wenqi Fan
48
2
0
09 Feb 2025
Adaptive Self-improvement LLM Agentic System for ML Library Development
Genghan Zhang
Weixin Liang
Olivia Hsu
K. Olukotun
152
0
0
04 Feb 2025
Explaining Context Length Scaling and Bounds for Language Models
Jingzhe Shi
Qinwei Ma
Hongyi Liu
Hang Zhao
Jeng-Neng Hwang
Serge Belongie
LRM
79
2
0
03 Feb 2025
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Yang Cao
Zhao-quan Song
Chiwun Yang
VGen
46
2
0
01 Feb 2025
LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning
Yansheng Mao
Jiaqi Li
Fanxu Meng
Jing Xiong
Zilong Zheng
Muhan Zhang
LLMAG
RALM
96
1
0
18 Dec 2024
MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Wenjun Huang
Jianguo Hu
84
0
0
14 Dec 2024
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan
Zhuang Wang
Zhen Jia
Can Karakus
Luca Zancato
Tri Dao
Ravi Netravali
Yida Wang
95
4
0
28 Nov 2024
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
Haoyang He
Jingyang Zhang
Yuxuan Cai
Hongxu Chen
Xiaobin Hu
Zhenye Gan
Yishuo Wang
Chengjie Wang
Yunsheng Wu
Lei Xie
Mamba
88
3
0
24 Nov 2024
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
Ali Behrouz
Ali Parviz
Mahdi Karami
Clayton Sanford
Bryan Perozzi
Vahab Mirrokni
81
2
0
23 Nov 2024
DATTA: Domain-Adversarial Test-Time Adaptation for Cross-Domain WiFi-Based Human Activity Recognition
Julian Strohmayer
Rafael Sterzinger
Matthias Wödlinger
Martin Kampel
TTA
92
0
0
20 Nov 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
92
11
0
19 Nov 2024
StreamAdapter: Efficient Test Time Adaptation from Contextual Streams
Dilxat Muhtar
Yelong Shen
Yuqing Yang
Xiaodong Liu
Yadong Lu
...
Feng Sun
Xueliang Zhang
Jianfeng Gao
Weizhu Chen
Qi Zhang
TTA
64
0
0
14 Nov 2024
Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs
Shan Zhong
Jiahao Zeng
Yongxin Yu
Bohong Lin
36
1
0
09 Nov 2024
Wave Network: An Ultra-Small Language Model
Xin Zhang
Victor S. Sheng
41
1
0
04 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
77
2
0
01 Nov 2024
Brain-like variational inference
Hadi Vafaii
Dekel Galor
Jacob L. Yates
DRL
49
0
0
25 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
53
0
0
22 Oct 2024
scFusionTTT: Single-cell transcriptomics and proteomics fusion with Test-Time Training layers
Dian Meng
Bohao Xing
Xinlei Huang
Yanran Liu
Yijun Zhou
Yongjun xiao
Zitong Yu
Xubin Zheng
28
1
0
17 Oct 2024
How much do contextualized representations encode long-range context?
Simeng Sun
Cheng-Ping Hsieh
41
0
0
16 Oct 2024
State-space models can learn in-context by gradient descent
Neeraj Mohan Sushma
Yudou Tian
Harshvardhan Mestha
Nicolo Colombo
David Kappel
Anand Subramoney
41
3
0
15 Oct 2024
MatMamba: A Matryoshka State Space Model
Abhinav Shukla
Sai H. Vemprala
Aditya Kusupati
Ashish Kapoor
Mamba
28
0
0
09 Oct 2024
On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun
Xiaozhong Liu
H. Shen
Xiaofeng Zhu
Ping Hu
VLM
48
4
0
07 Oct 2024
Med-TTT: Vision Test-Time Training model for Medical Image Segmentation
Jiashu Xu
ViT
LM&MA
MedIm
28
0
0
03 Oct 2024
The Role of Deductive and Inductive Reasoning in Large Language Models
Chengkun Cai
Xu Zhao
Haoliang Liu
Zhongyu Jiang
Tianfang Zhang
Zongkai Wu
Jenq-Neng Hwang
Serge Belongie
Lei Li
LRM
37
2
0
03 Oct 2024
A Survey for Deep Reinforcement Learning Based Network Intrusion Detection
Wanrong Yang
Alberto Acuto
Yihang Zhou
Dominik Wojtczak
OffRL
36
2
0
25 Sep 2024
1
2
Next