Title
A Reasoning-Focused Legal Retrieval Benchmark Lucia Zheng Neel Guha Javokhir Arifov Sarah Zhang Michal Skreta Christopher D. Manning Peter Henderson Daniel E. Ho AILaw RALM ELM 94 3 0 06 May 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining Jeffrey Li Mohammadreza Armandpour Iman Mirzadeh Sachin Mehta Vaishaal Shankar ... Samy Bengio Oncel Tuzel Mehrdad Farajtabar Hadi Pouransari Fartash Faghri CLL KELM 61 0 0 02 Apr 2025
Measuring temporal effects of agent knowledge by date-controlled tool use R. Xian Qiming Cui Stefan Bauer Reza Abbasi-Asl KELM 65 0 0 06 Mar 2025
Reinforced Lifelong Editing for Language Models Zherui Li Houcheng Jiang Hao Chen Baolong Bi Zhenhong Zhou Fei Sun Fan Zhang Qing Guo KELM 56 5 0 09 Feb 2025
Evolution and The Knightian Blindspot of Machine Learning Joel Lehman Elliot Meyerson Tarek El-Gaaly Kenneth O. Stanley Tarin Ziyaee 86 1 0 22 Jan 2025
Gradient Localization Improves Lifelong Pretraining of Language Models Jared Fernandez Yonatan Bisk Emma Strubell KELM 39 1 0 07 Nov 2024
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models Sitao Cheng Liangming Pan Xunjian Yin Xinyi Wang William Yang Wang KELM 39 4 0 10 Oct 2024
Towards understanding evolution of science through language model series Junjie Dong Zhuoqi Lyu Qing Ke AI4TS 35 0 0 15 Sep 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge Han Wang Archiki Prasad Elias Stengel-Eskin Joey Tianyi Zhou 82 5 0 11 Sep 2024
CHEW: A Dataset of CHanging Events in Wikipedia Hsuvas Borkakoty Luis Espinosa-Anke 48 1 0 27 Jun 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback Shangbin Feng Weijia Shi Yike Wang Wenxuan Ding Orevaoghene Ahia Shuyue Stella Li Vidhisha Balachandran Sunayana Sitaram Yulia Tsvetkov 72 4 0 22 Jun 2024
HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits Tim Franzmeyer Aleksandar Shtedritski Samuel Albanie Philip H. S. Torr João F. Henriques Jakob N. Foerster 27 1 0 05 Jun 2024
SAVA: Scalable Learning-Agnostic Data Valuation Samuel Kessler Tam Le Vu Nguyen TDI 59 0 0 03 Jun 2024
Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data YongKyung Oh Dongyoung Lim Sungil Kim AI4TS 43 12 0 22 Feb 2024
Temporal Blind Spots in Large Language Models Jonas Wallat Adam Jatowt Avishek Anand 38 3 0 22 Jan 2024
Time is Encoded in the Weights of Finetuned Language Models Kai Nylund Suchin Gururangan Noah A. Smith AI4TS 28 17 0 20 Dec 2023
Faithful Persona-based Conversational Dataset Generation with Large Language Models Pegah Jandaghi XiangHai Sheng Xinyi Bai Jay Pujara Hakim Sidahmed 29 21 0 15 Dec 2023
Continual Learning: Applications and the Road Forward Eli Verwimp Rahaf Aljundi Shai Ben-David Matthias Bethge Andrea Cossu ... J. Weijer Bing Liu Vincenzo Lomonaco Tinne Tuytelaars Gido M. van de Ven CLL 43 44 0 20 Nov 2023
Geometric Data Augmentations to Mitigate Distribution Shifts in Pollen Classification from Microscopic Images Nam Cao O. Saukh 29 2 0 18 Nov 2023
Latent Feature-based Data Splits to Improve Generalisation Evaluation: A Hate Speech Detection Case Study Maike Zufle Verna Dankers Ivan Titov 42 0 0 16 Nov 2023
Benchmarking Multilabel Topic Classification in the Kyrgyz Language Anton M. Alekseev Sergey I. Nikolenko Gulnara Kabaeva 27 3 0 30 Aug 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models Roi Cohen Eden Biran Ori Yoran Amir Globerson Mor Geva KELM 42 155 0 24 Jul 2023
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features Ester Hlavnova Sebastian Ruder 32 5 0 11 Jul 2023
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future Linyi Yang Yangqiu Song Xuan Ren Chenyang Lyu Yidong Wang Lingqiao Liu Jindong Wang Jennifer Foster Yue Zhang OOD 37 2 0 23 May 2023
On the Limitations of Simulating Active Learning Katerina Margatina Nikolaos Aletras 31 11 0 21 May 2023
Revisiting Entropy Rate Constancy in Text Vivek Verma Nicholas Tomlin Dan Klein 18 4 0 20 May 2023
Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings Taichi Aida Danushka Bollegala 20 8 0 15 May 2023
SwissBERT: The Multilingual Language Model for Switzerland Jannis Vamvas Johannes Graen Rico Sennrich 38 6 0 23 Mar 2023
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset Thanh-Dung Le P. Jouvet R. Noumeir MoE MedIm 72 5 0 22 Mar 2023
An Overview on Language Models: Recent Developments and Outlook Chengwei Wei Yun Cheng Wang Bin Wang C.-C. Jay Kuo 25 42 0 10 Mar 2023
Diagnosing Model Performance Under Distribution Shift Tiffany Cai Hongseok Namkoong Steve Yadlowsky 37 27 0 03 Mar 2023
Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views Katerina Margatina Shuai Wang Yogarshi Vyas Neha Ann John Yassine Benajiba Miguel Ballesteros 17 15 0 23 Feb 2023
Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy Blake E. Woodworth Konstantin Mishchenko Francis R. Bach 42 6 0 07 Feb 2023
TempEL: Linking Dynamically Evolving and Newly Emerging Entities Klim Zaporojets Lucie-Aimée Kaffee Johannes Deleu Thomas Demeester Chris Develder Isabelle Augenstein KELM 36 15 0 05 Feb 2023
Addressing Distribution Shift at Test Time in Pre-trained Language Models Ayush Singh J. Ortega VLM 24 4 0 05 Dec 2022
Time-Aware Datasets are Adaptive Knowledgebases for the New Normal Abhijit Suprem Sanjyot Vaidya J. Ferreira C. Pu 29 2 0 22 Nov 2022
Large Language Models with Controllable Working Memory Daliang Li A. S. Rawat Manzil Zaheer Xin Wang Michal Lukasik Andreas Veit Felix X. Yu Surinder Kumar KELM 55 152 0 09 Nov 2022
Time-aware Prompting for Text Generation Shuyang Cao Lu Wang 26 11 0 03 Nov 2022
Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change Zhao-yu Su Zecheng Tang Xinyan Guan Juntao Li Lijun Wu M. Zhang CLL AI4CE 32 22 0 31 Oct 2022
Can Language Representation Models Think in Bets? Zhi–Bin Tang Mayank Kejriwal 15 6 0 14 Oct 2022
Mass-Editing Memory in a Transformer Kevin Meng Arnab Sen Sharma A. Andonian Yonatan Belinkov David Bau KELM VLM 35 525 0 13 Oct 2022
Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts Asahi Ushio Leonardo Neves Vítor Silva Francesco Barbieri Jose Camacho-Collados 25 26 0 07 Oct 2022
Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content! Stefan Smeu Elena Burceanu Andrei Liviu Nicolicioiu Emanuela Haller 35 4 0 06 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review Dieuwke Hupkes Mario Giulianelli Verna Dankers Mikel Artetxe Yanai Elazar ... Leila Khalatbari Maria Ryskina Rita Frieske Ryan Cotterell Zhijing Jin 116 93 0 06 Oct 2022
Twitter Topic Classification Dimosthenis Antypas Asahi Ushio Jose Camacho-Collados Leonardo Neves Vítor Silva Francesco Barbieri 35 31 0 20 Sep 2022
RealTime QA: What's the Answer Right Now? Jungo Kasai Keisuke Sakaguchi Yoichi Takahashi Ronan Le Bras Akari Asai Xinyan Velocity Yu Dragomir R. Radev Noah A. Smith Yejin Choi Kentaro Inui KELM 45 165 0 27 Jul 2022
Link the World: Improving Open-domain Conversation with Dynamic Spatiotemporal-aware Knowledge Han Zhou Xinchao Xu Wenquan Wu Zheng-Yu Niu Hua-Hong Wu Siqi Bao Fan Wang Haifeng Wang KELM 27 7 0 28 Jun 2022
Memory-Based Model Editing at Scale E. Mitchell Charles Lin Antoine Bosselut Christopher D. Manning Chelsea Finn KELM 35 318 0 13 Jun 2022
Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers Rabab Alkhalifa E. Kochkina A. Zubiaga 19 25 0 11 May 2022
Entity Cloze By Date: What LMs Know About Unseen Entities Yasumasa Onoe Michael J.Q. Zhang Eunsol Choi Greg Durrett KELM 21 49 0 05 May 2022