Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
v1
v2 (latest)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 23,491 papers shown
Title
Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New Models
Gabriel Aracena
Kyle Luster
Fabio Santos
Igor Steinmacher
M. Gerosa
31
0
0
30 May 2025
Structure-Aware Fill-in-the-Middle Pretraining for Code
Linyuan Gong
Alvin Cheung
Mostafa Elhoushi
Sida Wang
CLL
AI4CE
26
0
0
30 May 2025
Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion
Anum Afzal
Florian Matthes
Gal Chechik
Yftah Ziser
LRM
45
0
0
30 May 2025
CoRet: Improved Retriever for Code Editing
Fabio Fehr
Prabhu Teja Sivaprasad
Luca Franceschi
Giovanni Zappella
37
0
0
30 May 2025
Hush! Protecting Secrets During Model Training: An Indistinguishability Approach
Arun Ganesh
Brendan McMahan
Milad Nasr
Thomas Steinke
Abhradeep Thakurta
22
0
0
30 May 2025
MIR: Methodology Inspiration Retrieval for Scientific Research Problems
Aniketh Garikaparthi
Manasi Patwardhan
Aditya Sanjiv Kanade
Aman Hassan
Lovekesh Vig
Arman Cohan
OffRL
RALM
LRM
28
0
0
30 May 2025
On the Scaling of Robustness and Effectiveness in Dense Retrieval
Yu-an Liu
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
32
0
0
30 May 2025
Interpretable phenotyping of Heart Failure patients with Dutch discharge letters
Vittorio Torri
Machteld J. Boonstra
Marielle C. van de Veerdonk
Deborah N. Kalkman
Alicia Uijl
Francesca Ieva
Ameen Abu-Hanna
Folkert W. Asselbergs
Iacer Calixto
32
0
0
30 May 2025
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning
Yiqing Liang
Jielin Qiu
Wenhao Ding
Zuxin Liu
James Tompkin
Mengdi Xu
Mengzhou Xia
Zhengzhong Tu
Laixi Shi
Jiacheng Zhu
OffRL
128
0
0
30 May 2025
Dynamic Context-Aware Streaming Pretrained Language Model For Inverse Text Normalization
Luong Ho
Khanh Le
Vinh Pham
Bao Nguyen
Tan Tran
Duc Thanh Chau
32
0
0
30 May 2025
Is BERTopic Better than PLSA for Extracting Key Topics in Aviation Safety Reports?
Aziida Nanyonga
Joiner Keith
Turhan Ugur
Wild Graham
17
0
0
30 May 2025
Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks
Roksana Goworek
Haim Dubossarsky
LRM
30
0
0
30 May 2025
Multi-Domain ABSA Conversation Dataset Generation via LLMs for Real-World Evaluation and Model Comparison
Tejul Pandit
Meet Raval
Dhvani Upadhyay
52
0
0
30 May 2025
GPR: Empowering Generation with Graph-Pretrained Retriever
Xiaochen Wang
Zongyu Wu
Yuan Zhong
Xiang Zhang
Suhang Wang
Fenglong Ma
29
0
0
30 May 2025
Benchmarking Foundation Models for Zero-Shot Biometric Tasks
Redwan Sony
Parisa Farmanifard
Hamzeh Alzwairy
Nitish Shukla
Arun Ross
CVBM
VLM
58
0
0
30 May 2025
Multimodal Foundation Model for Cross-Modal Retrieval and Activity Recognition Tasks
Koki Matsuishi
Kosuke Ukita
Tsuyoshi Okita
27
0
0
29 May 2025
Noise-Robustness Through Noise: Asymmetric LoRA Adaption with Poisoning Expert
Zhaokun Wang
Jinyu Guo
Jingwen Pu
Lingfeng Chen
Hongli Pu
Jie Ou.Libo Qin
Libo Qin
Wenhong Tian
AAML
37
0
0
29 May 2025
Navigating the Accuracy-Size Trade-Off with Flexible Model Merging
Akash Dhasade
Divyansh Jhunjhunwala
Milos Vujasinovic
Gauri Joshi
Anne-Marie Kermarrec
MoMe
71
0
0
29 May 2025
Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models
Qiuyu Ding
Zhiqiang Cao
Hailong Cao
Tiejun Zhao
52
0
0
29 May 2025
Augment or Not? A Comparative Study of Pure and Augmented Large Language Model Recommenders
Wei-Hsiang Huang
Chen-Wei Ke
Wei-Ning Chiu
Yu-Xuan Su
Chun-Chun Yang
Chieh-Yuan Cheng
Yun-Nung Chen
Pu-Jen Cheng
82
0
0
29 May 2025
EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian
Daryna Dementieva
N. Babakov
Alexander Fraser
54
0
0
29 May 2025
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
Ildus Sadrtdinov
Ivan Klimov
E. Lobacheva
Dmitry Vetrov
35
0
0
29 May 2025
Fortune: Formula-Driven Reinforcement Learning for Symbolic Table Reasoning in Language Models
Lang Cao
Jingxian Xu
Hanbing Liu
Jinyu Wang
Mengyu Zhou
Haoyu Dong
Shi Han
Dongmei Zhang
LRM
OffRL
LMTD
ReLM
63
0
0
29 May 2025
Accelerating AllReduce with a Persistent Straggler
Arjun Devraj
Eric Ding
Abhishek Vijaya Kumar
Robert Kleinberg
Rachee Singh
56
0
0
29 May 2025
Multi-RAG: A Multimodal Retrieval-Augmented Generation System for Adaptive Video Understanding
Mingyang Mao
Mariela M. Perez-Cabarcas
Utteja Kallakuri
Nicholas R. Waytowich
Xiaomin Lin
T. Mohsenin
32
1
0
29 May 2025
Probing Politico-Economic Bias in Multilingual Large Language Models: A Cultural Analysis of Low-Resource Pakistani Languages
Afrozah Nadeem
Mark Dras
Usman Naseem
42
0
0
29 May 2025
VModA: An Effective Framework for Adaptive NSFW Image Moderation
Han Bao
Qinying Wang
Zhi Chen
Qingming Li
Xuhong Zhang
Changjiang Li
Zonghui Wang
Shouling Ji
Wenzhi Chen
45
0
0
29 May 2025
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models
Chenbin Pan
Wenbin He
Zhengzhong Tu
Liu Ren
LRM
VLM
77
0
0
29 May 2025
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Vatsal Sharan
Mahdi Soltanolkotabi
36
0
0
29 May 2025
Semantics-Aware Human Motion Generation from Audio Instructions
Zi-An Wang
Shihao Zou
Shiyao Yu
Mingyuan Zhang
Chao Dong
VGen
39
0
0
29 May 2025
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction
Jang-Hyun Kim
Jinuk Kim
S. Kwon
Jae W. Lee
Sangdoo Yun
Hyun Oh Song
MQ
VLM
63
0
0
29 May 2025
Exploring Scaling Laws for EHR Foundation Models
Sheng Zhang
Qin Liu
Naoto Usuyama
Cliff Wong
Tristan Naumann
Hoifung Poon
42
0
0
29 May 2025
A New Deep-learning-Based Approach For mRNA Optimization: High Fidelity, Computation Efficiency, and Multiple Optimization Factors
Zheng Gong
Ziyi Jiang
Weihao Gao
Deng Zhuo
Lan Ma
33
0
0
29 May 2025
MAP: Revisiting Weight Decomposition for Low-Rank Adaptation
Chongjie Si
Zhiyi Shi
Yadao Wang
Xiaokang Yang
Susanto Rahardja
Wei Shen
64
0
0
29 May 2025
A Survey of Generative Categories and Techniques in Multimodal Large Language Models
Longzhen Han
Awes Mubarak
Almas Baimagambetov
Nikolaos Polatidis
Thar Baker
LRM
67
0
0
29 May 2025
Stairway to Success: Zero-Shot Floor-Aware Object-Goal Navigation via LLM-Driven Coarse-to-Fine Exploration
Zeying Gong
Rong Li
Tianshuai Hu
Ronghe Qiu
Lingdong Kong
Lingfeng Zhang
Yiyi Ding
Leying Zhang
Junwei Liang
64
0
0
29 May 2025
TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine
Jiacheng Xie
Yang Yu
Ziyang Zhang
Shuai Zeng
Jiaxuan He
...
Congyu Guo
Lening Zhao
Congcong Jing
Guanghui An
Dong Xu
LM&MA
ELM
24
0
0
29 May 2025
DATD3: Depthwise Attention Twin Delayed Deep Deterministic Policy Gradient For Model Free Reinforcement Learning Under Output Feedback Control
Wuhao Wang
Zhiyong Chen
OffRL
24
0
0
29 May 2025
Large Language Model Meets Constraint Propagation
Alexandre Bonlarron
Florian Régin
Elisabetta De Maria
Jean-Charles Régin
37
0
0
29 May 2025
SG-Blend: Learning an Interpolation Between Improved Swish and GELU for Robust Neural Representations
Gaurav Sarkar
Jay Gala
Subarna Tripathi
28
0
0
29 May 2025
What About Emotions? Guiding Fine-Grained Emotion Extraction from Mobile App Reviews
Quim Motger
Marc Oriol
Max Tiessler
Xavier Franch
Jordi Marco
19
0
0
29 May 2025
Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation
Hao Li
Ju Dai
Xin Zhao
Feng Zhou
Junjun Pan
Lei Li
19
0
0
29 May 2025
Generalized Category Discovery in Event-Centric Contexts: Latent Pattern Mining with LLMs
Yi Luo
Qiwen Wang
Junqi Yang
Luyao Tang
Zhenghao Lin
ZhenZhe Ying
Weiqiang Wang
Chen Lin
56
0
0
29 May 2025
TailorSQL: An NL2SQL System Tailored to Your Query Workload
Kapil Vaidya
Jialin Ding
Sebastian Kosak
David Kernert
Chuan Lei
Xiao Qin
Abhinav Tripathy
Ramesh Balan
Balakrishnan Narayanaswamy
Tim Kraska
12
0
0
29 May 2025
BiBLDR: Bidirectional Behavior Learning for Drug Repositioning
Renye Zhang
Mengyun Yang
Qichang Zhao
Jianxin Wang
OOD
54
0
0
29 May 2025
Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs
Ngeyen Yinkfu
14
0
0
28 May 2025
ICH-Qwen: A Large Language Model Towards Chinese Intangible Cultural Heritage
Wenhao Ye
Tiansheng Zheng
Yue Qi
Wenhua Zhao
Xiyu Wang
Xue Zhao
Jiacheng He
Yaya Zheng
Dongbo Wang
22
0
0
28 May 2025
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
Zhendong Mi
Zhenglun Kong
Geng Yuan
Shaoyi Huang
56
0
0
28 May 2025
EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse
Tianyu Guo
Hande Dong
Yichong Leng
Feng Liu
Cheater Lin
Nong Xiao
X. Zhang
RALM
31
0
0
28 May 2025
From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications
Feibo Jiang
Cunhua Pan
Li Dong
Kezhi Wang
O. Dobre
Mérouane Debbah
LLMAG
AI4TS
179
1
0
28 May 2025
Previous
1
2
3
...
7
8
9
...
468
469
470
Next