Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 1,211 papers shown
Title
Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Tian Gao
Amit Dhurandhar
Karthikeyan N. Ramamurthy
Dennis L. Wei
68
0
0
21 Oct 2024
Do Audio-Language Models Understand Linguistic Variations?
Ramaneswaran Selvakumar
Sonal Kumar
Hemant Kumar Giri
Nishit Anand
Ashish Seth
Sreyan Ghosh
Dinesh Manocha
AuLLM
VLM
104
1
0
21 Oct 2024
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts
Zhenpeng Su
Xing Wu
Zijia Lin
Yizhe Xiong
Minxuan Lv
Guangyuan Ma
Hui Chen
Songlin Hu
Guiguang Ding
MoE
49
4
0
21 Oct 2024
Grammatical Error Correction for Low-Resource Languages: The Case of Zarma
Mamadou K. Keita
Christopher Homan
Sofiane Abdoulaye Hamani
Adwoa Bremang
Marcos Zampieri
Habibatou Abdoulaye Alfari
Elysabhete Amadou Ibrahim
69
0
0
20 Oct 2024
A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
Han Zhou
Jordy Van Landeghem
Teodora Popordanoska
Matthew B. Blaschko
70
2
0
20 Oct 2024
Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework
Yuchen Wang
Shangxin Guo
C. Tan
51
0
0
20 Oct 2024
Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data
Elias Hossain
Tasfia Nuzhat
Shamsul Masum
Shahram Rahimi
Sudip Mittal
63
0
0
19 Oct 2024
CAST: Corpus-Aware Self-similarity Enhanced Topic modelling
Yanan Ma
Chenghao Xiao
Chenhan Yuan
Sabine N van der Veer
Lamiece Hassan
Chenghua Lin
Goran Nenadic
62
0
0
19 Oct 2024
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Jiacheng Ye
Jiahui Gao
Shansan Gong
Lin Zheng
Xin Jiang
Zhiyu Li
Dianbo Sui
DiffM
LRM
109
20
0
18 Oct 2024
DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph
Maitreya Prafulla Chitale
Uday Bindal
Rajakrishnan Rajkumar
Rahul Mishra
72
1
0
18 Oct 2024
Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents
Sabit Hassan
Hye-Young Chung
Xiang Zhi Tan
Malihe Alikhani
123
0
0
18 Oct 2024
Rationale Behind Essay Scores: Enhancing S-LLM's Multi-Trait Essay Scoring with Rationale Generated by LLMs
SeongYeub Chu
JongWoo Kim
Bryan Wong
MunYong Yi
LRM
51
3
0
18 Oct 2024
Transformer Guided Coevolution: Improved Team Selection in Multiagent Adversarial Team Games
Pranav Rajbhandari
Prithviraj Dasgupta
D. Sofge
33
0
0
17 Oct 2024
Linguistically Grounded Analysis of Language Models using Shapley Head Values
Marcell Richard Fekete
Johannes Bjerva
85
0
0
17 Oct 2024
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
Georgios Chochlakis
Alexandros Potamianos
Kristina Lerman
Shrikanth Narayanan
81
2
0
17 Oct 2024
The Mystery of the Pathological Path-star Task for Language Models
Arvid Frydenlund
LRM
43
4
0
17 Oct 2024
The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces
Ahmed Oumar El-Shangiti
Tatsuya Hiraoka
Hilal AlQuabeh
Benjamin Heinzerling
Kentaro Inui
85
1
0
17 Oct 2024
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
Renpu Liu
Ruida Zhou
Cong Shen
Jing Yang
80
0
0
17 Oct 2024
Preference Diffusion for Recommendation
Shuo Liu
An Zhang
Guoqing Hu
Hong Qian
Tat-Seng Chua
83
1
0
17 Oct 2024
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replay
Yuyang Chen
Kaiyan Zhao
Yiming Wang
Ming Yang
Jian Zhang
Yan Li
115
1
0
16 Oct 2024
StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples
Ajay Patel
Jiacheng Zhu
Justin Qiu
Zachary Horvitz
Marianna Apidianaki
Kathleen McKeown
Chris Callison-Burch
98
4
0
16 Oct 2024
KcMF: A Knowledge-compliant Framework for Schema and Entity Matching with Fine-tuning-free LLMs
Yongqin Xu
Huan Li
Ke Chen
Lidan Shou
84
4
0
16 Oct 2024
On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs
Herun Wan
Minnan Luo
Zhixiong Su
Guang Dai
Xiang Zhao
DeLMO
76
0
0
16 Oct 2024
TopoLM: brain-like spatio-functional organization in a topographic language model
Neil Rathi
Johannes Mehrer
Badr AlKhamissi
Taha Binhuraib
Nicholas M. Blauch
Martin Schrimpf
71
3
0
15 Oct 2024
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI
Arya Tschand
Arun Tejusve Raghunath Rajan
S. Idgunji
Anirban Ghosh
J. Holleman
...
Rowan Taubitz
Sean Zhan
Scott Wasson
David Kanter
Vijay Janapa Reddi
96
3
0
15 Oct 2024
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
Chong Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
53
7
0
14 Oct 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
82
3
0
14 Oct 2024
An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them
Creston Brooks
J. Haubold
Charlie Cowen-Breen
Jay White
Desmond DeVaul
Frederick Riemenschneider
Karthik Narasimhan
B. Graziosi
80
0
0
14 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
144
20
0
14 Oct 2024
The Epochal Sawtooth Effect: Unveiling Training Loss Oscillations in Adam and Other Optimizers
Qi Liu
Wanjing Ma
46
0
0
14 Oct 2024
GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs
Yun Zhu
Haizhou Shi
Xiaotang Wang
Yongchao Liu
Yaoke Wang
Boci Peng
Chuntao Hong
Siliang Tang
VLM
106
10
0
14 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
87
7
0
14 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
Melanie Zeilinger
Carmen Amo Alonso
122
0
0
14 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
125
6
0
12 Oct 2024
Zero-Shot Pupil Segmentation with SAM 2: A Case Study of Over 14 Million Images
Virmarie Maquiling
Sean Anthony Byrne
D. Niehorster
Marco Carminati
Enkelejda Kasneci
VLM
62
2
0
11 Oct 2024
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
47
5
0
11 Oct 2024
SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
Yang Zhou
Hao Shao
Letian Wang
Steven Waslander
Hongsheng Li
Yu Liu
58
2
0
11 Oct 2024
Improving Semantic Understanding in Speech Language Models via Brain-tuning
Omer Moussa
Dietrich Klakow
Mariya Toneva
66
5
0
11 Oct 2024
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Binghui Li
Yuanzhi Li
OOD
56
2
0
11 Oct 2024
DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models
Xiaoxiao He
Ligong Han
Quan Dao
Song Wen
Minhao Bai
...
Hongdong Li
Junzhou Huang
Faez Ahmed
Akash Srivastava
Dimitris Metaxas
DiffM
SyDa
82
5
0
10 Oct 2024
Accurate and Regret-aware Numerical Problem Solver for Tabular Question Answering
Yuxiang Wang
Jianzhong Qi
Junhao Gan
LMTD
125
3
0
10 Oct 2024
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Philipp Guldimann
Alexander Spiridonov
Robin Staab
Nikola Jovanović
Mark Vero
...
Mislav Balunović
Nikola Konstantinov
Pavol Bielik
Petar Tsankov
Martin Vechev
ELM
67
6
0
10 Oct 2024
Do Current Language Models Support Code Intelligence for R Programming Language?
ZiXiao Zhao
Fatemeh H. Fard
ELM
66
0
0
10 Oct 2024
Masked Generative Priors Improve World Models Sequence Modelling Capabilities
Cristian Meo
Mircea Lica
Zarif Ikram
Akihiro Nakano
Vedant Shah
Aniket Didolkar
Dianbo Liu
Anirudh Goyal
Justin Dauwels
OffRL
94
0
0
10 Oct 2024
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax
I. Butakov
Alexander Sememenko
Alexander Tolmachev
Andrey Gladkov
Marina Munkhoeva
Alexey Frolov
102
1
0
09 Oct 2024
Steering Large Language Models using Conceptors: Improving Addition-Based Activation Engineering
Joris Postmus
Steven Abreu
LLMSV
252
2
0
09 Oct 2024
Let's Ask GNN: Empowering Large Language Model for Graph In-Context Learning
Zhengyu Hu
Yichuan Li
Zhengyu Chen
Jiadong Wang
Han Liu
Kyumin Lee
Kaize Ding
GNN
397
1
0
09 Oct 2024
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
Cheng-rong Li
May Fung
Qingyun Wang
Chi Han
Manling Li
Jindong Wang
Heng Ji
AI4MH
358
0
0
09 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
124
0
0
09 Oct 2024
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Muhammad Jehanzeb Mirza
Mengjie Zhao
Zhuoyuan Mao
Sivan Doveh
Wei Lin
...
Yuki Mitsufuji
Horst Possegger
Rogerio Feris
Leonid Karlinsky
James Glass
VLM
126
1
0
08 Oct 2024
Previous
1
2
3
...
12
13
14
...
23
24
25
Next