Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.04805
Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
50 / 1,211 papers shown
Title
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision
Saierdaer Yusuyin
Te Ma
Hao Huang
Wenbo Zhao
Zhijian Ou
71
3
0
04 Jun 2024
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
Marianna Nezhurina
Lucia Cipolina-Kun
Mehdi Cherti
J. Jitsev
LLMAG
LRM
ELM
ReLM
85
30
0
04 Jun 2024
Towards Effective Time-Aware Language Representation: Exploring Enhanced Temporal Understanding in Language Models
Jiexin Wang
Adam Jatowt
Yi Cai
AI4CE
54
1
0
04 Jun 2024
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
Zi-Yi Dou
Cheng-Fu Yang
Xueqing Wu
Kai-Wei Chang
Nanyun Peng
LRM
97
9
0
03 Jun 2024
PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration
Huiping Zhuang
Jianwei Wang
Zhengdong Lu
Huiping Zhuang
Haoran Li
Huiping Zhuang
Cen Chen
RALM
KELM
71
8
0
03 Jun 2024
Deciphering Oracle Bone Language with Diffusion Models
Haisu Guan
Huanxin Yang
Xinyu Wang
Shengwei Han
Yongge Liu
Lianwen Jin
Xiang Bai
Yunxing Liu
AAML
AI4CE
114
8
0
02 Jun 2024
RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis
Md. Mostafizer Rahman
Ariful Islam Shiplu
Yutaka Watanobe
Md. Ashad Alam
53
11
0
01 Jun 2024
OR-Bench: An Over-Refusal Benchmark for Large Language Models
Justin Cui
Wei-Lin Chiang
Ion Stoica
Cho-Jui Hsieh
ALM
68
45
0
31 May 2024
Locking Machine Learning Models into Hardware
Eleanor Clifford
Adhithya Saravanan
Harry Langford
Cheng Zhang
Yiren Zhao
Robert D. Mullins
Ilia Shumailov
Jamie Hayes
66
0
0
31 May 2024
Scaling White-Box Transformers for Vision
Jinrui Yang
Xianhang Li
Druv Pai
Yuyin Zhou
Yi-An Ma
Yaodong Yu
Cihang Xie
ViT
60
9
0
30 May 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Patrick Emami
Zhaonan Li
Saumya Sinha
Truc Nguyen
101
1
0
30 May 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
41
6
0
29 May 2024
FAIIR: Building Toward A Conversational AI Agent Assistant for Youth Mental Health Service Provision
Stephen Obadinma
Alia Lachana
M. Norman
Jocelyn Rankin
Joanna Yu
Xiaodan Zhu
Darren Mastropaolo
D. Pandya
Roxana Sultan
Elham Dolatabadi
AI4MH
65
1
0
28 May 2024
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates
Albin Soutif--Cormerais
Simone Magistri
Joost van de Weijer
Andew D. Bagdanov
57
1
0
28 May 2024
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Suraj Anand
Michael A. Lepori
Jack Merullo
Ellie Pavlick
CLL
68
8
0
28 May 2024
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma
Dheeraj M. Nagaraj
Karthikeyan Shanmugam
VLM
96
3
0
27 May 2024
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Aojun Zhou
Junting Pan
Hongsheng Li
SyDa
62
7
0
27 May 2024
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
103
0
0
27 May 2024
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Dixuan Wang
Yanda Li
Junyuan Jiang
Zepeng Ding
Ziqin Luo
Guochao Jiang
Jiaqing Liang
Deqing Yang
58
13
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
163
170
0
27 May 2024
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Cristian Rodriguez-Opazo
Ehsan Abbasnejad
Damien Teney
Edison Marrese-Taylor
Hamed Damirchi
Anton Van Den Hengel
VLM
80
1
0
27 May 2024
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
Zhuoling Li
Xiaogang Xu
Zhenhua Xu
Sernam Lim
Hengshuang Zhao
LM&Ro
87
2
0
27 May 2024
Categorical Flow Matching on Statistical Manifolds
Chaoran Cheng
Jiahan Li
Jian-wei Peng
Ge Liu
81
10
0
26 May 2024
Unsupervised Meta-Learning via In-Context Learning
Anna Vettoruzzo
Lorenzo Braccaioli
Joaquin Vanschoren
M. Nowaczyk
SSL
87
0
0
25 May 2024
Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation
Weize Li
Zhicheng Zhao
Haochen Bai
Fei Su
66
0
0
24 May 2024
A Structure-Aware Framework for Learning Device Placements on Computation Graphs
Shukai Duan
Heng Ping
Nikos Kanakaris
Xiongye Xiao
Panagiotis Kyriakis
...
Guixiang Ma
Mihai Capota
Shahin Nazarian
Theodore L. Willke
Paul Bogdan
79
3
0
23 May 2024
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Yongxin Guo
Zhenglin Cheng
Xiaoying Tang
Tao R. Lin
Tao Lin
MoE
88
8
0
23 May 2024
Explaining Black-box Model Predictions via Two-level Nested Feature Attributions with Consistency Property
Yuya Yoshikawa
Masanari Kimura
Ryotaro Shimizu
Yuki Saito
FAtt
58
0
0
23 May 2024
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks
Michelle Halbheer
Dominik J. Mühlematter
Alexander Becker
Dominik Narnhofer
Helge Aasen
Konrad Schindler
Mehmet Özgür Türkoglu
UQCV
70
2
0
23 May 2024
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Zexi Li
Lingzhi Gao
Chao Wu
AI4CE
DiffM
77
3
0
23 May 2024
Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations
Mohammed Baharoon
Jonathan Klein
D. L. Michels
SSL
VLM
102
0
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
188
49
0
23 May 2024
Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment
A. Abdel-Rehim
Hector Zenil
Oghenejokpeme I. Orhobor
Marie Fisher
Ross J. Collins
...
Gareth W. Fearnley
Emma Tate
Holly X. Smith
Larisa B. Soldatova
Ross D. King
LM&MA
90
7
0
20 May 2024
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
Cengiz Pehlevan
65
12
0
20 May 2024
GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D
Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
Milad Cheraghalikhani
G. A. V. Hakim
David Osowiechi
Farzad Beizaee
Ismail Ben Ayed
Christian Desrosiers
3DPC
101
2
0
20 May 2024
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
Ruitao Chen
Liwei Wang
91
1
0
18 May 2024
Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Rya Sanovar
Srikant Bharadwaj
Renée St. Amant
Victor Rühle
Saravan Rajmohan
93
7
0
17 May 2024
ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios
Markus Bayer
Justin Lutz
Christian A. Reuter
99
7
0
17 May 2024
Positional encoding is not the same as context: A study on positional encoding for sequential recommendation
Alejo López-Ávila
Jinhua Du
Abbas Shimary
Ze Li
62
2
0
16 May 2024
Networking Systems for Video Anomaly Detection: A Tutorial and Survey
Jing Liu
Yang Liu
Jieyu Lin
Jielin Li
Peng Sun
Bo Hu
Liang Song
Azzedine Boukerche
Victor C.M. Leung
Victor C.M. Leung
110
10
0
16 May 2024
Learning 3-Manifold Triangulations
Francesco Costantino
Yang-Hui He
Elli Heyes
Edward Hirst
53
0
0
15 May 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
94
0
0
13 May 2024
Word-specific tonal realizations in Mandarin
Yu-Ying Chuang
Melanie J. Bell
Yu-Hsiang Tseng
R. Baayen
81
5
0
11 May 2024
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
Jie Xu
Karthikeyan P. Saravanan
Rogier van Dalen
Haaris Mehmood
David Tuckey
Mete Ozay
103
6
0
10 May 2024
Whole Genome Transformer for Gene Interaction Effects in Microbiome Habitat Specificity
Zhufeng Li
S. S. Cranganore
Nicholas D. Youngblut
Niki Kilbertus
72
2
0
09 May 2024
Natural Language Processing RELIES on Linguistics
Juri Opitz
Shira Wein
Nathan Schneider
AI4CE
73
7
0
09 May 2024
Review-based Recommender Systems: A Survey of Approaches, Challenges and Future Perspectives
Emrul Hasan
Mizanur Rahman
Chen Ding
Jimmy Xiangji Huang
Shaina Raza
51
5
0
09 May 2024
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples
Peiqin Lin
André F. T. Martins
Hinrich Schütze
RALM
83
3
0
08 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
76
33
0
08 May 2024
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Junchao Wu
Runzhe Zhan
Derek F. Wong
Shu Yang
Xuebo Liu
Lidia S. Chao
Min Zhang
DeLMO
85
4
0
07 May 2024
Previous
1
2
3
...
17
18
19
...
23
24
25
Next