ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 17,997 papers shown
Title
Analog Foundation Models
Analog Foundation Models
Julian Büchel
Iason Chalas
Giovanni Acampa
An Chen
Omobayode Fagbohungbe
Sidney Tsai
Kaoutar El Maghraoui
Manuel Le Gallo
Abbas Rahimi
Abu Sebastian
MQ
35
0
0
14 May 2025
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning
Berkay Guler
Giovanni Geraci
Hamid Jafarkhani
38
0
0
14 May 2025
AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation
AdaFortiTran: An Adaptive Transformer Model for Robust OFDM Channel Estimation
Berkay Guler
Hamid Jafarkhani
37
1
0
14 May 2025
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models
Junda Zhao
Yuliang Song
Eldan Cohen
26
0
0
14 May 2025
Contrastive Cross-Course Knowledge Tracing via Concept Graph Guided Knowledge Transfer
Contrastive Cross-Course Knowledge Tracing via Concept Graph Guided Knowledge Transfer
Wenkang Han
Wang Lin
Liya Hu
Zhenlong Dai
Yiyun Zhou
Mengze Li
Zemin Liu
Chang Yao
Jingyuan Chen
AI4Ed
20
0
0
14 May 2025
LiDDA: Data Driven Attribution at LinkedIn
LiDDA: Data Driven Attribution at LinkedIn
John Bencina
Erkut Aykutlug
Yue Chen
Zerui Zhang
Stephanie Sorenson
Shao Tang
Changshuai Wei
19
0
0
14 May 2025
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
Brandon Smith
Mohamed Reda Bouadjenek
Tahsin Alamgir Kheya
Phillip Dawson
S. Aryal
ALM
ELM
39
0
0
14 May 2025
Adversarial Suffix Filtering: a Defense Pipeline for LLMs
Adversarial Suffix Filtering: a Defense Pipeline for LLMs
David Khachaturov
Robert D. Mullins
AAML
36
0
0
14 May 2025
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor
Seungbeom Choi
Jeonghoe Goo
Eunjoo Jeon
Mingyu Yang
Minsung Jang
21
0
0
14 May 2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Julian Tanke
Takashi Shibuya
Kengo Uchida
Koichi Saito
Yuki Mitsufuji
Mamba
47
0
0
14 May 2025
Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain
Learning Advanced Self-Attention for Linear Transformers in the Singular Value Domain
Hyowon Wi
Jeongwhan Choi
Noseong Park
33
0
0
13 May 2025
Guiding LLM-based Smart Contract Generation with Finite State Machine
Guiding LLM-based Smart Contract Generation with Finite State Machine
Hao Luo
Yuhao Lin
Xiao Yan
Xintong Hu
Yunhong Wang
Qiming Zeng
Hao Wang
Jiawei Jiang
31
0
0
13 May 2025
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies
Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies
Xiaoliang Luo
Xinyi Xu
Michael Ramscar
Bradley C. Love
32
0
0
13 May 2025
Small but Significant: On the Promise of Small Language Models for Accessible AIED
Small but Significant: On the Promise of Small Language Models for Accessible AIED
Yumou Wei
Paulo Carvalho
John Stamper
SyDa
45
0
0
13 May 2025
An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models
An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models
Jialin Mao
Itay Griniasty
Yan Sun
Mark K. Transtrum
James P. Sethna
Pratik Chaudhari
29
0
0
13 May 2025
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News
The Truth Becomes Clearer Through Debate! Multi-Agent Systems with Large Language Models Unmask Fake News
Yuhan Liu
Yong-Jin Liu
Xiaoqing Zhang
Xiuying Chen
Rui Yan
LLMAG
49
0
0
13 May 2025
RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models
RepCali: High Efficient Fine-tuning Via Representation Calibration in Latent Space for Pre-trained Language Models
Fujun Zhang
Xiangdong Su
36
0
0
13 May 2025
LM-Scout: Analyzing the Security of Language Model Integration in Android Apps
LM-Scout: Analyzing the Security of Language Model Integration in Android Apps
Muhammad Ibrahim
Gűliz Seray Tuncay
Z. Berkay Celik
Aravind Machiry
Antonio Bianchi
38
0
0
13 May 2025
Automatic Task Detection and Heterogeneous LLM Speculative Decoding
Automatic Task Detection and Heterogeneous LLM Speculative Decoding
Danying Ge
Jianhua Gao
Qizhi Jiang
Yifei Feng
Weixing Ji
44
0
0
13 May 2025
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models
LCES: Zero-shot Automated Essay Scoring via Pairwise Comparisons Using Large Language Models
Takumi Shibata
Yuichi Miyamura
39
0
0
13 May 2025
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
54
0
0
13 May 2025
Exploiting Text Semantics for Few and Zero Shot Node Classification on Text-attributed Graph
Exploiting Text Semantics for Few and Zero Shot Node Classification on Text-attributed Graph
Yuxiang Wang
Xiao Yan
Shiyu Jin
Quanqing Xu
Chuang Hu
Yuanyuan Zhu
Bo Du
Jia Wu
Wentao Zhang
36
0
0
13 May 2025
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection
Ayush K. Rai
Kyle Min
Tarun Krishna
Feiyan Hu
Alan F. Smeaton
Noel E. O'Connor
VGen
31
0
0
13 May 2025
Next Word Suggestion using Graph Neural Network
Next Word Suggestion using Graph Neural Network
Abisha Thapa Magar
Anup Shakya
GNN
32
0
0
13 May 2025
Ultrasound Report Generation with Multimodal Large Language Models for Standardized Texts
Ultrasound Report Generation with Multimodal Large Language Models for Standardized Texts
Peixuan Ge
Tongkun Su
Faqin Lv
Baoliang Zhao
Peng Zhang
...
Liang Yao
Yu Sun
Zenan Wang
Pak Kin Wong
Ying Hu
MedIm
29
0
0
13 May 2025
Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Guang Yan
Yuhui Zhang
Zimu Guo
Lutan Zhao
Xiaojun Chen
Chen Wang
Wenhao Wang
Dan Meng
Rui Hou
33
0
0
12 May 2025
HAMLET: Healthcare-focused Adaptive Multilingual Learning Embedding-based Topic Modeling
HAMLET: Healthcare-focused Adaptive Multilingual Learning Embedding-based Topic Modeling
Hajar Sakai
Sarah Lam
34
0
0
12 May 2025
Must Read: A Systematic Survey of Computational Persuasion
Must Read: A Systematic Survey of Computational Persuasion
Nimet Beyza Bozdag
Shuhaib Mehri
Xiaocheng Yang
Hyeonjeong Ha
Zirui Cheng
Esin Durmus
Jiaxuan You
Heng Ji
Gokhan Tur
Dilek Hakkani-Tur
54
0
0
12 May 2025
Chronocept: Instilling a Sense of Time in Machines
Chronocept: Instilling a Sense of Time in Machines
Krish Goel
Sanskar Pandey
KS Mahadevan
Harsh Kumar
Vishesh Khadaria
28
0
0
12 May 2025
Efficient and Reproducible Biomedical Question Answering using Retrieval Augmented Generation
Efficient and Reproducible Biomedical Question Answering using Retrieval Augmented Generation
Linus Stuhlmann
Michael Alexander Saxer
Jonathan Fürst
RALM
31
0
0
12 May 2025
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach
Ruikun Hou
B. Bühler
Tim Fütterer
Efe Bozkir
Peter Gerjets
Ulrich Trautwein
Enkelejda Kasneci
31
0
0
12 May 2025
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
Jiashuo Sun
Xianrui Zhong
Sizhe Zhou
Jiawei Han
RALM
31
0
0
12 May 2025
Synthetic Code Surgery: Repairing Bugs and Vulnerabilities with LLMs and Synthetic Data
Synthetic Code Surgery: Repairing Bugs and Vulnerabilities with LLMs and Synthetic Data
David de-Fitero-Dominguez
Antonio Garcia-Cabot
Eva García-López
SyDa
71
0
0
12 May 2025
A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny
A Reproduction Study: The Kernel PCA Interpretation of Self-Attention Fails Under Scrutiny
Karahan Sarıtaş
Çağatay Yıldız
34
0
0
12 May 2025
Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent
Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent
E. Wilcox
Cui Ding
Giovanni Acampa
Tiago Pimentel
Alex Warstadt
Tamar I. Regev
36
0
0
12 May 2025
Domain Regeneration: How well do LLMs match syntactic properties of text domains?
Domain Regeneration: How well do LLMs match syntactic properties of text domains?
Da Ju
Hagen Blix
Adina Williams
DeLMO
43
0
0
12 May 2025
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs
Miguel Lopez-Duran
Julian Fierrez
Aythami Morales
Ruben Tolosana
Oscar Delgado-Mohatar
Alvaro Ortigosa
2
0
0
12 May 2025
TiSpell: A Semi-Masked Methodology for Tibetan Spelling Correction covering Multi-Level Error with Data Augmentation
TiSpell: A Semi-Masked Methodology for Tibetan Spelling Correction covering Multi-Level Error with Data Augmentation
Yutong Liu
Feng Xiao
Ziyue Zhang
Yongbin Yu
Cheng Huang
...
Thupten Tsering
Cheng Huang
Gadeng Luosang
Renzeng Duojie
Nyima Tashi
31
0
0
12 May 2025
MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing
MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing
Aybora Koksal
A. Aydin Alatan
LRM
29
1
0
12 May 2025
LECTOR: Summarizing E-book Reading Content for Personalized Student Support
LECTOR: Summarizing E-book Reading Content for Personalized Student Support
Erwin Daniel López Zapata
Cheng Tang
Valdemar Švábenský
Fumiya Okubo
Atsushi Shimada
24
0
0
12 May 2025
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition
Zheng Yao
Shuai Wang
Guido Zuccon
21
0
0
12 May 2025
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
Hajar Sakai
Sarah Lam
VLM
44
0
0
12 May 2025
Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems
Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems
Ippokratis Koukoulis
Ilias Syrigos
Thanasis Korakis
31
0
0
12 May 2025
Tagging fully hadronic exotic decays of the vectorlike $\mathbf{B}$ quark using a graph neural network
Tagging fully hadronic exotic decays of the vectorlike B\mathbf{B}B quark using a graph neural network
Jai Bardhan
Tanumoy Mandal
Subhadip Mitra
Cyrin Neeraj
Mihir Rawat
28
0
0
12 May 2025
AI-Enabled Accurate Non-Invasive Assessment of Pulmonary Hypertension Progression via Multi-Modal Echocardiography
AI-Enabled Accurate Non-Invasive Assessment of Pulmonary Hypertension Progression via Multi-Modal Echocardiography
Jiewen Yang
Taoran Huang
Shangwei Ding
Xiaowei Xu
Qinhua Zhao
...
Bin Pu
Jiexuan Zheng
Caojin Zhang
Hongwen Fei
Xuelong Li
21
0
0
12 May 2025
Multimodal Survival Modeling in the Age of Foundation Models
Multimodal Survival Modeling in the Age of Foundation Models
Steven Song
Morgan Borjigin-Wang
Irene Madejski
Robert L. Grossman
28
0
0
12 May 2025
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Joint Low-level and High-level Textual Representation Learning with Multiple Masking Strategies
Zhengmi Tang
Yuto Mitsui
Tomo Miyazaki
S. Omachi
36
0
0
11 May 2025
NewsNet-SDF: Stochastic Discount Factor Estimation with Pretrained Language Model News Embeddings via Adversarial Networks
NewsNet-SDF: Stochastic Discount Factor Estimation with Pretrained Language Model News Embeddings via Adversarial Networks
Shunyao Wang
Ming Cheng
Christina Dan Wang
AIFin
30
0
0
11 May 2025
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Francesco Cagnetta
Alessandro Favero
Antonio Sclocchi
M. Wyart
30
0
0
11 May 2025
Evaluating Reasoning LLMs for Suicide Screening with the Columbia-Suicide Severity Rating Scale
Evaluating Reasoning LLMs for Suicide Screening with the Columbia-Suicide Severity Rating Scale
Avinash Patil
Siru Tao
Amardeep Gedhu
AI4MH
LRM
ELM
7
0
0
11 May 2025
Previous
12345...358359360
Next