ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 23,491 papers shown
Title
GORACS: Group-level Optimal Transport-guided Coreset Selection for LLM-based Recommender Systems
GORACS: Group-level Optimal Transport-guided Coreset Selection for LLM-based Recommender Systems
Tiehua Mei
Hengrui Chen
Peng Yu
Jiaqing Liang
Deqing Yang
94
0
0
04 Jun 2025
Evaluating Apple Intelligence's Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets
Evaluating Apple Intelligence's Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets
Mohd. Farhan Israk Soumik
Syed Mhamudul Hasan
Abdur R. Shahid
101
0
0
04 Jun 2025
Analyzing Transformer Models and Knowledge Distillation Approaches for Image Captioning on Edge AI
Analyzing Transformer Models and Knowledge Distillation Approaches for Image Captioning on Edge AI
Wing Man Casca Kwok
Yip Chiu Tung
Kunal Bhagchandani
VLM
61
0
0
04 Jun 2025
Relationship Detection on Tabular Data Using Statistical Analysis and Large Language Models
Relationship Detection on Tabular Data Using Statistical Analysis and Large Language Models
Panagiotis Koletsis
Christos Panagiotopoulos
Georgios Th. Papadopoulos
Vasilis Efthymiou
LMTD
13
0
0
04 Jun 2025
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
Hao Yu
Tangyu Jiang
Shuning Jia
Shannan Yan
Shunning Liu
Haolong Qian
Guanghao Li
Shuting Dong
Huaisong Zhang
Chun Yuan
102
0
0
04 Jun 2025
Multi-domain anomaly detection in a 5G network
Multi-domain anomaly detection in a 5G network
Thomas Hoger
Philippe Owezarski
12
0
0
04 Jun 2025
Building a Few-Shot Cross-Domain Multilingual NLU Model for Customer Care
Saurabh Kumar
Sourav Bansal
Neeraj Agrawal
Priyanka Bhatt
29
0
0
04 Jun 2025
Measuring Human Involvement in AI-Generated Text: A Case Study on Academic Writing
Measuring Human Involvement in AI-Generated Text: A Case Study on Academic Writing
Yuchen Guo
Zhicheng Dou
H. Nguyen
Ching-Chun Chang
Saku Sugawara
Isao Echizen
DeLMO
112
0
0
04 Jun 2025
Algorithms for estimating linear function in data mining
Algorithms for estimating linear function in data mining
Thomas Hoang
23
0
0
04 Jun 2025
Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection
Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection
Chuyuan Li
Raymond Li
Thalia S. Field
Giuseppe Carenini
138
0
0
04 Jun 2025
Object-level Self-Distillation for Vision Pretraining
Object-level Self-Distillation for Vision Pretraining
Çağlar Hızlı
Çağatay Yıldız
Pekka Marttinen
OCLVLM
52
0
0
04 Jun 2025
A Large-Scale Referring Remote Sensing Image Segmentation Dataset and Benchmark
A Large-Scale Referring Remote Sensing Image Segmentation Dataset and Benchmark
Zhigang Yang
Huiguang Yao
Linmao Tian
Xuezhi Zhao
Qiang Li
Qi. Wang
100
0
0
04 Jun 2025
When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective
When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective
Beatrix M. G. Nielsen
Emanuele Marconato
Andrea Dittadi
Luigi Gresele
62
0
0
04 Jun 2025
Attention-Only Transformers via Unrolled Subspace Denoising
Attention-Only Transformers via Unrolled Subspace Denoising
Peng Wang
Yifu Lu
Yaodong Yu
Druv Pai
Qing Qu
Yi Ma
ViT
128
1
0
04 Jun 2025
Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering
Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering
Pradeep Rangappa
Andres Carofilis
Jeena Prakash
Shashi Kumar
Sergio Burdisso
...
P. Motlícek
Kadri Hacioğlu
Shankar Venkatesan
Saurabh Vyas
Andreas Stolcke
49
0
0
04 Jun 2025
Scaling Transformers for Discriminative Recommendation via Generative Pretraining
Scaling Transformers for Discriminative Recommendation via Generative Pretraining
Chunqi Wang
Bingchao Wu
Z. Chen
Lei Shen
Bing Wang
Xiaoyi Zeng
60
0
0
04 Jun 2025
Training-free AI for Earth Observation Change Detection using Physics Aware Neuromorphic Networks
Stephen Smith
Cormac Purcell
Zdenka Kuncic
34
0
0
04 Jun 2025
Comprehensive Attribute Encoding and Dynamic LSTM HyperModels for Outcome Oriented Predictive Business Process Monitoring
Comprehensive Attribute Encoding and Dynamic LSTM HyperModels for Outcome Oriented Predictive Business Process Monitoring
Fang Wang
Paolo Ceravolo
Ernesto Damiani
AI4TS
87
0
0
04 Jun 2025
Prosodic Structure Beyond Lexical Content: A Study of Self-Supervised Learning
Prosodic Structure Beyond Lexical Content: A Study of Self-Supervised Learning
Sarenne Wallbridge
Christoph Minixhofer
Catherine Lai
P. Bell
SSL
62
0
0
03 Jun 2025
Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM
Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM
Pralaypati Ta
Sriram Venkatesaperumal
Keerthi Ram
M. Sivaprakasam
47
0
0
03 Jun 2025
Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs
Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs
Jaydip Sen
Saptarshi Sengupta
S. Dasgupta
52
0
0
03 Jun 2025
DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization
DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization
Geonyoung Lee
Geonhee Han
Paul Hongsuck Seo
DiffM
80
0
0
03 Jun 2025
CoT is Not True Reasoning, It Is Just a Tight Constraint to Imitate: A Theory Perspective
CoT is Not True Reasoning, It Is Just a Tight Constraint to Imitate: A Theory Perspective
Jintian Shao
YiMing Cheng
LRM
73
0
0
03 Jun 2025
Token and Span Classification for Entity Recognition in French Historical Encyclopedias
Token and Span Classification for Entity Recognition in French Historical Encyclopedias
Ludovic Moncla
Hédi Zeghidi
61
0
0
03 Jun 2025
EvidenceOutcomes: a Dataset of Clinical Trial Publications with Clinically Meaningful Outcomes
EvidenceOutcomes: a Dataset of Clinical Trial Publications with Clinically Meaningful Outcomes
Yiliang Zhou
Abigail M. Newbury
Gongbo Zhang
B. Idnay
Hao Liu
Chunhua Weng
Yifan Peng
30
0
0
03 Jun 2025
Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
Luca Arnaboldi
Bruno Loureiro
Ludovic Stephan
Florent Krzakala
Lenka Zdeborová
65
0
0
03 Jun 2025
Natural Language Processing to Enhance Deliberation in Political Online Discussions: A Survey
Natural Language Processing to Enhance Deliberation in Political Online Discussions: A Survey
Maike Behrendt
Stefan Sylvius Wagner
Carina Weinmann
Marike Bormann
Mira Warne
Stefan Harmeling
60
0
0
03 Jun 2025
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring
Zhixiong Su
Yichen Wang
Herun Wan
Zhaohan Zhang
Minnan Luo
DeLMO
66
0
0
03 Jun 2025
Rethinking the effects of data contamination in Code Intelligence
Rethinking the effects of data contamination in Code Intelligence
Zhen Yang
Hongyi Lin
Yifan He
Jie Xu
Zeyu Sun
Shuo Liu
P. Wang
Zhongxing Yu
Qingyuan Liang
60
0
0
03 Jun 2025
PoLAR: Polar-Decomposed Low-Rank Adapter Representation
PoLAR: Polar-Decomposed Low-Rank Adapter Representation
Kai Lion
Liang Zhang
Bingcong Li
Niao He
62
0
0
03 Jun 2025
IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator
IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator
Yusuke Sakai
Takumi Goto
Taro Watanabe
54
1
0
03 Jun 2025
From Transformers to Large Language Models: A systematic review of AI applications in the energy sector towards Agentic Digital Twins
From Transformers to Large Language Models: A systematic review of AI applications in the energy sector towards Agentic Digital Twins
Gabriel Antonesi
T. Cioara
I. Anghel
Vasilis Michalakopoulos
Elissaios Sarmas
Liana Toderean
LLMAGMedImAI4CE
20
0
0
03 Jun 2025
OpenFace 3.0: A Lightweight Multitask System for Comprehensive Facial Behavior Analysis
OpenFace 3.0: A Lightweight Multitask System for Comprehensive Facial Behavior Analysis
Jiewen Hu
Leena Mathur
Paul Pu Liang
Louis-Philippe Morency
CVBM
59
0
0
03 Jun 2025
Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior
Exploiting LLMs for Automatic Hypothesis Assessment via a Logit-Based Calibrated Prior
Yue Gong
Raul Castro Fernandez
19
0
0
03 Jun 2025
Enriching Location Representation with Detailed Semantic Information
Enriching Location Representation with Detailed Semantic Information
Junyuan Liu
Xinglei Wang
Tao Cheng
70
1
0
03 Jun 2025
Universal Reusability in Recommender Systems: The Case for Dataset- and Task-Independent Frameworks
Universal Reusability in Recommender Systems: The Case for Dataset- and Task-Independent Frameworks
Tri Kurniawan Wijaya
Xinyang Shao
Gonzalo Fiz Pontiveros
Edoardo DÁmico
29
0
0
03 Jun 2025
Hopscotch: Discovering and Skipping Redundancies in Language Models
Hopscotch: Discovering and Skipping Redundancies in Language Models
Mustafa Eyceoz
Nikhil Shivakumar Nayak
Hao Wang
Ligong Han
Akash Srivastava
56
0
0
03 Jun 2025
Efficient Tactile Perception with Soft Electrical Impedance Tomography and Pre-trained Transformer
Efficient Tactile Perception with Soft Electrical Impedance Tomography and Pre-trained Transformer
Huazhi Dong
Ronald B. Liu
Sihao Teng
Delin Hu
Peisan
F. G. Serchi
Yunjie Yang
46
0
0
03 Jun 2025
Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning
Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning
Pierre Lepagnol
Sahar Ghannay
Thomas Gerald
Christophe Servan
S. Rosset
54
0
0
03 Jun 2025
Comparative Analysis of AI Agent Architectures for Entity Relationship Classification
Comparative Analysis of AI Agent Architectures for Entity Relationship Classification
Maryam Berijanian
Kuldeep Singh
Amin Sehati
63
0
0
03 Jun 2025
ss-Mamba: Semantic-Spline Selective State-Space Model
ss-Mamba: Semantic-Spline Selective State-Space Model
Zuochen Ye
Mamba
33
0
0
03 Jun 2025
ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model
ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model
Wenshuo Chen
Kuimou Yu
Haozhe Jia
Kaishen Yuan
Bowen Tian
Songning Lai
Hongru Xiao
Erhang Zhang
Lei Wang
Yutao Yue
DiffMVGen
77
0
0
03 Jun 2025
Something Just Like TRuST : Toxicity Recognition of Span and Target
Something Just Like TRuST : Toxicity Recognition of Span and Target
Berk Atil
Namrata Sureddy
R. Passonneau
29
0
0
02 Jun 2025
FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
Yiming Zhong
Yumeng Liu
Chuyang Xiao
Zemin Yang
Youzhuo Wang
Yufei Zhu
Ye-ling Shi
Yujing Sun
X. Zhu
Yuexin Ma
61
0
0
02 Jun 2025
Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts
Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts
Spencer Banasik
45
0
0
02 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
38
0
0
02 Jun 2025
Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data
Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data
Zixiao Zhu
Kezhi Mao
54
0
0
02 Jun 2025
Human-Centric Evaluation for Foundation Models
Human-Centric Evaluation for Foundation Models
Yijin Guo
Kaiyuan Ji
Xiaorong Zhu
Junying Wang
Farong Wen
Chunyi Li
Zicheng Zhang
Guangtao Zhai
ALMELM
63
0
0
02 Jun 2025
Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models
Yuchen Liang
Renxiang Huang
Lifeng Lai
Ness B. Shroff
Yingbin Liang
40
0
0
02 Jun 2025
Unraveling Spatio-Temporal Foundation Models via the Pipeline Lens: A Comprehensive Review
Unraveling Spatio-Temporal Foundation Models via the Pipeline Lens: A Comprehensive Review
Yuchen Fang
Hao Miao
Yuxuan Liang
Liwei Deng
Yue Cui
...
Yan Zhao
T. Pedersen
Christian S. Jensen
Xiaofang Zhou
Kai Zheng
AI4TSAI4CE
79
0
0
02 Jun 2025
Previous
123...567...468469470
Next