ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.12327
  4. Cited By
A Primer in BERTology: What we know about how BERT works

A Primer in BERTology: What we know about how BERT works

27 February 2020
Anna Rogers
Olga Kovaleva
Anna Rumshisky
    OffRL
ArXivPDFHTML

Papers citing "A Primer in BERTology: What we know about how BERT works"

50 / 224 papers shown
Title
Jekyll-and-Hyde Tipping Point in an AI's Behavior
Jekyll-and-Hyde Tipping Point in an AI's Behavior
Neil F. Johnson
Frank Yingjie Huo
46
0
0
29 Apr 2025
Deep Learning with Pretrained Ínternal World' Layers: A Gemma 3-Based Modular Architecture for Wildfire Prediction
Deep Learning with Pretrained Ínternal World' Layers: A Gemma 3-Based Modular Architecture for Wildfire Prediction
Ayoub Jadouli
Chaker El Amrani
KELM
AI4TS
81
0
0
20 Apr 2025
Statistical Deficiency for Task Inclusion Estimation
Loïc Fosse
Frédéric Béchet
Benoit Favre
Géraldine Damnati
Gwénolé Lecorvé
Maxime Darrin
Philippe Formont
Pablo Piantanida
136
0
0
07 Mar 2025
A Survey of Model Architectures in Information Retrieval
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
56
2
0
21 Feb 2025
Integrating Language Models for Enhanced Network State Monitoring in DRL-Based SFC Provisioning
Integrating Language Models for Enhanced Network State Monitoring in DRL-Based SFC Provisioning
Parisa Fard Moshiri
Murat Arda Onsu
Poonam Lohan
Burak Kantarci
Emil Janulewicz
39
0
0
16 Feb 2025
The Geometry of Tokens in Internal Representations of Large Language Models
The Geometry of Tokens in Internal Representations of Large Language Models
Karthik Viswanathan
Yuri Gardinazzi
Giada Panerai
Alberto Cazzaniga
Matteo Biagetti
AIFin
94
4
0
17 Jan 2025
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
Michael Toker
Ido Galil
Hadas Orgad
Rinon Gal
Yoad Tewel
Gal Chechik
Yonatan Belinkov
DiffM
54
2
0
12 Jan 2025
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based
  Language Models
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
33
0
0
15 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
J. Liu
MoE
36
0
0
14 Oct 2024
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
Nadav Borenstein
Anej Svete
R. Chan
Josef Valvoda
Franz Nowak
Isabelle Augenstein
Eleanor Chodroff
Ryan Cotterell
42
11
0
06 Jun 2024
Exploring Multilingual Large Language Models for Enhanced TNM
  classification of Radiology Report in lung cancer staging
Exploring Multilingual Large Language Models for Enhanced TNM classification of Radiology Report in lung cancer staging
Hidetoshi Matsuo
Mizuho Nishio
Takaaki Matsunaga
Koji Fujimoto
Takamichi Murakami
LM&MA
42
5
0
05 Jun 2024
Standards for Belief Representations in LLMs
Standards for Belief Representations in LLMs
Daniel A. Herrmann
B. Levinstein
39
7
0
31 May 2024
Are queries and keys always relevant? A case study on Transformer wave functions
Are queries and keys always relevant? A case study on Transformer wave functions
Riccardo Rende
Luciano Loris Viteritti
24
5
0
29 May 2024
PhilHumans: Benchmarking Machine Learning for Personal Health
PhilHumans: Benchmarking Machine Learning for Personal Health
Vadim Liventsev
Vivek Kumar
Allmin Pradhap Singh Susaiyah
Zixiu "Alex" Wu
Ivan Rodin
...
Milan Petkovic
Diego Reforgiato Recupero
Ehud Reiter
Daniele Riboni
Raymond Sterling
AI4MH
LM&MA
34
0
0
04 May 2024
ViTHSD: Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts
ViTHSD: Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts
Cuong Nhat Vo
Khanh Bao Huynh
Son T. Luu
Trong-Hop Do
45
1
0
30 Apr 2024
Large language models and linguistic intentionality
Large language models and linguistic intentionality
J. Grindrod
38
5
0
15 Apr 2024
Transformers for molecular property prediction: Lessons learned from the
  past five years
Transformers for molecular property prediction: Lessons learned from the past five years
Afnan Sultan
Jochen Sieg
M. Mathea
Andrea Volkamer
AI4CE
29
10
0
05 Apr 2024
CSEPrompts: A Benchmark of Introductory Computer Science Prompts
CSEPrompts: A Benchmark of Introductory Computer Science Prompts
Md. Nishat Raihan
Dhiman Goswami
Sadiya Sayara Chowdhury Puspo
Christian D. Newman
Tharindu Ranasinghe
Marcos Zampieri
ELM
41
2
0
03 Apr 2024
Toward Informal Language Processing: Knowledge of Slang in Large
  Language Models
Toward Informal Language Processing: Knowledge of Slang in Large Language Models
Zhewei Sun
Qian Hu
Rahul Gupta
Richard Zemel
Yang Xu
38
1
0
02 Apr 2024
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
Giovanni Monea
Maxime Peyrard
Martin Josifoski
Vishrav Chaudhary
Jason Eisner
Emre Kiciman
Hamid Palangi
Barun Patra
Robert West
KELM
51
12
0
04 Dec 2023
Uncovering Intermediate Variables in Transformers using Circuit Probing
Uncovering Intermediate Variables in Transformers using Circuit Probing
Michael A. Lepori
Thomas Serre
Ellie Pavlick
75
7
0
07 Nov 2023
Setting the Trap: Capturing and Defeating Backdoors in Pretrained
  Language Models through Honeypots
Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots
Ruixiang Tang
Jiayi Yuan
Yiming Li
Zirui Liu
Rui Chen
Xia Hu
AAML
36
13
0
28 Oct 2023
Codebook Features: Sparse and Discrete Interpretability for Neural
  Networks
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin
Mohammad Taufeeque
Noah D. Goodman
32
27
0
26 Oct 2023
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Morris Alper
Hadar Averbuch-Elor
33
10
0
25 Oct 2023
Towards a Mechanistic Interpretation of Multi-Step Reasoning
  Capabilities of Language Models
Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou
Jiaoda Li
Yu Fei
Alessandro Stolfo
Wangchunshu Zhou
Guangtao Zeng
Antoine Bosselut
Mrinmaya Sachan
LRM
30
40
0
23 Oct 2023
Bridging Information-Theoretic and Geometric Compression in Language
  Models
Bridging Information-Theoretic and Geometric Compression in Language Models
Emily Cheng
Corentin Kervadec
Marco Baroni
34
16
0
20 Oct 2023
The Temporal Structure of Language Processing in the Human Brain
  Corresponds to The Layered Hierarchy of Deep Language Models
The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models
Ariel Goldstein
Eric Ham
Mariano Schain
Samuel A. Nastase
Zaid Zada
...
Avinatan Hassidim
O. Devinsky
A. Flinker
Omer Levy
Uri Hasson
AI4CE
15
10
0
11 Oct 2023
Why bother with geometry? On the relevance of linear decompositions of
  Transformer embeddings
Why bother with geometry? On the relevance of linear decompositions of Transformer embeddings
Timothee Mickus
Raúl Vázquez
20
2
0
10 Oct 2023
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Recurrent Neural Language Models as Probabilistic Finite-state Automata
Anej Svete
Ryan Cotterell
32
2
0
08 Oct 2023
Language Models Represent Space and Time
Language Models Represent Space and Time
Wes Gurnee
Max Tegmark
33
141
0
03 Oct 2023
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls
  of Large Language Models on Bengali NLP
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP
M. Kabir
Mohammed Saidul Islam
Md Tahmid Rahman Laskar
Mir Tafseer Nayeem
M Saiful Bari
Enamul Hoque
LM&MA
24
15
0
22 Sep 2023
Feature Engineering in Learning-to-Rank for Community Question Answering
  Task
Feature Engineering in Learning-to-Rank for Community Question Answering Task
Nafis Sajid
Md Rashidul Hasan
Muhammad Ibrahim
21
3
0
14 Sep 2023
A Comparative Analysis of Pretrained Language Models for Text-to-Speech
A Comparative Analysis of Pretrained Language Models for Text-to-Speech
M. G. Moya
Panagiota Karanasou
S. Karlapati
Bastian Schnell
Nicole Peinelt
Alexis Moinet
Thomas Drugman
37
3
0
04 Sep 2023
A User-Centered Evaluation of Spanish Text Simplification
A User-Centered Evaluation of Spanish Text Simplification
Adrian de Wynter
Anthony Hevia
Si-Qing Chen
28
0
0
15 Aug 2023
Intelligent Assistant Language Understanding On Device
Intelligent Assistant Language Understanding On Device
Cecilia Aas
Hisham Abdelsalam
Irina Belousova
Shruti Bhargava
Jianpeng Cheng
...
John Torr
Marco Del Vecchio
Jay Wacker
Jason D. Williams
Hong-ye Yu
13
2
0
07 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of
  large language model behavior?
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
30
14
0
31 Jul 2023
Exploring Anisotropy and Outliers in Multilingual Language Models for
  Cross-Lingual Semantic Sentence Similarity
Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity
Katharina Hämmerl
Alina Fastowski
Jindrich Libovický
Alexander M. Fraser
20
6
0
01 Jun 2023
A Method for Studying Semantic Construal in Grammatical Constructions
  with Interpretable Contextual Embedding Spaces
A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces
Gabriella Chronis
Kyle Mahowald
K. Erk
18
8
0
29 May 2023
Plug-and-Play Document Modules for Pre-trained Models
Plug-and-Play Document Modules for Pre-trained Models
Chaojun Xiao
Zhengyan Zhang
Xu Han
Chi-Min Chan
Yankai Lin
Zhiyuan Liu
Xiangyang Li
Zhonghua Li
Zhao Cao
Maosong Sun
KELM
22
5
0
28 May 2023
Structural Ambiguity and its Disambiguation in Language Model Based
  Parsers: the Case of Dutch Clause Relativization
Structural Ambiguity and its Disambiguation in Language Model Based Parsers: the Case of Dutch Clause Relativization
G. Wijnholds
M. Moortgat
10
3
0
24 May 2023
Automatic Readability Assessment for Closely Related Languages
Automatic Readability Assessment for Closely Related Languages
Joseph Marvin Imperial
E. Kochmar
22
8
0
22 May 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States
  for Analyzing Model Predictions
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
Byung-Doh Oh
William Schuler
29
2
0
17 May 2023
Explaining black box text modules in natural language with language
  models
Explaining black box text modules in natural language with language models
Chandan Singh
Aliyah R. Hsu
Richard Antonello
Shailee Jain
Alexander G. Huth
Bin-Xia Yu
Jianfeng Gao
MILM
26
46
0
17 May 2023
Idioms, Probing and Dangerous Things: Towards Structural Probing for
  Idiomaticity in Vector Space
Idioms, Probing and Dangerous Things: Towards Structural Probing for Idiomaticity in Vector Space
Filip Klubicka
Vasudevan Nedumpozhimana
John D. Kelleher
33
4
0
27 Apr 2023
Integrating Image Features with Convolutional Sequence-to-sequence
  Network for Multilingual Visual Question Answering
Integrating Image Features with Convolutional Sequence-to-sequence Network for Multilingual Visual Question Answering
T. M. Thai
Son T. Luu
37
0
0
22 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
25
42
0
10 Mar 2023
STA: Self-controlled Text Augmentation for Improving Text
  Classifications
STA: Self-controlled Text Augmentation for Improving Text Classifications
Congcong Wang
Gonzalo Fiz Pontiveros
Steven Derby
Tri Kurniawan Wijaya
40
3
0
24 Feb 2023
A Scalable Space-efficient In-database Interpretability Framework for
  Embedding-based Semantic SQL Queries
A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries
P. Kudva
R. Bordawekar
Apoorva Nitsure
12
0
0
23 Feb 2023
Mask-guided BERT for Few Shot Text Classification
Mask-guided BERT for Few Shot Text Classification
Wenxiong Liao
Zheng Liu
Haixing Dai
Zihao Wu
Yiyang Zhang
...
Dajiang Zhu
Tianming Liu
Sheng R. Li
Xiang Li
Hongmin Cai
VLM
47
39
0
21 Feb 2023
Dynamic Named Entity Recognition
Dynamic Named Entity Recognition
Tristan Luiggi
Laure Soulier
Vincent Guigue
Siwar Jendoubi
Aurélien Baelde
28
0
0
16 Feb 2023
12345
Next