ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.02715
  4. Cited By
Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

6 June 2019
Andy Coenen
Emily Reif
Ann Yuan
Been Kim
Adam Pearce
F. Viégas
Martin Wattenberg
    MILM
ArXivPDFHTML

Papers citing "Visualizing and Measuring the Geometry of BERT"

21 / 71 papers shown
Title
Gender Bias in Multilingual Neural Machine Translation: The Architecture
  Matters
Gender Bias in Multilingual Neural Machine Translation: The Architecture Matters
Marta R. Costa-jussá
Carlos Escolano
Christine Basta
Javier Ferrando
Roser Batlle-Roca
Ksenia Kharitonova
16
18
0
24 Dec 2020
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
TabTransformer: Tabular Data Modeling Using Contextual Embeddings
Xin Huang
A. Khetan
Milan Cvitkovic
Zohar Karnin
ViT
LMTD
157
417
0
11 Dec 2020
Positional Artefacts Propagate Through Masked Language Model Embeddings
Positional Artefacts Propagate Through Masked Language Model Embeddings
Ziyang Luo
Artur Kulmizev
Xiaoxi Mao
29
41
0
09 Nov 2020
Dynamic Contextualized Word Embeddings
Dynamic Contextualized Word Embeddings
Valentin Hofmann
J. Pierrehumbert
Hinrich Schütze
39
51
0
23 Oct 2020
Probing Pretrained Language Models for Lexical Semantics
Probing Pretrained Language Models for Lexical Semantics
Ivan Vulić
E. Ponti
Robert Litschko
Goran Glavas
Anna Korhonen
KELM
28
232
0
12 Oct 2020
LUKE: Deep Contextualized Entity Representations with Entity-aware
  Self-attention
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Ikuya Yamada
Akari Asai
Hiroyuki Shindo
Hideaki Takeda
Yuji Matsumoto
22
662
0
02 Oct 2020
Attention Flows: Analyzing and Comparing Attention Mechanisms in
  Language Models
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Joseph F DeRose
Jiayao Wang
M. Berger
17
83
0
03 Sep 2020
Analysis and Evaluation of Language Models for Word Sense Disambiguation
Analysis and Evaluation of Language Models for Word Sense Disambiguation
Daniel Loureiro
Kiamehr Rezaee
Mohammad Taher Pilehvar
Jose Camacho-Collados
18
13
0
26 Aug 2020
BERTology Meets Biology: Interpreting Attention in Protein Language
  Models
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig
Ali Madani
L. Varshney
Caiming Xiong
R. Socher
Nazneen Rajani
29
288
0
26 Jun 2020
Finding Universal Grammatical Relations in Multilingual BERT
Finding Universal Grammatical Relations in Multilingual BERT
Ethan A. Chi
John Hewitt
Christopher D. Manning
18
151
0
09 May 2020
Moving Down the Long Tail of Word Sense Disambiguation with
  Gloss-Informed Biencoders
Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders
Terra Blevins
Luke Zettlemoyer
24
162
0
06 May 2020
Quantifying Attention Flow in Transformers
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
15
776
0
02 May 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,452
0
18 Mar 2020
A Survey on Contextual Embeddings
A Survey on Contextual Embeddings
Qi Liu
Matt J. Kusner
Phil Blunsom
225
146
0
16 Mar 2020
Capturing Evolution in Word Usage: Just Add More Clusters?
Capturing Evolution in Word Usage: Just Add More Clusters?
Matej Martinc
Syrielle Montariol
Elaine Zosa
Lidia Pivovarova
43
47
0
18 Jan 2020
oLMpics -- On what Language Model Pre-training Captures
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
22
300
0
31 Dec 2019
Are Transformers universal approximators of sequence-to-sequence
  functions?
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
8
335
0
20 Dec 2019
What do you mean, BERT? Assessing BERT as a Distributional Semantics
  Model
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
Timothee Mickus
Denis Paperno
Mathieu Constant
Kees van Deemter
23
45
0
13 Nov 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
23
16
0
25 Oct 2019
On Identifiability in Transformers
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
30
186
0
12 Aug 2019
What you can cram into a single vector: Probing sentence embeddings for
  linguistic properties
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
Previous
12