ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05365
  4. Cited By
Deep contextualized word representations
v1v2 (latest)

Deep contextualized word representations

15 February 2018
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
    NAI
ArXiv (abs)PDFHTML

Papers citing "Deep contextualized word representations"

50 / 4,508 papers shown
Title
Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and
  Evaluation
Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation
Marius Mosbach
Tiago Pimentel
Shauli Ravfogel
Dietrich Klakow
Yanai Elazar
108
135
0
26 May 2023
Backpack Language Models
Backpack Language Models
John Hewitt
John Thickstun
Christopher D. Manning
Percy Liang
KELM
101
16
0
26 May 2023
Parameter-Efficient Fine-Tuning without Introducing New Latency
Parameter-Efficient Fine-Tuning without Introducing New Latency
Baohao Liao
Yan Meng
Christof Monz
59
56
0
26 May 2023
Language Models Implement Simple Word2Vec-style Vector Arithmetic
Language Models Implement Simple Word2Vec-style Vector Arithmetic
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
KELM
95
66
0
25 May 2023
Extracting Text Representations for Terms and Phrases in Technical
  Domains
Extracting Text Representations for Terms and Phrases in Technical Domains
Francesco Fusco
Diego Antognini
62
0
0
25 May 2023
Dynamic Masking Rate Schedules for MLM Pretraining
Dynamic Masking Rate Schedules for MLM Pretraining
Zachary Ankner
Naomi Saphra
Davis W. Blalock
Jonathan Frankle
Matthew L. Leavitt
101
8
0
24 May 2023
A Human-in-the-Loop Approach for Information Extraction from Privacy
  Policies under Data Scarcity
A Human-in-the-Loop Approach for Information Extraction from Privacy Policies under Data Scarcity
M. Gebauer
Faraz Maschhur
Nicola Leschke
Elias Grünewald
Frank Pallas
61
7
0
24 May 2023
Modeling rapid language learning by distilling Bayesian priors into
  artificial neural networks
Modeling rapid language learning by distilling Bayesian priors into artificial neural networks
R. Thomas McCoy
Thomas Griffiths
BDL
89
17
0
24 May 2023
Complex Mathematical Symbol Definition Structures: A Dataset and Model
  for Coordination Resolution in Definition Extraction
Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition Extraction
Anna Martin-Boyle
Andrew Head
Kyle Lo
Risham Sidhu
Marti A. Hearst
Dongyeop Kang
55
1
0
24 May 2023
From Characters to Words: Hierarchical Pre-trained Language Model for
  Open-vocabulary Language Understanding
From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Li Sun
F. Luisier
Kayhan Batmanghelich
D. Florêncio
Changrong Zhang
VLM
44
6
0
23 May 2023
Masked Path Modeling for Vision-and-Language Navigation
Masked Path Modeling for Vision-and-Language Navigation
Zi-Yi Dou
Feng Gao
Nanyun Peng
LM&Ro
83
3
0
23 May 2023
Acquiring Frame Element Knowledge with Deep Metric Learning for Semantic
  Frame Induction
Acquiring Frame Element Knowledge with Deep Metric Learning for Semantic Frame Induction
Kosuke Yamada
Ryohei Sasano
Koichi Takeda
FedML
38
1
0
23 May 2023
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
Leonardo Ranaldi
Elena Sofia Ruzzetti
Davide Venditti
Dario Onorati
Fabio Massimo Zanzotto
93
37
0
23 May 2023
Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large
  Language Models
Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models
Alfonso Amayuelas
Kyle Wong
Liangming Pan
Wenhu Chen
Wenjie Wang
103
29
0
23 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data
  Age, Domain Coverage, Quality, & Toxicity
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
117
168
0
22 May 2023
Textually Pretrained Speech Language Models
Textually Pretrained Speech Language Models
Michael Hassid
Tal Remez
Tu Nguyen
Itai Gat
Alexis Conneau
...
Alexandre Défossez
Gabriel Synnaeve
Emmanuel Dupoux
Roy Schwartz
Yossi Adi
VLMSyDa
131
61
0
22 May 2023
EnCore: Fine-Grained Entity Typing by Pre-Training Entity Encoders on
  Coreference Chains
EnCore: Fine-Grained Entity Typing by Pre-Training Entity Encoders on Coreference Chains
Frank Mtumbuka
Steven Schockaert
73
0
0
22 May 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset
  Selection for Language Model
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
129
10
0
22 May 2023
A Frustratingly Simple Decoding Method for Neural Text Generation
A Frustratingly Simple Decoding Method for Neural Text Generation
Haoran Yang
Deng Cai
Huayang Li
Wei Bi
Wai Lam
Shuming Shi
84
11
0
22 May 2023
Data-efficient Active Learning for Structured Prediction with Partial
  Annotation and Self-Training
Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training
Zhisong Zhang
Emma Strubell
Eduard H. Hovy
77
1
0
22 May 2023
Patton: Language Model Pretraining on Text-Rich Networks
Patton: Language Model Pretraining on Text-Rich Networks
Bowen Jin
Wentao Zhang
Yu Zhang
Yu Meng
Xinyang Zhang
Qi Zhu
Jiawei Han
VLM
112
46
0
20 May 2023
Deep Learning Approaches to Lexical Simplification: A Survey
Deep Learning Approaches to Lexical Simplification: A Survey
Kai North
Tharindu Ranasinghe
Matthew Shardlow
Marcos Zampieri
50
15
0
19 May 2023
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage
  Leveraging Generative Models
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models
Akshita Jha
Aida Mostafazadeh Davani
Chandan K. Reddy
Shachi Dave
Vinodkumar Prabhakaran
Sunipa Dev
87
50
0
19 May 2023
DMDD: A Large-Scale Dataset for Dataset Mentions Detection
DMDD: A Large-Scale Dataset for Dataset Mentions Detection
Huitong Pan
Qi Zhang
Eduard Constantin Dragut
Cornelia Caragea
Longin Jan Latecki
42
11
0
19 May 2023
Decouple knowledge from parameters for plug-and-play language modeling
Decouple knowledge from parameters for plug-and-play language modeling
Xin Cheng
Yankai Lin
Preslav Nakov
Dongyan Zhao
Rui Yan
KELM
86
2
0
19 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLMMLLMObjD
151
122
0
18 May 2023
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning
Youhe Jiang
Fangcheng Fu
Xupeng Miao
Xiaonan Nie
Tengjiao Wang
73
11
0
17 May 2023
Distinguish Before Answer: Generating Contrastive Explanation as
  Knowledge for Commonsense Question Answering
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering
Qianglong Chen
Guohai Xu
Mingshi Yan
Ji Zhang
Fei Huang
Luo Si
Yin Zhang
104
10
0
14 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured
  Data
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao Song
Mingquan Ye
72
4
0
13 May 2023
Constructing Holistic Measures for Social Biases in Masked Language Models
Yang Liu
Yuexian Hou
25
0
0
12 May 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document
  Image Classification
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
65
6
0
11 May 2023
SPSQL: Step-by-step Parsing Based Framework for Text-to-SQL Generation
SPSQL: Step-by-step Parsing Based Framework for Text-to-SQL Generation
Ran Shen
Gang Sun
Hao Shen
Yiling Li
Liangfeng Jin
Han Jiang
55
5
0
10 May 2023
Best-Effort Adaptation
Best-Effort Adaptation
Pranjal Awasthi
Corinna Cortes
M. Mohri
91
8
0
10 May 2023
RLocator: Reinforcement Learning for Bug Localization
RLocator: Reinforcement Learning for Bug Localization
Partha Chakraborty
Mahmoud Alfadel
M. Nagappan
79
9
0
09 May 2023
A Frustratingly Easy Improvement for Position Embeddings via Random
  Padding
A Frustratingly Easy Improvement for Position Embeddings via Random Padding
Mingxu Tao
Yansong Feng
Dongyan Zhao
77
6
0
08 May 2023
PreCog: Exploring the Relation between Memorization and Performance in
  Pre-trained Language Models
PreCog: Exploring the Relation between Memorization and Performance in Pre-trained Language Models
Leonardo Ranaldi
Elena Sofia Ruzzetti
Fabio Massimo Zanzotto
69
6
0
08 May 2023
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous
  Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias
Zhiyuan Zhang
Deli Chen
Hao Zhou
Fandong Meng
Jie Zhou
Xu Sun
73
5
0
08 May 2023
Harnessing the Power of BERT in the Turkish Clinical Domain: Pretraining
  Approaches for Limited Data Scenarios
Harnessing the Power of BERT in the Turkish Clinical Domain: Pretraining Approaches for Limited Data Scenarios
Hazal Türkmen
Oğuz Dikenelli
C. Eraslan
Mehmet Cem Çalli
S. Özbek
89
3
0
05 May 2023
Context-Aware Semantic Similarity Measurement for Unsupervised Word
  Sense Disambiguation
Context-Aware Semantic Similarity Measurement for Unsupervised Word Sense Disambiguation
J. Martinez-Gil
56
3
0
05 May 2023
Towards Applying Powerful Large AI Models in Classroom Teaching:
  Opportunities, Challenges and Prospects
Towards Applying Powerful Large AI Models in Classroom Teaching: Opportunities, Challenges and Prospects
Kehui Tan
Tianqi Pang
Chenyou Fan
Song Yu
68
16
0
05 May 2023
LLM-RM at SemEval-2023 Task 2: Multilingual Complex NER using
  XLM-RoBERTa
LLM-RM at SemEval-2023 Task 2: Multilingual Complex NER using XLM-RoBERTa
Rahul Mehta
Vasudeva Varma
63
13
0
05 May 2023
From Statistical Methods to Deep Learning, Automatic Keyphrase
  Prediction: A Survey
From Statistical Methods to Deep Learning, Automatic Keyphrase Prediction: A Survey
Binbin Xie
Jianwei Song
Liangying Shao
Suhang Wu
Xiangpeng Wei
Baosong Yang
Huan Lin
Jun Xie
Jinsong Su
79
25
0
04 May 2023
A Novel Plagiarism Detection Approach Combining BERT-based Word
  Embedding, Attention-based LSTMs and an Improved Differential Evolution
  Algorithm
A Novel Plagiarism Detection Approach Combining BERT-based Word Embedding, Attention-based LSTMs and an Improved Differential Evolution Algorithm
Seyed Vahid Moravvej
Seyed Jalaleddin Mousavirad
Diego Oliva
F. Mohammadi
26
19
0
03 May 2023
Evaluating the Efficacy of Length-Controllable Machine Translation
Evaluating the Efficacy of Length-Controllable Machine Translation
Hao Cheng
Meng Zhang
Weixuan Wang
Liangyou Li
Qun Liu
Zhihua Zhang
92
0
0
03 May 2023
Exploring Linguistic Properties of Monolingual BERTs with Typological
  Classification among Languages
Exploring Linguistic Properties of Monolingual BERTs with Typological Classification among Languages
Elena Sofia Ruzzetti
Federico Ranaldi
F. Logozzo
Michele Mastromattei
Leonardo Ranaldi
Fabio Massimo Zanzotto
69
9
0
03 May 2023
Improving Cancer Hallmark Classification with BERT-based Deep Learning
  Approach
Improving Cancer Hallmark Classification with BERT-based Deep Learning Approach
Sultan Zavrak
Seyhmus Yilmaz
46
0
0
02 May 2023
ArK: Augmented Reality with Knowledge Interactive Emergent Ability
ArK: Augmented Reality with Knowledge Interactive Emergent Ability
Qiuyuan Huang
Jinho Park
Abhinav Gupta
Paul N. Bennett
Ran Gong
...
Baolin Peng
O. Mohammed
C. Pal
Yejin Choi
Jianfeng Gao
122
6
0
01 May 2023
Reliable Gradient-free and Likelihood-free Prompt Tuning
Reliable Gradient-free and Likelihood-free Prompt Tuning
Maohao Shen
S. Ghosh
P. Sattigeri
Subhro Das
Yuheng Bu
G. Wornell
VLM
115
12
0
30 Apr 2023
How does GPT-2 compute greater-than?: Interpreting mathematical
  abilities in a pre-trained language model
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
311
132
0
30 Apr 2023
When Deep Learning Meets Polyhedral Theory: A Survey
When Deep Learning Meets Polyhedral Theory: A Survey
Joey Huchette
Gonzalo Muñoz
Thiago Serra
Calvin Tsay
AI4CE
160
37
0
29 Apr 2023
Previous
123...101112...899091
Next