Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.12471
Cited By
Neural Network Acceptability Judgments
31 May 2018
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Network Acceptability Judgments"
50 / 880 papers shown
Title
SAS: Self-Augmentation Strategy for Language Model Pre-training
Yifei Xu
Jingqiao Zhang
Ru He
Liangzhu Ge
Chao Yang
Cheng Yang
Ying Wu
34
1
0
14 Jun 2021
Why Can You Lay Off Heads? Investigating How BERT Heads Transfer
Ting-Rui Chiang
Yun-Nung Chen
28
0
0
14 Jun 2021
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
Xinyi Wang
Haiqing Yang
Liang Zhao
Yang Mo
Jianping Shen
MQ
20
3
0
11 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Z. Tu
ViT
29
15
0
10 Jun 2021
Bayesian Attention Belief Networks
Shujian Zhang
Xinjie Fan
Bo Chen
Mingyuan Zhou
BDL
24
30
0
09 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
67
468
0
08 Jun 2021
BERT Learns to Teach: Knowledge Distillation with Meta Learning
Wangchunshu Zhou
Canwen Xu
Julian McAuley
28
87
0
08 Jun 2021
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Rabeeh Karimi Mahabadi
Sebastian Ruder
Mostafa Dehghani
James Henderson
MoE
22
294
0
08 Jun 2021
Refiner: Refining Self-attention for Vision Transformers
Daquan Zhou
Yujun Shi
Bingyi Kang
Weihao Yu
Zihang Jiang
Yuan Li
Xiaojie Jin
Qibin Hou
Jiashi Feng
ViT
29
59
0
07 Jun 2021
Learning Slice-Aware Representations with Mixture of Attentions
Cheng Wang
Sungjin Lee
Sunghyun Park
Han Li
Young-Bum Kim
R. Sarikaya
29
2
0
04 Jun 2021
Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech
Wanzheng Zhu
S. Bhat
12
54
0
03 Jun 2021
A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space
S. Rajaee
Mohammad Taher Pilehvar
16
41
0
02 Jun 2021
Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERT
Anmol Nayak
Hariprasad Timmapathini
32
4
0
01 Jun 2021
Training ELECTRA Augmented with Multi-word Selection
Jiaming Shen
Jialu Liu
Tianqi Liu
Cong Yu
Jiawei Han
29
9
0
31 May 2021
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing
David Peer
Sebastian Stabinger
Stefan Engl
A. Rodríguez-Sánchez
16
27
0
31 May 2021
A Compression-Compilation Framework for On-mobile Real-time BERT Applications
Wei Niu
Zhenglun Kong
Geng Yuan
Weiwen Jiang
Jiexiong Guan
Caiwen Ding
Pu Zhao
Sijia Liu
Bin Ren
Yanzhi Wang
MQ
17
4
0
30 May 2021
Pre-training Universal Language Representation
Yian Li
Hai Zhao
SSL
27
8
0
30 May 2021
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
Jin Xu
Xu Tan
Renqian Luo
Kaitao Song
Jian Li
Tao Qin
Tie-Yan Liu
MQ
15
78
0
30 May 2021
Early Exiting with Ensemble Internal Classifiers
Tianxiang Sun
Yunhua Zhou
Xiangyang Liu
Xinyu Zhang
Hao Jiang
Bo Zhao
Xuanjing Huang
Xipeng Qiu
32
30
0
28 May 2021
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
Chen Liang
Simiao Zuo
Minshuo Chen
Haoming Jiang
Xiaodong Liu
Pengcheng He
T. Zhao
Weizhu Chen
20
68
0
25 May 2021
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Yuchen Jin
Dinesh Manocha
Liangyu Zhao
Yibo Zhu
Chuanxiong Guo
Marco Canini
Arvind Krishnamurthy
37
18
0
22 May 2021
KLUE: Korean Language Understanding Evaluation
Sungjoon Park
Jihyung Moon
Sungdong Kim
Won Ik Cho
Jiyoon Han
...
Seonghyun Kim
Lucy Park
Alice H. Oh
Jung-Woo Ha
Kyunghyun Cho
ELM
VLM
29
191
0
20 May 2021
Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence
Jian Guan
Xiaoxi Mao
Changjie Fan
Zitao Liu
Wenbiao Ding
Minlie Huang
AuLLM
29
78
0
19 May 2021
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Jian Guan
Zhexin Zhang
Zhuoer Feng
Zitao Liu
Wenbiao Ding
Xiaoxi Mao
Changjie Fan
Minlie Huang
14
60
0
19 May 2021
How is BERT surprised? Layerwise detection of linguistic anomalies
Bai Li
Zining Zhu
Guillaume Thomas
Yang Xu
Frank Rudzicz
27
31
0
16 May 2021
DaLAJ - a dataset for linguistic acceptability judgments for Swedish: Format, baseline, sharing
Elena Volodina
Yousuf Ali Mohammed
Julia Klezl
11
21
0
14 May 2021
The Summary Loop: Learning to Write Abstractive Summaries Without Examples
Philippe Laban
Andrew Hsi Bloomberg
John F. Canny
Marti A. Hearst
22
56
0
11 May 2021
Benchmarking down-scaled (not so large) pre-trained language models
Matthias Aßenmacher
P. Schulze
C. Heumann
6
1
0
11 May 2021
Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Anne Beyer
Sharid Loáiciga
David Schlangen
27
15
0
07 May 2021
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates
Yuqing Xie
Yi-An Lai
Yuanjun Xiong
Yi Zhang
Stefano Soatto
UQCV
19
16
0
07 May 2021
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
35
183
0
29 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov
O. Serikov
Ekaterina Artemova
12
9
0
26 Apr 2021
On Geodesic Distances and Contextual Embedding Compression for Text Classification
Rishi Jha
Kai Mihata
17
6
0
22 Apr 2021
SoT: Delving Deeper into Classification Head for Transformer
Jiangtao Xie
Rui Zeng
Qilong Wang
Ziqi Zhou
P. Li
ViT
34
12
0
22 Apr 2021
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Michael Hahn
Dan Jurafsky
Richard Futrell
150
22
0
21 Apr 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
223
180
0
18 Apr 2021
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
36
235
0
18 Apr 2021
Contrastive Out-of-Distribution Detection for Pretrained Transformers
Wenxuan Zhou
Fangyu Liu
Muhao Chen
13
98
0
18 Apr 2021
Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning
Xisen Jin
Bill Yuchen Lin
Mohammad Rostami
Xiang Ren
BDL
CLL
22
42
0
18 Apr 2021
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge
Maarten Sap
Ana Marasović
William Agnew
Gabriel Ilharco
Dirk Groeneveld
Margaret Mitchell
Matt Gardner
AILaw
37
425
0
18 Apr 2021
What to Pre-Train on? Efficient Intermediate Task Selection
Clifton A. Poth
Jonas Pfeiffer
Andreas Rucklé
Iryna Gurevych
19
94
0
16 Apr 2021
Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models
Taichi Iki
Akiko Aizawa
VLM
25
20
0
16 Apr 2021
Probing Across Time: What Does RoBERTa Know and When?
Leo Z. Liu
Yizhong Wang
Jungo Kasai
Hannaneh Hajishirzi
Noah A. Smith
KELM
13
80
0
16 Apr 2021
How to Train BERT with an Academic Budget
Peter Izsak
Moshe Berchansky
Omer Levy
23
113
0
15 Apr 2021
Annealing Knowledge Distillation
A. Jafari
Mehdi Rezagholizadeh
Pranav Sharma
A. Ghodsi
15
77
0
14 Apr 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
Koustuv Sinha
Robin Jia
Dieuwke Hupkes
J. Pineau
Adina Williams
Douwe Kiela
45
243
0
14 Apr 2021
On the Use of Linguistic Features for the Evaluation of Generative Dialogue Systems
Ian Berlot-Attwell
Frank Rudzicz
6
2
0
13 Apr 2021
Understanding Transformers for Bot Detection in Twitter
Andrés García-Silva
Cristian Berrío
José Manuél Gómez-Pérez
33
4
0
13 Apr 2021
Targeted Adversarial Training for Natural Language Understanding
L. Pereira
Xiaodong Liu
Hao Cheng
Hoifung Poon
Jianfeng Gao
Ichiro Kobayashi
19
12
0
12 Apr 2021
FUDGE: Controlled Text Generation With Future Discriminators
Kevin Kaichuang Yang
Dan Klein
33
313
0
12 Apr 2021
Previous
1
2
3
...
13
14
15
16
17
18
Next