ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.12471
  4. Cited By
Neural Network Acceptability Judgments

Neural Network Acceptability Judgments

31 May 2018
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
ArXivPDFHTML

Papers citing "Neural Network Acceptability Judgments"

50 / 880 papers shown
Title
SAS: Self-Augmentation Strategy for Language Model Pre-training
SAS: Self-Augmentation Strategy for Language Model Pre-training
Yifei Xu
Jingqiao Zhang
Ru He
Liangzhu Ge
Chao Yang
Cheng Yang
Ying Wu
34
1
0
14 Jun 2021
Why Can You Lay Off Heads? Investigating How BERT Heads Transfer
Why Can You Lay Off Heads? Investigating How BERT Heads Transfer
Ting-Rui Chiang
Yun-Nung Chen
28
0
0
14 Jun 2021
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
Xinyi Wang
Haiqing Yang
Liang Zhao
Yang Mo
Jianping Shen
MQ
20
3
0
11 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in
  Pre-trained Language Models
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Z. Tu
ViT
29
15
0
10 Jun 2021
Bayesian Attention Belief Networks
Bayesian Attention Belief Networks
Shujian Zhang
Xinjie Fan
Bo Chen
Mingyuan Zhou
BDL
24
30
0
09 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
67
468
0
08 Jun 2021
BERT Learns to Teach: Knowledge Distillation with Meta Learning
BERT Learns to Teach: Knowledge Distillation with Meta Learning
Wangchunshu Zhou
Canwen Xu
Julian McAuley
28
87
0
08 Jun 2021
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared
  Hypernetworks
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Rabeeh Karimi Mahabadi
Sebastian Ruder
Mostafa Dehghani
James Henderson
MoE
22
294
0
08 Jun 2021
Refiner: Refining Self-attention for Vision Transformers
Refiner: Refining Self-attention for Vision Transformers
Daquan Zhou
Yujun Shi
Bingyi Kang
Weihao Yu
Zihang Jiang
Yuan Li
Xiaojie Jin
Qibin Hou
Jiashi Feng
ViT
29
59
0
07 Jun 2021
Learning Slice-Aware Representations with Mixture of Attentions
Learning Slice-Aware Representations with Mixture of Attentions
Cheng Wang
Sungjin Lee
Sunghyun Park
Han Li
Young-Bum Kim
R. Sarikaya
29
2
0
04 Jun 2021
Generate, Prune, Select: A Pipeline for Counterspeech Generation against
  Online Hate Speech
Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech
Wanzheng Zhu
S. Bhat
12
54
0
03 Jun 2021
A Cluster-based Approach for Improving Isotropy in Contextual Embedding
  Space
A Cluster-based Approach for Improving Isotropy in Contextual Embedding Space
S. Rajaee
Mohammad Taher Pilehvar
16
41
0
02 Jun 2021
Using Integrated Gradients and Constituency Parse Trees to explain
  Linguistic Acceptability learnt by BERT
Using Integrated Gradients and Constituency Parse Trees to explain Linguistic Acceptability learnt by BERT
Anmol Nayak
Hariprasad Timmapathini
32
4
0
01 Jun 2021
Training ELECTRA Augmented with Multi-word Selection
Training ELECTRA Augmented with Multi-word Selection
Jiaming Shen
Jialu Liu
Tianqi Liu
Cong Yu
Jiawei Han
29
9
0
31 May 2021
Greedy-layer Pruning: Speeding up Transformer Models for Natural
  Language Processing
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing
David Peer
Sebastian Stabinger
Stefan Engl
A. Rodríguez-Sánchez
16
27
0
31 May 2021
A Compression-Compilation Framework for On-mobile Real-time BERT
  Applications
A Compression-Compilation Framework for On-mobile Real-time BERT Applications
Wei Niu
Zhenglun Kong
Geng Yuan
Weiwen Jiang
Jiexiong Guan
Caiwen Ding
Pu Zhao
Sijia Liu
Bin Ren
Yanzhi Wang
MQ
17
4
0
30 May 2021
Pre-training Universal Language Representation
Pre-training Universal Language Representation
Yian Li
Hai Zhao
SSL
27
8
0
30 May 2021
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural
  Architecture Search
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
Jin Xu
Xu Tan
Renqian Luo
Kaitao Song
Jian Li
Tao Qin
Tie-Yan Liu
MQ
15
78
0
30 May 2021
Early Exiting with Ensemble Internal Classifiers
Early Exiting with Ensemble Internal Classifiers
Tianxiang Sun
Yunhua Zhou
Xiangyang Liu
Xinyu Zhang
Hao Jiang
Bo Zhao
Xuanjing Huang
Xipeng Qiu
32
30
0
28 May 2021
Super Tickets in Pre-Trained Language Models: From Model Compression to
  Improving Generalization
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
Chen Liang
Simiao Zuo
Minshuo Chen
Haoming Jiang
Xiaodong Liu
Pengcheng He
T. Zhao
Weizhu Chen
20
68
0
25 May 2021
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on
  the Fly
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Yuchen Jin
Dinesh Manocha
Liangyu Zhao
Yibo Zhu
Chuanxiong Guo
Marco Canini
Arvind Krishnamurthy
37
18
0
22 May 2021
KLUE: Korean Language Understanding Evaluation
KLUE: Korean Language Understanding Evaluation
Sungjoon Park
Jihyung Moon
Sungdong Kim
Won Ik Cho
Jiyoon Han
...
Seonghyun Kim
Lucy Park
Alice H. Oh
Jung-Woo Ha
Kyunghyun Cho
ELM
VLM
29
191
0
20 May 2021
Long Text Generation by Modeling Sentence-Level and Discourse-Level
  Coherence
Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence
Jian Guan
Xiaoxi Mao
Changjie Fan
Zitao Liu
Wenbiao Ding
Minlie Huang
AuLLM
29
78
0
19 May 2021
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Jian Guan
Zhexin Zhang
Zhuoer Feng
Zitao Liu
Wenbiao Ding
Xiaoxi Mao
Changjie Fan
Minlie Huang
14
60
0
19 May 2021
How is BERT surprised? Layerwise detection of linguistic anomalies
How is BERT surprised? Layerwise detection of linguistic anomalies
Bai Li
Zining Zhu
Guillaume Thomas
Yang Xu
Frank Rudzicz
27
31
0
16 May 2021
DaLAJ - a dataset for linguistic acceptability judgments for Swedish:
  Format, baseline, sharing
DaLAJ - a dataset for linguistic acceptability judgments for Swedish: Format, baseline, sharing
Elena Volodina
Yousuf Ali Mohammed
Julia Klezl
11
21
0
14 May 2021
The Summary Loop: Learning to Write Abstractive Summaries Without
  Examples
The Summary Loop: Learning to Write Abstractive Summaries Without Examples
Philippe Laban
Andrew Hsi Bloomberg
John F. Canny
Marti A. Hearst
22
56
0
11 May 2021
Benchmarking down-scaled (not so large) pre-trained language models
Benchmarking down-scaled (not so large) pre-trained language models
Matthias Aßenmacher
P. Schulze
C. Heumann
6
1
0
11 May 2021
Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction
  from Language Models
Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models
Anne Beyer
Sharid Loáiciga
David Schlangen
27
15
0
07 May 2021
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing
  Regressions In NLP Model Updates
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates
Yuqing Xie
Yi-An Lai
Yuanjun Xiong
Yi Zhang
Stefano Soatto
UQCV
19
16
0
07 May 2021
Entailment as Few-Shot Learner
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
35
183
0
29 Apr 2021
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov
O. Serikov
Ekaterina Artemova
12
9
0
26 Apr 2021
On Geodesic Distances and Contextual Embedding Compression for Text
  Classification
On Geodesic Distances and Contextual Embedding Compression for Text Classification
Rishi Jha
Kai Mihata
17
6
0
22 Apr 2021
SoT: Delving Deeper into Classification Head for Transformer
SoT: Delving Deeper into Classification Head for Transformer
Jiangtao Xie
Rui Zeng
Qilong Wang
Ziqi Zhou
P. Li
ViT
34
12
0
22 Apr 2021
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Michael Hahn
Dan Jurafsky
Richard Futrell
150
22
0
21 Apr 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
223
180
0
18 Apr 2021
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
36
235
0
18 Apr 2021
Contrastive Out-of-Distribution Detection for Pretrained Transformers
Contrastive Out-of-Distribution Detection for Pretrained Transformers
Wenxuan Zhou
Fangyu Liu
Muhao Chen
13
98
0
18 Apr 2021
Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation
  for Few-shot Learning
Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning
Xisen Jin
Bill Yuchen Lin
Mohammad Rostami
Xiang Ren
BDL
CLL
22
42
0
18 Apr 2021
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean
  Crawled Corpus
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge
Maarten Sap
Ana Marasović
William Agnew
Gabriel Ilharco
Dirk Groeneveld
Margaret Mitchell
Matt Gardner
AILaw
37
425
0
18 Apr 2021
What to Pre-Train on? Efficient Intermediate Task Selection
What to Pre-Train on? Efficient Intermediate Task Selection
Clifton A. Poth
Jonas Pfeiffer
Andreas Rucklé
Iryna Gurevych
19
94
0
16 Apr 2021
Effect of Visual Extensions on Natural Language Understanding in
  Vision-and-Language Models
Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models
Taichi Iki
Akiko Aizawa
VLM
25
20
0
16 Apr 2021
Probing Across Time: What Does RoBERTa Know and When?
Probing Across Time: What Does RoBERTa Know and When?
Leo Z. Liu
Yizhong Wang
Jungo Kasai
Hannaneh Hajishirzi
Noah A. Smith
KELM
13
80
0
16 Apr 2021
How to Train BERT with an Academic Budget
How to Train BERT with an Academic Budget
Peter Izsak
Moshe Berchansky
Omer Levy
23
113
0
15 Apr 2021
Annealing Knowledge Distillation
Annealing Knowledge Distillation
A. Jafari
Mehdi Rezagholizadeh
Pranav Sharma
A. Ghodsi
15
77
0
14 Apr 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word
  Matters Pre-training for Little
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
Koustuv Sinha
Robin Jia
Dieuwke Hupkes
J. Pineau
Adina Williams
Douwe Kiela
45
243
0
14 Apr 2021
On the Use of Linguistic Features for the Evaluation of Generative
  Dialogue Systems
On the Use of Linguistic Features for the Evaluation of Generative Dialogue Systems
Ian Berlot-Attwell
Frank Rudzicz
6
2
0
13 Apr 2021
Understanding Transformers for Bot Detection in Twitter
Understanding Transformers for Bot Detection in Twitter
Andrés García-Silva
Cristian Berrío
José Manuél Gómez-Pérez
33
4
0
13 Apr 2021
Targeted Adversarial Training for Natural Language Understanding
Targeted Adversarial Training for Natural Language Understanding
L. Pereira
Xiaodong Liu
Hao Cheng
Hoifung Poon
Jianfeng Gao
Ichiro Kobayashi
19
12
0
12 Apr 2021
FUDGE: Controlled Text Generation With Future Discriminators
FUDGE: Controlled Text Generation With Future Discriminators
Kevin Kaichuang Yang
Dan Klein
33
313
0
12 Apr 2021
Previous
123...131415161718
Next