v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,522 papers shown

Title
Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations Geetanjali Bihani Julia Taylor Rayz 68 13 0 22 Apr 2021
A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP Munazza Zaib Quan Z. Sheng W. Zhang 77 72 0 22 Apr 2021
Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead? T. Isbister F. Carlsson Magnus Sahlgren 95 25 0 21 Apr 2021
Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement Verification and Evidence Finding with Tables Xiaoyi Ruan Meizhi Jin Jian Ma Haiqing Yang Lian-Xin Jiang Yang Mo Mengyuan Zhou LMTD 65 2 0 21 Apr 2021
Sensitivity as a Complexity Measure for Sequence Classification Tasks Michael Hahn Dan Jurafsky Richard Futrell 197 22 0 21 Apr 2021
Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense Reasoning Tasks Lisa Bauer Mohit Bansal 48 19 0 20 Apr 2021
Enhancing Cognitive Models of Emotions with Representation Learning Yuting Guo Jinho Choi 48 5 0 20 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding Jianlin Su Yu Lu Shengfeng Pan Ahmed Murtadha Bo Wen Yunfeng Liu 382 2,555 0 20 Apr 2021
Efficient pre-training objectives for Transformers Luca Di Liello Matteo Gabburo Alessandro Moschitti 42 15 0 20 Apr 2021
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior Md Sultan al Nahian Spencer Frazier Brent Harrison Mark O. Riedl 97 19 0 19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training Chenyi Lei Shixian Luo Yong Liu Wanggui He Jiamang Wang Guoxin Wang Haihong Tang Chunyan Miao Houqiang Li 60 42 0 19 Apr 2021
TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime Nick Craswell Bhaskar Mitra Emine Yilmaz Daniel Fernando Campos E. Voorhees I. Soboroff 76 52 0 19 Apr 2021
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence Bhaskar Mitra Sebastian Hofstatter Hamed Zamani Nick Craswell 81 8 0 19 Apr 2021
A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings Yifei Ding M. Jia Qiuhua Miao Yudong Cao 59 290 0 19 Apr 2021
On the Use of Context for Predicting Citation Worthiness of Sentences in Scholarly Articles Rakesh Gosangi Ravneet Arora Mohsen Gheisarieha Debanjan Mahata Haimin Zhang 45 10 0 18 Apr 2021
Emotion-Regularized Conditional Variational Autoencoder for Emotional Response Generation Yu-Ping Ruan Zhenhua Ling DRL 80 16 0 18 Apr 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity Yao Lu Max Bartolo Alastair Moore Sebastian Riedel Pontus Stenetorp AILaw LRM 461 1,200 0 18 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation Tianyu Liu Yizhe Zhang Chris Brockett Yi Mao Zhifang Sui Weizhu Chen W. Dolan HILM 297 149 0 18 Apr 2021
A Simple and Effective Positional Encoding for Transformers Pu-Chin Chen Henry Tsai Srinadh Bhojanapalli Hyung Won Chung Yin-Wen Chang Chun-Sung Ferng 120 66 0 18 Apr 2021
Linguistic Dependencies and Statistical Dependence Jacob Louis Hoover Alessandro Sordoni Wenyu Du Timothy J. O'Donnell 75 15 0 18 Apr 2021
"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models Zihan Wang Chengyu Dong Jingbo Shang FAtt 140 4 0 18 Apr 2021
Characterizing Idioms: Conventionality and Contingency Michaela Socolof Jackie C.K. Cheung Michael Wagner Timothy J. O'Donnell 25 6 0 17 Apr 2021
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models Zhengxuan Wu Nelson F. Liu Christopher Potts 45 3 0 17 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models Katikapalli Subramanyam Kalyan A. Rajasekharan S. Sangeetha LM&MA MedIm 117 170 0 16 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval Luyu Gao Jamie Callan AI4CE 69 269 0 16 Apr 2021
DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems Yu Li Shirley Anugrah Hayati Weiyan Shi Zhou Yu 64 5 0 16 Apr 2021
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation Lilin Cheng Suzhe Wang Zhimeng Zhang Yu-qiong Ding Yixing Zheng Xin Yu Changjie Fan VGen 45 71 0 16 Apr 2021
Time-Stamped Language Model: Teaching Language Models to Understand the Flow of Events Hossein Rajaby Faghihi Parisa Kordjamshidi 65 25 0 15 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers Chuan Guo Alexandre Sablayrolles Hervé Jégou Douwe Kiela SILM 165 248 0 15 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models Matteo Alleman J. Mamou Miguel Rio Hanlin Tang Yoon Kim SueYeon Chung NAI 100 17 0 15 Apr 2021
Hierarchical Learning for Generation with Long Source Sequences T. Rohde Xiaoxia Wu Yinhan Liu BDL VLM 76 56 0 15 Apr 2021
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models Karolina Stañczak Sagnik Ray Choudhury Tiago Pimentel Ryan Cotterell Isabelle Augenstein 84 24 0 15 Apr 2021
Natural Language Understanding with Privacy-Preserving BERT Chen Qu Weize Kong Liu Yang Mingyang Zhang Michael Bendersky Marc Najork 103 76 0 15 Apr 2021
Effect of Post-processing on Contextualized Word Representations Hassan Sajjad Firoj Alam Fahim Dalvi Nadir Durrani 61 9 0 15 Apr 2021
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution Ryuto Konno Shun Kiyono Yuichiroh Matsubayashi Hiroki Ouchi Kentaro Inui 19 10 0 15 Apr 2021
Consistency Training with Virtual Adversarial Discrete Perturbation Jungsoo Park Gyuwan Kim Jaewoo Kang 76 15 0 15 Apr 2021
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models Yuxuan Lai Yijia Liu Yansong Feng Songfang Huang Dongyan Zhao VLM AI4CE 77 38 0 15 Apr 2021
COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List Luyu Gao Zhuyun Dai Jamie Callan 87 220 0 15 Apr 2021
Disentangling Representations of Text by Masking Transformers Xiongyi Zhang Jan-Willem van de Meent Byron C. Wallace DRL 64 21 0 14 Apr 2021
UDALM: Unsupervised Domain Adaptation through Language Modeling Constantinos F. Karouzos Georgios Paraskevopoulos Alexandros Potamianos 72 57 0 14 Apr 2021
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce Song Xu Haoran Li Peng Yuan Yujia Wang Youzheng Wu Xiaodong He Ying Liu Bowen Zhou KELM 91 24 0 14 Apr 2021
Enhancing Interpretable Clauses Semantically using Pretrained Word Representation Rohan Kumar Yadav Lei Jiao Ole-Christoffer Granmo Morten Goodwin NAI 67 16 0 14 Apr 2021
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual Dataset for Counterfactual Detection in Product Reviews James OÑeill Polina Rozenshtein Ryuichi Kiryo Motoko Kubota Danushka Bollegala 76 31 0 14 Apr 2021
Knowledge-driven Answer Generation for Conversational Search Mariana Leite Rafael Ferreira David Semedo João Magalhães RALM KELM 78 1 0 14 Apr 2021
AR-LSAT: Investigating Analytical Reasoning of Text Wanjun Zhong Siyuan Wang Duyu Tang Zenan Xu Daya Guo Jiahai Wang Jian Yin Ming Zhou Nan Duan ELM 137 44 0 14 Apr 2021
Demystifying BERT: Implications for Accelerator Design Suchita Pati Shaizeen Aga Nuwan Jayasena Matthew D. Sinclair LLMAG 88 17 0 14 Apr 2021
Developing a Conversational Recommendation System for Navigating Limited Options Victor S. Bursztyn Jennifer Healey Eunyee Koh Nedim Lipka Larry Birnbaum 25 7 0 13 Apr 2021
BERT Embeddings Can Track Context in Conversational Search Rafael Ferreira David Semedo João Magalhães AI4TS 55 0 0 13 Apr 2021
The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction Manling Li Sha Li Zhenhailong Wang Lifu Huang Kyunghyun Cho Heng Ji Jiawei Han Clare R. Voss 114 58 0 13 Apr 2021
Reducing Discontinuous to Continuous Parsing with Pointer Network Reordering Daniel Fernández-González Carlos Gómez-Rodríguez 35 11 0 13 Apr 2021