ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,454 papers shown
Title
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL
  Robustness
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
Shuaichen Chang
Jun Wang
Mingwen Dong
Lin Pan
Henghui Zhu
...
William Yang Wang
Zhiguo Wang
Vittorio Castelli
Patrick Ng
Bing Xiang
OOD
49
34
0
21 Jan 2023
Prompting Large Language Model for Machine Translation: A Case Study
Prompting Large Language Model for Machine Translation: A Case Study
Biao Zhang
Barry Haddow
Alexandra Birch
LRM
32
278
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
43
11
0
17 Jan 2023
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real
  World
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World
Hongpeng Lin
Ludan Ruan
Wenke Xia
Peiyu Liu
Jing Wen
...
Di Hu
Ruihua Song
Wayne Xin Zhao
Qin Jin
Zhiwu Lu
VGen
33
9
0
14 Jan 2023
Leveraging Large Language Models to Power Chatbots for Collecting User
  Self-Reported Data
Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Jing Wei
Sungdong Kim
Hyunhoon Jung
Young-Ho Kim
32
82
0
14 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language
  Models for Knowledge-based Visual Reasoning
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
42
36
0
12 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models:
  from Data to Inference
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
R. Brath
Daniel A. Keim
Johannes Knittel
Shimei Pan
Pia Sommerauer
Hendrik Strobelt
19
11
0
11 Jan 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
35
5
0
06 Jan 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Elias Frantar
Dan Alistarh
VLM
35
643
0
02 Jan 2023
Rethinking with Retrieval: Faithful Large Language Model Inference
Rethinking with Retrieval: Faithful Large Language Model Inference
Hangfeng He
Hongming Zhang
Dan Roth
KELM
LRM
149
161
0
31 Dec 2022
Targeted Phishing Campaigns using Large Scale Language Models
Targeted Phishing Campaigns using Large Scale Language Models
Rabimba Karanjai
18
39
0
30 Dec 2022
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
78
372
0
28 Dec 2022
OPT-IML: Scaling Language Model Instruction Meta Learning through the
  Lens of Generalization
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Srinivasan Iyer
Xi Lin
Ramakanth Pasunuru
Todor Mihaylov
Daniel Simig
...
Jeff Wang
Christopher Dewan
Asli Celikyilmaz
Luke Zettlemoyer
Veselin Stoyanov
ALM
44
261
0
22 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
JASMINE: Arabic GPT Models for Few-Shot Learning
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
33
7
0
21 Dec 2022
When Not to Trust Language Models: Investigating Effectiveness of
  Parametric and Non-Parametric Memories
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Alex Troy Mallen
Akari Asai
Victor Zhong
Rajarshi Das
Daniel Khashabi
Hannaneh Hajishirzi
RALM
HILM
KELM
64
529
0
20 Dec 2022
Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language
  Models
Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models
Jingjing Xu
Qingxiu Dong
Hongyi Liu
Lei Li
ALM
LRM
33
1
0
20 Dec 2022
Is GPT-3 a Good Data Annotator?
Is GPT-3 a Good Data Annotator?
Bosheng Ding
Chengwei Qin
Linlin Liu
Yew Ken Chia
Chenyu You
Boyang Albert Li
Lidong Bing
37
236
0
20 Dec 2022
Geographic and Geopolitical Biases of Language Models
Geographic and Geopolitical Biases of Language Models
Fahim Faisal
Antonios Anastasopoulos
30
20
0
20 Dec 2022
Towards Reasoning in Large Language Models: A Survey
Towards Reasoning in Large Language Models: A Survey
Jie Huang
Kevin Chen-Chuan Chang
LM&MA
ELM
LRM
32
586
0
20 Dec 2022
HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot
  Generalisation
HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot Generalisation
Hamish Ivison
Akshita Bhagia
Yizhong Wang
Hannaneh Hajishirzi
Matthew E. Peters
53
16
0
20 Dec 2022
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
Tianxing He
Jingyu Zhang
Tianle Wang
Sachin Kumar
Kyunghyun Cho
James R. Glass
Yulia Tsvetkov
45
44
0
20 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with
  Type-level Interchange Intervention Training
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
29
13
0
19 Dec 2022
Training Trajectories of Language Models Across Scales
Training Trajectories of Language Models Across Scales
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFin
LRM
39
55
0
19 Dec 2022
The case for 4-bit precision: k-bit Inference Scaling Laws
The case for 4-bit precision: k-bit Inference Scaling Laws
Tim Dettmers
Luke Zettlemoyer
MQ
27
218
0
19 Dec 2022
Explanation Regeneration via Information Bottleneck
Explanation Regeneration via Information Bottleneck
Qintong Li
Zhiyong Wu
Lingpeng Kong
Wei Bi
30
3
0
19 Dec 2022
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Zheng-Xin Yong
Hailey Schoelkopf
Niklas Muennighoff
Alham Fikri Aji
David Ifeoluwa Adelani
...
Genta Indra Winata
Stella Biderman
Edward Raff
Dragomir R. Radev
Vassilina Nikoulina
CLL
VLM
AI4CE
LRM
35
81
0
19 Dec 2022
I2D2: Inductive Knowledge Distillation with NeuroLogic and
  Self-Imitation
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
28
34
0
19 Dec 2022
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be
  Imitated?
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
Ajay Patel
Matthew Wiesner
Chris Callison-Burch
38
7
0
18 Dec 2022
Language model acceptability judgements are not always robust to context
Language model acceptability judgements are not always robust to context
Koustuv Sinha
Jon Gauthier
Aaron Mueller
Kanishka Misra
Keren Fuentes
R. Levy
Adina Williams
23
18
0
18 Dec 2022
MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
  Generation
MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation
Swarnadeep Saha
Xinyan Velocity Yu
Joey Tianyi Zhou
Ramakanth Pasunuru
Asli Celikyilmaz
ReLM
LRM
30
10
0
16 Dec 2022
Controllable Text Generation via Probability Density Estimation in the
  Latent Space
Controllable Text Generation via Probability Density Estimation in the Latent Space
Yuxuan Gu
Xiaocheng Feng
Sicheng Ma
Lingyuan Zhang
Heng Gong
Weihong Zhong
Bing Qin
29
18
0
16 Dec 2022
Improving Chess Commentaries by Combining Language Models with Symbolic
  Reasoning Engines
Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Andrew Lee
David Wu
Emily Dinan
M. Lewis
LRM
33
7
0
15 Dec 2022
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in
  Zero-Shot Reasoning
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning
Omar Shaikh
Hongxin Zhang
William B. Held
Michael S. Bernstein
Diyi Yang
ReLM
LRM
35
186
0
15 Dec 2022
Attributed Question Answering: Evaluation and Modeling for Attributed
  Large Language Models
Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Bernd Bohnet
Vinh Q. Tran
Pat Verga
Roee Aharoni
D. Andor
...
Michael Collins
Dipanjan Das
Donald Metzler
Slav Petrov
Kellie Webster
43
60
0
15 Dec 2022
Prompting Is Programming: A Query Language for Large Language Models
Prompting Is Programming: A Query Language for Large Language Models
Luca Beurer-Kellner
Marc Fischer
Martin Vechev
LRM
50
95
0
12 Dec 2022
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
Jianhao Yan
Jin Xu
Fandong Meng
Jie Zhou
Yue Zhang
24
3
0
08 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Demystifying Prompts in Language Models via Perplexity Estimation
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
46
196
0
08 Dec 2022
The problem with AI consciousness: A neurogenetic case against synthetic
  sentience
The problem with AI consciousness: A neurogenetic case against synthetic sentience
Yoshija Walter
L. Zbinden
16
1
0
07 Dec 2022
I2MVFormer: Large Language Model Generated Multi-View Document
  Supervision for Zero-Shot Image Classification
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Ferjad Naeem
Muhammad Gul Zain Ali Khan
Yongqin Xian
Muhammad Zeshan Afzal
D. Stricker
Luc Van Gool
F. Tombari
VLM
35
52
0
05 Dec 2022
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Tian Lan
Yixuan Su
Shuhang Liu
Heyan Huang
Xian-Ling Mao
47
5
0
05 Dec 2022
Understanding How Model Size Affects Few-shot Instruction Prompting
Understanding How Model Size Affects Few-shot Instruction Prompting
Ayrton San Joaquin
Ardy Haroen
29
0
0
04 Dec 2022
Nonparametric Masked Language Modeling
Nonparametric Masked Language Modeling
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Wen-tau Yih
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
50
48
0
02 Dec 2022
Extensible Prompts for Language Models on Zero-shot Language Style
  Customization
Extensible Prompts for Language Models on Zero-shot Language Style Customization
Tao Ge
Jing Hu
Li Dong
Shaoguang Mao
Yanqiu Xia
Xun Wang
Si-Qing Chen
Furu Wei
VLM
51
6
0
01 Dec 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
  Foundation Models
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
27
47
0
27 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
22
95
0
22 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large
  Language Models
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
101
749
0
18 Nov 2022
Ignore Previous Prompt: Attack Techniques For Language Models
Ignore Previous Prompt: Attack Techniques For Language Models
Fábio Perez
Ian Ribeiro
SILM
51
403
0
17 Nov 2022
Galactica: A Large Language Model for Science
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELM
ReLM
46
740
0
16 Nov 2022
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
Xingcheng Xu
30
0
0
16 Nov 2022
On the Compositional Generalization Gap of In-Context Learning
On the Compositional Generalization Gap of In-Context Learning
Arian Hosseini
Ankit Vani
Dzmitry Bahdanau
Alessandro Sordoni
Rameswar Panda
32
24
0
15 Nov 2022
Previous
123...4647484950
Next