ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXivPDFHTML

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 557 papers shown
Title
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative
  Model Inference with Unstructured Sparsity
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Haojun Xia
Zhen Zheng
Yuchao Li
Donglin Zhuang
Zhongzhu Zhou
Xiafei Qiu
Yong Li
Wei Lin
Shuaiwen Leon Song
67
11
0
19 Sep 2023
Generative modeling, design and analysis of spider silk protein
  sequences for enhanced mechanical properties
Generative modeling, design and analysis of spider silk protein sequences for enhanced mechanical properties
Wei Lu
David L. Kaplan
Markus J. Buehler
21
31
0
18 Sep 2023
Struc-Bench: Are Large Language Models Really Good at Generating Complex
  Structured Data?
Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
Xiangru Tang
Yiming Zong
Jason Phang
Yilun Zhao
Wangchunshu Zhou
Arman Cohan
Mark B. Gerstein
LMTD
ELM
ALM
44
8
0
16 Sep 2023
CoCA: Fusing Position Embedding with Collinear Constrained Attention in
  Transformers for Long Context Window Extending
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
Shiyi Zhu
Jingting Ye
Wei Jiang
Siqiao Xue
Qi Zhang
Yifan Wu
Jianguo Li
32
4
0
15 Sep 2023
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain
  Performance and Calibration
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Rachneet Sachdeva
Martin Tutek
Iryna Gurevych
OODD
32
11
0
14 Sep 2023
EarthPT: a time series foundation model for Earth Observation
EarthPT: a time series foundation model for Earth Observation
Michael J. Smith
Luke Fleming
James E. Geach
AI4TS
22
7
0
13 Sep 2023
From Base to Conversational: Japanese Instruction Dataset and Tuning
  Large Language Models
From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models
Masahiro Suzuki
Masanori Hirano
Hiroki Sakaji
39
6
0
07 Sep 2023
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Daoyuan Chen
Yilun Huang
Zhijian Ma
Hesen Chen
Xuchen Pan
...
Zhaoyang Liu
Jinyang Gao
Yaliang Li
Bolin Ding
Jingren Zhou
SyDa
VLM
31
30
0
05 Sep 2023
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
  Large Model
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Fengxiang Bie
Yibo Yang
Zhongzhu Zhou
Adam Ghanem
Minjia Zhang
...
Pareesa Ameneh Golnari
David A. Clifton
Yuxiong He
Dacheng Tao
Shuaiwen Leon Song
EGVM
33
20
0
02 Sep 2023
YaRN: Efficient Context Window Extension of Large Language Models
YaRN: Efficient Context Window Extension of Large Language Models
Bowen Peng
Jeffrey Quesnelle
Honglu Fan
Enrico Shippole
OSLM
20
226
0
31 Aug 2023
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on
  Language, Multimodal, and Scientific GPT Models
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models
Kaiyuan Gao
Su He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
LM&MA
SyDa
38
4
0
27 Aug 2023
Code Llama: Open Foundation Models for Code
Code Llama: Open Foundation Models for Code
Baptiste Rozière
Jonas Gehring
Fabian Gloeckle
Sten Sootla
Itai Gat
...
Hugo Touvron
Louis Martin
Nicolas Usunier
Thomas Scialom
Gabriel Synnaeve
ELM
ALM
63
1,911
0
24 Aug 2023
Anonymity at Risk? Assessing Re-Identification Capabilities of Large
  Language Models
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
Alex Nyffenegger
Matthias Sturmer
Joel Niklaus
34
6
0
22 Aug 2023
Instruction Tuning for Large Language Models: A Survey
Instruction Tuning for Large Language Models: A Survey
Shengyu Zhang
Linfeng Dong
Xiaoya Li
Sen Zhang
Xiaofei Sun
...
Jiwei Li
Runyi Hu
Tianwei Zhang
Fei Wu
Guoyin Wang
LM&MA
24
549
0
21 Aug 2023
Large Language Models for Software Engineering: A Systematic Literature
  Review
Large Language Models for Software Engineering: A Systematic Literature Review
Xinying Hou
Yanjie Zhao
Yue Liu
Zhou Yang
Kailong Wang
Li Li
Xiapu Luo
David Lo
John C. Grundy
Haoyu Wang
39
332
0
21 Aug 2023
PMET: Precise Model Editing in a Transformer
PMET: Precise Model Editing in a Transformer
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
34
119
0
17 Aug 2023
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes
Zhaohui Li
Haitao Wang
Xinghua Jiang
40
1
0
14 Aug 2023
OctoPack: Instruction Tuning Code Large Language Models
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff
Qian Liu
A. Zebaze
Qinkai Zheng
Binyuan Hui
Terry Yue Zhuo
Swayam Singh
Xiangru Tang
Leandro von Werra
Shayne Longpre
VLM
ALM
71
120
0
14 Aug 2023
Large Language Models for Information Retrieval: A Survey
Large Language Models for Information Retrieval: A Survey
Yutao Zhu
Huaying Yuan
Shuting Wang
Jiongnan Liu
Wenhan Liu
Chenlong Deng
Haonan Chen
Zhicheng Dou
Ji-Rong Wen
KELM
57
288
0
14 Aug 2023
Three Ways of Using Large Language Models to Evaluate Chat
Three Ways of Using Large Language Models to Evaluate Chat
Ondvrej Plátek
Vojtvech Hudevcek
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
ALM
19
6
0
12 Aug 2023
Bringing order into the realm of Transformer-based language models for
  artificial intelligence and law
Bringing order into the realm of Transformer-based language models for artificial intelligence and law
C. M. Greco
Andrea Tagarelli
AILaw
30
19
0
10 Aug 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Sewon Min
Suchin Gururangan
Eric Wallace
Hannaneh Hajishirzi
Noah A. Smith
Luke Zettlemoyer
AILaw
26
63
0
08 Aug 2023
Large Language Model Prompt Chaining for Long Legal Document
  Classification
Large Language Model Prompt Chaining for Long Legal Document Classification
Dietrich Trautmann
ELM
AILaw
29
10
0
08 Aug 2023
Continual Pre-Training of Large Language Models: How to (re)warm your
  model?
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
35
99
0
08 Aug 2023
Evaluating and Explaining Large Language Models for Code Using Syntactic
  Structures
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures
David Nader-Palacio
Alejandro Velasco
Daniel Rodríguez-Cárdenas
Kevin Moran
Denys Poshyvanyk
34
8
0
07 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
28
3
0
07 Aug 2023
Learning to Paraphrase Sentences to Different Complexity Levels
Learning to Paraphrase Sentences to Different Complexity Levels
Alison Chi
Li-Kuang Chen
Yi-Chen Chang
Shu-Hui Lee
Jason J. S. Chang
24
10
0
04 Aug 2023
TransNormerLLM: A Faster and Better Large Language Model with Improved
  TransNormer
TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer
Zhen Qin
Dong Li
Weigao Sun
Weixuan Sun
Xuyang Shen
...
Yunshen Wei
Baohong Lv
Xiao Luo
Yu Qiao
Yiran Zhong
43
15
0
27 Jul 2023
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners
Jihyeon Janel Lee
Dain Kim
Doohae Jung
Boseop Kim
Kyoung-Woon On
28
0
0
27 Jul 2023
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Roi Cohen
Eden Biran
Ori Yoran
Amir Globerson
Mor Geva
KELM
42
157
0
24 Jul 2023
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language
  Models Applied to Clinical and Biomedical Tasks
A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks
Yanis Labrak
Mickael Rouvier
Richard Dufour
LM&MA
29
25
0
22 Jul 2023
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained
  Foundation Models
FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models
Yuwei Yin
Yazheng Yang
Jian Yang
Qi Liu
23
12
0
22 Jul 2023
FinGPT: Democratizing Internet-scale Data for Financial Large Language
  Models
FinGPT: Democratizing Internet-scale Data for Financial Large Language Models
Xiao-Yang Liu
Guoxuan Wang
Hongyang Yang
Daochen Zha
AIFin
44
43
0
19 Jul 2023
Overthinking the Truth: Understanding how Language Models Process False
  Demonstrations
Overthinking the Truth: Understanding how Language Models Process False Demonstrations
Danny Halawi
Jean-Stanislas Denain
Jacob Steinhardt
33
54
0
18 Jul 2023
On the application of Large Language Models for language teaching and
  assessment technology
On the application of Large Language Models for language teaching and assessment technology
Andrew Caines
Luca Benedetto
Shiva Taslimipoor
Christopher Davis
Yuan Gao
...
Marek Rei
H. Yannakoudakis
Andrew Mullooly
D. Nicholls
P. Buttery
ELM
24
43
0
17 Jul 2023
Generating Benchmarks for Factuality Evaluation of Language Models
Generating Benchmarks for Factuality Evaluation of Language Models
Dor Muhlgay
Ori Ram
Inbal Magar
Yoav Levine
Nir Ratner
Yonatan Belinkov
Omri Abend
Kevin Leyton-Brown
Amnon Shashua
Y. Shoham
HILM
33
91
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Mian
OffRL
70
538
0
12 Jul 2023
QIGen: Generating Efficient Kernels for Quantized Inference on Large
  Language Models
QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models
Tommaso Pegolotti
Elias Frantar
Dan Alistarh
Markus Püschel
MQ
24
3
0
07 Jul 2023
Evaluating Biased Attitude Associations of Language Models in an
  Intersectional Context
Evaluating Biased Attitude Associations of Language Models in an Intersectional Context
Shiva Omrani Sabbaghi
Robert Wolfe
Aylin Caliskan
26
22
0
07 Jul 2023
Several categories of Large Language Models (LLMs): A Short Survey
Several categories of Large Language Models (LLMs): A Short Survey
Saurabh Pahune
Manoj Chandrasekharan
AILaw
25
14
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for
  AI-Assisted Programming: A Review
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
42
78
0
04 Jul 2023
InstructEval: Systematic Evaluation of Instruction Selection Methods
InstructEval: Systematic Evaluation of Instruction Selection Methods
Anirudh Ajith
Chris Pan
Mengzhou Xia
Ameet Deshpande
Karthik R. Narasimhan
ELM
33
16
0
01 Jul 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with
  Reinforcement Learning
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
16
3
0
25 Jun 2023
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large
  Language Models
H2_22​O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Dinesh Manocha
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
66
261
0
24 Jun 2023
Long-range Language Modeling with Self-retrieval
Long-range Language Modeling with Self-retrieval
Ohad Rubin
Jonathan Berant
RALM
KELM
33
18
0
23 Jun 2023
Textbooks Are All You Need
Textbooks Are All You Need
Suriya Gunasekar
Yi Zhang
J. Aneja
C. C. T. Mendes
Allison Del Giorno
...
Sébastien Bubeck
Ronen Eldan
Adam Tauman Kalai
Y. Lee
Yuan-Fang Li
AI4CE
ALM
SyDa
40
392
0
20 Jun 2023
Guiding Language Models of Code with Global Context using Monitors
Guiding Language Models of Code with Global Context using Monitors
Lakshya A Agrawal
Aditya Kanade
Navin Goyal
Shuvendu K. Lahiri
S. Rajamani
40
23
0
19 Jun 2023
ZeRO++: Extremely Efficient Collective Communication for Giant Model
  Training
ZeRO++: Extremely Efficient Collective Communication for Giant Model Training
Guanhua Wang
Heyang Qin
S. A. Jacobs
Connor Holmes
Samyam Rajbhandari
Olatunji Ruwase
Feng Yan
Lei Yang
Yuxiong He
VLM
65
58
0
16 Jun 2023
You Don't Need Robust Machine Learning to Manage Adversarial Attack
  Risks
You Don't Need Robust Machine Learning to Manage Adversarial Attack Risks
Edward Raff
M. Benaroch
Andrew L. Farris
AAML
27
2
0
16 Jun 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
Jifan Yu
Xiaozhi Wang
Shangqing Tu
S. Cao
Daniel Zhang-Li
...
Lei Hou
Zhiyuan Liu
Bin Xu
Jie Tang
Juanzi Li
ELM
ALM
41
66
0
15 Jun 2023
Previous
123...678...101112
Next