ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.06745
  4. Cited By
GPT-NeoX-20B: An Open-Source Autoregressive Language Model

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

14 April 2022
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
Laurence Golding
Horace He
Connor Leahy
Kyle McDonell
Jason Phang
Michael Pieler
USVSN Sai Prashanth
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
ArXivPDFHTML

Papers citing "GPT-NeoX-20B: An Open-Source Autoregressive Language Model"

50 / 554 papers shown
Title
LA4SR: illuminating the dark proteome with generative AI
LA4SR: illuminating the dark proteome with generative AI
David R. Nelson
Ashish Kumar Jaiswal
Noha Ismail
Alexandra Mystikou
Kourosh Salehi-Ashtiani
22
0
0
11 Nov 2024
Towards Low-Resource Harmful Meme Detection with LMM Agents
Towards Low-Resource Harmful Meme Detection with LMM Agents
Jianzhao Huang
Hongzhan Lin
Ziyan Liu
Ziyang Luo
Guang Chen
Jing Ma
33
2
0
08 Nov 2024
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
77
17
0
07 Nov 2024
Photon: Federated LLM Pre-Training
Photon: Federated LLM Pre-Training
Lorenzo Sani
Alex Iacob
Zeyu Cao
Royson Lee
Bill Marino
...
Dongqi Cai
Zexi Li
Wanru Zhao
Xinchi Qiu
Nicholas D. Lane
AI4CE
33
7
0
05 Nov 2024
Normalization Layer Per-Example Gradients are Sufficient to Predict
  Gradient Noise Scale in Transformers
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
Gavia Gray
Aman Tiwari
Shane Bergsma
Joel Hestness
27
1
0
01 Nov 2024
GigaCheck: Detecting LLM-generated Content
GigaCheck: Detecting LLM-generated Content
Irina Tolstykh
Aleksandra Tsybina
Sergey Yakubson
Aleksandr Gordeev
Vladimir Dokholyan
Maksim Kuprashevich
DeLMO
42
1
0
31 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
46
2
0
30 Oct 2024
SVIP: Towards Verifiable Inference of Open-source Large Language Models
SVIP: Towards Verifiable Inference of Open-source Large Language Models
Yifan Sun
Yuhang Li
Yue Zhang
Yuchen Jin
Huan Zhang
28
2
0
29 Oct 2024
M2rc-Eval: Massively Multilingual Repository-level Code Completion
  Evaluation
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
Jiaheng Liu
Ken Deng
Congnan Liu
Jian Yang
Shukai Liu
...
Zekun Wang
Guoan Zhang
Bangyu Xiang
Wenbo Su
Jian Xu
69
4
0
28 Oct 2024
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive
  Learning
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning
Xun Guo
Shan Zhang
Yongxin He
Ting Zhang
Wanquan Feng
Haibin Huang
Chongyang Ma
DeLMO
47
5
0
28 Oct 2024
Deep Optimizer States: Towards Scalable Training of Transformer Models
  Using Interleaved Offloading
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Avinash Maurya
Jie Ye
M. Rafique
Franck Cappello
Bogdan Nicolae
29
1
0
26 Oct 2024
Reinforcement Learning for Aligning Large Language Models Agents with
  Interactive Environments: Quantifying and Mitigating Prompt Overfitting
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting
Mohamed Salim Aissi
Clément Romac
Thomas Carta
Sylvain Lamprier
Pierre-Yves Oudeyer
Olivier Sigaud
Laure Soulier
Nicolas Thome
24
2
0
25 Oct 2024
Self-Explained Keywords Empower Large Language Models for Code
  Generation
Self-Explained Keywords Empower Large Language Models for Code Generation
Lishui Fan
Mouxiang Chen
Zhongxin Liu
40
1
0
21 Oct 2024
Scalable Data Ablation Approximations for Language Models through
  Modular Training and Merging
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Clara Na
Ian H. Magnusson
A. Jha
Tom Sherborne
Emma Strubell
Jesse Dodge
Pradeep Dasigi
MoMe
36
5
0
21 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
39
15
0
15 Oct 2024
TemporalBench: Benchmarking Fine-grained Temporal Understanding for
  Multimodal Video Models
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Mu Cai
Reuben Tan
Jianrui Zhang
Bocheng Zou
Kai Zhang
...
Yao Dou
J. Park
Jianfeng Gao
Yong Jae Lee
Jianwei Yang
44
12
0
14 Oct 2024
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Tongtian Yue
Longteng Guo
Jie Cheng
Xuange Gao
Jiaheng Liu
MoE
36
0
0
14 Oct 2024
LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection
LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection
Zhiyuan Wei
Jing Sun
Zijiang Zhang
Xianhao Zhang
Meng Li
Zhe Hou
36
4
0
12 Oct 2024
Enterprise Benchmarks for Large Language Model Evaluation
Enterprise Benchmarks for Large Language Model Evaluation
Bing Zhang
Mikio Takeuchi
Ryo Kawahara
Shubhi Asthana
Md. Maruf Hossain
Guang-Jie Ren
Kate Soule
Yada Zhu
ELM
42
2
0
11 Oct 2024
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
Preferred Elements
:
Kenshin Abe
Kaizaburo Chubachi
Yasuhiro Fujita
...
Yoshihiko Ozaki
Shotaro Sano
Shuji Suzuki
Tianqi Xu
Toshihiko Yanase
36
0
0
10 Oct 2024
LecPrompt: A Prompt-based Approach for Logical Error Correction with
  CodeBERT
LecPrompt: A Prompt-based Approach for Logical Error Correction with CodeBERT
Zhenyu Xu
Victor S. Sheng
KELM
18
0
0
10 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
95
4
2
10 Oct 2024
Which Programming Language and What Features at Pre-training Stage
  Affect Downstream Logical Inference Performance?
Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?
Fumiya Uchiyama
Takeshi Kojima
Andrew Gambardella
Qi Cao
Yusuke Iwasawa
Yutaka Matsuo
LRM
ReLM
28
3
0
09 Oct 2024
FreqMark: Frequency-Based Watermark for Sentence-Level Detection of
  LLM-Generated Text
FreqMark: Frequency-Based Watermark for Sentence-Level Detection of LLM-Generated Text
Zhenyu Xu
Anton van den Hengel
Victor S. Sheng
WaLM
46
2
0
09 Oct 2024
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Fine-tuning can Help Detect Pretraining Data from Large Language Models
H. Zhang
Songxin Zhang
Bingyi Jing
Hongxin Wei
43
0
0
09 Oct 2024
Round and Round We Go! What makes Rotary Positional Encodings useful?
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
75
16
0
08 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
145
0
0
07 Oct 2024
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference
  Services
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference Services
Małgorzata Łazuka
Andreea Anghel
Thomas Parnell
27
10
0
03 Oct 2024
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
SyDa
37
2
0
03 Oct 2024
Creative and Context-Aware Translation of East Asian Idioms with GPT-4
Creative and Context-Aware Translation of East Asian Idioms with GPT-4
Kenan Tang
Peiyang Song
Yao Qin
Xifeng Yan
28
1
0
01 Oct 2024
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
Shixuan Ma
Quan Wang
40
2
0
25 Sep 2024
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method
Weichao Zhang
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
35
8
0
23 Sep 2024
Expanding Expressivity in Transformer Models with MöbiusAttention
Expanding Expressivity in Transformer Models with MöbiusAttention
Anna-Maria Halacheva
M. Nayyeri
Steffen Staab
25
1
0
08 Sep 2024
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Cheng Wang
Yiwei Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
42
2
0
05 Sep 2024
The AdEMAMix Optimizer: Better, Faster, Older
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
28
8
0
05 Sep 2024
Comparing Discrete and Continuous Space LLMs for Speech Recognition
Comparing Discrete and Continuous Space LLMs for Speech Recognition
Yaoxun Xu
Shi-Xiong Zhang
Jianwei Yu
Zhiyong Wu
Dong Yu
AuLLM
17
3
0
01 Sep 2024
A Survey of Large Language Models for European Languages
A Survey of Large Language Models for European Languages
Wazir Ali
S. Pyysalo
39
2
0
27 Aug 2024
Internal and External Knowledge Interactive Refinement Framework for
  Knowledge-Intensive Question Answering
Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering
Haowei Du
Dongyan Zhao
KELM
30
0
0
23 Aug 2024
ONSEP: A Novel Online Neural-Symbolic Framework for Event Prediction
  Based on Large Language Model
ONSEP: A Novel Online Neural-Symbolic Framework for Event Prediction Based on Large Language Model
Xuanqing Yu
Wangtao Sun
Jingwei Li
Kang Liu
Chengbao Liu
Jie Tan
OffRL
AI4TS
38
3
0
14 Aug 2024
Data Mixture Inference: What do BPE Tokenizers Reveal about their
  Training Data?
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
J. Hayase
Alisa Liu
Yejin Choi
Sewoong Oh
Noah A. Smith
37
10
0
23 Jul 2024
Consent in Crisis: The Rapid Decline of the AI Data Commons
Consent in Crisis: The Rapid Decline of the AI Data Commons
Shayne Longpre
Robert Mahari
Ariel N. Lee
Campbell Lund
Hamidah Oderinwale
...
Hanlin Li
Daphne Ippolito
Sara Hooker
Jad Kabbara
Sandy Pentland
69
36
0
20 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
H. Song
SyDa
41
5
0
12 Jul 2024
AutoBencher: Towards Declarative Benchmark Construction
AutoBencher: Towards Declarative Benchmark Construction
Xiang Lisa Li
E. Liu
Percy Liang
Tatsunori Hashimoto
Percy Liang
Tatsunori Hashimoto
48
2
0
11 Jul 2024
A Review of the Challenges with Massive Web-mined Corpora Used in Large
  Language Models Pre-Training
A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training
Michał Perełkiewicz
Rafał Poświata
40
1
0
10 Jul 2024
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in
  Large Language Models
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models
Zara Siddique
Liam D. Turner
Luis Espinosa-Anke
34
0
0
09 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
76
9
0
09 Jul 2024
LLMBox: A Comprehensive Library for Large Language Models
LLMBox: A Comprehensive Library for Large Language Models
Tianyi Tang
Yiwen Hu
Bingqian Li
Wenyang Luo
Zijing Qin
...
Chunxuan Xia
Junyi Li
Kun Zhou
Wayne Xin Zhao
Ji-Rong Wen
31
1
0
08 Jul 2024
Looking into Black Box Code Language Models
Looking into Black Box Code Language Models
Muhammad Umair Haider
Umar Farooq
A. B. Siddique
Mark Marron
39
2
0
05 Jul 2024
Leveraging Graph Structures to Detect Hallucinations in Large Language
  Models
Leveraging Graph Structures to Detect Hallucinations in Large Language Models
Noa Nonkes
Sergei Agaronian
Evangelos Kanoulas
Roxana Petcu
24
1
0
05 Jul 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Yu Sun
Xinhao Li
Karan Dalal
Jiarui Xu
Arjun Vikram
...
Xinlei Chen
Xiaolong Wang
Sanmi Koyejo
Tatsunori Hashimoto
Carlos Guestrin
60
92
0
05 Jul 2024
Previous
12345...101112
Next