ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,456 papers shown
Title
Contextual Feature Extraction Hierarchies Converge in Large Language
  Models and the Brain
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain
Gavin Mischler
Yinghao Aaron Li
Stephan Bickel
A. Mehta
N. Mesgarani
32
23
0
31 Jan 2024
When Large Language Models Meet Vector Databases: A Survey
When Large Language Models Meet Vector Databases: A Survey
Zhi Jing
Yongye Su
Yikun Han
Bo Yuan
Haiyun Xu
Chunjiang Liu
Kehai Chen
Min Zhang
61
36
0
30 Jan 2024
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor
  Image Comprehension in Remote Sensing Domain
EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain
Wei Zhang
Miaoxin Cai
Tong Zhang
Zhuang Yin
Xuerui Mao
44
92
0
30 Jan 2024
H2O-Danube-1.8B Technical Report
H2O-Danube-1.8B Technical Report
Philipp Singer
Pascal Pfeiffer
Yauhen Babakhin
Maximilian Jeblick
Nischay Dhankhar
Gabor Fodor
SriSatish Ambati
VLM
29
8
0
30 Jan 2024
Security and Privacy Challenges of Large Language Models: A Survey
Security and Privacy Challenges of Large Language Models: A Survey
B. Das
M. H. Amini
Yanzhao Wu
PILM
ELM
26
108
0
30 Jan 2024
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on
  E-Branchformer
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Yifan Peng
Jinchuan Tian
William Chen
Siddhant Arora
Brian Yan
...
Kwanghee Choi
Jiatong Shi
Xuankai Chang
Jee-weon Jung
Shinji Watanabe
VLM
OSLM
39
40
0
30 Jan 2024
TeenyTinyLlama: open-source tiny language models trained in Brazilian
  Portuguese
TeenyTinyLlama: open-source tiny language models trained in Brazilian Portuguese
N. Corrêa
Sophia Falk
Shiza Fatimah
Aniket Sen
N. D. Oliveira
32
9
0
30 Jan 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and
  Comprehension in Vision-Language Large Model
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Bin Wang
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Jiaqi Wang
VLM
MLLM
89
245
0
29 Jan 2024
Defining and Extracting generalizable interaction primitives from DNNs
Defining and Extracting generalizable interaction primitives from DNNs
Lu Chen
Siyu Lou
Benhao Huang
Quanshi Zhang
42
9
0
29 Jan 2024
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large
  Models
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large Models
Yi Zhao
Yilin Zhang
Rong Xiang
Jing Li
Hillming Li
48
16
0
29 Jan 2024
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Bin Lin
Zhenyu Tang
Yang Ye
Jiaxi Cui
Bin Zhu
...
Jinfa Huang
Junwu Zhang
Yatian Pang
Munan Ning
Li-ming Yuan
VLM
MLLM
MoE
48
154
0
29 Jan 2024
Contextualization Distillation from Large Language Model for Knowledge
  Graph Completion
Contextualization Distillation from Large Language Model for Knowledge Graph Completion
Dawei Li
Zhen Tan
Tianlong Chen
Huan Liu
KELM
35
12
0
28 Jan 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought
  Prompting
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Timothy Baldwin
LRM
37
27
0
28 Jan 2024
To Burst or Not to Burst: Generating and Quantifying Improbable Text
To Burst or Not to Burst: Generating and Quantifying Improbable Text
Kuleen Sasse
Samuel Barham
Efsun Sarioglu Kayi
Edward W. Staley
DeLMO
27
1
0
27 Jan 2024
A Comprehensive Survey of Compression Algorithms for Language Models
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
36
12
0
27 Jan 2024
Unlearning Traces the Influential Training Data of Language Models
Unlearning Traces the Influential Training Data of Language Models
Masaru Isonuma
Ivan Titov
MU
34
7
0
26 Jan 2024
HiFT: A Hierarchical Full Parameter Fine-Tuning Strategy
HiFT: A Hierarchical Full Parameter Fine-Tuning Strategy
Yongkang Liu
Yiqun Zhang
Qian Li
Tong Liu
Shi Feng
Daling Wang
Yifei Zhang
Hinrich Schütze
40
6
0
26 Jan 2024
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos
Maximilian L. Croci
Marcelo Gennari do Nascimento
Torsten Hoefler
James Hensman
VLM
132
148
0
26 Jan 2024
Large Language Model Adaptation for Financial Sentiment Analysis
Large Language Model Adaptation for Financial Sentiment Analysis
Pau Rodriguez Inserte
Mariam Nakhlé
Raheel Qader
Gaëtan Caillaut
Jingshu Liu
33
13
0
26 Jan 2024
Looking Right is Sometimes Right: Investigating the Capabilities of
  Decoder-only LLMs for Sequence Labeling
Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling
David Dukić
Jan Šnajder
33
13
0
25 Jan 2024
The Case for Co-Designing Model Architectures with Hardware
The Case for Co-Designing Model Architectures with Hardware
Quentin G. Anthony
Jacob Hatef
Deepak Narayanan
Stella Biderman
Stas Bekman
Junqi Yin
Hari Subramoni
Hari Subramoni
Dhabaleswar Panda
3DV
27
4
0
25 Jan 2024
RomanSetu: Efficiently unlocking multilingual capabilities of Large
  Language Models via Romanization
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization
Jaavid Aktar Husain
Raj Dabre
Aswanth Kumar
Jay Gala
Thanmay Jayakumar
Ratish Puduppully
Anoop Kunchukuttan
43
12
0
25 Jan 2024
Adaptive Text Watermark for Large Language Models
Adaptive Text Watermark for Large Language Models
Yepeng Liu
Yuheng Bu
WaLM
20
19
0
25 Jan 2024
Automated Root Causing of Cloud Incidents using In-Context Learning with
  GPT-4
Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4
Xuchao Zhang
Supriyo Ghosh
Chetan Bansal
Rujia Wang
Ming-Jie Ma
Yu Kang
Saravan Rajmohan
46
23
0
24 Jan 2024
MambaByte: Token-free Selective State Space Model
MambaByte: Token-free Selective State Space Model
Junxiong Wang
Tushaar Gangavarapu
Jing Nathan Yan
Alexander M. Rush
Mamba
44
37
0
24 Jan 2024
MM-LLMs: Recent Advances in MultiModal Large Language Models
MM-LLMs: Recent Advances in MultiModal Large Language Models
Duzhen Zhang
Yahan Yu
Jiahua Dong
Chenxing Li
Dan Su
Chenhui Chu
Dong Yu
OffRL
LRM
56
183
0
24 Jan 2024
ChatterBox: Multi-round Multimodal Referring and Grounding
ChatterBox: Multi-round Multimodal Referring and Grounding
Yunjie Tian
Tianren Ma
Lingxi Xie
Jihao Qiu
Xi Tang
Yuan Zhang
Jianbin Jiao
Qi Tian
Qixiang Ye
33
14
0
24 Jan 2024
ARGS: Alignment as Reward-Guided Search
ARGS: Alignment as Reward-Guided Search
Maxim Khanov
Jirayu Burapacheep
Yixuan Li
40
48
0
23 Jan 2024
Raidar: geneRative AI Detection viA Rewriting
Raidar: geneRative AI Detection viA Rewriting
Chengzhi Mao
Carl Vondrick
Hao Wang
Junfeng Yang
DeLMO
31
25
0
23 Jan 2024
BiTA: Bi-Directional Tuning for Lossless Acceleration in Large Language
  Models
BiTA: Bi-Directional Tuning for Lossless Acceleration in Large Language Models
Feng-Huei Lin
Hanling Yi
Hongbin Li
Yifan Yang
Xiaotian Yu
Guangming Lu
Rong Xiao
43
3
0
23 Jan 2024
Small Language Model Meets with Reinforced Vision Vocabulary
Small Language Model Meets with Reinforced Vision Vocabulary
Haoran Wei
Lingyu Kong
Jinyue Chen
Liang Zhao
Zheng Ge
En Yu
Jian‐Yuan Sun
Chunrui Han
Xiangyu Zhang
VLM
57
40
0
23 Jan 2024
Enhancing In-context Learning via Linear Probe Calibration
Enhancing In-context Learning via Linear Probe Calibration
Momin Abbas
Yi Zhou
Parikshit Ram
Nathalie Baracaldo
Horst Samulowitz
Theodoros Salonidis
Tianyi Chen
76
11
0
22 Jan 2024
Universal Neurons in GPT2 Language Models
Universal Neurons in GPT2 Language Models
Wes Gurnee
Theo Horsley
Zifan Carl Guo
Tara Rezaei Kheirkhah
Qinyi Sun
Will Hathaway
Neel Nanda
Dimitris Bertsimas
MILM
105
40
0
22 Jan 2024
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and
  Generating with Multimodal LLMs
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Ling Yang
Zhaochen Yu
Chenlin Meng
Minkai Xu
Stefano Ermon
Tengjiao Wang
CoGe
DiffM
52
116
0
22 Jan 2024
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
SMUTF: Schema Matching Using Generative Tags and Hybrid Features
Yu Zhang
Mei Di
Haozheng Luo
Chenwei Xu
Richard Tzong-Han Tsai
65
0
0
22 Jan 2024
CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM
  Inference
CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference
Suyi Li
Hanfeng Lu
Tianyuan Wu
Minchen Yu
Qizhen Weng
Xusheng Chen
Yizhou Shan
Binhang Yuan
Wei Wang
56
12
0
20 Jan 2024
Inference without Interference: Disaggregate LLM Inference for Mixed
  Downstream Workloads
Inference without Interference: Disaggregate LLM Inference for Mixed Downstream Workloads
Cunchen Hu
Heyang Huang
Liangliang Xu
Xusheng Chen
Jiang Xu
...
Chenxi Wang
Sa Wang
Yungang Bao
Ninghui Sun
Yizhou Shan
DRL
41
63
0
20 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple
  Decoding Heads
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
60
257
0
19 Jan 2024
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
Haritz Puerto
Martin Tutek
Somak Aditya
Xiaodan Zhu
Iryna Gurevych
ReCod
ReLM
LRM
58
11
0
18 Jan 2024
Veagle: Advancements in Multimodal Representation Learning
Veagle: Advancements in Multimodal Representation Learning
Rajat Chawla
Arkajit Datta
Tushar Verma
Adarsh Jha
Anmol Gautam
Ayush Vatsal
Sukrit Chaterjee
NS Mukunda
Ishaan Bhola
VLM
21
4
0
18 Jan 2024
Computing in the Era of Large Generative Models: From Cloud-Native to
  AI-Native
Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native
Yao Lu
Song Bian
Lequn Chen
Yongjun He
Yulong Hui
...
Huanchen Zhang
Minjia Zhang
Qizhen Zhang
Tianyi Zhou
Danyang Zhuo
39
7
0
17 Jan 2024
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in
  3D World
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
Yining Hong
Zishuo Zheng
Peihao Chen
Yian Wang
Junyan Li
Chuang Gan
26
33
0
16 Jan 2024
EmoLLMs: A Series of Emotional Large Language Models and Annotation
  Tools for Comprehensive Affective Analysis
EmoLLMs: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective Analysis
Zhiwei Liu
Kailai Yang
Tianlin Zhang
Qianqian Xie
Sophia Ananiadou
49
40
0
16 Jan 2024
Generative Multi-Modal Knowledge Retrieval with Large Language Models
Generative Multi-Modal Knowledge Retrieval with Large Language Models
Xinwei Long
Jiali Zeng
Fandong Meng
Zhiyuan Ma
Kaiyan Zhang
Bowen Zhou
Jie Zhou
47
15
0
16 Jan 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
69
35
0
16 Jan 2024
Learned Best-Effort LLM Serving
Learned Best-Effort LLM Serving
Siddharth Jha
Coleman Hooper
Xiaoxuan Liu
Sehoon Kim
Kurt Keutzer
26
2
0
15 Jan 2024
The What, Why, and How of Context Length Extension Techniques in Large
  Language Models -- A Detailed Survey
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
42
28
0
15 Jan 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive
  Survey of Speculative Decoding
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
Heming Xia
Zhe Yang
Qingxiu Dong
Peiyi Wang
Yongqi Li
Tao Ge
Tianyu Liu
Wenjie Li
Zhifang Sui
LRM
40
105
0
15 Jan 2024
Developing ChatGPT for Biology and Medicine: A Complete Review of
  Biomedical Question Answering
Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering
Qing Li
Lei Li
Yu Li
LM&MA
AI4MH
48
6
0
15 Jan 2024
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Only Send What You Need: Learning to Communicate Efficiently in Federated Multilingual Machine Translation
Yun-Wei Chu
Dong-Jun Han
Christopher G. Brinton
39
4
0
15 Jan 2024
Previous
123...232425...484950
Next