ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.01068
  4. Cited By
OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

2 May 2022
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
Shuohui Chen
Christopher Dewan
Mona T. Diab
Xian Li
Xi Lin
Todor Mihaylov
Myle Ott
Sam Shleifer
Kurt Shuster
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
    VLM
    OSLM
    AI4CE
ArXivPDFHTML

Papers citing "OPT: Open Pre-trained Transformer Language Models"

50 / 2,447 papers shown
Title
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
203
1
0
11 Oct 2024
The Large Language Model GreekLegalRoBERTa
The Large Language Model GreekLegalRoBERTa
Vasileios Saketos
D. Pantazi
Manolis Koubarakis
AILaw
34
0
0
10 Oct 2024
A Target-Aware Analysis of Data Augmentation for Hate Speech Detection
A Target-Aware Analysis of Data Augmentation for Hate Speech Detection
Camilla Casula
Sara Tonelli
31
0
0
10 Oct 2024
StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for
  Large Language Models
StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models
Minchan Kwon
Gaeun Kim
Jongsuk Kim
Haeil Lee
Junmo Kim
OffRL
LRM
LLMAG
26
2
0
10 Oct 2024
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in
  LLMs, Even for Vigilant Users
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu
Hongyi Wu
Zihan Guan
Ronghang Zhu
Dongliang Guo
Daiqing Qi
Sheng Li
SILM
41
3
0
10 Oct 2024
Detecting Training Data of Large Language Models via Expectation Maximization
Detecting Training Data of Large Language Models via Expectation Maximization
Gyuwan Kim
Yang Li
Evangelia Spiliopoulou
Jie Ma
Miguel Ballesteros
William Yang Wang
MIALM
98
4
2
10 Oct 2024
QuAILoRA: Quantization-Aware Initialization for LoRA
QuAILoRA: Quantization-Aware Initialization for LoRA
Neal Lawton
Aishwarya Padmakumar
Judith Gaspers
Jack FitzGerald
Anoop Kumar
Greg Ver Steeg
Aram Galstyan
MQ
36
0
0
09 Oct 2024
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation
  Experts
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
32
4
0
09 Oct 2024
Generative Model for Less-Resourced Language with 1 billion parameters
Generative Model for Less-Resourced Language with 1 billion parameters
Domen Vreš
Martin Božič
Aljaž Potočnik
Tomaž Martinčič
Marko Robnik-Šikonja
26
1
0
09 Oct 2024
Break the Visual Perception: Adversarial Attacks Targeting Encoded
  Visual Tokens of Large Vision-Language Models
Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models
Yubo Wang
Chaohu Liu
Yanqiu Qu
Haoyu Cao
Deqiang Jiang
Linli Xu
MLLM
AAML
32
3
0
09 Oct 2024
Signal Watermark on Large Language Models
Signal Watermark on Large Language Models
Zhenyu Xu
Victor S. Sheng
WaLM
25
0
0
09 Oct 2024
Do great minds think alike? Investigating Human-AI Complementarity in
  Question Answering with CAIMIRA
Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA
Maharshi Gor
Hal Daumé III
Dinesh Manocha
Jordan Boyd-Graber
ELM
AI4MH
LRM
28
1
0
09 Oct 2024
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and
  Performance of SGD for Fine-Tuning Language Models
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Zeman Li
Xinwei Zhang
Peilin Zhong
Yuan Deng
Meisam Razaviyayn
Vahab Mirrokni
25
2
0
09 Oct 2024
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Han Zhang
Songxin Zhang
Bingyi Jing
Hongxin Wei
43
0
0
09 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
J. Li
Weiyao Lin
VLM
38
1
0
09 Oct 2024
An Eye for an Ear: Zero-shot Audio Description Leveraging an Image
  Captioner using Audiovisual Distribution Alignment
An Eye for an Ear: Zero-shot Audio Description Leveraging an Image Captioner using Audiovisual Distribution Alignment
Hugo Malard
Michel Olvera
Stéphane Lathuilière
S. Essid
VLM
44
0
0
08 Oct 2024
Give me a hint: Can LLMs take a hint to solve math problems?
Give me a hint: Can LLMs take a hint to solve math problems?
Vansh Agrawal
Pratham Singla
Amitoj Singh Miglani
Shivank Garg
Ayush Mangal
ReLM
LRM
28
5
0
08 Oct 2024
Enhancing Temporal Modeling of Video LLMs via Time Gating
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu
Yiwu Zhong
Shijia Huang
M. Lyu
Liwei Wang
VLM
33
0
0
08 Oct 2024
Attribute Controlled Fine-tuning for Large Language Models: A Case Study
  on Detoxification
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Tao Meng
Ninareh Mehrabi
Palash Goyal
Anil Ramakrishna
Aram Galstyan
Richard Zemel
Kai-Wei Chang
Rahul Gupta
Charith Peris
27
1
0
07 Oct 2024
Initialization of Large Language Models via Reparameterization to
  Mitigate Loss Spikes
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes
Kosuke Nishida
Kyosuke Nishida
Kuniko Saito
36
1
0
07 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
218
0
0
07 Oct 2024
Realizing Video Summarization from the Path of Language-based Semantic
  Understanding
Realizing Video Summarization from the Path of Language-based Semantic Understanding
Kuan-Chen Mu
Zhi-Yi Chin
Wei-Chen Chiu
28
0
0
06 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
17
0
06 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David Harwath
58
3
0
05 Oct 2024
RAFT: Realistic Attacks to Fool Text Detectors
RAFT: Realistic Attacks to Fool Text Detectors
James Wang
Ran Li
Junfeng Yang
Chengzhi Mao
AAML
DeLMO
18
3
0
04 Oct 2024
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Aiwei Liu
Sheng Guan
Ye Liu
L. Pan
Yifei Zhang
Liancheng Fang
Lijie Wen
Philip S. Yu
Xuming Hu
WaLM
188
2
0
04 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
Xinyu Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
37
2
0
04 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
44
8
0
04 Oct 2024
Salient Information Prompting to Steer Content in Prompt-based
  Abstractive Summarization
Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization
Lei Xu
Mohammed Asad Karim
Saket Dingliwal
Aparna Elangovan
29
0
0
03 Oct 2024
Learning the Latent Rules of a Game from Data: A Chess Story
Learning the Latent Rules of a Game from Data: A Chess Story
Ben Fauber
26
1
0
03 Oct 2024
Agent-Oriented Planning in Multi-Agent Systems
Agent-Oriented Planning in Multi-Agent Systems
Ao Li
Yuexiang Xie
Songze Li
Fugee Tsung
Bolin Ding
Yaliang Li
AIFin
161
6
0
03 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
86
7
0
03 Oct 2024
Financial Sentiment Analysis on News and Reports Using Large Language
  Models and FinBERT
Financial Sentiment Analysis on News and Reports Using Large Language Models and FinBERT
Yanxin Shen
Pulin Kirin Zhang
AIFin
36
11
0
02 Oct 2024
Getting Free Bits Back from Rotational Symmetries in LLMs
Getting Free Bits Back from Rotational Symmetries in LLMs
Jiajun He
Gergely Flamich
José Miguel Hernández-Lobato
MQ
23
0
0
02 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
46
2
0
02 Oct 2024
Can visual language models resolve textual ambiguity with visual cues?
  Let visual puns tell you!
Can visual language models resolve textual ambiguity with visual cues? Let visual puns tell you!
Jiwan Chung
Seungwon Lim
Jaehyun Jeon
Seungbeen Lee
Youngjae Yu
37
0
0
01 Oct 2024
Causal Representation Learning with Generative Artificial Intelligence:
  Application to Texts as Treatments
Causal Representation Learning with Generative Artificial Intelligence: Application to Texts as Treatments
Kosuke Imai
Kentaro Nakamura
CML
28
4
0
01 Oct 2024
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge
  Distillation for Large Language Models in Code Generation
AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation
Ziyang Luo
Xin Li
Hongzhan Lin
Jing Ma
Lidong Bing
VLM
32
0
0
01 Oct 2024
Exploring the Learning Capabilities of Language Models using LEVERWORLDS
Exploring the Learning Capabilities of Language Models using LEVERWORLDS
Eitan Wagner
Amir Feder
Omri Abend
16
0
0
01 Oct 2024
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common
  Sense Reasoning
A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense Reasoning
Niki Maria Foteinopoulou
Enjie Ghorbel
Djamila Aouada
41
2
0
01 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
61
17
0
01 Oct 2024
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs
Mehdi Ali
Michael Fromm
Klaudia Thellmann
Jan Ebert
Alexander Arno Weber
...
René Jäkel
Georg Rehm
Stefan Kesselheim
Joachim Köhler
Nicolas Flores-Herr
72
6
0
30 Sep 2024
Aggressive Post-Training Compression on Extremely Large Language Models
Aggressive Post-Training Compression on Extremely Large Language Models
Zining Zhang
Yao Chen
Bingsheng He
Zhenjie Zhang
28
0
0
30 Sep 2024
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances
  Retrieval-Augmented Generation with Zero Inference Overhead
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
Tao Tan
Yining Qian
Ang Lv
Hongzhan Lin
Songhao Wu
Yongbo Wang
Feng Wang
Jingtong Wu
Xin Lu
Rui Yan
27
1
0
29 Sep 2024
A Certified Robust Watermark For Large Language Models
A Certified Robust Watermark For Large Language Models
Xianheng Feng
Jian-wei Liu
Kui Ren
Chun Chen
AAML
WaLM
52
0
0
29 Sep 2024
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
Yike Wu
Yi Huang
Nan Hu
Yuncheng Hua
Guilin Qi
Jiaoyan Chen
Jeff Z. Pan
44
7
0
29 Sep 2024
DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image
  Captioning
DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image Captioning
Kazuki Matsuda
Yuiga Wada
Komei Sugiura
31
1
0
28 Sep 2024
Show and Guide: Instructional-Plan Grounded Vision and Language Model
Show and Guide: Instructional-Plan Grounded Vision and Language Model
Diogo Glória-Silva
David Semedo
João Magalhães
26
0
0
27 Sep 2024
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A
  Survey
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
Tiansheng Huang
Sihao Hu
Fatih Ilhan
Selim Furkan Tekin
Ling Liu
AAML
50
24
0
26 Sep 2024
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Jiaxiang Tang
Zhaoshuo Li
Jinwei Gu
Xian Liu
Gang Zeng
Ming-Yu Liu
Qinsheng Zhang
47
24
0
26 Sep 2024
Previous
123...789...474849
Next