ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.13971
  4. Cited By
LLaMA: Open and Efficient Foundation Language Models

LLaMA: Open and Efficient Foundation Language Models

27 February 2023
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
Timothée Lacroix
Baptiste Rozière
Naman Goyal
Eric Hambro
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
    ALM
    PILM
ArXivPDFHTML

Papers citing "LLaMA: Open and Efficient Foundation Language Models"

41 / 7,091 papers shown
Title
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
40
5
0
06 Jan 2023
Reference Twice: A Simple and Unified Baseline for Few-Shot Instance
  Segmentation
Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation
Yue Han
Jiangning Zhang
Zhucun Xue
Chao Xu
Xintian Shen
Yabiao Wang
Chengjie Wang
Yong Liu
Xiangtai Li
62
17
0
03 Jan 2023
Principled and Efficient Transfer Learning of Deep Models via Neural
  Collapse
Principled and Efficient Transfer Learning of Deep Models via Neural Collapse
Xiao Li
Sheng Liu
Jin-li Zhou
Xin Lu
C. Fernandez‐Granda
Zhihui Zhu
Q. Qu
AAML
35
19
0
23 Dec 2022
Language Models as Inductive Reasoners
Language Models as Inductive Reasoners
Zonglin Yang
Li Dong
Xinya Du
Hao Cheng
Min Zhang
Xiaodong Liu
Jianfeng Gao
Furu Wei
ReLM
LRM
35
34
0
21 Dec 2022
JASMINE: Arabic GPT Models for Few-Shot Learning
JASMINE: Arabic GPT Models for Few-Shot Learning
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
AbdelRahim Elmadany
Alcides Alcoba Inciarte
Md. Tawkat Islam Khondaker
38
7
0
21 Dec 2022
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models
Martha Lewis
Nihal V. Nayak
Peilin Yu
Qinan Yu
Jack Merullo
Stephen H. Bach
Ellie Pavlick
VLM
OCL
CoGe
46
59
0
20 Dec 2022
Is GPT-3 a Good Data Annotator?
Is GPT-3 a Good Data Annotator?
Bosheng Ding
Chengwei Qin
Linlin Liu
Yew Ken Chia
Shafiq Joty
Boyang Albert Li
Lidong Bing
42
237
0
20 Dec 2022
Large Language Models Are Reasoning Teachers
Large Language Models Are Reasoning Teachers
Namgyu Ho
Laura Schmid
Se-Young Yun
ReLM
ELM
LRM
42
324
0
20 Dec 2022
When Federated Learning Meets Pre-trained Language Models'
  Parameter-Efficient Tuning Methods
When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods
Zhuo Zhang
Yuanhang Yang
Yong Dai
Lizhen Qu
Zenglin Xu
FedML
74
67
0
20 Dec 2022
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in
  Zero-Shot Reasoning
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning
Omar Shaikh
Hongxin Zhang
William B. Held
Michael S. Bernstein
Diyi Yang
ReLM
LRM
64
186
0
15 Dec 2022
A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective
A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective
Yu Zhao
Huaming Du
Qing Li
Fuzhen Zhuang
Ji Liu
Gang Kou
Gang Kou
57
1
0
28 Nov 2022
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
  Foundation Models
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models
Peter Henderson
E. Mitchell
Christopher D. Manning
Dan Jurafsky
Chelsea Finn
44
47
0
27 Nov 2022
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large
  Language Models
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
103
762
0
18 Nov 2022
Deep Emotion Recognition in Textual Conversations: A Survey
Deep Emotion Recognition in Textual Conversations: A Survey
Patrícia Pereira
Helena Moniz
Joao Paulo Carvalho
52
15
0
16 Nov 2022
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
GAMMT: Generative Ambiguity Modeling Using Multiple Transformers
Xingcheng Xu
35
0
0
16 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
40
33
0
07 Nov 2022
Two-stage LLM Fine-tuning with Less Specialization and More
  Generalization
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
Yihan Wang
Si Si
Daliang Li
Michal Lukasik
Felix X. Yu
Cho-Jui Hsieh
Inderjit S Dhillon
Sanjiv Kumar
67
30
0
01 Nov 2022
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches
  and Future Directions
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches and Future Directions
Qi Jia
Yizhu Liu
Siyu Ren
Kenny Q. Zhu
34
7
0
18 Oct 2022
Differentially Private Optimization on Large Model at Small Cost
Differentially Private Optimization on Large Model at Small Cost
Zhiqi Bu
Yu Wang
Sheng Zha
George Karypis
45
52
0
30 Sep 2022
YATO: Yet Another deep learning based Text analysis Open toolkit
YATO: Yet Another deep learning based Text analysis Open toolkit
Zeqiang Wang
Yile Wang
Jiageng Wu
Zhiyang Teng
Jie Yang
68
3
0
28 Sep 2022
Moral Mimicry: Large Language Models Produce Moral Rationalizations
  Tailored to Political Identity
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity
Gabriel Simmons
110
58
0
24 Sep 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Tengjiao Wang
Ming-Hsuan Yang
DiffM
MedIm
226
1,328
0
02 Sep 2022
Prompt Tuning with Soft Context Sharing for Vision-Language Models
Prompt Tuning with Soft Context Sharing for Vision-Language Models
Kun Ding
Ying Wang
Pengzhang Liu
Qiang Yu
Hao Zhang
Shiming Xiang
Chunhong Pan
VPVLM
VLM
36
14
0
29 Aug 2022
CP-PINNs: Data-Driven Changepoints Detection in PDEs Using Online
  Optimized Physics-Informed Neural Networks
CP-PINNs: Data-Driven Changepoints Detection in PDEs Using Online Optimized Physics-Informed Neural Networks
Zhi-Ling Dong
Pawel Polak
PINN
31
1
0
18 Aug 2022
AI Augmented Edge and Fog Computing: Trends and Challenges
AI Augmented Edge and Fog Computing: Trends and Challenges
Shreshth Tuli
Fatemeh Mirhakimi
Samodha Pallewatta
Syed Zawad
G. Casale
B. Javadi
Feng Yan
Rajkumar Buyya
N. Jennings
34
56
0
01 Aug 2022
Innovations in Neural Data-to-text Generation: A Survey
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
49
10
0
25 Jul 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
365
3,349
0
21 Mar 2022
Geographic Adaptation of Pretrained Language Models
Geographic Adaptation of Pretrained Language Models
Valentin Hofmann
Goran Glavaš
Nikola Ljubevsić
J. Pierrehumbert
Hinrich Schütze
VLM
32
16
0
16 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
454
12,178
0
04 Mar 2022
DeepNet: Scaling Transformers to 1,000 Layers
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
50
157
0
01 Mar 2022
A Data-scalable Transformer for Medical Image Segmentation:
  Architecture, Model Efficiency, and Benchmark
A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark
Yunhe Gao
Mu Zhou
Ding Liu
Zhennan Yan
Shaoting Zhang
Dimitris N. Metaxas
ViT
MedIm
43
68
0
28 Feb 2022
Unveiling Project-Specific Bias in Neural Code Models
Unveiling Project-Specific Bias in Neural Code Models
Zhiming Li
Yanzhou Li
Tianlin Li
Mengnan Du
Bozhi Wu
Yushi Cao
Yi Li
Yang Liu
47
5
0
19 Jan 2022
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
57
215
0
14 Jan 2022
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural
  Language Question
Speech-to-SQL: Towards Speech-driven SQL Query Generation From Natural Language Question
Yuanfeng Song
Raymond Chi-Wing Wong
Xuefang Zhao
Di Jiang
44
13
0
04 Jan 2022
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Zhongping Zhang
Yiwen Gu
Bryan A. Plummer
48
2
0
11 Dec 2021
Recent Advances in Natural Language Processing via Large Pre-Trained
  Language Models: A Survey
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
88
1,044
0
01 Nov 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
284
2,022
0
31 Dec 2020
iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on
  Robots
iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on Robots
Shiqi Zhang
Piyush Khandelwal
Peter Stone
LRM
36
2
0
18 Apr 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
268
4,576
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
247
1,838
0
17 Sep 2019
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
225
623
0
03 Sep 2019
Previous
123...140141142