ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,272 papers shown
Title
SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models' Knowledge of Indian Culture
SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models' Knowledge of Indian Culture
Arijit Maji
Raghvendra Kumar
Akash Ghosh
Anushka
Sriparna Saha
ELM
14
0
0
18 Jun 2025
Finance Language Model Evaluation (FLaME)
Finance Language Model Evaluation (FLaME)
Glenn Matlin
Mika Okamoto
Huzaifa Pardawala
Yang Yang
Sudheer Chava
AIFinLRM
23
0
0
18 Jun 2025
Uncovering Intention through LLM-Driven Code Snippet Description Generation
Uncovering Intention through LLM-Driven Code Snippet Description Generation
Yusuf Sulistyo Nugroho
Farah Danisha Salam
Brittany Reid
R. Kula
Kazumasa Shimari
Kenichi Matsumoto
12
0
0
18 Jun 2025
Gender Inclusivity Fairness Index (GIFI): A Multilevel Framework for Evaluating Gender Diversity in Large Language Models
Gender Inclusivity Fairness Index (GIFI): A Multilevel Framework for Evaluating Gender Diversity in Large Language Models
Zhengyang Shan
Emily Ruth Diana
Jiawei Zhou
26
0
0
18 Jun 2025
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
Raghvendra Kumar
S. A. Mohammed Salman
Aryan Sahu
Tridib Nandi
Pragathi Y. P.
S. Saha
Jose G. Moreno
12
0
0
18 Jun 2025
NetRoller: Interfacing General and Specialized Models for End-to-End Autonomous Driving
NetRoller: Interfacing General and Specialized Models for End-to-End Autonomous Driving
Ren Xin
Hongji Liu
Xiaodong Mei
Wenru Liu
Maosheng Ye
Zhili Chen
Jun Ma
19
0
0
17 Jun 2025
GRAM: A Generative Foundation Reward Model for Reward Generalization
GRAM: A Generative Foundation Reward Model for Reward Generalization
Chenglong Wang
Yang Gan
Yifu Huo
Yongyu Mu
Qiaozhi He
...
Bei Li
Tong Xiao
Chunliang Zhang
Tongran Liu
Jingbo Zhu
ALMOffRLLRM
39
0
0
17 Jun 2025
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
Di He
Ajay Jaiswal
Songjun Tu
Li Shen
Ganzhao Yuan
Shiwei Liu
L. Yin
24
0
0
17 Jun 2025
FedOne: Query-Efficient Federated Learning for Black-box Discrete Prompt Learning
FedOne: Query-Efficient Federated Learning for Black-box Discrete Prompt Learning
Ganyu Wang
Jinjie Fang
Maxwell J. Ying
Bin Gu
Xi Chen
Boyu Wang
Charles Ling
FedML
12
0
0
17 Jun 2025
Efficient Serving of LLM Applications with Probabilistic Demand Modeling
Efficient Serving of LLM Applications with Probabilistic Demand Modeling
Yifei Liu
Zuo Gan
Zhenghao Gan
Weiye Wang
Chen Chen
...
Xusheng Chen
Zhenhua Han
Yifei Zhu
Shixuan Sun
Minyi Guo
12
0
0
17 Jun 2025
Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data
Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data
Anton Changalidis
Aki Härmä
15
0
0
17 Jun 2025
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM
ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM
Yujun Wang
Jinhe Bi
Yunpu Ma
Soeren Pirk
MLLM
36
0
0
17 Jun 2025
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Mathurin Videau
Badr Youbi Idrissi
Alessandro Leite
Marc Schoenauer
O. Teytaud
David Lopez-Paz
20
0
0
17 Jun 2025
Foundation Artificial Intelligence Models for Health Recognition Using Face Photographs (FAHR-Face)
Foundation Artificial Intelligence Models for Health Recognition Using Face Photographs (FAHR-Face)
Fridolin Haugg
Grace Lee
John He
Leonard Nürnberg
Dennis Bontempi
...
Christian Guthier
Benjamin H. Kann
Vadim N. Gladyshev
Hugo J. W. L. Aerts
Raymond H. Mak
12
0
0
17 Jun 2025
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
Black Forest Labs
Stephen Batifol
A. Blattmann
Frederic Boesel
Saksham Consul
...
Dustin Podell
Robin Rombach
Harry Saini
Axel Sauer
Luke Smith
DiffM
15
0
0
17 Jun 2025
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot
Revisiting Chain-of-Thought Prompting: Zero-shot Can Be Stronger than Few-shot
Xiang Cheng
Chengyan Pan
Minjun Zhao
Deyang Li
Fangchao Liu
Xinyu Zhang
Xiao Zhang
Yong Liu
ReLMLRM
39
0
0
17 Jun 2025
Unified Representation Space for 3D Visual Grounding
Unified Representation Space for 3D Visual Grounding
Yinuo Zheng
Lipeng Gu
Honghua Chen
Liangliang Nan
Mingqiang Wei
14
0
0
17 Jun 2025
Doppelganger Method: Breaking Role Consistency in LLM Agent via Prompt-based Transferable Adversarial Attack
Doppelganger Method: Breaking Role Consistency in LLM Agent via Prompt-based Transferable Adversarial Attack
Daewon Kang
YeongHwan Shin
Doyeon Kim
Kyu-Hwan Jung
Meong Hi Son
AAMLSILM
40
0
0
17 Jun 2025
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Daniel D'souza
Julia Kreutzer
Adrien Morisot
Ahmet Üstün
Sara Hooker
14
0
0
17 Jun 2025
Don't throw the baby out with the bathwater: How and why deep learning for ARC
Don't throw the baby out with the bathwater: How and why deep learning for ARC
Jack Cole
Mohamed Osman
LRM
35
0
0
17 Jun 2025
Explainable Detection of Implicit Influential Patterns in Conversations via Data Augmentation
Explainable Detection of Implicit Influential Patterns in Conversations via Data Augmentation
Sina Abdidizaji
Md. Kowsher
Niloofar Yousefi
Ivan I. Garibay
17
0
0
17 Jun 2025
Re-Initialization Token Learning for Tool-Augmented Large Language Models
Re-Initialization Token Learning for Tool-Augmented Large Language Models
Chenghao Li
Liu Liu
B. Yu
Jiayan Qiu
Yibing Zhan
LLMAGCLLKELM
35
0
0
17 Jun 2025
Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis
Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis
Martina Pastorino
Michael Alibani
Nicola Acito
Gabriele Moser
9
0
0
16 Jun 2025
PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers
PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers
Lukas Schiesser
Cornelius Wolff
Sophie Haas
Simon Pukrop
VLM
15
0
0
16 Jun 2025
FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design
FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design
Kai Lan
Jiayong Zhu
Jiangtong Li
Dawei Cheng
Guang-Sheng Chen
Changjun Jiang
LRM
9
0
0
16 Jun 2025
Dynamic Context-oriented Decomposition for Task-aware Low-rank Adaptation with Less Forgetting and Faster Convergence
Dynamic Context-oriented Decomposition for Task-aware Low-rank Adaptation with Less Forgetting and Faster Convergence
Yibo Yang
Sihao Liu
Chuan Rao
Bang An
Tiancheng Shen
Philip Torr
Ming-Hsuan Yang
Bernard Ghanem
9
0
0
16 Jun 2025
Detecting Hard-Coded Credentials in Software Repositories via LLMs
Detecting Hard-Coded Credentials in Software Repositories via LLMs
Chidera Biringa
Gökhan Kul
15
0
0
16 Jun 2025
Assessing the Limits of In-Context Learning beyond Functions using Partially Ordered Relation
Assessing the Limits of In-Context Learning beyond Functions using Partially Ordered Relation
Debanjan Dutta
Faizanuddin Ansari
Swagatam Das
15
0
0
16 Jun 2025
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
Zhengyu Hu
Jianxun Lian
Zheyuan Xiao
Seraphina Zhang
Tianfu Wang
Nicholas Jing Yuan
Xing Xie
Hui Xiong
ELMLRM
14
0
0
16 Jun 2025
Decompositional Reasoning for Graph Retrieval with Large Language Models
Decompositional Reasoning for Graph Retrieval with Large Language Models
Valentin Six
Evan Dufraisse
Gaël de Chalendar
ReLMLRM
12
0
0
16 Jun 2025
BOW: Bottlenecked Next Word Exploration
BOW: Bottlenecked Next Word Exploration
Ming shen
Zhikun Xu
Xiao Ye
Jacob Dineen
Ben Zhou
OffRLLRM
17
0
0
16 Jun 2025
Distinct Computations Emerge From Compositional Curricula in In-Context Learning
Distinct Computations Emerge From Compositional Curricula in In-Context Learning
Jin Hwa Lee
Andrew Kyle Lampinen
Aaditya K. Singh
Andrew Saxe
23
0
0
16 Jun 2025
Rethinking Test-Time Scaling for Medical AI: Model and Task-Aware Strategies for LLMs and VLMs
Rethinking Test-Time Scaling for Medical AI: Model and Task-Aware Strategies for LLMs and VLMs
Gyutaek Oh
Seoyeon Kim
Sangjoon Park
Byung-Hoon Kim
LM&MALRM
17
0
0
16 Jun 2025
CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model
CFBenchmark-MM: Chinese Financial Assistant Benchmark for Multimodal Large Language Model
Jiangtong Li
Yiyun Zhu
Dawei Cheng
Zhijun Ding
Changjun Jiang
14
0
0
16 Jun 2025
AI-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns
AI-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns
Evgeny Markhasin
13
0
0
16 Jun 2025
Load Balancing Mixture of Experts with Similarity Preserving Routers
Load Balancing Mixture of Experts with Similarity Preserving Routers
Nabil Omi
S. Sen
Ali Farhadi
MoE
33
0
0
16 Jun 2025
A Survey on World Models Grounded in Acoustic Physical Information
A Survey on World Models Grounded in Acoustic Physical Information
Xiaoliang Chen
Le Chang
Xin Yu
Yunhe Huang
Xianling Tu
SyDaAI4CE
42
0
0
16 Jun 2025
Rectifying Privacy and Efficacy Measurements in Machine Unlearning: A New Inference Attack Perspective
Rectifying Privacy and Efficacy Measurements in Machine Unlearning: A New Inference Attack Perspective
Nima Naderloui
Shenao Yan
Binghui Wang
Jie Fu
Wendy Hui Wang
Weiran Liu
Yuan Hong
AAML
15
0
0
16 Jun 2025
Scaling Algorithm Distillation for Continuous Control with Mamba
Scaling Algorithm Distillation for Continuous Control with Mamba
Samuel Beaussant
Mehdi Mounsif
17
0
0
16 Jun 2025
Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMs
Mitigating Safety Fallback in Editing-based Backdoor Injection on LLMs
Houcheng Jiang
Zetong Zhao
Junfeng Fang
Haokai Ma
Ruipeng Wang
Yang Deng
Xiang Wang
Xiangnan He
KELMAAML
14
0
0
16 Jun 2025
Understand the Implication: Learning to Think for Pragmatic Understanding
Understand the Implication: Learning to Think for Pragmatic Understanding
S. Sravanthi
Kishan Maharaj
Sravani Gunnu
Abhijit Mishra
Pushpak Bhattacharyya
ReLMLRM
17
0
0
16 Jun 2025
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices
Mingxue Xu
Y. Xu
Danilo Mandic
26
0
0
16 Jun 2025
Prefix-Tuning+: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
Prefix-Tuning+: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
Haonan Wang
Brian K Chen
Siquan Li
Xinhe Liang
Hwee Kuan Lee
Kenji Kawaguchi
Tianyang Hu
16
0
0
16 Jun 2025
Humans, Machine Learning, and Language Models in Union: A Cognitive Study on Table Unionability
Humans, Machine Learning, and Language Models in Union: A Cognitive Study on Table Unionability
Sreeram Marimuthu
Nina Klimenkova
Roee Shraga
13
0
0
15 Jun 2025
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
Yashothara Shanmugarasa
Ming Ding
M. Chamikara
Thierry Rakotoarivelo
PILMAILaw
60
0
0
15 Jun 2025
MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models
MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models
Yan Sun
Qixin Zhang
Zhiyuan Yu
Xikun Zhang
Li Shen
Dacheng Tao
11
0
0
15 Jun 2025
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies
ComplexBench-Edit: Benchmarking Complex Instruction-Driven Image Editing via Compositional Dependencies
Chenglin Wang
Yucheng Zhou
Qianning Wang
Zhe Wang
Kai Zhang
CoGe
12
0
0
15 Jun 2025
Unleashing Diffusion and State Space Models for Medical Image Segmentation
Unleashing Diffusion and State Space Models for Medical Image Segmentation
Rong Wu
Ziqi Chen
Liming Zhong
Heng Li
Hai Shu
MedIm
11
0
0
15 Jun 2025
Assessing the Role of Data Quality in Training Bilingual Language Models
Assessing the Role of Data Quality in Training Bilingual Language Models
Skyler Seto
Maartje ter Hoeve
Maureen de Seyssel
David Grangier
7
0
0
15 Jun 2025
SheetMind: An End-to-End LLM-Powered Multi-Agent Framework for Spreadsheet Automation
SheetMind: An End-to-End LLM-Powered Multi-Agent Framework for Spreadsheet Automation
Ruiyan Zhu
Xi Cheng
Ke Liu
Brian Zhu
Daniel Jin
Neeraj Parihar
Zhoutian Xu
Oliver Gao
LMTD
7
0
0
14 Jun 2025
Previous
12345...244245246
Next