ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.04615
  4. Cited By
Beyond the Imitation Game: Quantifying and extrapolating the
  capabilities of language models

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

9 June 2022
Aarohi Srivastava
Abhinav Rastogi
Abhishek Rao
Abu Awal Md Shoeb
Abubakar Abid
Adam Fisch
Adam R. Brown
Adam Santoro
Aditya Gupta
Adrià Garriga-Alonso
Agnieszka Kluska
Aitor Lewkowycz
Akshat Agarwal
Alethea Power
Alex Ray
Alex Warstadt
Alexander W. Kocurek
Ali Safaya
Ali Tazarv
Alice Xiang
Alicia Parrish
Allen Nie
Aman Hussain
Amanda Askell
Amanda Dsouza
Ambrose Slone
Ameet Rahane
Anantharaman S. Iyer
Anders Andreassen
Andrea Madotto
Andrea Santilli
Andreas Stuhlmuller
Andrew M. Dai
Andrew La
Andrew Kyle Lampinen
Andy Zou
Angela Jiang
Angelica Chen
Anh Vuong
Animesh Gupta
Anna Gottardi
Antonio Norelli
Anu Venkatesh
Arash Gholamidavoodi
Arfa Tabassum
Arul Menezes
Arun Kirubarajan
A. Mullokandov
Ashish Sabharwal
Austin Herrick
Avia Efrat
Aykut Erdem
Ayla Karakacs
B. R. Roberts
B. S. Loe
Barret Zoph
Bartlomiej Bojanowski
Batuhan Ozyurt
Behnam Hedayatnia
Behnam Neyshabur
Benjamin Inden
Benno Stein
Berk Ekmekci
Bill Yuchen Lin
B. Howald
Bryan Orinion
Cameron Diao
Cameron Dour
Catherine Stinson
Cedrick Argueta
César Ferri Ramírez
Chandan Singh
Charles Rathkopf
Chenlin Meng
Chitta Baral
Chiyu Wu
Chris Callison-Burch
Chris Waites
Christian Voigt
Christopher D. Manning
Christopher Potts
Cindy Ramirez
Clara E. Rivera
Clemencia Siro
Colin Raffel
Courtney Ashcraft
Cristina Garbacea
Damien Sileo
Daniel H Garrette
Dan Hendrycks
D. Kilman
Dan Roth
Daniel Freeman
Daniel Khashabi
Daniel Levy
D. González
Danielle R. Perszyk
Danny Hernandez
Danqi Chen
Daphne Ippolito
D. Gilboa
David Dohan
D. Drakard
David Jurgens
Debajyoti Datta
Deep Ganguli
Denis Emelin
Denis Kleyko
Deniz Yuret
Derek Chen
Derek Tam
Dieuwke Hupkes
Diganta Misra
Dilyar Buzan
Dimitri Coelho Mollo
Diyi Yang
Dong-Ho Lee
Dylan Schrader
Ekaterina Shutova
E. D. Cubuk
Elad Segal
Eleanor Hagerman
Elizabeth Barnes
E. Donoway
Ellie Pavlick
Emanuele Rodolà
Emma Lam
Eric Chu
Eric Tang
Erkut Erdem
Ernie Chang
Ethan A. Chi
Ethan Dyer
E. Jerzak
Ethan Kim
Eunice Engefu Manyasi
Evgenii Zheltonozhskii
Fanyue Xia
F. Siar
Fernando Martínez-Plumed
Francesca Happé
François Chollet
Frieda Rong
Gaurav Mishra
Genta Indra Winata
Gerard de Melo
Germán Kruszewski
Giambattista Parascandolo
Giorgio Mariani
Gloria Xinyue Wang
Gonzalo Jaimovitch-López
Gregor Betz
Guy Gur-Ari
Hana Galijasevic
Hannah Kim
Hannah Rashkin
Hannaneh Hajishirzi
Harsh Mehta
H. Bogar
Henry Shevlin
Hinrich Schütze
Hiromu Yakura
Hongming Zhang
Hugh Mee Wong
Ian Ng
Isaac Noble
Jaap Jumelet
Jack Geissinger
John Kernion
Jacob Hilton
Jaehoon Lee
J. F. Fisac
James B. Simon
James Koppel
James Zheng
James Zou
Jan Kocoñ
Jana Thompson
Janelle Wingfield
Jared Kaplan
Jarema Radom
Jascha Narain Sohl-Dickstein
Jason Phang
Jason W. Wei
J. Yosinski
Jekaterina Novikova
Jelle Bosscher
Jennifer Marsh
Jeremy Kim
Jeroen Taal
Jesse Engel
Jesujoba Oluwadara Alabi
Jiacheng Xu
Jiaming Song
Jillian Tang
Jane W Waweru
John Burden
John Miller
John U. Balis
Jonathan Batchelder
Jonathan Berant
Jorg Frohberg
Jos Rozen
Jose Hernandez-Orallo
Joseph Boudeman
Joseph Guerr
Joseph Jones
Joshua B. Tenenbaum
Joshua S. Rule
Joyce Chua
Kamil Kanclerz
Karen Livescu
K. Krauth
Karthik Gopalakrishnan
Katerina Ignatyeva
K. Markert
Kaustubh D. Dhole
Kevin Gimpel
Kevin Omondi
Kory W. Mathewson
Kristen Chiafullo
Ksenia Shkaruta
Kumar Shridhar
Kyle McDonell
Kyle Richardson
Laria Reynolds
Leo Gao
Li Zhang
Liam Dugan
Lianhui Qin
Lidia Contreras Ochando
Louis-Philippe Morency
Luca Moschella
Luca Lam
Lucy Noble
Ludwig Schmidt
Luheng He
Luis Oliveros Colón
Luke Metz
Lutfi Kerem cSenel
Maarten Bosma
Maarten Sap
Maartje ter Hoeve
Maheen Farooqi
Manaal Faruqui
Mantas Mazeika
Marco Baturan
Marco Marelli
Marco Maru
Maria Jose Ram’irez Quintana
M. Tolkiehn
Mario Giulianelli
Martha Lewis
Martin Potthast
Matthew L. Leavitt
Matthias Hagen
M. Schubert
Medina Baitemirova
Melody Arnaud
M. McElrath
Michael A. Yee
Michael Cohen
Michael Gu
Michael Ivanitskiy
Michael Starritt
Michael Strube
Michal Swkedrowski
Michele Bevilacqua
Michihiro Yasunaga
Mihir Kale
Mike Cain
Mimee Xu
Mirac Suzgun
Mitch Walker
Monica Tiwari
Mohit Bansal
Moin Aminnaseri
Mor Geva
Mozhdeh Gheini
T. MukundVarma
Nanyun Peng
Nathan A. Chi
Nayeon Lee
Neta Gur-Ari Krakover
Nicholas Cameron
Nicholas Roberts
Nick Doiron
Nicole Martinez
Nikita Nangia
Niklas Deckers
Niklas Muennighoff
N. Keskar
Niveditha Iyer
Noah Constant
Noah Fiedel
Nuan Wen
Oliver Zhang
Omar Agha
Omar Elbaghdadi
Omer Levy
Owain Evans
Pablo Antonio Moreno Casares
P. Doshi
Pascale Fung
Paul Pu Liang
Paul Vicol
Pegah Alipoormolabashi
Peiyuan Liao
Percy Liang
Peter Chang
P. Eckersley
Phu Mon Htut
P. Hwang
P. Milkowski
P. Patil
Pouya Pezeshkpour
Priti Oli
Qiaozhu Mei
Qing Lyu
Qinlang Chen
Rabin Banjade
Rachel Etta Rudolph
Raefer Gabriel
Rahel Habacker
Ramon Risco
Raphael Milliere
Rhythm Garg
Richard Barnes
Rif A. Saurous
Riku Arakawa
Robbe Raymaekers
Robert Frank
Rohan Sikand
Roman Novak
Roman Sitelew
Ronan Le Bras
Rosanne Liu
Rowan Jacobs
Rui Zhang
Ruslan Salakhutdinov
Ryan A. Chi
Ryan Lee
Ryan Stovall
Ryan Teehan
Rylan Yang
Sahib Singh
Saif M. Mohammad
Sajant Anand
Sam Dillavou
Sam Shleifer
Sam Wiseman
Samuel Gruetter
Samuel R. Bowman
S. Schoenholz
Sanghyun Han
Sanjeev Kwatra
Sarah A. Rous
Sarik Ghazarian
Sayan Ghosh
Sean Casey
Sebastian Bischoff
Sebastian Gehrmann
Sebastian Schuster
Sepideh Sadeghi
Shadi S. Hamdan
Sharon Zhou
Shashank Srivastava
Sherry Shi
Shikhar Singh
Shima Asaadi
S. Gu
Shubh Pachchigar
Shubham Toshniwal
Shyam Upadhyay
Shyamolima Debnath
Debnath
Siamak Shakeri
Simon Thormeyer
Simone Melzi
Siva Reddy
S. Makini
Soo-hwan Lee
Spencer Bradley Torene
Sriharsha Hatwar
S. Dehaene
Stefan Divic
Stefano Ermon
Stella Biderman
Stephanie Lin
Stephen Prasad
Steven T Piantadosi
Stuart M. Shieber
Summer Misherghi
S. Kiritchenko
Swaroop Mishra
Tal Linzen
Tal Schuster
Tao Li
Tao Yu
Tariq Ali
Tatsunori Hashimoto
Te-Lin Wu
T. Desbordes
Theodore Rothschild
Thomas Phan
Tianle Wang
Tiberius Nkinyili
Timo Schick
T. Kornev
T. Tunduny
Tobias Gerstenberg
T. Chang
Trishala Neeraj
Tushar Khot
Tyler Shultz
Uri Shaham
Vedant Misra
Vera Demberg
Victoria Nyamai
Vikas Raunak
V. Ramasesh
Vinay Uday Prabhu
Vishakh Padmakumar
Vivek Srikumar
W. Fedus
William Saunders
William Zhang
Wout Vossen
Xiang Ren
Xiaoyu Tong
Xinran Zhao
Xinyi Wu
Xudong Shen
Yadollah Yaghoobzadeh
Yair Lakretz
Yangqiu Song
Yasaman Bahri
Yejin Choi
Yichi Yang
Yiding Hao
Yifu Chen
Yonatan Belinkov
Yu Hou
Yufang Hou
Yuntao Bai
Zachary Seid
Zhuoye Zhao
Zijian Wang
Zijie J. Wang
Zirui Wang
Ziyi Wu
    ELM
ArXivPDFHTML

Papers citing "Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models"

50 / 111 papers shown
Title
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu
Bowen Shi
Avi Caciularu
Idan Szpektor
Arman Cohan
115
4
0
30 Oct 2024
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions
Yujuan Fu
Özlem Uzuner
Meliha Yetisgen
Fei Xia
76
5
0
24 Oct 2024
In Context Learning and Reasoning for Symbolic Regression with Large Language Models
In Context Learning and Reasoning for Symbolic Regression with Large Language Models
Samiha Sharlin
Tyler R. Josephson
ReLM
LLMAG
LRM
70
2
0
22 Oct 2024
Compute-Constrained Data Selection
Compute-Constrained Data Selection
Junjie Oscar Yin
Alexander M. Rush
89
1
0
21 Oct 2024
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors
Georgios Chochlakis
Alexandros Potamianos
Kristina Lerman
Shrikanth Narayanan
79
2
0
17 Oct 2024
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
Jiahao Yuan
Dehui Du
Hao Zhang
Zixiang Di
Usman Naseem
LRM
51
5
0
16 Oct 2024
FLARE: Faithful Logic-Aided Reasoning and Exploration
FLARE: Faithful Logic-Aided Reasoning and Exploration
Erik Arakelyan
Pasquale Minervini
Pat Verga
Patrick Lewis
Isabelle Augenstein
ReLM
LRM
99
2
0
14 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
58
11
0
13 Oct 2024
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
H. Xia
Zhengbang Yang
Junbo Zou
Rhys Tracy
Yuqing Wang
...
Xun Shao
Zhuoqing Xie
Yuan-fang Wang
Weining Shen
Hanjie Chen
ReLM
LRM
ELM
60
3
0
11 Oct 2024
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Rushang Karia
Daniel Bramblett
D. Dobhal
Siddharth Srivastava
ELM
LRM
49
0
0
11 Oct 2024
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Philipp Guldimann
Alexander Spiridonov
Robin Staab
Nikola Jovanović
Mark Vero
...
Mislav Balunović
Nikola Konstantinov
Pavol Bielik
Petar Tsankov
Martin Vechev
ELM
67
6
0
10 Oct 2024
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
Hyun Ryu
Gyeongman Kim
Hyemin S. Lee
Eunho Yang
LRM
73
6
0
10 Oct 2024
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets
Tommaso Giorgi
Lorenzo Cima
T. Fagni
Marco Avvenuti
S. Cresci
93
10
0
10 Oct 2024
Undesirable Memorization in Large Language Models: A Survey
Undesirable Memorization in Large Language Models: A Survey
Ali Satvaty
Suzan Verberne
Fatih Turkmen
ELM
PILM
124
7
0
03 Oct 2024
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
Yu Ying Chiu
Liwei Jiang
Yejin Choi
79
5
0
03 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
107
0
0
02 Oct 2024
Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks
Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks
Huanxuan Liao
Shizhu He
Yao Xu
Yuanzhe Zhang
Kang Liu
Jun Zhao
LRM
77
3
0
20 Sep 2024
Bilingual Evaluation of Language Models on General Knowledge in University Entrance Exams with Minimal Contamination
Bilingual Evaluation of Language Models on General Knowledge in University Entrance Exams with Minimal Contamination
Eva Sánchez Salido
Roser Morante
Julio Gonzalo
Guillermo Marco
Jorge Carrillo-de-Albornoz
...
Enrique Amigó
Andrés Fernández
Alejandro Benito-Santos
Adrián Ghajari Espinosa
Victor Fresno
ELM
73
0
0
19 Sep 2024
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
Jin Jiang
Yuchen Yan
Yang Liu
Yonggang Jin
Shuai Peng
Hao Fei
Xunliang Cai
Yixin Cao
Liangcai Gao
Zhi Tang
LRM
85
5
0
19 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
155
101
0
18 Sep 2024
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
Guobin Shen
Dongcheng Zhao
Aorigele Bao
Xiang He
Yiting Dong
Yi Zeng
49
1
0
14 Sep 2024
Assessing SPARQL capabilities of Large Language Models
Assessing SPARQL capabilities of Large Language Models
Lars-Peter Meyer
Johannes Frey
Felix Brei
Natanael Arndt
53
7
0
09 Sep 2024
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
Wenxuan Zhang
Philip Torr
Mohamed Elhoseiny
Adel Bibi
123
10
0
27 Aug 2024
CogLM: Tracking Cognitive Development of Large Language Models
CogLM: Tracking Cognitive Development of Large Language Models
Xinglin Wang
Peiwen Yuan
Shaoxiong Feng
Yiwei Li
Boyuan Pan
Heda Wang
Yao Hu
Kan Li
ELM
76
0
0
17 Aug 2024
Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay
Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay
Gonçalo Hora de Carvalho
Oscar Knap
R. Pollice
ReLM
ELM
LRM
50
1
0
12 Jul 2024
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Riccardo Cantini
Giada Cosenza
A. Orsino
Domenico Talia
AAML
70
6
0
11 Jul 2024
Training on the Test Task Confounds Evaluation and Emergence
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
84
7
1
10 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
102
10
0
09 Jul 2024
Generalizability of experimental studies
Generalizability of experimental studies
Federico Matteucci
Vadim Arzamasov
Jose Cribeiro-Ramallo
Marco Heyden
Konstantin Ntounas
Klemens Bohm
71
0
0
25 Jun 2024
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Han Jiang
Xiaoyuan Yi
Zhihua Wei
Ziang Xiao
Shu Wang
Xing Xie
ELM
ALM
88
8
0
20 Jun 2024
DCA-Bench: A Benchmark for Dataset Curation Agents
DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang
Yingzhuo Yu
Jin Huang
Xingjian Zhang
Jiaqi Ma
52
1
0
11 Jun 2024
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Seungone Kim
Juyoung Suk
Ji Yong Cho
Shayne Longpre
Chaeeun Kim
...
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
ELM
ALM
LM&MA
130
36
0
09 Jun 2024
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Is On-Device AI Broken and Exploitable? Assessing the Trust and Ethics in Small Language Models
Kalyan Nakka
Jimmy Dani
Nitesh Saxena
108
1
0
08 Jun 2024
When does compositional structure yield compositional generalization? A kernel theory
When does compositional structure yield compositional generalization? A kernel theory
Samuel Lippl
Kim Stachenfeld
NAI
CoGe
150
8
0
26 May 2024
Asymptotic theory of in-context learning by linear attention
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
Cengiz Pehlevan
65
12
0
20 May 2024
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
Zhuoxuan Jiang
Haoyuan Peng
Shanshan Feng
Fan Li
Dongsheng Li
KELM
LRM
64
14
0
09 May 2024
Chain of Thoughtlessness? An Analysis of CoT in Planning
Chain of Thoughtlessness? An Analysis of CoT in Planning
Kaya Stechly
Karthik Valmeekam
Subbarao Kambhampati
LRM
LM&Ro
102
47
0
08 May 2024
Folded Context Condensation in Path Integral Formalism for Infinite Context Transformers
Folded Context Condensation in Path Integral Formalism for Infinite Context Transformers
Won-Gi Paeng
Daesuk Kwon
Kyungwon Jeong
Honggyo Suh
90
0
0
07 May 2024
From Matching to Generation: A Survey on Generative Information Retrieval
From Matching to Generation: A Survey on Generative Information Retrieval
Xiaoxi Li
Jiajie Jin
Yujia Zhou
Yuyao Zhang
Peitian Zhang
Yutao Zhu
Zhicheng Dou
3DV
123
50
0
23 Apr 2024
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems
Qihuang Zhong
Kang Wang
Ziyang Xu
Juhua Liu
Liang Ding
Bo Du
LRM
AIMat
73
4
0
23 Apr 2024
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Amir Saeidi
Shivanshu Verma
Chitta Baral
Chitta Baral
ALM
61
23
0
23 Apr 2024
Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition
Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition
Kehua Feng
Keyan Ding
Hongzhi Tan
Kede Ma
Zhihua Wang
...
Yuzhou Cheng
Ge Sun
Guozhou Zheng
Qiang Zhang
H. Chen
60
12
0
10 Apr 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
89
37
0
08 Apr 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
115
11
0
25 Mar 2024
Understanding Emergent Abilities of Language Models from the Loss Perspective
Understanding Emergent Abilities of Language Models from the Loss Perspective
Zhengxiao Du
Aohan Zeng
Yuxiao Dong
Jie Tang
UQCV
LRM
99
51
0
23 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
Frank Breitinger
Mark Scanlon
83
9
0
29 Feb 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
141
18
0
28 Feb 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
Beyza Ermis
CLL
KELM
LRM
84
26
0
27 Feb 2024
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
Pengjie Ren
Chengshun Shi
Shiguang Wu
Mengqi Zhang
Zhaochun Ren
Maarten de Rijke
Zhumin Chen
Jiahuan Pei
MoE
100
14
0
27 Feb 2024
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Kristian Lum
Jacy Reese Anthis
Chirag Nagpal
Alex DÁmour
Alexander D’Amour
64
16
0
20 Feb 2024
Previous
123
Next