ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00118
  4. Cited By
Gemma 2: Improving Open Language Models at a Practical Size

Gemma 2: Improving Open Language Models at a Practical Size

31 July 2024
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
Léonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ramé
Johan Ferret
Peter J. Liu
P. Tafti
Abe Friesen
Michelle Casbon
Sabela Ramos
Ravin Kumar
Charline Le Lan
Sammy Jerome
Anton Tsitsulin
Nino Vieillard
Piotr Stańczyk
Sertan Girgin
Nikola Momchev
Matt Hoffman
S. Thakoor
Jean-Bastien Grill
Behnam Neyshabur
Olivier Bachem
Alanna Walton
Aliaksei Severyn
Alicia Parrish
Aliya Ahmad
Allen Hutchison
Alvin Abdagic
Amanda Carl
Amy Shen
Andy Brock
Andy Coenen
Anthony Laforge
Antonia Paterson
Ben Bastian
Bilal Piot
Boxi Wu
Brandon Royal
Charlie Chen
Chintu Kumar
Chris Perry
Christoper A. Welty
Christopher A. Choquette-Choo
Danila Sinopalnikov
David Weinberger
Dimple Vijaykumar
Dominika Rogoziñska
D. Herbison
Elisa Bandy
Emma Wang
Eric Noland
Erica Moreira
Evan Senter
Evgenii Eltyshev
Francesco Visin
Gabriel Rasskin
Gary Wei
Glenn Cameron
Gus Martins
Hadi Hashemi
Hanna Klimczak-Pluciñska
Harleen Batra
H. Dhand
Ivan Nardini
Jacinda Mein
Jack Zhou
James Svensson
Jeff Stanway
Jetha Chan
Jin Zhou
Joana Carrasqueira
Joana Iljazi
Jocelyn Becker
Joe Fernandez
Joost R. van Amersfoort
Josh Gordon
Josh Lipschultz
Joshua Newlan
Junsong Ji
Kareem Mohamed
Kartikeya Badola
Kat Black
Katie Millican
Keelin McDonell
Kelvin Nguyen
Kiranbir Sodhia
Kish Greene
Lars Lowe Sjoesund
Lauren Usui
Laurent Sifre
L. Heuermann
Leticia Lago
Lilly McNealus
Livio Baldini Soares
Logan Kilpatrick
Lucas Dixon
Luciano Martins
Machel Reid
Manvinder Singh
Mark Iverson
Martin Gorner
Mat Velloso
Mateo Wirth
Matt Davidow
Matt Miller
Matthew Rahtz
Matthew Watson
Meg Risdal
Mehran Kazemi
Michael Moynihan
Ming Zhang
Minsuk Kahng
Minwoo Park
Mofi Rahman
Mohit Khatwani
Natalie Dao
Nenshad Bardoliwalla
Nesh Devanathan
Neta Dumai
Nilay Chauhan
O. Wahltinez
Pankil Botarda
Parker Barnes
P. Barham
Paul Michel
Pengchong Jin
Petko Georgiev
Phil Culliton
Pradeep Kuppala
Ramona Comanescu
Ramona Merhej
Reena Jana
R. Rokni
Rishabh Agarwal
Ryan Mullins
Samaneh Saadat
Sara Mc Carthy
Sarah Perrin
Sébastien Arnold
Sebastian Krause
Shengyang Dai
S. Garg
Shruti Sheth
S. Ronstrom
Susan Chan
Timothy Jordan
Ting-To Yu
Tom Eccles
Tom Hennigan
Tomás Kociský
Tulsee Doshi
Vihan Jain
Vikas Yadav
Vilobh Meshram
Vishal Dharmadhikari
Warren Barkley
Wei Wei
Wenming Ye
Woohyun Han
Woosuk Kwon
Xiang Xu
Zhe Shen
Zhitao Gong
Zichuan Wei
Victor Cotruta
Phoebe Kirk
Anand Rao
Minh Giang
Ludovic Peran
T. Warkentin
Eli Collins
Joelle Barral
Zoubin Ghahramani
R. Hadsell
D. Sculley
Jeanine Banks
Anca Dragan
Slav Petrov
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clement Farabet
Elena Buchatskaya
Sebastian Borgeaud
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
    VLM
    MoE
    OSLM
ArXivPDFHTML

Papers citing "Gemma 2: Improving Open Language Models at a Practical Size"

50 / 210 papers shown
Title
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao
Alessio Devoto
Giwon Hong
Xiaotang Du
Aryo Pradipta Gema
Hongru Wang
Xuanli He
Kam-Fai Wong
Pasquale Minervini
KELM
LLMSV
50
20
0
21 Oct 2024
AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context
AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context
Naba Rizvi
Harper Strickland
Daniel Gitelman
Tristan Cooper
Alexis Morales-Flores
...
Haaset Owens
Saleha Ahmedi
Isha Khirwadkar
Imani Munyaka
Nedjma Ousidhoum
39
0
0
21 Oct 2024
Montessori-Instruct: Generate Influential Training Data Tailored for
  Student Learning
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li
Zichun Yu
Chenyan Xiong
SyDa
47
1
0
18 Oct 2024
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Jialu Tang
Tong Xia
Yuan Lu
Cecilia Mascolo
Aaqib Saeed
AI4MH
59
2
0
18 Oct 2024
Decomposing The Dark Matter of Sparse Autoencoders
Decomposing The Dark Matter of Sparse Autoencoders
Joshua Engels
Logan Riggs
Max Tegmark
LLMSV
65
10
0
18 Oct 2024
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
Raviraj Joshi
Kanishk Singla
Anusha Kamath
Raunak Kalani
Rakesh Paul
Utkarsh Vaidya
Sanjay Singh Chauhan
Niranjan Wartikar
Eileen Long
SyDa
CLL
40
2
0
18 Oct 2024
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
252
2
0
17 Oct 2024
Interpreting token compositionality in LLMs: A robustness analysis
Interpreting token compositionality in LLMs: A robustness analysis
Nura Aljaafari
Danilo S. Carvalho
André Freitas
40
1
0
16 Oct 2024
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
S. Gorti
Ilan Gofman
Zhaoyan Liu
Jiapeng Wu
Noël Vouitsis
Guangwei Yu
Jesse C. Cresswell
Rasa Hosseinzadeh
SyDa
60
7
0
16 Oct 2024
In-context KV-Cache Eviction for LLMs via Attention-Gate
In-context KV-Cache Eviction for LLMs via Attention-Gate
Zihao Zeng
Bokai Lin
Tianqi Hou
Hao Zhang
Zhijie Deng
38
1
0
15 Oct 2024
Bias Similarity Across Large Language Models
Bias Similarity Across Large Language Models
Hyejun Jeong
Shiqing Ma
Amir Houmansadr
59
0
0
15 Oct 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion
  Transformers
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Zhekai Zhang
Ligeng Zhu
Yaojie Lu
Song Han
VLM
57
54
0
14 Oct 2024
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
40
5
0
14 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
VLM
77
4
0
14 Oct 2024
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
Yein Park
Chanwoong Yoon
Jungwoo Park
Donghyeon Lee
Minbyul Jeong
Jaewoo Kang
KELM
70
1
0
13 Oct 2024
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback
Heng Chang
Miao Zheng
Fan Yang
Guosheng Dong
Bin Cui
Xin Wu
Zenan Zhou
Wentao Zhang
ALM
53
6
0
12 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
58
9
0
10 Oct 2024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Shenao Zhang
Zhihan Liu
Boyi Liu
Yanzhe Zhang
Yingxiang Yang
Yunxing Liu
Liyu Chen
Tao Sun
Ziyi Wang
101
3
0
10 Oct 2024
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
Binghai Wang
Weipeng Chen
Ji-Rong Wen
68
0
0
10 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
Jia-Nan Li
Weiyao Lin
VLM
40
1
0
09 Oct 2024
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
Rana Muhammad Shahroz Khan
Pingzhi Li
Sukwon Yun
Zhenyu Wang
S. Nirjon
Chau-Wai Wong
Tianlong Chen
KELM
45
2
0
08 Oct 2024
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe
Yuxin Xiao
Shujian Zhang
Wenxuan Zhou
Marzyeh Ghassemi
Sanqiang Zhao
213
0
0
07 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
17
0
06 Oct 2024
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Hanyang Zhao
Genta Indra Winata
Anirban Das
Shi-Xiong Zhang
D. Yao
Wenpin Tang
Sambit Sahu
67
7
0
05 Oct 2024
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Zixuan Wang
Chi-Keung Tang
Chi-Keung Tang
DiffM
VGen
LLMAG
54
4
0
04 Oct 2024
How Much Can We Forget about Data Contamination?
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
54
1
0
04 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
55
8
0
04 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAML
LLMAG
ELM
59
22
0
03 Oct 2024
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning
  Large Language Models
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
Zefang Liu
Yinzhu Quan
46
0
0
02 Oct 2024
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Integrative Decoding: Improve Factuality via Implicit Self-consistency
Yi Cheng
Xiao Liang
Yeyun Gong
Wen Xiao
Song Wang
...
Wenjie Li
Jian Jiao
Qi Chen
Peng Cheng
Wayne Xiong
HILM
65
1
0
02 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
58
0
0
02 Oct 2024
House of Cards: Massive Weights in LLMs
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
48
1
0
02 Oct 2024
Enhancing elusive clues in knowledge learning by contrasting attention of language models
Enhancing elusive clues in knowledge learning by contrasting attention of language models
Jian Gao
Xiao Zhang
Ji Wu
Miao Li
50
0
0
26 Sep 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji
Zihao Li
Indraneil Paul
Jaakko Paavola
Peiqin Lin
...
Dayyán O'Brien
Hengyu Luo
Hinrich Schütze
Jörg Tiedemann
Barry Haddow
CLL
47
3
0
26 Sep 2024
HelloBench: Evaluating Long Text Generation Capabilities of Large
  Language Models
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models
Haoran Que
Feiyu Duan
Liqun He
Yutao Mou
Wangchunshu Zhou
...
Ge Zhang
Junran Peng
Zhaoxiang Zhang
Songyang Zhang
Kai Chen
LM&MA
ELM
VLM
56
12
0
24 Sep 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
125
90
0
18 Sep 2024
AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances
AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances
Dhruv Agarwal
Mor Naaman
Aditya Vashistha
43
16
0
17 Sep 2024
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Teng Wang
Zhenqi He
Wing-Yin Yu
Xiaojin Fu
Xiongwei Han
LRM
66
5
0
17 Sep 2024
Flash STU: Fast Spectral Transform Units
Flash STU: Fast Spectral Transform Units
Y. Isabel Liu
Windsor Nguyen
Yagiz Devre
Evan Dogariu
Anirudha Majumdar
Elad Hazan
AI4TS
74
1
0
16 Sep 2024
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia
Fajri Koto
ELM
63
2
0
13 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
66
23
0
10 Sep 2024
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
Ilya Gusev
LLMAG
58
3
0
10 Sep 2024
Residual Stream Analysis with Multi-Layer SAEs
Residual Stream Analysis with Multi-Layer SAEs
Tim Lawson
Lucy Farnik
Conor Houghton
Laurence Aitchison
39
3
0
06 Sep 2024
Large Language Models-Enabled Digital Twins for Precision Medicine in
  Rare Gynecological Tumors
Large Language Models-Enabled Digital Twins for Precision Medicine in Rare Gynecological Tumors
Jacqueline Lammert
Nicole Pfarr
Leonid Kuligin
Sonja Mathes
Tobias Dreyer
...
Martin Boeker
Marion Kiechle
Ulrich A. Schatz
Holger Bronger
Maximilian Tschochohei
LM&MA
AI4CE
26
0
0
31 Aug 2024
W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering
W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering
Jinming Nian
Zhiyuan Peng
Qifan Wang
Yi Fang
RALM
78
2
0
15 Aug 2024
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Zilong Wang
Zifeng Wang
Long Le
Huaixiu Steven Zheng
Swaroop Mishra
...
Anush Mattapalli
Ankur Taly
Jingbo Shang
Chen-Yu Lee
Tomas Pfister
RALM
85
34
0
11 Jul 2024
Teola: Towards End-to-End Optimization of LLM-based Applications
Teola: Towards End-to-End Optimization of LLM-based Applications
Xin Tan
Yimin Jiang
Yitao Yang
Hong-Yu Xu
75
5
0
29 Jun 2024
Evaluating Copyright Takedown Methods for Language Models
Evaluating Copyright Takedown Methods for Language Models
Boyi Wei
Weijia Shi
Yangsibo Huang
Noah A. Smith
Chiyuan Zhang
Luke Zettlemoyer
Kai Li
Peter Henderson
56
20
0
26 Jun 2024
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction
  Tuning
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning
Jiaqi Li
Yixuan Tang
Yi Yang
51
6
0
14 Jun 2024
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs
  with Nothing
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu
Fengqing Jiang
Luyao Niu
Yuntian Deng
Radha Poovendran
Yejin Choi
Bill Yuchen Lin
SyDa
59
129
0
12 Jun 2024
Previous
12345
Next