ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00118
  4. Cited By
Gemma 2: Improving Open Language Models at a Practical Size

Gemma 2: Improving Open Language Models at a Practical Size

31 July 2024
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
Surya Bhupatiraju
Léonard Hussenot
Thomas Mesnard
Bobak Shahriari
Alexandre Ramé
Johan Ferret
Peter J. Liu
P. Tafti
Abe Friesen
Michelle Casbon
Sabela Ramos
Ravin Kumar
Charline Le Lan
Sammy Jerome
Anton Tsitsulin
Nino Vieillard
Piotr Stańczyk
Sertan Girgin
Nikola Momchev
Matt Hoffman
S. Thakoor
Jean-Bastien Grill
Behnam Neyshabur
Olivier Bachem
Alanna Walton
Aliaksei Severyn
Alicia Parrish
Aliya Ahmad
Allen Hutchison
Alvin Abdagic
Amanda Carl
Amy Shen
Andy Brock
Andy Coenen
Anthony Laforge
Antonia Paterson
Ben Bastian
Bilal Piot
Boxi Wu
Brandon Royal
Charlie Chen
Chintu Kumar
Chris Perry
Christoper A. Welty
Christopher A. Choquette-Choo
Danila Sinopalnikov
David Weinberger
Dimple Vijaykumar
Dominika Rogoziñska
D. Herbison
Elisa Bandy
Emma Wang
Eric Noland
Erica Moreira
Evan Senter
Evgenii Eltyshev
Francesco Visin
Gabriel Rasskin
Gary Wei
Glenn Cameron
Gus Martins
Hadi Hashemi
Hanna Klimczak-Pluciñska
Harleen Batra
H. Dhand
Ivan Nardini
Jacinda Mein
Jack Zhou
James Svensson
Jeff Stanway
Jetha Chan
Jin Zhou
Joana Carrasqueira
Joana Iljazi
Jocelyn Becker
Joe Fernandez
Joost R. van Amersfoort
Josh Gordon
Josh Lipschultz
Joshua Newlan
Junsong Ji
Kareem Mohamed
Kartikeya Badola
Kat Black
Katie Millican
Keelin McDonell
Kelvin Nguyen
Kiranbir Sodhia
Kish Greene
Lars Lowe Sjoesund
Lauren Usui
Laurent Sifre
L. Heuermann
Leticia Lago
Lilly McNealus
Livio Baldini Soares
Logan Kilpatrick
Lucas Dixon
Luciano Martins
Machel Reid
Manvinder Singh
Mark Iverson
Martin Gorner
Mat Velloso
Mateo Wirth
Matt Davidow
Matt Miller
Matthew Rahtz
Matthew Watson
Meg Risdal
Mehran Kazemi
Michael Moynihan
Ming Zhang
Minsuk Kahng
Minwoo Park
Mofi Rahman
Mohit Khatwani
Natalie Dao
Nenshad Bardoliwalla
Nesh Devanathan
Neta Dumai
Nilay Chauhan
O. Wahltinez
Pankil Botarda
Parker Barnes
P. Barham
Paul Michel
Pengchong Jin
Petko Georgiev
Phil Culliton
Pradeep Kuppala
Ramona Comanescu
Ramona Merhej
Reena Jana
R. Rokni
Rishabh Agarwal
Ryan Mullins
Samaneh Saadat
Sara Mc Carthy
Sarah Perrin
Sébastien Arnold
Sebastian Krause
Shengyang Dai
S. Garg
Shruti Sheth
S. Ronstrom
Susan Chan
Timothy Jordan
Ting-To Yu
Tom Eccles
Tom Hennigan
Tomás Kociský
Tulsee Doshi
Vihan Jain
Vikas Yadav
Vilobh Meshram
Vishal Dharmadhikari
Warren Barkley
Wei Wei
Wenming Ye
Woohyun Han
Woosuk Kwon
Xiang Xu
Zhe Shen
Zhitao Gong
Zichuan Wei
Victor Cotruta
Phoebe Kirk
Anand Rao
Minh Giang
Ludovic Peran
T. Warkentin
Eli Collins
Joelle Barral
Zoubin Ghahramani
R. Hadsell
D. Sculley
Jeanine Banks
Anca Dragan
Slav Petrov
Oriol Vinyals
Jeffrey Dean
Demis Hassabis
Koray Kavukcuoglu
Clement Farabet
Elena Buchatskaya
Sebastian Borgeaud
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
    VLM
    MoE
    OSLM
ArXivPDFHTML

Papers citing "Gemma 2: Improving Open Language Models at a Practical Size"

50 / 209 papers shown
Title
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
Jiangnan Li
Thuy-Trang Vu
Christian Herold
Amirhossein Tebbifakhr
Shahram Khadivi
Gholamreza Haffari
42
0
0
31 Mar 2025
Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B
Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B
Aleksandra Bakalova
Yana Veitsman
Xinting Huang
Michael Hahn
41
0
0
31 Mar 2025
Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models
Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models
Irtaza Khalid
Amir Masoud Nourollah
Steven Schockaert
LRM
57
0
0
30 Mar 2025
HRET: A Self-Evolving LLM Evaluation Toolkit for Korean
HRET: A Self-Evolving LLM Evaluation Toolkit for Korean
Hanwool Albert Lee
Soo Yong Kim
Dasol Choi
Sangwon Baek
Seunghyeok Hong
Ilgyun Jeong
Inseon Hwang
Naeun Lee
Guijin Son
VLM
51
0
0
29 Mar 2025
Outlier dimensions favor frequent tokens in language models
Outlier dimensions favor frequent tokens in language models
Iuri Macocco
Nora Graichen
Gemma Boleda
Marco Baroni
60
0
0
27 Mar 2025
Shared Global and Local Geometry of Language Model Embeddings
Shared Global and Local Geometry of Language Model Embeddings
Andrew Lee
Melanie Weber
F. Viégas
Martin Wattenberg
FedML
79
3
0
27 Mar 2025
ImF: Implicit Fingerprint for Large Language Models
ImF: Implicit Fingerprint for Large Language Models
Wu jiaxuan
Peng Wanli
Fu hang
Xue Yiming
Wen juan
41
0
0
25 Mar 2025
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text
Tadesse Destaw Belay
Israel Abebe Azime
Ibrahim Said Ahmad
Idris Abdulmumin
Idris Abdulmumin
Abinew Ali Ayele
Shamsuddeen Hassan Muhammad
Seid Muhie Yimam
43
0
0
24 Mar 2025
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang
Qiuyu Huang
Junjie Liu
Xiefan Guo
Di Huang
62
2
0
24 Mar 2025
Investigating Retrieval-Augmented Generation in Quranic Studies: A Study of 13 Open-Source Large Language Models
Investigating Retrieval-Augmented Generation in Quranic Studies: A Study of 13 Open-Source Large Language Models
Zahra Khalila
Arbi Haza Nasution
Winda Monika
Aytug Onan
Yohei Murakami
Yasir Bin Ismail Radi
Noor Mohammad Osmani
RALM
81
0
0
20 Mar 2025
Don't lie to your friends: Learning what you know from collaborative self-play
Don't lie to your friends: Learning what you know from collaborative self-play
Jacob Eisenstein
Reza Aghajani
Adam Fisch
Dheeru Dua
Fantine Huot
Mirella Lapata
Vicky Zayats
Jonathan Berant
72
0
0
18 Mar 2025
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels
M. Beck
Korbinian Poppel
Phillip Lippe
Sepp Hochreiter
74
1
0
18 Mar 2025
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control
Nvidia
Hassan Abu Alhaija
Jose M. Alvarez
Maciej Bala
Tiffany Cai
...
Yuchong Ye
Xiaodong Yang
Boxin Wang
Fangyin Wei
Yu Zeng
VGen
95
2
0
18 Mar 2025
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
Cheng Deng
Luoyang Sun
Jiwen Jiang
Yongcheng Zeng
Xinjian Wu
...
Haoyang Li
Lei Chen
Lionel M. Ni
Jun Wang
Jun Wang
222
0
0
15 Mar 2025
TigerLLM -- A Family of Bangla Large Language Models
TigerLLM -- A Family of Bangla Large Language Models
Nishat Raihan
Marcos Zampieri
50
0
0
14 Mar 2025
High-Dimensional Interlingual Representations of Large Language Models
High-Dimensional Interlingual Representations of Large Language Models
Bryan Wilie
Samuel Cahyawijaya
Junxian He
Pascale Fung
62
0
0
14 Mar 2025
Florenz: Scaling Laws for Systematic Generalization in Vision-Language Models
Julian Spravil
Sebastian Houben
Sven Behnke
VLM
83
0
0
12 Mar 2025
SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection
SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection
Shamsuddeen Hassan Muhammad
N. Ousidhoum
Idris Abdulmumin
Seid Muhie Yimam
Jan Philip Wahle
...
Abinew Ali Ayele
Oana Ignat
Alexander Panchenko
Yi Zhou
Saif M. Mohammad
54
3
0
10 Mar 2025
WildIFEval: Instruction Following in the Wild
Gili Lior
Asaf Yehudai
Ariel Gera
L. Ein-Dor
74
0
0
09 Mar 2025
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
Xubin Wang
Zhiqing Tang
Jianxiong Guo
Tianhui Meng
Chenhao Wang
Tian-sheng Wang
Weijia Jia
65
1
0
08 Mar 2025
Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
Thomas Winninger
Boussad Addad
Katarzyna Kapusta
AAML
68
0
0
08 Mar 2025
Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach
Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach
Soumyadeep Ro
Sanapala Satwika
Pamarthi Yasoda Gayathri
Mohmmad Ghaith Balsha
Aysegul Ucar
VLM
ObjD
61
0
0
06 Mar 2025
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
Zhijian Zhuo
Yutao Zeng
Ya Wang
Sijun Zhang
Jian Yang
Xiaoqing Li
Xun Zhou
Jinwen Ma
51
0
0
06 Mar 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Abdelrahman Abouelenin
Atabak Ashfaq
Adam Atkinson
Hany Awadalla
Nguyen Bach
...
Ishmam Zabir
Yunan Zhang
Li Zhang
Wenjie Qu
Xiren Zhou
MoE
SyDa
78
32
0
03 Mar 2025
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data Valuation
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data Valuation
Yanzhou Pan
Huawei Lin
Yide Ran
Jiamin Chen
Xiaodong Yu
Weijie Zhao
Denghui Zhang
Zhaozhuo Xu
42
1
0
02 Mar 2025
Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff
Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff
Maximilian Holsman
Yukun Huang
Bhuwan Dhingra
51
0
0
28 Feb 2025
Plan2Align: Predictive Planning Based Test-Time Preference Alignment in Paragraph-Level Machine Translation
Plan2Align: Predictive Planning Based Test-Time Preference Alignment in Paragraph-Level Machine Translation
Kuang-Da Wang
Teng-Ruei Chen
Yu-Heng Hung
Shuoyang Ding
Yueh-Hua Wu
Yu-Chun Wang
Chao-Han Huck Yang
Wen-Chih Peng
Ping-Chun Hsieh
79
0
0
28 Feb 2025
A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation
A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation
Xujie Yuan
Yongxu Liu
Shimin Di
Shiwen Wu
Libin Zheng
Rui Meng
Lei Chen
Xiaofang Zhou
Jian Yin
41
0
0
28 Feb 2025
Self-Training Elicits Concise Reasoning in Large Language Models
Self-Training Elicits Concise Reasoning in Large Language Models
Tergel Munkhbat
Namgyu Ho
S. Kim
Yongjin Yang
Yujin Kim
Se-Young Yun
ReLM
LRM
71
16
0
27 Feb 2025
LLM as a Broken Telephone: Iterative Generation Distorts Information
LLM as a Broken Telephone: Iterative Generation Distorts Information
Amr Mohamed
Mingmeng Geng
Michalis Vazirgiannis
Guokan Shang
83
1
0
27 Feb 2025
BIG-Bench Extra Hard
BIG-Bench Extra Hard
Mehran Kazemi
Bahare Fatemi
Hritik Bansal
John Palowitch
Chrysovalantis Anastasiou
...
Kate Olszewska
Yi Tay
Vinh Q. Tran
Quoc V. Le
Orhan Firat
ELM
LRM
122
8
0
26 Feb 2025
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models
Yu He
Boheng Li
Lu Liu
Zhongjie Ba
Wei Dong
Yiming Li
Zengchang Qin
Kui Ren
Chong Chen
MIALM
79
0
0
26 Feb 2025
A City of Millions: Mapping Literary Social Networks At Scale
A City of Millions: Mapping Literary Social Networks At Scale
Sil Hamilton
Rebecca M. M. Hicke
David M. Mimno
Matthew Wilkens
GNN
264
1
0
26 Feb 2025
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models
Hongzhan Lin
Yang Deng
Yuxuan Gu
Wenxuan Zhang
Jing Ma
See-Kiong Ng
Tat-Seng Chua
LLMAG
KELM
HILM
73
0
0
25 Feb 2025
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models
Yuxuan Zhang
CLL
ALM
73
1
0
25 Feb 2025
Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning
Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-Tuning
Raghav Singhal
Kaustubh Ponkshe
Rohit Vartak
Lav R. Varshney
Praneeth Vepakomma
FedML
84
1
0
24 Feb 2025
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Jaydeep Borkar
Matthew Jagielski
Katherine Lee
Niloofar Mireshghallah
David A. Smith
Christopher A. Choquette-Choo
PILM
90
1
0
24 Feb 2025
IPO: Your Language Model is Secretly a Preference Classifier
IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg
Ayush Singh
Shweta Singh
Paras Chopra
238
1
0
22 Feb 2025
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models
Martina Miliani
S. Auriemma
Alessandro Bondielli
Emmanuele Chersoni
Lucia Passaro
Irene Sucameli
Alessandro Lenci
LRM
ELM
57
0
0
21 Feb 2025
Multilingual Language Model Pretraining using Machine-translated Data
Multilingual Language Model Pretraining using Machine-translated Data
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
David Ifeoluwa Adelani
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
88
3
0
20 Feb 2025
Drift: Decoding-time Personalized Alignments with Implicit User Preferences
Drift: Decoding-time Personalized Alignments with Implicit User Preferences
Minbeom Kim
Kang-il Lee
Seongho Joo
Hwaran Lee
Thibaut Thonet
Kyomin Jung
AI4TS
121
1
0
20 Feb 2025
Baichuan-M1: Pushing the Medical Capability of Large Language Models
Binghui Wang
Haizhou Zhao
Huozhi Zhou
Liang Song
Mingyu Xu
...
Yan Zhang
Yifei Duan
Yuyan Zhou
Zhi-Ming Ma
Zhikai Wu
LM&MA
ELM
AI4MH
47
4
0
18 Feb 2025
Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction
Nils Constantin Hellwig
Jakob Fehle
Udo Kruschwitz
Christian Wolff
AI4MH
51
0
0
18 Feb 2025
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks
Eva Sánchez Salido
Julio Gonzalo
Guillermo Marco
ELM
65
3
0
18 Feb 2025
Can Your Uncertainty Scores Detect Hallucinated Entity?
Can Your Uncertainty Scores Detect Hallucinated Entity?
Min-Hsuan Yeh
Max Kamachee
Seongheon Park
Yixuan Li
HILM
55
2
0
17 Feb 2025
Competing LLM Agents in a Non-Cooperative Game of Opinion Polarisation
Competing LLM Agents in a Non-Cooperative Game of Opinion Polarisation
Amin Qasmi
Usman Naseem
Mehwish Nasim
46
0
0
17 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
59
1
0
16 Feb 2025
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Prateek Chhikara
54
1
0
16 Feb 2025
ReLearn: Unlearning via Learning for Large Language Models
ReLearn: Unlearning via Learning for Large Language Models
Haoming Xu
Ningyuan Zhao
Liming Yang
Sendong Zhao
Shumin Deng
Mengru Wang
Bryan Hooi
Nay Oo
Huajun Chen
N. Zhang
KELM
CLL
MU
270
0
0
16 Feb 2025
Fast Proxies for LLM Robustness Evaluation
Fast Proxies for LLM Robustness Evaluation
Tim Beyer
Jan Schuchardt
Leo Schwinn
Stephan Günnemann
AAML
51
0
0
14 Feb 2025
Previous
12345
Next