Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.08295
Cited By
v1
v2
v3
v4 (latest)
Gemma: Open Models Based on Gemini Research and Technology
13 March 2024
Gemma Team
Gemma Team Thomas Mesnard
Cassidy Hardin
Robert Dadashi
Surya Bhupatiraju
Shreya Pathak
Laurent Sifre
Morgane Riviere
Mihir Kale
J Christopher Love
P. Tafti
Léonard Hussenot
Pier Giuseppe Sessa
Aakanksha Chowdhery
Adam Roberts
Aditya Barua
Alex Botev
Alex Castro-Ros
Ambrose Slone
Amélie Héliou
Andrea Tacchetti
Anna Bulanova
Antonia Paterson
Beth Tsai
Bobak Shahriari
Charline Le Lan
Christopher A. Choquette-Choo
Clément Crepy
Daniel Cer
Daphne Ippolito
David Reid
Elena Buchatskaya
Eric Ni
Eric Noland
Geng Yan
George Tucker
George-Christian Muraru
Grigory Rozhdestvenskiy
Henryk Michalewski
Ian Tenney
Ivan Grishchenko
Jacob Austin
James Keeling
Jane Labanowski
Jean-Baptiste Lespiau
Jeff Stanway
Jenny Brennan
Jeremy Chen
Johan Ferret
Justin T Chiu
J. Mao-Jones
Katherine Lee
Kathy Yu
Katie Millican
Lars Lowe Sjoesund
Lisa Lee
Lucas Dixon
Machel Reid
Maciej Mikula
Mateo Wirth
Michael Sharman
Nikolai Chinaev
Nithum Thain
Olivier Bachem
Oscar Chang
O. Wahltinez
Paige Bailey
Paul Michel
Petko Yotov
Rahma Chaabouni
Ramona Comanescu
Reena Jana
Rohan Anil
Ross McIlroy
Ruibo Liu
Ryan Mullins
Samuel L. Smith
Sebastian Borgeaud
Sertan Girgin
Sholto Douglas
Shree Pandya
Siamak Shakeri
Soham De
Ted Klimenko
Tom Hennigan
Vladimir Feinberg
Wojciech Stokowiec
Yu-hui Chen
Zafarali Ahmed
Zhitao Gong
T. Warkentin
Ludovic Peran
Minh Giang
Clement Farabet
Oriol Vinyals
Jeffrey Dean
Koray Kavukcuoglu
Demis Hassabis
Zoubin Ghahramani
Douglas Eck
Joelle Barral
Fernando Pereira
Eli Collins
Armand Joulin
Noah Fiedel
Evan Senter
Alek Andreev
Kathleen Kenealy
VLM
LLMAG
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gemma: Open Models Based on Gemini Research and Technology"
50 / 78 papers shown
Title
ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment
Xiaoqiang Lin
Arun Verma
Zhongxiang Dai
Daniela Rus
See-Kiong Ng
Bryan Kian Hsiang Low
263
0
0
25 May 2025
VISTA: Vision-Language Inference for Training-Free Stock Time-Series Analysis
Tina Khezresmaeilzadeh
Parsa Razmara
Seyedarmin Azizi
Mohammad Erfan Sadeghi
Erfan Baghaei Portaghloo
AI4TS
256
0
0
24 May 2025
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Chen Shani
Dan Jurafsky
Yann LeCun
Ravid Shwartz-Ziv
188
0
0
21 May 2025
Text Generation Beyond Discrete Token Sampling
Yufan Zhuang
Liyuan Liu
Chandan Singh
Jingbo Shang
Jianfeng Gao
OOD
157
1
0
20 May 2025
Adversarial Attacks in Multimodal Systems: A Practitioner's Survey
Shashank Kapoor
Sanjay Surendranath Girija
Lakshit Arora
Dipen Pradhan
Ankit Shetgaonkar
Aman Raj
AAML
152
0
0
06 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
290
0
0
05 May 2025
Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control
Hannah Cyberey
David Evans
LLMSV
152
3
0
23 Apr 2025
MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning
Yiwei Ma
Guohai Xu
Xiaoshuai Sun
Jiayi Ji
Jie Lou
Debing Zhang
Rongrong Ji
194
2
0
26 Mar 2025
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text
Tadesse Destaw Belay
Israel Abebe Azime
Ibrahim Said Ahmad
David Ifeoluwa Adelani
Idris Abdulmumin
Abinew Ali Ayele
Shamsuddeen Hassan Muhammad
Seid Muhie Yimam
104
0
0
24 Mar 2025
Green Prompting
Marta Adamska
Daria Smirnova
Hamid Nasiri
Zhengxin Yu
Peter Garraghan
517
1
0
09 Mar 2025
LEWIS (LayEr WIse Sparsity) -- A Training Free Guided Model Merging Approach
Hetarth Chopra
Vidhi Rambhia
Vikram Adve
MoMe
118
0
0
05 Mar 2025
MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models
Zihao Wei
Jingcheng Deng
Liang Pang
Hanxing Ding
Huawei Shen
Xueqi Cheng
KELM
135
7
0
20 Feb 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
148
10
0
31 Dec 2024
Large-scale moral machine experiment on large language models
Muhammad Shahrul Zaim bin Ahmad
Kazuhiro Takemoto
ELM
AILaw
103
1
1
31 Dec 2024
ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation
Weilong Dong
Xinwei Wu
Renren Jin
Shaoyang Xu
Deyi Xiong
131
9
0
31 Dec 2024
ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Qing Zong
Zhaoxiang Wang
Tianshi Zheng
Xiyu Ren
Yangqiu Song
147
3
0
28 Dec 2024
Neptune: The Long Orbit to Benchmarking Long Video Understanding
Arsha Nagrani
Ruotong Wang
Ramin Mehran
Rachel Hornung
N. B. Gundavarapu
...
Boqing Gong
Cordelia Schmid
Mikhail Sirotenko
Yukun Zhu
Tobias Weyand
153
8
0
12 Dec 2024
ReWind: Understanding Long Videos with Instructed Learnable Memory
Anxhelo Diko
Tinghuai Wang
Wassim Swaileh
Shiyan Sun
Ioannis Patras
KELM
VLM
129
1
0
23 Nov 2024
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
Shitong Shao
Zikai Zhou
Tian Ye
Lichen Bai
Zhiqiang Xu
Zeke Xie
DiffM
101
0
0
16 Nov 2024
LSHBloom: Memory-efficient, Extreme-scale Document Deduplication
A. Khan
Robert Underwood
Carlo Siebenschuh
Y. Babuji
Aswathy Ajith
Kyle Hippe
Ozan Gokdemir
Alexander Brace
Kyle Chard
Ian Foster
67
0
0
06 Nov 2024
Data Extraction Attacks in Retrieval-Augmented Generation via Backdoors
Yuefeng Peng
Junda Wang
Hong-ye Yu
Amir Houmansadr
SILM
99
3
0
03 Nov 2024
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Youngjoon Lee
J. Gong
Joonhyuk Kang
87
0
0
31 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
125
7
0
28 Oct 2024
Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization
Zhecheng Li
Yijiao Wang
Bryan Hooi
Yujun Cai
Naifan Cheung
Nanyun Peng
Kai-Wei Chang
179
1
0
26 Oct 2024
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning
Yifan Peng
Krishna Puvvada
Zhehuai Chen
Piotr .Zelasko
He Huang
Kunal Dhawan
Ke Hu
Shinji Watanabe
Jagadeesh Balam
Boris Ginsburg
132
5
0
23 Oct 2024
Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning
Haining Wang
Jason Clark
Hannah McKelvey
Leila Sterman
Zheng Gao
Zuoyu Tian
Sandra Kübler
Xiaozhong Liu
98
1
0
22 Oct 2024
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
Mingzhi Wang
Chengdong Ma
Qizhi Chen
Linjian Meng
Yang Han
Jiancong Xiao
Zhaowei Zhang
Jing Huo
Weijie Su
Yaodong Yang
126
9
0
22 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
148
7
0
22 Oct 2024
Bias Similarity Across Large Language Models
Hyejun Jeong
Shiqing Ma
Amir Houmansadr
93
0
0
15 Oct 2024
ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
Yi Ding
Bolian Li
Ruqi Zhang
MLLM
112
14
0
09 Oct 2024
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
121
29
0
08 Oct 2024
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
SyDa
82
2
0
03 Oct 2024
SPINE: Online Semantic Planning for Missions with Incomplete Natural Language Specifications in Unstructured Environments
Zachary Ravichandran
Varun Murali
Mariliza Tzes
George J. Pappas
Vijay Kumar
LRM
101
10
0
03 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
113
2
0
02 Oct 2024
A Watermark for Black-Box Language Models
Dara Bahri
John Wieting
WaLM
118
5
0
02 Oct 2024
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems
Stephen Miner
Yoshiki Takashima
Simeng Han
Ferhat Erata
Timos Antonopoulos
R. Piskac
Scott J. Shapiro
LRM
142
4
0
30 Sep 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
158
15
0
23 Sep 2024
Bilingual Evaluation of Language Models on General Knowledge in University Entrance Exams with Minimal Contamination
Eva Sánchez Salido
Roser Morante
Julio Gonzalo
Guillermo Marco
Jorge Carrillo-de-Albornoz
...
Enrique Amigó
Andrés Fernández
Alejandro Benito-Santos
Adrián Ghajari Espinosa
Victor Fresno
ELM
91
0
0
19 Sep 2024
Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs
Guillermo Marco
Luz Rello
Julio Gonzalo
LM&MA
ALM
95
7
0
17 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
123
9
0
13 Sep 2024
LLMs generate structurally realistic social networks but overestimate political homophily
Serina Chang
Alicja Chaszczewicz
Emma Wang
Maya Josifovska
Emma Pierson
J. Leskovec
112
8
0
29 Aug 2024
Personality Alignment of Large Language Models
Minjun Zhu
Linyi Yang
Yue Zhang
Yue Zhang
ALM
117
8
0
21 Aug 2024
Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Guangyuan Ma
Yongliang Ma
Xing Wu
Zhenpeng Su
Ming Zhou
Songlin Hu
OOD
154
3
0
20 Aug 2024
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Mingda Li
Abhijit Mishra
Utkarsh Mujumdar
92
0
0
19 Aug 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
142
13
0
19 Aug 2024
MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU
Yan Li
So-Eon Kim
Seong-Bae Park
S. Han
80
1
0
15 Aug 2024
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
Ran Xu
Yennie Jun
Zhen Qin
Tianqi Liu
Carl Yang
Yi Liang
Simon Baumgartner
Michael Bendersky
SyDa
118
5
0
22 Jul 2024
Evaluating the Reliability of Self-Explanations in Large Language Models
Korbinian Randl
John Pavlopoulos
Aron Henriksson
Tony Lindgren
LRM
108
1
0
19 Jul 2024
Training on the Test Task Confounds Evaluation and Emergence
Ricardo Dominguez-Olmedo
Florian E. Dorner
Moritz Hardt
ELM
135
9
1
10 Jul 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
123
4
0
17 Jun 2024
1
2
Next