ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.07965
  4. Cited By
Rho-1: Not All Tokens Are What You Need

Rho-1: Not All Tokens Are What You Need

11 April 2024
Zheng-Wen Lin
Zhibin Gou
Yeyun Gong
Xiao Liu
Yelong Shen
Ruochen Xu
Chen Lin
Yujiu Yang
Jian Jiao
Nan Duan
Weizhu Chen
    CLL
ArXivPDFHTML

Papers citing "Rho-1: Not All Tokens Are What You Need"

21 / 121 papers shown
Title
Deep Double Descent: Where Bigger Models and More Data Hurt
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran
Gal Kaplun
Yamini Bansal
Tristan Yang
Boaz Barak
Ilya Sutskever
119
935
0
04 Dec 2019
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OOD
LRM
116
1,776
0
26 Nov 2019
Carpe Diem, Seize the Samples Uncertain "At the Moment" for Adaptive
  Batch Selection
Carpe Diem, Seize the Samples Uncertain "At the Moment" for Adaptive Batch Selection
Hwanjun Song
Minseok Kim
Sundong Kim
Jae-Gil Lee
39
16
0
19 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
81
654
0
01 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
367
20,053
0
23 Oct 2019
Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction
  to Concepts and Methods
Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods
Eyke Hüllermeier
Willem Waegeman
PER
UD
198
1,405
0
21 Oct 2019
Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Angela H. Jiang
Daniel L.-K. Wong
Giulio Zhou
D. Andersen
J. Dean
...
Gauri Joshi
M. Kaminsky
M. Kozuch
Zachary Chase Lipton
Padmanabhan Pillai
54
121
0
02 Oct 2019
Distributionally Robust Language Modeling
Distributionally Robust Language Modeling
Yonatan Oren
Shiori Sagawa
Tatsunori B. Hashimoto
Percy Liang
OOD
68
172
0
04 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
514
24,351
0
26 Jul 2019
Selection via Proxy: Efficient Data Selection for Deep Learning
Selection via Proxy: Efficient Data Selection for Deep Learning
Cody Coleman
Christopher Yeh
Stephen Mussmann
Baharan Mirzasoleiman
Peter Bailis
Percy Liang
J. Leskovec
Matei A. Zaharia
73
345
0
26 Jun 2019
MathQA: Towards Interpretable Math Word Problem Solving with
  Operation-Based Formalisms
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms
Aida Amini
Saadia Gabriel
Shanchuan Lin
Rik Koncel-Kedziorski
Yejin Choi
Hannaneh Hajishirzi
AIMat
ReLM
AI4CE
100
565
0
30 May 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
205
1,511
0
24 May 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
145
2,446
0
19 May 2019
Understanding Learning Dynamics Of Language Models with SVCCA
Understanding Learning Dynamics Of Language Models with SVCCA
Naomi Saphra
Adam Lopez
55
94
0
01 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.4K
94,511
0
11 Oct 2018
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book
  Question Answering
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Todor Mihaylov
Peter Clark
Tushar Khot
Ashish Sabharwal
94
1,516
0
08 Sep 2018
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
134
2,567
0
14 Mar 2018
Not All Samples Are Created Equal: Deep Learning with Importance
  Sampling
Not All Samples Are Created Equal: Deep Learning with Importance Sampling
Angelos Katharopoulos
François Fleuret
72
517
0
02 Mar 2018
Active Bias: Training More Accurate Neural Networks by Emphasizing High
  Variance Samples
Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples
Haw-Shiuan Chang
Erik Learned-Miller
Andrew McCallum
73
352
0
24 Apr 2017
Online Batch Selection for Faster Training of Neural Networks
Online Batch Selection for Faster Training of Neural Networks
I. Loshchilov
Frank Hutter
ODL
82
300
0
19 Nov 2015
Prioritized Experience Replay
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
210
3,786
0
18 Nov 2015
Previous
123