ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.11704
  4. Cited By
Nemotron-4 340B Technical Report

Nemotron-4 340B Technical Report

17 June 2024
Nvidia
:
Bo Adler
Niket Agarwal
Ashwath Aithal
Dong H. Anh
Pallab Bhattacharya
Annika Brundyn
Jared Casper
Bryan Catanzaro
Sharon Clay
Jonathan Cohen
Sirshak Das
Ayush Dattagupta
Olivier Delalleau
Leon Derczynski
Yi Dong
Daniel Egert
Ellie Evans
Aleksander Ficek
Denys Fridman
Shaona Ghosh
Boris Ginsburg
Igor Gitman
Tomasz Grzegorzek
R. Hero
Jining Huang
Vibhu Jawa
Joseph Jennings
Aastha Jhunjhunwala
John Kamalu
Sadaf Khan
Oleksii Kuchaiev
P. LeGresley
Hui Li
Jiwei Liu
Zihan Liu
E. Long
Ameya Mahabaleshwarkar
Somshubra Majumdar
James Maki
Miguel Martinez
Maer Rodrigues de Melo
Ivan Moshkov
Deepak Narayanan
Sean Narenthiran
J. Navarro
Phong Nguyen
Osvald Nitski
Vahid Noroozi
Guruprasad Nutheti
Christopher Parisien
Jupinder Parmar
M. Patwary
Krzysztof Pawelec
Ming-Yu Liu
Shrimai Prabhumoye
Rajarshi Roy
Trisha Saar
Vasanth Rao Naik Sabavat
S. Satheesh
Jane Polak Scowcroft
J. Sewall
Pavel Shamis
Gerald Shen
M. Shoeybi
Dave Sizer
Alice Luo
Felipe Soares
Makesh Narsimhan Sreedhar
Dan Su
Sandeep Subramanian
Shengyang Sun
Shubham Toshniwal
Hao Wang
Zhilin Wang
Jiaxuan You
Jiaqi Zeng
Jimmy Zhang
Jing Zhang
Vivienne Zhang
Yian Zhang
Chen Zhu
ArXivPDFHTML

Papers citing "Nemotron-4 340B Technical Report"

44 / 44 papers shown
Title
RM-R1: Reward Modeling as Reasoning
RM-R1: Reward Modeling as Reasoning
Xiusi Chen
Gaotang Li
Zehua Wang
Bowen Jin
Cheng Qian
...
Y. Zhang
D. Zhang
Tong Zhang
Hanghang Tong
Heng Ji
ReLM
OffRL
LRM
165
1
0
05 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
45
0
0
01 May 2025
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao
Yu Yang
Y. Fu
Xin Dong
Dan Su
...
Hongxu Yin
M. Patwary
Yingyan
Jan Kautz
Pavlo Molchanov
38
0
0
17 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
2
0
12 Apr 2025
Adversarial Training of Reward Models
Adversarial Training of Reward Models
Alexander Bukharin
Haifeng Qian
Shengyang Sun
Adithya Renduchintala
Soumye Singhal
Zhilin Wang
Oleksii Kuchaiev
Olivier Delalleau
T. Zhao
AAML
32
0
0
08 Apr 2025
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
Xiaohui Sun
Ruitong Xiao
Jianye Mo
Bowen Wu
Qun Yu
Baoxun Wang
51
1
0
03 Apr 2025
Entropy-Based Adaptive Weighting for Self-Training
Entropy-Based Adaptive Weighting for Self-Training
Xiaoxuan Wang
Yihe Deng
Mingyu Derek Ma
Wei Wang
LRM
52
0
0
31 Mar 2025
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning
Huajie Tan
Yuheng Ji
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
ReLM
OffRL
LRM
94
7
0
26 Mar 2025
Accelerating Transformer Inference and Training with 2:4 Activation Sparsity
Accelerating Transformer Inference and Training with 2:4 Activation Sparsity
Daniel Haziza
Timothy Chou
Dhruv Choudhary
Luca Wehrstedt
Francisco Massa
Jiecao Yu
Geonhwa Jeong
Supriya Rao
Patrick Labatut
Jesse Cai
42
0
0
20 Mar 2025
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Nvidia
A. Azzolini
Junjie Bai
Prithvijit Chattopadhyay
Huayu Chen
...
Xiaodong Yang
Zhuolin Yang
Jingyang Zhang
Xiaohui Zeng
Zhe Zhang
AI4CE
LM&Ro
LRM
54
5
0
18 Mar 2025
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
Yiwen Ding
Zhiheng Xi
Wei He
Zhuoyuan Li
Yitao Zhai
Xiaowei Shi
Xunliang Cai
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
75
3
0
24 Feb 2025
C-3DPO: Constrained Controlled Classification for Direct Preference Optimization
C-3DPO: Constrained Controlled Classification for Direct Preference Optimization
Kavosh Asadi
Julien Han
Xingzi Xu
Dominique Perrault-Joncas
Shoham Sabach
Karim Bouyarmane
Mohammad Ghavamzadeh
34
0
0
22 Feb 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain
Paarth Neekhara
Xuesong Yang
Edresson Casanova
Subhankar Ghosh
Mikyas T. Desta
Roy Fejgin
Rafael Valle
Jason Chun Lok Li
61
2
0
07 Feb 2025
Scaling Embedding Layers in Language Models
Scaling Embedding Layers in Language Models
Da Yu
Edith Cohen
Badih Ghazi
Yangsibo Huang
Pritish Kamath
Ravi Kumar
Daogao Liu
Chiyuan Zhang
79
0
0
03 Feb 2025
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Tianyu Zheng
Ge Zhang
Tianhao Shen
Xueling Liu
Bill Yuchen Lin
Jie Fu
Wenhu Chen
Xiang Yue
SyDa
91
102
0
08 Jan 2025
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
Hanwen Jiang
Zexiang Xu
Desai Xie
Z. Chen
Haian Jin
...
Xin Sun
Jiuxiang Gu
Qixing Huang
Georgios Pavlakos
Hao Tan
159
1
0
18 Dec 2024
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Akhiad Bercovich
Tomer Ronen
Talor Abramovich
Nir Ailon
Nave Assaf
...
Ido Shahaf
Oren Tropp
Omer Ullman Argov
Ran Zilberstein
Ran El-Yaniv
77
1
0
28 Nov 2024
Self-Generated Critiques Boost Reward Modeling for Language Models
Self-Generated Critiques Boost Reward Modeling for Language Models
Yue Yu
Zhengxing Chen
Aston Zhang
L Tan
Chenguang Zhu
...
Suchin Gururangan
Chao-Yue Zhang
Melanie Kambadur
Dhruv Mahajan
Rui Hou
LRM
ALM
96
16
0
25 Nov 2024
Large Language Models for Constructing and Optimizing Machine Learning
  Workflows: A Survey
Large Language Models for Constructing and Optimizing Machine Learning Workflows: A Survey
Yang Gu
Hengyu You
Jian Cao
Muran Yu
Haoran Fan
Shiyou Qian
LM&MA
AI4CE
46
3
0
11 Nov 2024
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated
  Parameters by Tencent
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Xingchen Sun
Yanfeng Chen
Yanwen Huang
Ruobing Xie
Jiaqi Zhu
...
Zhanhui Kang
Yong Yang
Yuhong Liu
Di Wang
Jie Jiang
MoE
ALM
ELM
73
25
0
04 Nov 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Yaojie Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
75
9
0
25 Oct 2024
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Chris Liu
Liang Zeng
Jiaheng Liu
Rui Yan
Jujie He
Chaojie Wang
Shuicheng Yan
Yang Liu
Yahui Zhou
AI4TS
48
63
0
24 Oct 2024
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety
  and Style
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
Yantao Liu
Zijun Yao
Rui Min
Yixin Cao
Lei Hou
Juanzi Li
OffRL
ALM
20
29
0
21 Oct 2024
$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large
  Language Models
γ−γ-γ−MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Yaxin Luo
Gen Luo
Jiayi Ji
Yiyi Zhou
Xiaoshuai Sun
Zhiqiang Shen
Rongrong Ji
VLM
MoE
39
1
0
17 Oct 2024
Preference Optimization with Multi-Sample Comparisons
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
66
10
0
16 Oct 2024
Upcycling Large Language Models into Mixture of Experts
Upcycling Large Language Models into Mixture of Experts
Ethan He
Abhinav Khattar
R. Prenger
V. Korthikanti
Zijie Yan
Tong Liu
Shiqing Fan
Ashwath Aithal
M. Shoeybi
Bryan Catanzaro
MoE
39
9
0
10 Oct 2024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
Shenao Zhang
Zhihan Liu
Boyi Liu
Yuhang Zhang
Yingxiang Yang
Y. Liu
Liyu Chen
Tao Sun
Ziyi Wang
98
3
0
10 Oct 2024
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative
  Feedback Loss
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Xin Mao
Feng-Lin Li
Huimin Xu
Wei Zhang
Wang Chen
A. Luu
29
1
0
07 Oct 2024
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg
Lerrel Pinto
Rob Fergus
SyDa
37
2
0
03 Oct 2024
Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown
Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown
Xingzhou Lou
Dong Yan
Wei Shen
Yuzi Yan
Jian Xie
Junge Zhang
47
22
0
01 Oct 2024
Direct Judgement Preference Optimization
Direct Judgement Preference Optimization
Peifeng Wang
Austin Xu
Yilun Zhou
Caiming Xiong
Shafiq Joty
ELM
39
12
0
23 Sep 2024
The Central Role of the Loss Function in Reinforcement Learning
The Central Role of the Loss Function in Reinforcement Learning
Kaiwen Wang
Nathan Kallus
Wen Sun
OffRL
56
7
0
19 Sep 2024
Leveraging Unstructured Text Data for Federated Instruction Tuning of
  Large Language Models
Leveraging Unstructured Text Data for Federated Instruction Tuning of Large Language Models
Rui Ye
Rui Ge
Yuchi Fengting
Jingyi Chai
Yanfeng Wang
Siheng Chen
FedML
40
1
0
11 Sep 2024
Self-Directed Synthetic Dialogues and Revisions Technical Report
Self-Directed Synthetic Dialogues and Revisions Technical Report
Nathan Lambert
Hailey Schoelkopf
Aaron Gokaslan
Luca Soldaini
Valentina Pyatkin
Louis Castricato
SyDa
45
3
0
25 Jul 2024
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng-Tao Xu
Ming-Yu Liu
Xianchao Wu
Zihan Liu
M. Shoeybi
Mohammad Shoeybi
Bryan Catanzaro
RALM
52
14
0
19 Jul 2024
LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes
LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes
M. Dearing
Yiheng Tao
Xingfu Wu
Z. Lan
V. Taylor
40
3
0
30 Jun 2024
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and
  Mitigation Strategies for Large Language Models
Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models
Jie Chen
Yupeng Zhang
Bingning Wang
Wayne Xin Zhao
Ji-Rong Wen
Weipeng Chen
SyDa
39
4
0
18 Jun 2024
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment
Gerald Shen
Zhilin Wang
Olivier Delalleau
Jiaqi Zeng
Yi Dong
...
Sahil Jain
Ali Taghibakhshi
Markel Sanz Ausin
Ashwath Aithal
Oleksii Kuchaiev
40
13
0
02 May 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with
  General Preferences
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
152
114
0
04 Apr 2024
RewardBench: Evaluating Reward Models for Language Modeling
RewardBench: Evaluating Reward Models for Language Modeling
Nathan Lambert
Valentina Pyatkin
Jacob Morrison
Lester James Validad Miranda
Bill Yuchen Lin
...
Sachin Kumar
Tom Zick
Yejin Choi
Noah A. Smith
Hanna Hajishirzi
ALM
82
214
0
20 Mar 2024
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural
  language generation from feedback
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey
Yatin Nandwani
Tahira Naseem
Mayank Mishra
Guangxuan Xu
Dinesh Raghu
Sachindra Joshi
Asim Munawar
Ramón Fernández Astudillo
BDL
44
3
0
04 Feb 2024
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of
  Large Language Models for Code Generation
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Jiawei Liu
Chun Xia
Yuyao Wang
Lingming Zhang
ELM
ALM
183
799
0
02 May 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
1