ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.16819
31
22

Nemotron-4 15B Technical Report

26 February 2024
Jupinder Parmar
Shrimai Prabhumoye
Joseph Jennings
M. Patwary
Sandeep Subramanian
Dan Su
Chen Zhu
Deepak Narayanan
Aastha Jhunjhunwala
Ayush Dattagupta
Vibhu Jawa
Jiwei Liu
Ameya Mahabaleshwarkar
Osvald Nitski
Annika Brundyn
James Maki
Miguel Martinez
Jiaxuan You
John Kamalu
P. LeGresley
Denys Fridman
Jared Casper
Ashwath Aithal
Oleksii Kuchaiev
M. Shoeybi
Jonathan Cohen
Bryan Catanzaro
ArXivPDFHTML
Abstract

We introduce Nemotron-4 15B, a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens. Nemotron-4 15B demonstrates strong performance when assessed on English, multilingual, and coding tasks: it outperforms all existing similarly-sized open models on 4 out of 7 downstream evaluation areas and achieves competitive performance to the leading open models in the remaining ones. Specifically, Nemotron-4 15B exhibits the best multilingual capabilities of all similarly-sized models, even outperforming models over four times larger and those explicitly specialized for multilingual tasks.

View on arXiv
Comments on this paper