ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.03450
26
13

XGen-7B Technical Report

7 September 2023
Erik Nijkamp
Tian Xie
Hiroaki Hayashi
Bo Pang
Congying Xia
Chen Xing
Jesse Vig
Semih Yavuz
Philippe Laban
Ben Krause
Senthil Purushwalkam
Tong Niu
Wojciech Kry'sciñski
Lidiya Murakhovs'ka
Prafulla Kumar Choubey
Alexander R. Fabbri
Ye Liu
Rui Meng
Lifu Tu
Meghana Moorthy Bhat
Chien-Sheng Wu
Silvio Savarese
Yingbo Zhou
Shafiq R. Joty
Caiming Xiong
    ALM
ArXivPDFHTML
Abstract

Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific progress. Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context. To address this, we have trained XGen, a series of 7B parameter models on up to 8K sequence length for up to 1.5T tokens. We have also finetuned the XGen models on public-domain instructional data, creating their instruction-tuned counterparts (XGen-Inst). We open-source our models for both research advancements and commercial applications. Our evaluation on standard benchmarks shows that XGen models achieve comparable or better results when compared with state-of-the-art open-source LLMs. Our targeted evaluation on long sequence modeling tasks shows the benefits of our 8K-sequence models over 2K-sequence open-source LLMs.

View on arXiv
Comments on this paper