ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.09519
24
60

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

20 September 2021
Siqi Bao
H. He
Fan Wang
Hua-Hong Wu
Haifeng Wang
Wenquan Wu
Zhihua Wu
Zhen Guo
Hua Lu
Xinxian Huang
Xin Tian
Xinchao Xu
Yingzhan Lin
Zhengyu Niu
    VLM
    ALM
ArXivPDFHTML
Abstract

To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations. To train such large models, we adopt the architecture of unified transformer with high computation and parameter efficiency. In addition, we carry out multi-party aware pre-training to better distinguish the characteristic information in social media conversations. With such designs, PLATO-XL successfully achieves superior performances as compared to other approaches in both Chinese and English chitchat. We further explore the capacity of PLATO-XL on other conversational tasks, such as knowledge grounded dialogue and task-oriented conversation. The experimental results indicate that PLATO-XL obtains state-of-the-art results across multiple conversational tasks, verifying its potential as a foundation model of conversational AI.

View on arXiv
Comments on this paper