ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.09587
14
11

Improving Robustness and Generality of NLP Models Using Disentangled Representations

21 September 2020
Jiawei Wu
Xiaoya Li
Xiang Ao
Yuxian Meng
Fei Wu
Jiwei Li
    OOD
    DRL
ArXivPDFHTML
Abstract

Supervised neural networks, which first map an input xxx to a single representation zzz, and then map zzz to the output label yyy, have achieved remarkable success in a wide range of natural language processing (NLP) tasks. Despite their success, neural models lack for both robustness and generality: small perturbations to inputs can result in absolutely different outputs; the performance of a model trained on one domain drops drastically when tested on another domain. In this paper, we present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. Instead of mapping xxx to a single representation zzz, the proposed strategy maps xxx to a set of representations {z1,z2,...,zK}\{z_1,z_2,...,z_K\}{z1​,z2​,...,zK​} while forcing them to be disentangled. These representations are then mapped to different logits llls, the ensemble of which is used to make the final prediction yyy. We propose different methods to incorporate this idea into currently widely-used models, including adding an LLL2 regularizer on zzzs or adding Total Correlation (TC) under the framework of variational information bottleneck (VIB). We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.

View on arXiv
Comments on this paper