ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19327
  4. Cited By
Paying Alignment Tax with Contrastive Learning

Paying Alignment Tax with Contrastive Learning

25 May 2025
Buse Sibel Korkmaz
Rahul Nair
Elizabeth M. Daly
Antonio del Rio Chanona
ArXivPDFHTML

Papers citing "Paying Alignment Tax with Contrastive Learning"

8 / 8 papers shown
Title
Mitigating the Alignment Tax of RLHF
Mitigating the Alignment Tax of RLHF
Yong Lin
Hangyu Lin
Wei Xiong
Shizhe Diao
Zeming Zheng
...
Han Zhao
Nan Jiang
Heng Ji
Yuan Yao
Tong Zhang
MoMe
CLL
58
76
0
12 Sep 2023
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of
  LLMs by Validating Low-Confidence Generation
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation
Neeraj Varshney
Wenlin Yao
Hongming Zhang
Jianshu Chen
Dong Yu
HILM
85
169
0
08 Jul 2023
CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in
  Abstractive Summarization
CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization
Shuyang Cao
Lu Wang
HILM
53
181
0
19 Sep 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
302
384
0
28 Feb 2021
Measuring and Reducing Gendered Correlations in Pre-trained Models
Measuring and Reducing Gendered Correlations in Pre-trained Models
Kellie Webster
Xuezhi Wang
Ian Tenney
Alex Beutel
Emily Pitler
Ellie Pavlick
Jilin Chen
Ed Chi
Slav Petrov
FaML
72
258
0
12 Oct 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
133
1,194
0
24 Sep 2020
Towards Debiasing Sentence Representations
Towards Debiasing Sentence Representations
Paul Pu Liang
Irene Li
Emily Zheng
Y. Lim
Ruslan Salakhutdinov
Louis-Philippe Morency
70
238
0
16 Jul 2020
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in
  Languages with Rich Morphology
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology
Ran Zmigrod
Sabrina J. Mielke
Hanna M. Wallach
Ryan Cotterell
61
281
0
11 Jun 2019
1