ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.00352
  4. Cited By
Does Alignment Tuning Really Break LLMs' Internal Confidence?
v1v2 (latest)

Does Alignment Tuning Really Break LLMs' Internal Confidence?

31 August 2024
Hongseok Oh
Wonseok Hwang
ArXiv (abs)PDFHTML

Papers citing "Does Alignment Tuning Really Break LLMs' Internal Confidence?"

9 / 9 papers shown
Title
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
210
413
0
01 Feb 2024
On the Calibration of Large Language Models and Alignment
On the Calibration of Large Language Models and Alignment
Chiwei Zhu
Benfeng Xu
Quan Wang
Yongdong Zhang
Zhendong Mao
157
45
0
22 Nov 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
550
12,137
0
18 Jul 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
  Scores from Language Models Fine-Tuned with Human Feedback
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
167
357
0
24 May 2023
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELMRALM
476
4,587
0
07 Sep 2020
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OODLRM
374
1,853
0
26 Nov 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
235
2,537
0
19 May 2019
Measuring Calibration in Deep Learning
Measuring Calibration in Deep Learning
Jeremy Nixon
Michael W. Dusenberry
Ghassen Jerfel
Timothy Nguyen
Jeremiah Zhe Liu
Linchuan Zhang
Dustin Tran
UQCV
103
494
0
02 Apr 2019
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELMRALMLRM
249
2,679
0
14 Mar 2018
1