
BeHonest: Benchmarking Honesty in Large Language Models
Papers citing "BeHonest: Benchmarking Honesty in Large Language Models"
17 / 17 papers shown
Title |
---|
![]() Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations Hakan Inan Kartikeya Upasani Jianfeng Chi Rashi Rungta Krithika Iyer ...Michael Tontchev Qing Hu Brian Fuller Davide Testuggine Madian Khabsa |
![]() Qwen Technical Report Jinze Bai Shuai Bai Yunfei Chu Zeyu Cui Kai Dang ...Zhenru Zhang Chang Zhou Jingren Zhou Xiaohuan Zhou Tianhang Zhu |