Title |
---|
![]() Do Not Design, Learn: A Trainable Scoring Function for Uncertainty Estimation in Generative LLMs D. Yaldiz Yavuz Faruk Bakman Baturalp Buyukates Chenyang Tao Anil Ramakrishna Dimitrios Dimitriadis Jieyu Zhao Salman Avestimehr |
![]() The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models Seungone Kim Juyoung Suk Ji Yong Cho Shayne Longpre Chaeeun Kim ...Sean Welleck Graham Neubig Moontae Lee Kyungjae Lee Minjoon Seo |