Title |
---|
![]() Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities Haoyu Zhao Yihan Geng Shange Tang Yong Lin Bohan Lyu Hongzhou Lin Chi Jin Sanjeev Arora |
![]() When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research Guijin Son Jiwoo Hong Honglu Fan Heejeong Nam Hyunwoo Ko ...Jinyeop Song Jinha Choi Gonçalo Paulo Youngjae Yu Stella Biderman |