Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.13826
Cited By
Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models
17 October 2024
Mazda Moayeri
Vidhisha Balachandran
Varun Chandrasekaran
Safoora Yousefi
Thomas Fel
Soheil Feizi
Besmira Nushi
Neel Joshi
Vibhav Vineet
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models"
17 / 17 papers shown
Title
Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class
Mazda Moayeri
Michael G. Rabbat
Mark Ibrahim
Diane Bouchacourt
VLM
61
1
0
25 Apr 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Shengbang Tong
Zhuang Liu
Yuexiang Zhai
Yi-An Ma
Yann LeCun
Saining Xie
VLM
MLLM
79
320
0
11 Jan 2024
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Xiang Yue
Yuansheng Ni
Kai Zhang
Tianyu Zheng
Ruoqi Liu
...
Yibo Liu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
OSLM
ELM
VLM
178
891
0
27 Nov 2023
QualEval: Qualitative Evaluation for Model Improvement
Vishvak Murahari
Ameet Deshpande
Peter Clark
Tanmay Rajpurohit
Ashish Sabharwal
Karthik Narasimhan
Ashwin Kalyan
46
5
0
06 Nov 2023
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu
Hritik Bansal
Tony Xia
Jiacheng Liu
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
85
605
0
03 Oct 2023
Measuring Progress on Scalable Oversight for Large Language Models
Sam Bowman
Jeeyoon Hyun
Ethan Perez
Edwin Chen
Craig Pettit
...
Tristan Hume
Yuntao Bai
Zac Hatfield-Dodds
Benjamin Mann
Jared Kaplan
ALM
ELM
62
128
0
04 Nov 2022
MTEB: Massive Text Embedding Benchmark
Niklas Muennighoff
Nouamane Tazi
L. Magne
Nils Reimers
357
395
0
13 Oct 2022
When and why vision-language models behave like bags-of-words, and what to do about it?
Mert Yuksekgonul
Federico Bianchi
Pratyusha Kalluri
Dan Jurafsky
James Zou
VLM
CoGe
62
387
0
04 Oct 2022
STaR: Bootstrapping Reasoning With Reasoning
E. Zelikman
Yuhuai Wu
Jesse Mu
Noah D. Goodman
ReLM
LRM
85
481
0
28 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
652
9,267
0
28 Jan 2022
A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers
Shen-Yun Miao
Chao-Chun Liang
Keh-Yih Su
54
338
0
30 Jun 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
791
29,167
0
26 Feb 2021
WILDS: A Benchmark of in-the-Wild Distribution Shifts
Pang Wei Koh
Shiori Sagawa
Henrik Marklund
Sang Michael Xie
Marvin Zhang
...
A. Kundaje
Emma Pierson
Sergey Levine
Chelsea Finn
Percy Liang
OOD
170
1,428
0
14 Dec 2020
Understanding Failures of Deep Networks via Robust Feature Extraction
Sahil Singla
Besmira Nushi
S. Shah
Ece Kamar
Eric Horvitz
FAtt
48
84
0
03 Dec 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
939
12,129
0
27 Aug 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
220
2,305
0
02 May 2019
A large annotated corpus for learning natural language inference
Samuel R. Bowman
Gabor Angeli
Christopher Potts
Christopher D. Manning
266
4,278
0
21 Aug 2015
1