ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.06884
10
0

FREE: Fast and Robust Vision Language Models with Early Exits

7 June 2025
Divya J. Bajpai
M. Hanawal
    VLM
ArXiv (abs)PDFHTML
Main:7 Pages
4 Figures
Bibliography:5 Pages
12 Tables
Appendix:5 Pages
Abstract

In recent years, Vision-Language Models (VLMs) have shown remarkable performance improvements in Vision-Language tasks. However, their large size poses challenges for real-world applications where inference latency is a concern. To tackle this issue, we propose employing Early Exit (EE) strategies in VLMs. However, training exit classifiers in VLMs is challenging, particularly with limited labeled training data. To address this, we introduce FREE, an adversarial training approach within a GAN-based framework. Here, each exit consists of a transformer layer and a classifier. The transformer layer is adversarially trained to produce feature representations similar to the final layer, while a feature classifier serves as the discriminator. Our method focuses on performing input-adaptive inference that increases inference speed with minimal drop in performance. Experimental results demonstrate the effectiveness of our approach in enhancing accuracy and model robustness by mitigating overthinking and the phenomenon of mid-crisis that we highlight. We experimentally validate that our method speeds up the inference process by more than 1.51x while retaining comparable performance. The source code is available atthis https URL.

View on arXiv
@article{bajpai2025_2506.06884,
  title={ FREE: Fast and Robust Vision Language Models with Early Exits },
  author={ Divya Jyoti Bajpai and Manjesh Kumar Hanawal },
  journal={arXiv preprint arXiv:2506.06884},
  year={ 2025 }
}
Comments on this paper