Title
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs Sihang Zhao Youliang Yuan Xiaoying Tang Pinjia He 92 3 0 15 Oct 2024
Quadratic Gating Mixture of Experts: Statistical Insights into Self-Attention Pedram Akbarian Huy Le Nguyen Xing Han Nhat Ho MoE 81 3 0 15 Oct 2024
LVD-2M: A Long-take Video Dataset with Temporally Dense Captions Tianwei Xiong Yuqing Wang Daquan Zhou Zhijie Lin Jiashi Feng Xihui Liu VGen 121 10 0 14 Oct 2024
Combinatorial Multi-armed Bandits: Arm Selection via Group Testing Arpan Mukherjee Shashanka Ubaru K. Murugesan Karthikeyan Shanmugam A. Tajer 81 0 0 14 Oct 2024
V2M: Visual 2-Dimensional Mamba for Image Representation Learning Chengkun Wang Wenzhao Zheng Yuanhui Huang Jie Zhou Jiwen Lu Mamba 44 2 0 14 Oct 2024
GlobalMamba: Global Image Serialization for Vision Mamba Chengkun Wang Wenzhao Zheng Jie Zhou Jiwen Lu Mamba 98 0 0 14 Oct 2024
SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models Yuqi Li Yao Lu Zhihong Zhu Chuanguang Yang Yihao Chen Jianping Gou 76 6 0 14 Oct 2024
ChartKG: A Knowledge-Graph-Based Representation for Chart Images Zhiguang Zhou Haoxuan Wang Zhengqing Zhao Fengling Zheng Yongheng Wang Wei Chen Yong Wang 144 1 0 13 Oct 2024
Uncovering Attacks and Defenses in Secure Aggregation for Federated Deep Learning Yiwei Zhang R. Behnia A. Yavuz Reza Ebrahimi E. Bertino FedML 78 2 0 13 Oct 2024
POPoS: Improving Efficient and Robust Facial Landmark Detection with Parallel Optimal Position Search Chong-Yang Xiang Jun-Yan He Zhi-Qi Cheng Xiao-Jun Wu Xian-Sheng Hua 110 0 0 12 Oct 2024
Preserving Old Memories in Vivid Detail: Human-Interactive Photo Restoration Framework Seung-Yeon Back Geonho Son Dahye Jeong Eunil Park Simon S. Woo 108 0 0 12 Oct 2024
Deep Transfer Learning: Model Framework and Error Analysis Yuling Jiao Huazhen Lin Yuchen Luo Jerry Zhijian Yang 147 1 0 12 Oct 2024
DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention Nguyen Huu Bao Long Chenyu Zhang Yuzhi Shi Tsubasa Hirakawa Takayoshi Yamashita Tohgoroh Matsui H. Fujiyoshi 70 3 0 11 Oct 2024
Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients Yan Li Mingyi Li Xiao Zhang Guangwei Xu Feng Chen Yuan Yuan Yifei Zou Mengying Zhao Jianbo Lu Dongxiao Yu 67 0 0 11 Oct 2024
Why pre-training is beneficial for downstream classification tasks? Xin Jiang Xu Cheng Zechao Li 75 0 0 11 Oct 2024
TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning Tsiry Mayet Pourya Shamsolmoali Simon Bernard Eric Granger Romain Hérault Clément Chatelain DiffM 84 1 0 11 Oct 2024
Non-transferable Pruning Ruyi Ding Lili Su A. A. Ding Yunsi Fei AAML 76 2 0 10 Oct 2024
BA-Net: Bridge Attention in Deep Neural Networks Ronghui Zhang Runzong Zou Yue Zhao Zirui Zhang Junzhou Chen Yue Cao Chuan Hu Houbing Song 71 1 0 10 Oct 2024
When the Small-Loss Trick is Not Enough: Multi-Label Image Classification with Noisy Labels Applied to CCTV Sewer Inspections Keryan Chelouche Marie Lachaize Marine Bernard Louise Olgiati Remi Cuingnet NoLa 60 0 0 10 Oct 2024
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models Wenbo Hu Jia-Chen Gu Zi-Yi Dou Mohsen Fayyaz Pan Lu Kai-Wei Chang Nanyun Peng VLM 167 8 0 10 Oct 2024
Segmenting objects with Bayesian fusion of active contour models and convnet priors P. Polewski Jacquelyn A. Shelton W. Yao M. Heurich 114 1 0 09 Oct 2024
Defending Membership Inference Attacks via Privacy-aware Sparsity Tuning Qiang Hu Hengxiang Zhang Jianguo Huang 133 2 0 09 Oct 2024
Compositional Entailment Learning for Hyperbolic Vision-Language Models Avik Pal Max van Spengler Guido Maria DÁmely di Melendugno Alessandro Flaborea Fabio Galasso Pascal Mettes CoGe 108 10 0 09 Oct 2024
Understanding Model Ensemble in Transferable Adversarial Attack Wei Yao Zeliang Zhang Huayi Tang Yong Liu 122 3 0 09 Oct 2024
Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images Shiyu Miao Delong Chen Fan Liu Chuanyi Zhang Yanhui Gu Shengjie Guo Jun Zhou 101 2 0 08 Oct 2024
Contrastive Learning to Fine-Tune Feature Extraction Models for the Visual Cortex Alex Mulrooney Austin J. Brockmeier 103 0 0 08 Oct 2024
Mero Nagarikta: Advanced Nepali Citizenship Data Extractor with Deep Learning-Powered Text Detection and OCR Sisir Dhakal Sujan Sigdel Sandesh Prasad Paudel Sharad Kumar Ranabhat Nabin Lamichhane 51 1 0 08 Oct 2024
CLOSER: Towards Better Representation Learning for Few-Shot Class-Incremental Learning Junghun Oh Sungyong Baik Kyoung Mu Lee CLL 91 4 0 08 Oct 2024
Swift Sampler: Efficient Learning of Sampler by 10 Parameters Jiawei Yao Chuming Li Canran Xiao 100 7 0 08 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization Saqib Javed Hieu Le Mathieu Salzmann OOD MQ 133 2 0 08 Oct 2024
Rethinking Weak-to-Strong Augmentation in Source-Free Domain Adaptive Object Detection Jiuzheng Yang Song Tang Yangkuiyi Zhang Shuaifeng Li Mao Ye Jianwei Zhang Xiatian Zhu 107 1 0 07 Oct 2024
LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions R. Kannan Chiranjib Bhattacharyya Praneeth Kacham David P. Woodruff 114 1 0 07 Oct 2024
A Deep Learning-Based Approach for Mangrove Monitoring Lucas José Velôso de Souza Ingrid Valverde Reis Zreik Adrien Salem-Sermanet Nacéra Seghouani Lionel Pourchier 41 0 0 07 Oct 2024
DAAL: Density-Aware Adaptive Line Margin Loss for Multi-Modal Deep Metric Learning Hadush Hailu Gebrerufael Anil Kumar Tiwari Gaurav Neupane Goitom Ybrah Hailu 65 0 0 07 Oct 2024
Precise Model Benchmarking with Only a Few Observations Riccardo Fogliato Pratik Patil Nil-Jana Akpinar Mathew Monfort 98 0 0 07 Oct 2024
Systematic Literature Review of Vision-Based Approaches to Outdoor Livestock Monitoring with Lessons from Wildlife Studies Stacey D. Scott Zayn J. Abbas Feerass Ellid Eli-Henry Dykhne Muhammad Muhaiminul Islam Weam Ayad Kristina Kacmorova Dan Tulpan Minglun Gong 145 1 0 07 Oct 2024
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey Dianzhi Yu Xinni Zhang Yankai Chen Aiwei Liu Yifei Zhang Philip S. Yu Irwin King VLM CLL 115 13 0 07 Oct 2024
Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation Mahdi Ghaznavi Hesam Asadollahzadeh Fahimeh Hosseini Noohdani Soroush Vafaie Tabar Hosein Hasani Taha Akbari Alvanagh M. Rohban M. Baghshah 104 0 0 07 Oct 2024
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering Kazumoto Nakamura Yuji Nozawa Yu-Chieh Lin K. Nakata Youyang Ng ViT 71 2 0 07 Oct 2024
CAR: Controllable Autoregressive Modeling for Visual Generation Ziyu Yao Jialin Li Yifeng Zhou Yong Liu Xi Jiang Chengjie Wang Feng Zheng Yuexian Zou Lei Li DiffM 154 16 0 07 Oct 2024
Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data David Heurtel-Depeiges Anian Ruoss J. Veness Tim Genewein 220 2 0 07 Oct 2024
Robustness Reprogramming for Representation Learning Zhichao Hou MohamadAli Torkamani Hamid Krim Xiaorui Liu AAML OOD 110 1 0 06 Oct 2024
Is What You Ask For What You Get? Investigating Concept Associations in Text-to-Image Models Salma Abdel Magid Weiwei Pan Simon Warchol Grace Guo Junsik Kim Mahia Rahman Hanspeter Pfister 213 0 0 06 Oct 2024
Implicit to Explicit Entropy Regularization: Benchmarking ViT Fine-tuning under Noisy Labels Maria Marrium Arif Mahmood Mohammed Bennamoun NoLa AAML 107 0 0 05 Oct 2024
RetCompletion:High-Speed Inference Image Completion with Retentive Network Yueyang Cang P. Hu Xiaoteng Zhang Xingtong Wang Yuhang Liu VLM 85 1 0 05 Oct 2024
Impact of Regularization on Calibration and Robustness: from the Representation Space Perspective Jonghyun Park Juyeop Kim Jong-Seok Lee 89 1 0 05 Oct 2024
Classification-Denoising Networks Louis Thiry Florentin Guth 89 1 0 04 Oct 2024
Comparative Analysis and Ensemble Enhancement of Leading CNN Architectures for Breast Cancer Classification Gary Murphy Raghubir Singh 45 3 0 04 Oct 2024
EmojiHeroVR: A Study on Facial Expression Recognition under Partial Occlusion from Head-Mounted Displays Thorben Ortmann Qi Wang Larissa Putzar 57 2 0 04 Oct 2024
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark Wenhao Chai Enxin Song Y. Du Chenlin Meng Vashisht Madhavan Omer Bar-Tal Jeng-Neng Hwang Saining Xie Christopher D. Manning 3DV 223 37 0 04 Oct 2024