Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.03839
Cited By
An Adaptive Data-Resilient Multi-Modal Framework for Hierarchical Multi-Label Book Genre Identification
5 May 2025
Utsav Nareti
S. Chattopadhyay
Prolay Mallick
Suraj Kumar
Ayush Vikas Daga
Chandranath Adak
Adarsh Wase
Arjab Roy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Adaptive Data-Resilient Multi-Modal Framework for Hierarchical Multi-Label Book Genre Identification"
32 / 32 papers shown
Title
What Text Design Characterizes Book Genres?
Daichi Haraguchi
Brian Kenji Iwana
Seiichi Uchida
68
3
0
26 Feb 2024
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
325
3,410
0
14 Apr 2023
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming-Hsuan Yang
Fahad Shahbaz Khan
ViT
105
94
0
27 Mar 2023
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
Zhongzhi Chen
Guangyi Liu
Bo Zhang
Fulong Ye
Qinghong Yang
Ledell Yu Wu
VLM
74
89
0
12 Nov 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
78
366
0
02 Jun 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
532
4,360
0
28 Jan 2022
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
93
710
0
08 Dec 2021
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
Chenlin Meng
Yutong He
Yang Song
Jiaming Song
Jiajun Wu
Jun-Yan Zhu
Stefano Ermon
DiffM
144
1,492
0
02 Aug 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Ben Graham
Alaaeldin El-Nouby
Hugo Touvron
Pierre Stock
Armand Joulin
Hervé Jégou
Matthijs Douze
ViT
76
788
0
02 Apr 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
447
21,439
0
25 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
929
29,436
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
443
3,856
0
11 Feb 2021
RegNet: Self-Regulated Network for Image Classification
Jing Xu
Yu Pan
Xinglin Pan
Guosheng Lin
Zhang Yi
Zenglin Xu
54
130
0
03 Jan 2021
Deep multi-modal networks for book genre classification based on its cover
Chandra Kundu
Lukun Zheng
AI4TS
40
8
0
15 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
654
41,103
0
22 Oct 2020
Asymmetric Loss For Multi-Label Classification
Emanuel Ben-Baruch
T. Ridnik
Nadav Zamir
Asaf Noy
Itamar Friedman
M. Protter
Lihi Zelnik-Manor
86
540
0
29 Sep 2020
Book Success Prediction with Pretrained Sentence Embeddings and Readability Scores
Muhammad Khalifa
Aminul Islam
27
1
0
21 Jul 2020
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
232
7,520
0
02 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
659
24,464
0
26 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
230
8,433
0
19 Jun 2019
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
139
18,134
0
28 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
94,891
0
11 Oct 2018
SimplE Embedding for Link Prediction in Knowledge Graphs
Seyed Mehran Kazemi
David Poole
76
716
0
13 Feb 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
181
19,284
0
13 Jan 2018
Analogical Inference for Multi-Relational Embeddings
Hanxiao Liu
Yuexin Wu
Yiming Yang
78
375
0
06 May 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Zhuowen Tu
Kaiming He
517
10,330
0
16 Nov 2016
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
1.4K
14,575
0
07 Oct 2016
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
772
36,813
0
25 Aug 2016
Complex Embeddings for Simple Link Prediction
Théo Trouillon
Johannes Welbl
Sebastian Riedel
Éric Gaussier
Guillaume Bouchard
BDL
88
2,978
0
20 Jun 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,020
0
10 Dec 2015
Embedding Entities and Relations for Learning and Inference in Knowledge Bases
Bishan Yang
Wen-tau Yih
Xiaodong He
Jianfeng Gao
Li Deng
NAI
95
3,197
0
20 Dec 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.6K
100,386
0
04 Sep 2014
1