v1v2 (latest)

FiLM: Visual Reasoning with a General Conditioning Layer

22 September 2017

Aaron Courville

Papers citing "FiLM: Visual Reasoning with a General Conditioning Layer"

50 / 1,349 papers shown

Title
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training Renqiu Xia Mingxing Li Hancheng Ye Wenjie Wu Hongbin Zhou ... Zeang Sheng Botian Shi Tao Chen Junchi Yan Bo Zhang 196 10 0 16 Dec 2024
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation Saksham Singh Kushwaha Yapeng Tian DiffM VGen 127 2 0 14 Dec 2024
Fast and Robust Visuomotor Riemannian Flow Matching Policy Haoran Ding Noémie Jaquier Jan Peters Leonel Rozo 180 4 0 14 Dec 2024
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR Pengcheng Guo Xuankai Chang Hang Lv Shinji Watanabe Lei Xie 113 1 0 07 Dec 2024
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing Yen-Ju Lu Jing Liu Thomas Thebaud Laureano Moro-Velazquez Ariya Rastrow Najim Dehak Jesus Villalba 135 1 0 05 Dec 2024
TASR: Timestep-Aware Diffusion Model for Image Super-Resolution Qinwei Lin Xiaopeng Sun Yu Gao Yujie Zhong Dengjie Li Zheng Zhao Haoqian Wang 137 0 0 04 Dec 2024
Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated Reasoning Junjie Wen Minjie Zhu Yinlin Zhu Zhibin Tang Jinming Li ... Chengmeng Li Xiaoyu Liu Chaomin Shen Yaxin Peng Feifei Feng 149 22 0 04 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation Zilyu Ye Zhiyang Chen Tiancheng Li Zemin Huang Weijian Luo Guo-Jun Qi DiffM 132 6 0 02 Dec 2024
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures Alain Riou Antonin Gagnere Gaëtan Hadjeres Stefan Lattner Geoffroy Peeters 134 0 0 29 Nov 2024
Unpacking the Individual Components of Diffusion Policy Xiu Yuan 164 0 0 27 Nov 2024
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers Ruoxi Zhu Zhengzhong Tu Jiaming Liu A. Bovik Yibo Fan ViT 126 10 0 26 Nov 2024
Multi-Resolution Generative Modeling of Human Motion from Limited Data David Eduardo Moreno-Villamarín Anna Hilsmann Peter Eisert DiffM 3DH 127 0 0 25 Nov 2024
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors Soumava Paul Prakhar Kaushik Alan Yuille 3DGS DiffM 544 0 0 24 Nov 2024
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios Shantanu Jaiswal Debaditya Roy Basura Fernando Cheston Tan ReLM LRM 138 2 0 20 Nov 2024
Generating 3D-Consistent Videos from Unposed Internet Photos Gene Chou Kai Zhang Sai Bi Hao Tan Zexiang Xu Fujun Luan Bharath Hariharan Noah Snavely 3DGS VGen 164 3 0 20 Nov 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms Raihan Kabir Naznin Haque Md. Saiful Islam Marium-E. Jannat CoGe 91 1 0 17 Nov 2024
TDSM: Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition Jeonghyeok Do Munchurl Kim 152 1 0 16 Nov 2024
NeuralDEM -- Real-time Simulation of Industrial Particulate Flows Benedikt Alkin Tobias Kronlachner Samuele Papa Stefan Pirker Thomas Lichtenegger Johannes Brandstetter PINN AI4CE 128 3 1 14 Nov 2024
Artificial Intelligence for Biomedical Video Generation Linyuan Li Jianing Qiu Anujit Saha Lin Li Poyuan Li Mengxian He Ziyu Guo Wu Yuan VGen 183 0 0 12 Nov 2024
Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement Longbiao Cheng Ashutosh Pandey Buye Xu T. Delbruck V. Ithapu Shih-Chii Liu 71 2 0 04 Nov 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks Weihsiang Liao Yuhta Takida Yukara Ikemiya Zhi-Wei Zhong Chieh-Hsin Lai ... Stefan Uhlich Taketo Akama Woosung Choi Yuichiro Koyama Yuki Mitsufuji 237 1 0 02 Nov 2024
Is Multiple Object Tracking a Matter of Specialization? Gianluca Mancusi Mattia Bernardi Aniello Panariello Angelo Porrello Rita Cucchiara Simone Calderara MoMe 98 2 0 01 Nov 2024
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching Xinwang Chen Ning Liu Yinlin Zhu Feifei Feng Jian Tang 47 2 0 31 Oct 2024
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference Changwoo Lee Soo Min Kwon Qing Qu Hun-Seok Kim 90 0 0 28 Oct 2024
Enhancing Lie Detection Accuracy: A Comparative Study of Classic ML, CNN, and GCN Models using Audio-Visual Features Abdelrahman Abdelwahab Abdelrahman Abdelwahab Ayaan Vaswani Advait Bharathulwar Arnav Kommaraju 72 1 0 26 Oct 2024
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images Kyle Hatch Ashwin Balakrishna Oier Mees Suraj Nair Seohong Park ... Masha Itkina Benjamin Eysenbach Sergey Levine Thomas Kollar Benjamin Burchfiel 119 4 0 26 Oct 2024
Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare Arno Blaas Adam Goliñski Andrew C. Miller Luca Zappella J. Jacobsen Christina Heinze-Deml OOD 74 0 0 25 Oct 2024
Diffusion for Multi-Embodiment Grasping Roman Freiberg Alexander Qualmann Ngo Anh Vien Gerhard Neumann 74 3 0 24 Oct 2024
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation Yuang Ai Xiaoqiang Zhou Huaibo Huang Xiaotian Han Zhengyu Chen Quanzeng You Hongxia Yang 99 12 0 24 Oct 2024
Unified Microphone Conversion: Many-to-Many Device Mapping via Feature-wise Linear Modulation Myeonghoon Ryu Hongseok Oh Suji Lee Han Park 81 0 0 23 Oct 2024
Composing Diffusion Policies for Few-shot Learning of Movement Trajectories Omkar Patil Anant Sah N. Gopalan DiffM 64 1 0 22 Oct 2024
Allegro: Open the Black Box of Commercial-Level Video Generation Model Yuan Zhou Qiuyue Wang Yuxuan Cai Huan Yang VGen VLM 155 37 0 20 Oct 2024
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration Yuang Ai Huaibo Huang Ran He 86 5 0 20 Oct 2024
FoMo: A Foundation Model for Mobile Traffic Forecasting with Diffusion Model Haoye Chai Shiyuan Zhang Xiaoqian Qi Yong Li 175 1 0 20 Oct 2024
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation Shangning Xia Hongjie Fang Hao-Shu Fang Cewu Lu CML 86 5 0 19 Oct 2024
Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation Sung-Wook Lee Yen-Ling Kuo Yen-Ling Kuo 127 4 0 18 Oct 2024
GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning Shrishti Saha Shetu Emanuël A. P. Habets Andreas Brendel 38 3 0 17 Oct 2024
The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion Xu Han Yuancheng Sun Kai Chen Kang Liu Qiwei Ye DiffM AI4CE 95 1 0 17 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance Mitsuhiko Nakamoto Oier Mees Aviral Kumar Sergey Levine OffRL 171 19 0 17 Oct 2024
UniCoN: Universal Conditional Networks for Multi-Age Embryonic Cartilage Segmentation with Sparsely Annotated Data Nishchal Sapkota Yejia Zhang Zihao Zhao Maria Gomez Yuhan Hsi ... Meng Wu E. Jabs J. Richtsmeier S. M. Perrine Benlin Liu AI4CE 60 0 0 16 Oct 2024
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation Haechan Mark Bong Ricardo de Azambuja Giovanni Beltrame VLM 68 0 0 16 Oct 2024
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning Hongduan Tian Feng Liu Zhanke Zhou Tongliang Liu Chengqi Zhang Bo Han VLM 134 1 0 16 Oct 2024
Parametric model reduction of mean-field and stochastic systems via higher-order action matching Jules Berman Tobias Blickhan Benjamin Peherstorfer 149 0 0 15 Oct 2024
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions Ayush Jain Norio Kosaka Xinhu Li Kyung-Min Kim Erdem Bıyık Joseph J. Lim OffRL 49 0 0 15 Oct 2024
On-the-fly Modulation for Balanced Multimodal Learning Yake Wei D. Hu Henghui Du Ji-Rong Wen 52 11 0 15 Oct 2024
The Ingredients for Robotic Diffusion Transformers Sudeep Dasari Oier Mees Sebastian Zhao Mohan Kumar Srirama Sergey Levine 118 24 0 14 Oct 2024
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback Michelle Zhao Reid G. Simmons H. Admoni Aaditya Ramdas Andrea Bajcsy 449 4 0 11 Oct 2024
Scaling Laws For Diffusion Transformers Zhengyang Liang Hao He Ceyuan Yang Bo Dai 89 14 0 10 Oct 2024
Diversified and Adaptive Negative Sampling on Knowledge Graphs Ran Liu Zhongzhou Liu Xiaoli Li Hao Wu Yuan Fang 51 0 0 10 Oct 2024
Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels Zhizheng Liu Joe Lin Wayne Wu Bolei Zhou VGen 450 0 0 10 Oct 2024