ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.07871
  4. Cited By
FiLM: Visual Reasoning with a General Conditioning Layer
v1v2 (latest)

FiLM: Visual Reasoning with a General Conditioning Layer

22 September 2017
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
    FAttAIMatOffRLAI4CE
ArXiv (abs)PDFHTML

Papers citing "FiLM: Visual Reasoning with a General Conditioning Layer"

50 / 1,349 papers shown
Title
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
Renqiu Xia
Mingxing Li
Hancheng Ye
Wenjie Wu
Hongbin Zhou
...
Zeang Sheng
Botian Shi
Tao Chen
Junchi Yan
Bo Zhang
196
10
0
16 Dec 2024
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Saksham Singh Kushwaha
Yapeng Tian
DiffMVGen
127
2
0
14 Dec 2024
Fast and Robust Visuomotor Riemannian Flow Matching Policy
Fast and Robust Visuomotor Riemannian Flow Matching Policy
Haoran Ding
Noémie Jaquier
Jan Peters
Leonel Rozo
180
4
0
14 Dec 2024
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
Pengcheng Guo
Xuankai Chang
Hang Lv
Shinji Watanabe
Lei Xie
113
1
0
07 Dec 2024
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for
  Generalized Speech Processing
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Yen-Ju Lu
Jing Liu
Thomas Thebaud
Laureano Moro-Velazquez
Ariya Rastrow
Najim Dehak
Jesus Villalba
135
1
0
05 Dec 2024
TASR: Timestep-Aware Diffusion Model for Image Super-Resolution
TASR: Timestep-Aware Diffusion Model for Image Super-Resolution
Qinwei Lin
Xiaopeng Sun
Yu Gao
Yujie Zhong
Dengjie Li
Zheng Zhao
Haoqian Wang
137
0
0
04 Dec 2024
Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated Reasoning
Diffusion-VLA: Generalizable and Interpretable Robot Foundation Model via Self-Generated Reasoning
Junjie Wen
Minjie Zhu
Yinlin Zhu
Zhibin Tang
Jinming Li
...
Chengmeng Li
Xiaoyu Liu
Chaomin Shen
Yaxin Peng
Feifei Feng
149
22
0
04 Dec 2024
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye
Zhiyang Chen
Tiancheng Li
Zemin Huang
Weijian Luo
Guo-Jun Qi
DiffM
132
6
0
02 Dec 2024
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
Alain Riou
Antonin Gagnere
Gaëtan Hadjeres
Stefan Lattner
Geoffroy Peeters
134
0
0
29 Nov 2024
Unpacking the Individual Components of Diffusion Policy
Unpacking the Individual Components of Diffusion Policy
Xiu Yuan
164
0
0
27 Nov 2024
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware
  Transformers
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers
Ruoxi Zhu
Zhengzhong Tu
Jiaming Liu
A. Bovik
Yibo Fan
ViT
126
10
0
26 Nov 2024
Multi-Resolution Generative Modeling of Human Motion from Limited Data
Multi-Resolution Generative Modeling of Human Motion from Limited Data
David Eduardo Moreno-Villamarín
Anna Hilsmann
Peter Eisert
DiffM3DH
127
0
0
25 Nov 2024
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors
Soumava Paul
Prakhar Kaushik
Alan Yuille
3DGSDiffM
544
0
0
24 Nov 2024
Learning to Reason Iteratively and Parallelly for Complex Visual
  Reasoning Scenarios
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal
Debaditya Roy
Basura Fernando
Cheston Tan
ReLMLRM
138
2
0
20 Nov 2024
Generating 3D-Consistent Videos from Unposed Internet Photos
Generating 3D-Consistent Videos from Unposed Internet Photos
Gene Chou
Kai Zhang
Sai Bi
Hao Tan
Zexiang Xu
Fujun Luan
Bharath Hariharan
Noah Snavely
3DGSVGen
164
3
0
20 Nov 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
91
1
0
17 Nov 2024
TDSM: Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition
Jeonghyeok Do
Munchurl Kim
152
1
0
16 Nov 2024
NeuralDEM -- Real-time Simulation of Industrial Particulate Flows
NeuralDEM -- Real-time Simulation of Industrial Particulate Flows
Benedikt Alkin
Tobias Kronlachner
Samuele Papa
Stefan Pirker
Thomas Lichtenegger
Johannes Brandstetter
PINNAI4CE
128
3
1
14 Nov 2024
Artificial Intelligence for Biomedical Video Generation
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
183
0
0
12 Nov 2024
Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Longbiao Cheng
Ashutosh Pandey
Buye Xu
T. Delbruck
V. Ithapu
Shih-Chii Liu
71
2
0
04 Nov 2024
Music Foundation Model as Generic Booster for Music Downstream Tasks
Music Foundation Model as Generic Booster for Music Downstream Tasks
Weihsiang Liao
Yuhta Takida
Yukara Ikemiya
Zhi-Wei Zhong
Chieh-Hsin Lai
...
Stefan Uhlich
Taketo Akama
Woosung Choi
Yuichiro Koyama
Yuki Mitsufuji
237
1
0
02 Nov 2024
Is Multiple Object Tracking a Matter of Specialization?
Is Multiple Object Tracking a Matter of Specialization?
Gianluca Mancusi
Mattia Bernardi
Aniello Panariello
Angelo Porrello
Rita Cucchiara
Simone Calderara
MoMe
98
2
0
01 Nov 2024
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like
  Sketching
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching
Xinwang Chen
Ning Liu
Yinlin Zhu
Feifei Feng
Jian Tang
47
2
0
31 Oct 2024
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep
  Neural Network Inference
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
Changwoo Lee
Soo Min Kwon
Qing Qu
Hun-Seok Kim
90
0
0
28 Oct 2024
Enhancing Lie Detection Accuracy: A Comparative Study of Classic ML,
  CNN, and GCN Models using Audio-Visual Features
Enhancing Lie Detection Accuracy: A Comparative Study of Classic ML, CNN, and GCN Models using Audio-Visual Features
Abdelrahman Abdelwahab
Abdelrahman Abdelwahab
Ayaan Vaswani
Advait Bharathulwar
Arnav Kommaraju
72
1
0
26 Oct 2024
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images
Kyle Hatch
Ashwin Balakrishna
Oier Mees
Suraj Nair
Seohong Park
...
Masha Itkina
Benjamin Eysenbach
Sergey Levine
Thomas Kollar
Benjamin Burchfiel
119
4
0
26 Oct 2024
Considerations for Distribution Shift Robustness of Diagnostic Models in
  Healthcare
Considerations for Distribution Shift Robustness of Diagnostic Models in Healthcare
Arno Blaas
Adam Goliñski
Andrew C. Miller
Luca Zappella
J. Jacobsen
Christina Heinze-Deml
OOD
74
0
0
25 Oct 2024
Diffusion for Multi-Embodiment Grasping
Diffusion for Multi-Embodiment Grasping
Roman Freiberg
Alexander Qualmann
Ngo Anh Vien
Gerhard Neumann
74
3
0
24 Oct 2024
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe
  Dataset Curation
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
Yuang Ai
Xiaoqiang Zhou
Huaibo Huang
Xiaotian Han
Zhengyu Chen
Quanzeng You
Hongxia Yang
99
12
0
24 Oct 2024
Unified Microphone Conversion: Many-to-Many Device Mapping via Feature-wise Linear Modulation
Unified Microphone Conversion: Many-to-Many Device Mapping via Feature-wise Linear Modulation
Myeonghoon Ryu
Hongseok Oh
Suji Lee
Han Park
81
0
0
23 Oct 2024
Composing Diffusion Policies for Few-shot Learning of Movement
  Trajectories
Composing Diffusion Policies for Few-shot Learning of Movement Trajectories
Omkar Patil
Anant Sah
N. Gopalan
DiffM
64
1
0
22 Oct 2024
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Yuan Zhou
Qiuyue Wang
Yuxuan Cai
Huan Yang
VGenVLM
155
37
0
20 Oct 2024
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image
  Restoration
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration
Yuang Ai
Huaibo Huang
Ran He
86
5
0
20 Oct 2024
FoMo: A Foundation Model for Mobile Traffic Forecasting with Diffusion Model
FoMo: A Foundation Model for Mobile Traffic Forecasting with Diffusion Model
Haoye Chai
Shiyuan Zhang
Xiaoqian Qi
Yong Li
175
1
0
20 Oct 2024
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic
  Manipulation
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation
Shangning Xia
Hongjie Fang
Hao-Shu Fang
Cewu Lu
CML
86
5
0
19 Oct 2024
Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation
Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation
Sung-Wook Lee
Yen-Ling Kuo
Yen-Ling Kuo
127
4
0
18 Oct 2024
GAN-Based Speech Enhancement for Low SNR Using Latent Feature
  Conditioning
GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning
Shrishti Saha Shetu
Emanuël A. P. Habets
Andreas Brendel
38
3
0
17 Oct 2024
The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures
  with Latent Diffusion
The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion
Xu Han
Yuancheng Sun
Kai Chen
Kang Liu
Qiwei Ye
DiffMAI4CE
95
1
0
17 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
171
19
0
17 Oct 2024
UniCoN: Universal Conditional Networks for Multi-Age Embryonic Cartilage
  Segmentation with Sparsely Annotated Data
UniCoN: Universal Conditional Networks for Multi-Age Embryonic Cartilage Segmentation with Sparsely Annotated Data
Nishchal Sapkota
Yejia Zhang
Zihao Zhao
Maria Gomez
Yuhan Hsi
...
Meng Wu
E. Jabs
J. Richtsmeier
S. M. Perrine
Benlin Liu
AI4CE
60
0
0
16 Oct 2024
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Haechan Mark Bong
Ricardo de Azambuja
Giovanni Beltrame
VLM
68
0
0
16 Oct 2024
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
Hongduan Tian
Feng Liu
Zhanke Zhou
Tongliang Liu
Chengqi Zhang
Bo Han
VLM
134
1
0
16 Oct 2024
Parametric model reduction of mean-field and stochastic systems via
  higher-order action matching
Parametric model reduction of mean-field and stochastic systems via higher-order action matching
Jules Berman
Tobias Blickhan
Benjamin Peherstorfer
149
0
0
15 Oct 2024
Mitigating Suboptimality of Deterministic Policy Gradients in Complex
  Q-functions
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
Ayush Jain
Norio Kosaka
Xinhu Li
Kyung-Min Kim
Erdem Bıyık
Joseph J. Lim
OffRL
49
0
0
15 Oct 2024
On-the-fly Modulation for Balanced Multimodal Learning
On-the-fly Modulation for Balanced Multimodal Learning
Yake Wei
D. Hu
Henghui Du
Ji-Rong Wen
52
11
0
15 Oct 2024
The Ingredients for Robotic Diffusion Transformers
The Ingredients for Robotic Diffusion Transformers
Sudeep Dasari
Oier Mees
Sebastian Zhao
Mohan Kumar Srirama
Sergey Levine
118
24
0
14 Oct 2024
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Michelle Zhao
Reid G. Simmons
H. Admoni
Aaditya Ramdas
Andrea Bajcsy
449
4
0
11 Oct 2024
Scaling Laws For Diffusion Transformers
Scaling Laws For Diffusion Transformers
Zhengyang Liang
Hao He
Ceyuan Yang
Bo Dai
89
14
0
10 Oct 2024
Diversified and Adaptive Negative Sampling on Knowledge Graphs
Diversified and Adaptive Negative Sampling on Knowledge Graphs
Ran Liu
Zhongzhou Liu
Xiaoli Li
Hao Wu
Yuan Fang
51
0
0
10 Oct 2024
Learning to Generate Diverse Pedestrian Movements from Web Videos with
  Noisy Labels
Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels
Zhizheng Liu
Joe Lin
Wayne Wu
Bolei Zhou
VGen
450
0
0
10 Oct 2024
Previous
12345...252627
Next