Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1709.07871
Cited By
v1
v2 (latest)
FiLM: Visual Reasoning with a General Conditioning Layer
22 September 2017
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
FAtt
AIMat
OffRL
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FiLM: Visual Reasoning with a General Conditioning Layer"
50 / 1,349 papers shown
Title
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
127
2
0
01 Jul 2025
Generating Directed Graphs with Dual Attention and Asymmetric Encoding
Alba Carballo-Castro
Manuel Madeira
Yiming Qin
D. Thanou
Pascal Frossard
22
0
0
19 Jun 2025
Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ
Yunkee Chae
Kyogu Lee
26
0
0
19 Jun 2025
NTIRE 2025 Image Shadow Removal Challenge Report
Florin-Alexandru Vasluianu
Tim Seizinger
Z. Zhou
C. L. Philip Chen
Zongwei Wu
...
Suiyi Zhao
Bo Wang
Yan Luo
M. Y. Wang
Yilin Zhang
56
1
0
18 Jun 2025
POCO: Scalable Neural Forecasting through Population Conditioning
Yu Duan
Hamza Tahir Chaudhry
Misha B. Ahrens
Christopher D Harvey
Matthew G Perich
Karl Deisseroth
Kanaka Rajan
AI4CE
23
0
0
17 Jun 2025
A Variational Framework for Improving Naturalness in Generative Spoken Language Models
Li-Wei Chen
Takuya Higuchi
Zakaria Aldeneh
Ahmed Hussen Abdelaziz
Alexander I. Rudnicky
29
0
0
17 Jun 2025
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Zhongzheng Qiao
Chenghao Liu
Y. Zhang
Ming Jin
Quang Pham
Qingsong Wen
P.N. Suganthan
Xudong Jiang
Savitha Ramasamy
AI4TS
AI4CE
22
0
0
17 Jun 2025
Improving Multimodal Learning Balance and Sufficiency through Data Remixing
Xiaoyu Ma
Hao Chen
Yongjian Deng
24
0
0
13 Jun 2025
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
Yisi Liu
Chenyang Wang
Hanjo Kim
Raniya Khan
Gopala Anumanchipalli
107
0
0
12 Jun 2025
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
Yihe Tang
Wenlong Huang
Yingke Wang
Chengshu Li
Roy Yuan
Ruohan Zhang
Jiajun Wu
Li Fei-Fei
50
0
0
10 Jun 2025
Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion
Alan Nawzad Amin
Nate Gruver
Andrew Gordon Wilson
25
0
0
10 Jun 2025
JAFAR: Jack up Any Feature at Any Resolution
Paul Couairon
Loick Chambon
Louis Serrano
Jean-Emmanuel Haugeard
Matthieu Cord
Nicolas Thome
MDE
42
0
0
10 Jun 2025
PropMEND: Hypernetworks for Knowledge Propagation in LLMs
Zeyu Leo Liu
Greg Durrett
Eunsol Choi
KELM
32
0
0
10 Jun 2025
A Review on Score-based Generative Models for Audio Applications
Ge Zhu
Yutong Wen
Zhiyao Duan
DiffM
MedIm
39
0
0
10 Jun 2025
CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray
Mingquan Lin
G. Holste
Song Wang
Yiliang Zhou
Yishu Wei
...
Hao Chen
Adam Flanders
George Shih
Zhangyang Wang
Yifan Peng
LM&MA
27
0
0
09 Jun 2025
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lv
Tianlin Pan
Chenyang Si
Zhaoxi Chen
W. Zuo
Ziwei Liu
Kwan-Yee K. Wong
33
0
0
09 Jun 2025
FuncGNN: Learning Functional Semantics of Logic Circuits with Graph Neural Networks
Qiyun Zhao
GNN
22
0
0
07 Jun 2025
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning
Yuheng Lei
Sitong Mao
Shunbo Zhou
Hongyuan Zhang
Xuelong Li
Ping Luo
CLL
42
0
0
06 Jun 2025
Graph Diffusion that can Insert and Delete
Matteo Ninniri
Marco Podda
D. Bacciu
DiffM
24
0
0
06 Jun 2025
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
Hongyi Zhou
Weiran Liao
Xi Huang
Yucheng Tang
Fabian Otto
...
Qian Wang
Ömer Erdinç Yagmurlu
Nils Blank
Moritz Reuss
Rudolf Lioutikov
63
0
0
06 Jun 2025
Feature-aware Hypergraph Generation via Next-Scale Prediction
Dorian Gailhard
Enzo Tartaglione
Lirida Naviner
Jhony H. Giraldo
58
0
0
02 Jun 2025
MAGIK: Mapping to Analogous Goals via Imagination-enabled Knowledge Transfer
Ajsal Shereef Palattuparambil
Thommen George Karimpanal
Santu Rana
OffRL
56
0
0
02 Jun 2025
DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation
Ming Meng
Ziyi Yang
Jian Yang
Zhenjie Su
Yonggui Zhu
Zhaoxin Fan
DiffM
VLM
51
0
0
01 Jun 2025
DiffDSR: Dysarthric Speech Reconstruction Using Latent Diffusion Model
Xueyuan Chen
Dongchao Yang
Wenxuan Wu
Minglin Wu
Jing Xu
Xixin Wu
Zhiyong Wu
Helen M. Meng
DiffM
39
0
0
31 May 2025
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
Benjamin Holzschuh
Qiang Liu
Georg Kohl
Nils Thuerey
AI4CE
50
1
0
30 May 2025
Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation
Ari S. Benjamin
Kyle Daruwalla
Christian Pehle
Anthony M. Zador
46
0
0
29 May 2025
UniMoGen: Universal Motion Generation
Aliasghar Khani
Arianna Rampini
Evan Atherton
Bruno Roy
29
0
0
28 May 2025
DiMoSR: Feature Modulation via Multi-Branch Dilated Convolutions for Efficient Image Super-Resolution
M. Yilmaz
Ahmet Bilican
A. Murat Tekalp
SupR
97
0
0
27 May 2025
Learning Optimal Multimodal Information Bottleneck Representations
Qilong Wu
Yiyang Shao
Jun Wang
Xiaobo Sun
33
0
0
26 May 2025
MVP: Multi-source Voice Pathology detection
Alkis Koudounas
Moreno La Quatra
Gabriele Ciravegna
M. Fantini
Erika Crosetti
G. Succo
Tania Cerquitelli
Sabato Marco Siniscalchi
Elena Baralis
45
0
0
26 May 2025
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Yuzhu Wang
Archontis Politis
Konstantinos Drossos
Tuomas Virtanen
48
0
0
22 May 2025
Neural Conditional Transport Maps
Carlos Rodriguez-Pardo
Leonardo Chiani
Emanuele Borgonovo
Massimo Tavoni
OT
84
0
0
21 May 2025
Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model
Peisong Niu
Ziqing Ma
Tian Zhou
Weiqi Chen
Lefei Shen
Rong Jin
Liang Sun
AI4CE
45
0
0
20 May 2025
Scalable Autoregressive 3D Molecule Generation
Austin H. Cheng
Chong Sun
Alán Aspuru-Guzik
100
1
0
20 May 2025
Neural Functional: Learning Function to Scalar Maps for Neural PDE Surrogates
Anthony Zhou
Amir Barati Farimani
AI4CE
70
0
0
19 May 2025
VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection
Aditya Taparia
Noel Ngu
Mario Leiva
Joshua Shay Kricheli
John Corcoran
Nathaniel D. Bastian
Gerardo Simari
Paulo Shakarian
Ransalu Senanayake
ObjD
86
0
0
19 May 2025
Model alignment using inter-modal bridges
Ali Gholamzadeh
Noor Sajid
231
0
0
18 May 2025
SEPT: Standard-Definition Map Enhanced Scene Perception and Topology Reasoning for Autonomous Driving
Muleilan Pei
Jiayao Shan
Peiliang Li
Jieqi Shi
Jing Huo
Yang Gao
Shaojie Shen
155
0
0
18 May 2025
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
Matthew Landers
Taylor W. Killian
Thomas Hartvigsen
Afsaneh Doryab
61
0
0
17 May 2025
CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World
Zoya Volovikova
G. Gorbov
Petr Kuderov
Aleksandr I. Panov
A. Skrynnik
95
0
0
17 May 2025
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
Danilo de Oliveira
Julius Richter
Tal Peer
Timo Gerkmann
DiffM
103
0
0
16 May 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
72
0
0
15 May 2025
Diffusion-SAFE: Shared Autonomy Framework with Diffusion for Safe Human-to-Robot Driving Handover
Yunxin Fan
Monroe Kennedy III
55
0
0
15 May 2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Julian Tanke
Takashi Shibuya
Kengo Uchida
Koichi Saito
Yuki Mitsufuji
Mamba
84
0
0
14 May 2025
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches
Yutong Hu
Pinhao Song
Kehan Wen
Renaud Detry
VLM
80
0
0
14 May 2025
ReCDAP: Relation-Based Conditional Diffusion with Attention Pooling for Few-Shot Knowledge Graph Completion
Jeongho Kim
Chanyeong Heo
Jaehee Jung
129
0
0
12 May 2025
Efficient Robotic Policy Learning via Latent Space Backward Planning
Dongxiu Liu
Haoyi Niu
Zhihao Wang
Jinliang Zheng
Yinan Zheng
Zhonghong Ou
Jianming Hu
Jianxiong Li
Xianyuan Zhan
116
0
0
11 May 2025
Automated Learning of Semantic Embedding Representations for Diffusion Models
Limai Jiang
Yunpeng Cai
DiffM
64
0
0
09 May 2025
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
V. Bhat
Yu-Hsiang Lan
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
136
0
0
09 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
Paul Pu Liang
LM&Ro
VLM
455
1
0
08 May 2025
1
2
3
4
...
25
26
27
Next