ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.07871
  4. Cited By
FiLM: Visual Reasoning with a General Conditioning Layer
v1v2 (latest)

FiLM: Visual Reasoning with a General Conditioning Layer

22 September 2017
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
    FAttAIMatOffRLAI4CE
ArXiv (abs)PDFHTML

Papers citing "FiLM: Visual Reasoning with a General Conditioning Layer"

50 / 1,349 papers shown
Title
Diffusion Models for Robotic Manipulation: A Survey
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
127
2
0
01 Jul 2025
Generating Directed Graphs with Dual Attention and Asymmetric Encoding
Generating Directed Graphs with Dual Attention and Asymmetric Encoding
Alba Carballo-Castro
Manuel Madeira
Yiming Qin
D. Thanou
Pascal Frossard
22
0
0
19 Jun 2025
Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ
Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ
Yunkee Chae
Kyogu Lee
26
0
0
19 Jun 2025
NTIRE 2025 Image Shadow Removal Challenge Report
NTIRE 2025 Image Shadow Removal Challenge Report
Florin-Alexandru Vasluianu
Tim Seizinger
Z. Zhou
C. L. Philip Chen
Zongwei Wu
...
Suiyi Zhao
Bo Wang
Yan Luo
M. Y. Wang
Yilin Zhang
56
1
0
18 Jun 2025
POCO: Scalable Neural Forecasting through Population Conditioning
POCO: Scalable Neural Forecasting through Population Conditioning
Yu Duan
Hamza Tahir Chaudhry
Misha B. Ahrens
Christopher D Harvey
Matthew G Perich
Karl Deisseroth
Kanaka Rajan
AI4CE
23
0
0
17 Jun 2025
A Variational Framework for Improving Naturalness in Generative Spoken Language Models
A Variational Framework for Improving Naturalness in Generative Spoken Language Models
Li-Wei Chen
Takuya Higuchi
Zakaria Aldeneh
Ahmed Hussen Abdelaziz
Alexander I. Rudnicky
29
0
0
17 Jun 2025
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Zhongzheng Qiao
Chenghao Liu
Y. Zhang
Ming Jin
Quang Pham
Qingsong Wen
P.N. Suganthan
Xudong Jiang
Savitha Ramasamy
AI4TSAI4CE
22
0
0
17 Jun 2025
Improving Multimodal Learning Balance and Sufficiency through Data Remixing
Improving Multimodal Learning Balance and Sufficiency through Data Remixing
Xiaoyu Ma
Hao Chen
Yongjian Deng
24
0
0
13 Jun 2025
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
Yisi Liu
Chenyang Wang
Hanjo Kim
Raniya Khan
Gopala Anumanchipalli
107
0
0
12 Jun 2025
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
Yihe Tang
Wenlong Huang
Yingke Wang
Chengshu Li
Roy Yuan
Ruohan Zhang
Jiajun Wu
Li Fei-Fei
50
0
0
10 Jun 2025
Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion
Alan Nawzad Amin
Nate Gruver
Andrew Gordon Wilson
25
0
0
10 Jun 2025
JAFAR: Jack up Any Feature at Any Resolution
JAFAR: Jack up Any Feature at Any Resolution
Paul Couairon
Loick Chambon
Louis Serrano
Jean-Emmanuel Haugeard
Matthieu Cord
Nicolas Thome
MDE
42
0
0
10 Jun 2025
PropMEND: Hypernetworks for Knowledge Propagation in LLMs
Zeyu Leo Liu
Greg Durrett
Eunsol Choi
KELM
32
0
0
10 Jun 2025
A Review on Score-based Generative Models for Audio Applications
Ge Zhu
Yutong Wen
Zhiyao Duan
DiffMMedIm
39
0
0
10 Jun 2025
CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray
CXR-LT 2024: A MICCAI challenge on long-tailed, multi-label, and zero-shot disease classification from chest X-ray
Mingquan Lin
G. Holste
Song Wang
Yiliang Zhou
Yishu Wei
...
Hao Chen
Adam Flanders
George Shih
Zhangyang Wang
Yifan Peng
LM&MA
27
0
0
09 Jun 2025
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lv
Tianlin Pan
Chenyang Si
Zhaoxi Chen
W. Zuo
Ziwei Liu
Kwan-Yee K. Wong
33
0
0
09 Jun 2025
FuncGNN: Learning Functional Semantics of Logic Circuits with Graph Neural Networks
FuncGNN: Learning Functional Semantics of Logic Circuits with Graph Neural Networks
Qiyun Zhao
GNN
22
0
0
07 Jun 2025
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning
Yuheng Lei
Sitong Mao
Shunbo Zhou
Hongyuan Zhang
Xuelong Li
Ping Luo
CLL
42
0
0
06 Jun 2025
Graph Diffusion that can Insert and Delete
Graph Diffusion that can Insert and Delete
Matteo Ninniri
Marco Podda
D. Bacciu
DiffM
24
0
0
06 Jun 2025
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
Hongyi Zhou
Weiran Liao
Xi Huang
Yucheng Tang
Fabian Otto
...
Qian Wang
Ömer Erdinç Yagmurlu
Nils Blank
Moritz Reuss
Rudolf Lioutikov
63
0
0
06 Jun 2025
Feature-aware Hypergraph Generation via Next-Scale Prediction
Feature-aware Hypergraph Generation via Next-Scale Prediction
Dorian Gailhard
Enzo Tartaglione
Lirida Naviner
Jhony H. Giraldo
58
0
0
02 Jun 2025
MAGIK: Mapping to Analogous Goals via Imagination-enabled Knowledge Transfer
MAGIK: Mapping to Analogous Goals via Imagination-enabled Knowledge Transfer
Ajsal Shereef Palattuparambil
Thommen George Karimpanal
Santu Rana
OffRL
56
0
0
02 Jun 2025
DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation
DS-TTS: Zero-Shot Speaker Style Adaptation from Voice Clips via Dynamic Dual-Style Feature Modulation
Ming Meng
Ziyi Yang
Jian Yang
Zhenjie Su
Yonggui Zhu
Zhaoxin Fan
DiffMVLM
51
0
0
01 Jun 2025
DiffDSR: Dysarthric Speech Reconstruction Using Latent Diffusion Model
DiffDSR: Dysarthric Speech Reconstruction Using Latent Diffusion Model
Xueyuan Chen
Dongchao Yang
Wenxuan Wu
Minglin Wu
Jing Xu
Xixin Wu
Zhiyong Wu
Helen M. Meng
DiffM
39
0
0
31 May 2025
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations
Benjamin Holzschuh
Qiang Liu
Georg Kohl
Nils Thuerey
AI4CE
50
1
0
30 May 2025
Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation
Walking the Weight Manifold: a Topological Approach to Conditioning Inspired by Neuromodulation
Ari S. Benjamin
Kyle Daruwalla
Christian Pehle
Anthony M. Zador
46
0
0
29 May 2025
UniMoGen: Universal Motion Generation
UniMoGen: Universal Motion Generation
Aliasghar Khani
Arianna Rampini
Evan Atherton
Bruno Roy
29
0
0
28 May 2025
DiMoSR: Feature Modulation via Multi-Branch Dilated Convolutions for Efficient Image Super-Resolution
DiMoSR: Feature Modulation via Multi-Branch Dilated Convolutions for Efficient Image Super-Resolution
M. Yilmaz
Ahmet Bilican
A. Murat Tekalp
SupR
97
0
0
27 May 2025
Learning Optimal Multimodal Information Bottleneck Representations
Learning Optimal Multimodal Information Bottleneck Representations
Qilong Wu
Yiyang Shao
Jun Wang
Xiaobo Sun
33
0
0
26 May 2025
MVP: Multi-source Voice Pathology detection
MVP: Multi-source Voice Pathology detection
Alkis Koudounas
Moreno La Quatra
Gabriele Ciravegna
M. Fantini
Erika Crosetti
G. Succo
Tania Cerquitelli
Sabato Marco Siniscalchi
Elena Baralis
45
0
0
26 May 2025
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Yuzhu Wang
Archontis Politis
Konstantinos Drossos
Tuomas Virtanen
48
0
0
22 May 2025
Neural Conditional Transport Maps
Neural Conditional Transport Maps
Carlos Rodriguez-Pardo
Leonardo Chiani
Emanuele Borgonovo
Massimo Tavoni
OT
84
0
0
21 May 2025
Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model
Utilizing Strategic Pre-training to Reduce Overfitting: Baguan -- A Pre-trained Weather Forecasting Model
Peisong Niu
Ziqing Ma
Tian Zhou
Weiqi Chen
Lefei Shen
Rong Jin
Liang Sun
AI4CE
45
0
0
20 May 2025
Scalable Autoregressive 3D Molecule Generation
Scalable Autoregressive 3D Molecule Generation
Austin H. Cheng
Chong Sun
Alán Aspuru-Guzik
100
1
0
20 May 2025
Neural Functional: Learning Function to Scalar Maps for Neural PDE Surrogates
Neural Functional: Learning Function to Scalar Maps for Neural PDE Surrogates
Anthony Zhou
Amir Barati Farimani
AI4CE
70
0
0
19 May 2025
VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection
VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection
Aditya Taparia
Noel Ngu
Mario Leiva
Joshua Shay Kricheli
John Corcoran
Nathaniel D. Bastian
Gerardo Simari
Paulo Shakarian
Ransalu Senanayake
ObjD
86
0
0
19 May 2025
Model alignment using inter-modal bridges
Model alignment using inter-modal bridges
Ali Gholamzadeh
Noor Sajid
231
0
0
18 May 2025
SEPT: Standard-Definition Map Enhanced Scene Perception and Topology Reasoning for Autonomous Driving
SEPT: Standard-Definition Map Enhanced Scene Perception and Topology Reasoning for Autonomous Driving
Muleilan Pei
Jiayao Shan
Peiliang Li
Jieqi Shi
Jing Huo
Yang Gao
Shaojie Shen
155
0
0
18 May 2025
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
Matthew Landers
Taylor W. Killian
Thomas Hartvigsen
Afsaneh Doryab
61
0
0
17 May 2025
CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World
CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World
Zoya Volovikova
G. Gorbov
Petr Kuderov
Aleksandr I. Panov
A. Skrynnik
95
0
0
17 May 2025
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models
Danilo de Oliveira
Julius Richter
Tal Peer
Timo Gerkmann
DiffM
103
0
0
16 May 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
72
0
0
15 May 2025
Diffusion-SAFE: Shared Autonomy Framework with Diffusion for Safe Human-to-Robot Driving Handover
Yunxin Fan
Monroe Kennedy III
55
0
0
15 May 2025
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
Julian Tanke
Takashi Shibuya
Kengo Uchida
Koichi Saito
Yuki Mitsufuji
Mamba
84
0
0
14 May 2025
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches
Yutong Hu
Pinhao Song
Kehan Wen
Renaud Detry
VLM
80
0
0
14 May 2025
ReCDAP: Relation-Based Conditional Diffusion with Attention Pooling for Few-Shot Knowledge Graph Completion
ReCDAP: Relation-Based Conditional Diffusion with Attention Pooling for Few-Shot Knowledge Graph Completion
Jeongho Kim
Chanyeong Heo
Jaehee Jung
129
0
0
12 May 2025
Efficient Robotic Policy Learning via Latent Space Backward Planning
Efficient Robotic Policy Learning via Latent Space Backward Planning
Dongxiu Liu
Haoyi Niu
Zhihao Wang
Jinliang Zheng
Yinan Zheng
Zhonghong Ou
Jianming Hu
Jianxiong Li
Xianyuan Zhan
116
0
0
11 May 2025
Automated Learning of Semantic Embedding Representations for Diffusion Models
Automated Learning of Semantic Embedding Representations for Diffusion Models
Limai Jiang
Yunpeng Cai
DiffM
64
0
0
09 May 2025
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
V. Bhat
Yu-Hsiang Lan
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
136
0
0
09 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
Paul Pu Liang
LM&RoVLM
455
1
0
08 May 2025
1234...252627
Next