ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.07871
  4. Cited By
FiLM: Visual Reasoning with a General Conditioning Layer

FiLM: Visual Reasoning with a General Conditioning Layer

22 September 2017
Ethan Perez
Florian Strub
H. D. Vries
Vincent Dumoulin
Aaron Courville
    FAtt
    AIMat
    OffRL
    AI4CE
ArXivPDFHTML

Papers citing "FiLM: Visual Reasoning with a General Conditioning Layer"

50 / 1,312 papers shown
Title
Towards Efficient and Scalable Training of Differentially Private Deep
  Learning
Towards Efficient and Scalable Training of Differentially Private Deep Learning
Sebastian Rodriguez Beltran
Marlon Tobaben
Niki Loppi
Antti Honkela
34
0
0
25 Jun 2024
F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting
  with Proxy Data
F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data
Zexing Xu
Linjun Zhang
Sitan Yang
Rasoul Etesami
Hanghang Tong
Huan Zhang
Jiawei Han
AI4TS
36
3
0
23 Jun 2024
Multimodal Multilabel Classification by CLIP
Multimodal Multilabel Classification by CLIP
Yanming Guo
VLM
32
0
0
23 Jun 2024
Open-vocabulary Pick and Place via Patch-level Semantic Maps
Open-vocabulary Pick and Place via Patch-level Semantic Maps
Mingxi Jia
Haojie Huang
Zhewen Zhang
Chenghao Wang
Linfeng Zhao
Dian Wang
J. Liu
Robin Walters
Robert Platt
Stefanie Tellex
LM&Ro
44
5
0
21 Jun 2024
Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance
Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance
Selam Gano
Abraham George
A. Farimani
OnRL
45
1
0
21 Jun 2024
CONMOD: Controllable Neural Frame-based Modulation Effects
CONMOD: Controllable Neural Frame-based Modulation Effects
Gyubin Lee
Hounsu Kim
Junwon Lee
Juhan Nam
35
0
0
20 Jun 2024
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics
Cheol Jun Cho
Peter Wu
Tejas S. Prabhune
Dhruv Agarwal
Gopala K. Anumanchipalli
36
1
0
18 Jun 2024
Improving Text-To-Audio Models with Synthetic Captions
Improving Text-To-Audio Models with Synthetic Captions
Zhifeng Kong
Sang-gil Lee
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Rafael Valle
Soujanya Poria
Bryan Catanzaro
53
11
0
18 Jun 2024
Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Jiho Choi
Seonho Lee
Seungho Lee
Minhyun Lee
Hyunjung Shim
OCL
45
0
0
17 Jun 2024
Improving Reward-Conditioned Policies for Multi-Armed Bandits using
  Normalized Weight Functions
Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions
Kai Xu
Farid Tajaddodianfar
Ben Allison
21
0
0
16 Jun 2024
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models
  in Decision Making
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
Zibin Dong
Yifu Yuan
Jianye Hao
Fei Ni
Yi Ma
Pengyi Li
Yan Zheng
DiffM
58
9
0
13 Jun 2024
Alleviating Distortion in Image Generation via Multi-Resolution
  Diffusion Models
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
Qihao Liu
Zhanpeng Zeng
Ju He
Qihang Yu
Xiaohui Shen
Liang-Chieh Chen
53
19
0
13 Jun 2024
Advancing Graph Generation through Beta Diffusion
Advancing Graph Generation through Beta Diffusion
Yilin He
Xinyang Liu
Bo Chen
Mingyuan Zhou
DiffM
31
0
0
13 Jun 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
39
0
0
13 Jun 2024
TSE-PI: Target Sound Extraction under Reverberant Environments with
  Pitch Information
TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information
Yiwen Wang
Xihong Wu
46
2
0
13 Jun 2024
Meta-Learning Neural Procedural Biases
Meta-Learning Neural Procedural Biases
Christian Raymond
Qi Chen
Bing Xue
Mengjie Zhan
52
1
0
12 Jun 2024
Comparative Analysis of Personalized Voice Activity Detection Systems:
  Assessing Real-World Effectiveness
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness
Satyam Kumar
Sai Srujana Buddi
U. Sarawgi
Vineet Garg
Shivesh Ranjan
Ognjen
Rudovic
Ahmed Hussen Abdelaziz
Saurabh N. Adya
53
2
0
12 Jun 2024
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and
  Video Generation
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation
Kai Wang
Shijian Deng
Jing Shi
Dimitrios Hatzinakos
Yapeng Tian
VGen
80
10
0
11 Jun 2024
BAKU: An Efficient Transformer for Multi-Task Policy Learning
BAKU: An Efficient Transformer for Multi-Task Policy Learning
Siddhant Haldar
Zhuoran Peng
Lerrel Pinto
OffRL
44
27
0
11 Jun 2024
ICGAN: An implicit conditioning method for interpretable feature control
  of neural audio synthesis
ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis
Yunyi Liu
Craig Jin
32
0
0
11 Jun 2024
FlexLoc: Conditional Neural Networks for Zero-Shot Sensor Perspective
  Invariance in Object Localization with Distributed Multimodal Sensors
FlexLoc: Conditional Neural Networks for Zero-Shot Sensor Perspective Invariance in Object Localization with Distributed Multimodal Sensors
Jason Wu
Ziqi Wang
Xiaomin Ouyang
Ho Lyun Jeong
Colin Samplawski
Lance M. Kaplan
Benjamin M. Marlin
Mani Srivastava
HAI
52
1
0
10 Jun 2024
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Shikai Qiu
Andres Potapczynski
Marc Finzi
Micah Goldblum
Andrew Gordon Wilson
40
11
0
10 Jun 2024
Space-Time Continuous PDE Forecasting using Equivariant Neural Fields
Space-Time Continuous PDE Forecasting using Equivariant Neural Fields
David M. Knigge
David R. Wessels
Riccardo Valperga
Samuele Papa
J. Sonke
E. Gavves
Erik J. Bekkers
AI4CE
36
4
0
10 Jun 2024
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
David R. Wessels
David M. Knigge
Samuele Papa
Riccardo Valperga
Sharvaree P. Vadgama
E. Gavves
Erik J. Bekkers
52
7
0
09 Jun 2024
CityCraft: A Real Crafter for 3D City Generation
CityCraft: A Real Crafter for 3D City Generation
Jie Deng
Wenhao Chai
Junsheng Huang
Zhonghan Zhao
Qixuan Huang
...
Shengyu Hao
Wenhao Hu
Lei Li
X. Li
Gaoang Wang
44
12
0
07 Jun 2024
Diffusion Models in $\textit{De Novo}$ Drug Design
Diffusion Models in De Novo\textit{De Novo}De Novo Drug Design
Amira Alakhdar
Barnabás Póczos
Newell Washburn
MedIm
38
13
0
07 Jun 2024
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer
Himanshu Maurya
A. Sigurgeirsson
30
0
0
06 Jun 2024
Vectorized Conditional Neural Fields: A Framework for Solving
  Time-dependent Parametric Partial Differential Equations
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations
Jan Hagnberger
Marimuthu Kalimuthu
Daniel Musekamp
Mathias Niepert
AI4TS
AI4CE
47
5
0
06 Jun 2024
The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning
The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning
D. Jayalath
Gilad Landau
Brendan Shillingford
M. Woolrich
Oiwi Parker Jones
SSL
62
4
0
06 Jun 2024
Wings: Learning Multimodal LLMs without Text-only Forgetting
Wings: Learning Multimodal LLMs without Text-only Forgetting
Yi-Kai Zhang
Shiyin Lu
Yang Li
Yanqing Ma
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
VLM
35
6
0
05 Jun 2024
Textless Acoustic Model with Self-Supervised Distillation for
  Noise-Robust Expressive Speech-to-Speech Translation
Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation
Min-Jae Hwang
Ilia Kulikov
Benjamin Peloquin
Hongyu Gong
Peng-Jen Chen
Ann Lee
35
1
0
04 Jun 2024
Operational Latent Spaces
Operational Latent Spaces
Scott H. Hawley
Austin R. Tackett
25
0
0
04 Jun 2024
PDP: Physics-Based Character Animation via Diffusion Policy
PDP: Physics-Based Character Animation via Diffusion Policy
Takara E. Truong
Michael Piseno
Zhaoming Xie
Chenxi Liu
VGen
51
6
0
03 Jun 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang
Wufei Ma
Angtian Wang
Shuo Chen
Adam Kortylewski
Alan L. Yuille
34
3
0
02 Jun 2024
Slight Corruption in Pre-training Data Makes Better Diffusion Models
Slight Corruption in Pre-training Data Makes Better Diffusion Models
Hao Chen
Yujin Han
Diganta Misra
Xiang Li
Kai Hu
Difan Zou
Masashi Sugiyama
Jindong Wang
Bhiksha Raj
DiffM
47
5
0
30 May 2024
Instruction-Guided Visual Masking
Instruction-Guided Visual Masking
Jinliang Zheng
Jianxiong Li
Si Cheng
Yinan Zheng
Jiaming Li
Jihao Liu
Yu Liu
Jingjing Liu
Xianyuan Zhan
53
5
0
30 May 2024
Empowering Embodied Manipulation: A Bimanual-Mobile Robot Manipulation
  Dataset for Household Tasks
Empowering Embodied Manipulation: A Bimanual-Mobile Robot Manipulation Dataset for Household Tasks
Tianle Zhang
Dongjiang Li
Yihang Li
Zecui Zeng
Lin Zhao
...
Yue Chen
Xuelong Wei
Yibing Zhan
Lusong Li
Xiaodong He
37
7
0
29 May 2024
MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing
  Optimized Kernel Dependence
MOKD: Cross-domain Finetuning for Few-shot Classification via Maximizing Optimized Kernel Dependence
Hongduan Tian
Feng Liu
Tongliang Liu
Bo Du
Yiu-ming Cheung
Bo Han
29
1
0
29 May 2024
On the Limits of Multi-modal Meta-Learning with Auxiliary Task
  Modulation Using Conditional Batch Normalization
On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization
Jordi Armengol-Estapé
Vincent Michalski
Ramnath Kumar
P. St-Charles
Doina Precup
Samira Ebrahimi Kahou
35
0
0
29 May 2024
Improving global awareness of linkset predictions using Cross-Attentive
  Modulation tokens
Improving global awareness of linkset predictions using Cross-Attentive Modulation tokens
Félix Marcoccia
C. Adjih
P. Mühlethaler
46
0
0
28 May 2024
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
Lianghui Zhu
Zilong Huang
Bencheng Liao
Jun Hao Liew
Hanshu Yan
Jiashi Feng
Xinggang Wang
70
13
0
28 May 2024
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy
Nikola Konstantinov
37
4
0
27 May 2024
Unisolver: PDE-Conditional Transformers Are Universal PDE Solvers
Unisolver: PDE-Conditional Transformers Are Universal PDE Solvers
Zhou Hang
Yuezhou Ma
Haixu Wu
Haowen Wang
Mingsheng Long
AI4CE
36
9
0
27 May 2024
PTQ4DiT: Post-training Quantization for Diffusion Transformers
PTQ4DiT: Post-training Quantization for Diffusion Transformers
Junyi Wu
Haoxuan Wang
Yuzhang Shang
Mubarak Shah
Yan Yan
MQ
33
19
0
25 May 2024
Self-distilled Dynamic Fusion Network for Language-based Fashion
  Retrieval
Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval
Yiming Wu
Hangfei Li
Fangfang Wang
Yilong Zhang
Ronghua Liang
13
3
0
24 May 2024
StyleMaster: Towards Flexible Stylized Image Generation with Diffusion
  Models
StyleMaster: Towards Flexible Stylized Image Generation with Diffusion Models
Chengming Xu
Kai Hu
Donghao Luo
Jiangning Zhang
Wei Li
Yanhao Ge
Chengjie Wang
DiffM
45
0
0
24 May 2024
Semantica: An Adaptable Image-Conditioned Diffusion Model
Semantica: An Adaptable Image-Conditioned Diffusion Model
Manoj Kumar
N. Houlsby
Emiel Hoogeboom
DiffM
VLM
40
0
0
23 May 2024
Defining error accumulation in ML atmospheric simulators
Defining error accumulation in ML atmospheric simulators
R. Parthipan
Mohit Anand
Hannah M. Christensen
J. S. Hosking
Damon J. Wischik
29
1
0
23 May 2024
ArchesWeather: An efficient AI weather forecasting model at 1.5°
  resolution
ArchesWeather: An efficient AI weather forecasting model at 1.5° resolution
Guillaume Couairon
Christian Lessig
A. Charantonis
C. Monteleoni
27
1
0
23 May 2024
TerDiT: Ternary Diffusion Models with Transformers
TerDiT: Ternary Diffusion Models with Transformers
Xudong Lu
Aojun Zhou
Ziyi Lin
Qi Liu
Yuhui Xu
Renrui Zhang
Yafei Wen
Shuai Ren
Peng Gao
Junchi Yan
MQ
55
2
0
23 May 2024
Previous
123...567...252627
Next