CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions

18 October 2024

Papers citing "CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions"

20 / 20 papers shown

Title
Smoke and Mirrors in Causal Downstream Tasks Riccardo Cadei Lukas Lindorfer Sylvia Cremer Cordelia Schmid Francesco Locatello CML 76 5 0 27 May 2024
StableMask: Refining Causal Masking in Decoder-only Transformer Qingyu Yin Xuzheng He Zhuang Xiang Yu Zhao Jianhua Yao Xiaoyu Shen Qiang Zhang 23 11 0 07 Feb 2024
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry Michael Zhang Kush S. Bhatia Hermann Kumbong Christopher Ré 61 54 0 06 Feb 2024
Neural Topological Ordering for Computation Graphs Mukul Gagrani Corrado Rainone Yang Yang Harris Teague Wonseok Jeon H. V. Hoof Weizhen Zeng P. Zappi Chris Lott Roberto Bondesan 52 12 0 13 Jul 2022
Directed Acyclic Graph Network for Conversational Emotion Recognition Weizhou Shen Siyue Wu Yunyi Yang Xiaojun Quan 65 241 0 27 May 2021
D'ya like DAGs? A Survey on Structure Learning and Causal Discovery M. Vowels Necati Cihan Camgöz Richard Bowden CML 109 300 0 03 Mar 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 632 41,003 0 22 Oct 2020
Rethinking Attention with Performers K. Choromanski Valerii Likhosherstov David Dohan Xingyou Song Andreea Gane ... Afroz Mohiuddin Lukasz Kaiser David Belanger Lucy J. Colwell Adrian Weller 179 1,580 0 30 Sep 2020
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 743 41,932 0 28 May 2020
Causality matters in medical imaging Daniel Coelho De Castro Ian Walker Ben Glocker CML 38 344 0 17 Dec 2019
Adapting Neural Networks for the Estimation of Treatment Effects Claudia Shi David M. Blei Victor Veitch CML 145 374 0 05 Jun 2019
On the Fairness of Disentangled Representations Francesco Locatello G. Abbati Tom Rainforth Stefan Bauer Bernhard Schölkopf Olivier Bachem FaML DRL 73 226 0 31 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.7K 94,770 0 11 Oct 2018
Hyperparameters and Tuning Strategies for Random Forest Philipp Probst Marvin N. Wright A. Boulesteix 109 1,396 0 10 Apr 2018
Disentangled Sequential Autoencoder Yingzhen Li Stephan Mandt CoGe 71 271 0 08 Mar 2018
Causal Effect Inference with Deep Latent-Variable Models Christos Louizos Uri Shalit Joris Mooij David Sontag R. Zemel Max Welling CML BDL 198 743 0 24 May 2017
Layer Normalization Jimmy Lei Ba J. Kiros Geoffrey E. Hinton 410 10,482 0 21 Jul 2016
MADE: Masked Autoencoder for Distribution Estimation M. Germain Karol Gregor Iain Murray Hugo Larochelle OOD SyDa UQCV 170 867 0 12 Feb 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift Sergey Ioffe Christian Szegedy OOD 463 43,289 0 11 Feb 2015
A Deep and Tractable Density Estimator Benigno Uria Iain Murray Hugo Larochelle BDL 94 194 0 07 Oct 2013