ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.02068
  4. Cited By
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label
  Classification

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

5 February 2016
André F. T. Martins
Ramón Fernández Astudillo
ArXivPDFHTML

Papers citing "From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification"

50 / 128 papers shown
Title
CLCNet: Rethinking of Ensemble Modeling with Classification Confidence
  Network
CLCNet: Rethinking of Ensemble Modeling with Classification Confidence Network
Yaodong Yu
S. Horng
21
0
0
19 May 2022
Time-Series Domain Adaptation via Sparse Associative Structure
  Alignment: Learning Invariance and Variance
Time-Series Domain Adaptation via Sparse Associative Structure Alignment: Learning Invariance and Variance
Zijian Li
Ruichu Cai
Jiawei Chen
Yuguang Yan
Wei Chen
Keli Zhang
Junjian Ye
CML
TTA
AI4TS
42
5
0
07 May 2022
Learning to Scaffold: Optimizing Model Explanations for Teaching
Learning to Scaffold: Optimizing Model Explanations for Teaching
Patrick Fernandes
Marcos Vinícius Treviso
Danish Pruthi
André F. T. Martins
Graham Neubig
FAtt
32
22
0
22 Apr 2022
Learning Self-Modulating Attention in Continuous Time Space with
  Applications to Sequential Recommendation
Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation
Chao Chen
Haoyu Geng
Nianzu Yang
Junchi Yan
Daiyue Xue
Jianping Yu
Xiaokang Yang
HAI
AI4TS
27
11
0
30 Mar 2022
On Neural Network Equivalence Checking using SMT Solvers
On Neural Network Equivalence Checking using SMT Solvers
Charis Eleftheriadis
Nikolaos Kekatos
Panagiotis Katsaros
S. Tripakis
AAML
32
12
0
22 Mar 2022
TraceNet: Tracing and Locating the Key Elements in Sentiment Analysis
TraceNet: Tracing and Locating the Key Elements in Sentiment Analysis
Qinghua Zhao
Shuai Ma
19
0
0
28 Feb 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
48
2
0
15 Feb 2022
Are Transformers More Robust? Towards Exact Robustness Verification for
  Transformers
Are Transformers More Robust? Towards Exact Robustness Verification for Transformers
B. Liao
Chih-Hong Cheng
Hasan Esen
Alois Knoll
AAML
26
1
0
08 Feb 2022
Distance-Ratio-Based Formulation for Metric Learning
Distance-Ratio-Based Formulation for Metric Learning
Hyeongji Kim
P. Parviainen
K. Malde
19
1
0
21 Jan 2022
Taming Overconfident Prediction on Unlabeled Data from Hindsight
Taming Overconfident Prediction on Unlabeled Data from Hindsight
Jing Li
Yuangang Pan
Ivor W. Tsang
23
1
0
15 Dec 2021
Towards Controllable Agent in MOBA Games with Generative Modeling
Towards Controllable Agent in MOBA Games with Generative Modeling
Shubao Zhang
42
0
0
15 Dec 2021
Exploring Social Posterior Collapse in Variational Autoencoder for
  Interaction Modeling
Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling
Chen Tang
Wei Zhan
Masayoshi Tomizuka
DRL
39
19
0
01 Dec 2021
Evidential Softmax for Sparse Multimodal Distributions in Deep
  Generative Models
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Phil Chen
Masha Itkina
Ransalu Senanayake
Mykel J. Kochenderfer
41
6
0
27 Oct 2021
Understanding Interlocking Dynamics of Cooperative Rationalization
Understanding Interlocking Dynamics of Cooperative Rationalization
Mo Yu
Yang Zhang
Shiyu Chang
Tommi Jaakkola
29
41
0
26 Oct 2021
Deep Neural Networks and Tabular Data: A Survey
Deep Neural Networks and Tabular Data: A Survey
V. Borisov
Tobias Leemann
Kathrin Seßler
Johannes Haug
Martin Pawelczyk
Gjergji Kasneci
LMTD
52
650
0
05 Oct 2021
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Bo Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
119
357
0
04 Oct 2021
A Deep Learning Perspective on Connected Automated Vehicle (CAV)
  Cybersecurity and Threat Intelligence
A Deep Learning Perspective on Connected Automated Vehicle (CAV) Cybersecurity and Threat Intelligence
M. Basnet
Mohd. Hasan Ali
28
7
0
22 Sep 2021
Identifying Autism Spectrum Disorder Based on Individual-Aware
  Down-Sampling and Multi-Modal Learning
Identifying Autism Spectrum Disorder Based on Individual-Aware Down-Sampling and Multi-Modal Learning
Li Pan
Jundong Liu
M. Shi
C. Wong
K. Chan
32
11
0
19 Sep 2021
Subword Mapping and Anchoring across Languages
Subword Mapping and Anchoring across Languages
Giorgos Vernikos
Andrei Popescu-Belis
72
12
0
09 Sep 2021
Layer-wise Adaptive Graph Convolution Networks Using Generalized
  Pagerank
Layer-wise Adaptive Graph Convolution Networks Using Generalized Pagerank
Kishan Wimalawarne
Taiji Suzuki
GNN
22
2
0
24 Aug 2021
ARM-Net: Adaptive Relation Modeling Network for Structured Data
ARM-Net: Adaptive Relation Modeling Network for Structured Data
Shaofeng Cai
Kaiping Zheng
Gang Chen
H. V. Jagadish
Beng Chin Ooi
Meihui Zhang
40
50
0
05 Jul 2021
Attention-based multi-channel speaker verification with ad-hoc
  microphone arrays
Attention-based multi-channel speaker verification with ad-hoc microphone arrays
Che-Yuan Liang
Junqi Chen
Shanzheng Guan
Xiao-Lei Zhang
25
9
0
01 Jul 2021
Iterative Methods for Private Synthetic Data: Unifying Framework and New
  Methods
Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods
Terrance Liu
G. Vietri
Zhiwei Steven Wu
SyDa
33
61
0
14 Jun 2021
BERTTune: Fine-Tuning Neural Machine Translation with BERTScore
BERTTune: Fine-Tuning Neural Machine Translation with BERTScore
Inigo Jauregi Unanue
Jacob Parnell
Massimo Piccardi
26
32
0
04 Jun 2021
Scaling sparsemax based channel selection for speech recognition with
  ad-hoc microphone arrays
Scaling sparsemax based channel selection for speech recognition with ad-hoc microphone arrays
Junqi Chen
Xiao-Lei Zhang
13
10
0
29 Mar 2021
Differentially Private Query Release Through Adaptive Projection
Differentially Private Query Release Through Adaptive Projection
Sergul Aydore
William Brown
Michael Kearns
K. Kenthapadi
Luca Melis
Aaron Roth
Ankit Siva
54
64
0
11 Mar 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using
  Divergence Frontiers
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
48
343
0
02 Feb 2021
Explain and Predict, and then Predict Again
Explain and Predict, and then Predict Again
Zijian Zhang
Koustav Rudra
Avishek Anand
FAtt
33
51
0
11 Jan 2021
Time Series Domain Adaptation via Sparse Associative Structure Alignment
Time Series Domain Adaptation via Sparse Associative Structure Alignment
Ruichu Cai
Jiawei Chen
Zijian Li
Wei Chen
Keli Zhang
Junjian Ye
Zhuozhang Li
Xiaoyan Yang
Zhenjie Zhang
CML
TTA
OOD
AI4TS
27
88
0
22 Dec 2020
Know Your Limits: Uncertainty Estimation with ReLU Classifiers Fails at
  Reliable OOD Detection
Know Your Limits: Uncertainty Estimation with ReLU Classifiers Fails at Reliable OOD Detection
Dennis Ulmer
Giovanni Cina
OODD
37
31
0
09 Dec 2020
Optimal Approximation -- Smoothness Tradeoffs for Soft-Max Functions
Optimal Approximation -- Smoothness Tradeoffs for Soft-Max Functions
Alessandro Epasto
Mohammad Mahdian
Vahab Mirrokni
Manolis Zampetakis
25
15
0
22 Oct 2020
A Comparative Study of Deep Learning Loss Functions for Multi-Label
  Remote Sensing Image Classification
A Comparative Study of Deep Learning Loss Functions for Multi-Label Remote Sensing Image Classification
Hichame Yessou
Gencer Sumbul
Begüm Demir
19
31
0
29 Sep 2020
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual
  Sequence Generation
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation
Aditya Mogadala
Marius Mosbach
Dietrich Klakow
VLM
184
0
0
12 Jul 2020
Learning Abstract Models for Strategic Exploration and Fast Reward
  Transfer
Learning Abstract Models for Strategic Exploration and Fast Reward Transfer
E. Liu
Ramtin Keramati
Sudarshan Seshadri
Kelvin Guu
Panupong Pasupat
Emma Brunskill
Percy Liang
OffRL
27
5
0
12 Jul 2020
Sparse Randomized Shortest Paths Routing with Tsallis Divergence
  Regularization
Sparse Randomized Shortest Paths Routing with Tsallis Divergence Regularization
P. Leleux
Sylvain Courtain
Guillaume Guex
M. Saerens
OT
24
5
0
01 Jul 2020
Gradient Estimation with Stochastic Softmax Tricks
Gradient Estimation with Stochastic Softmax Tricks
Max B. Paulus
Dami Choi
Daniel Tarlow
Andreas Krause
Chris J. Maddison
BDL
41
85
0
15 Jun 2020
Why Attentions May Not Be Interpretable?
Why Attentions May Not Be Interpretable?
Bing Bai
Jian Liang
Guanhua Zhang
Hao Li
Kun Bai
Fei Wang
FAtt
25
56
0
10 Jun 2020
Rationalizing Text Matching: Learning Sparse Alignments via Optimal
  Transport
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Kyle Swanson
L. Yu
Tao Lei
OT
29
37
0
27 May 2020
Towards Transparent and Explainable Attention Models
Towards Transparent and Explainable Attention Models
Akash Kumar Mohankumar
Preksha Nema
Sharan Narasimhan
Mitesh M. Khapra
Balaji Vasan Srinivasan
Balaraman Ravindran
44
99
0
29 Apr 2020
Who2com: Collaborative Perception via Learnable Handshake Communication
Who2com: Collaborative Perception via Learnable Handshake Communication
Yen-Cheng Liu
Junjiao Tian
Chih-Yao Ma
Nathan Glaser
Chia-Wen Kuo
Z. Kira
23
162
0
21 Mar 2020
RP-DNN: A Tweet level propagation context based deep neural networks for
  early rumor detection in Social Media
RP-DNN: A Tweet level propagation context based deep neural networks for early rumor detection in Social Media
Jie Gao
Sooji Han
Xingyi Song
F. Ciravegna
23
20
0
28 Feb 2020
Sparse Sinkhorn Attention
Sparse Sinkhorn Attention
Yi Tay
Dara Bahri
Liu Yang
Donald Metzler
Da-Cheng Juan
23
331
0
26 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language
  Models
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
30
49
0
18 Feb 2020
Linking Social Media Posts to News with Siamese Transformers
Linking Social Media Posts to News with Siamese Transformers
Jacob Danovitch
24
2
0
10 Jan 2020
Explicit Sparse Transformer: Concentrated Attention Through Explicit
  Selection
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Automatic Design of CNNs via Differentiable Neural Architecture Search
  for PolSAR Image Classification
Automatic Design of CNNs via Differentiable Neural Architecture Search for PolSAR Image Classification
Hongwei Dong
Siyu Zhang
B. Zou
Lamei Zhang
16
47
0
16 Nov 2019
Multi-attention Networks for Temporal Localization of Video-level Labels
Multi-attention Networks for Temporal Localization of Video-level Labels
Lijun Zhang
Srinath Nizampatnam
Ahana Gangopadhyay
Marcos V. Conde
30
7
0
15 Nov 2019
Hierarchical Graph Pooling with Structure Learning
Hierarchical Graph Pooling with Structure Learning
Zhen Zhang
Jiajun Bu
Martin Ester
Jianfeng Zhang
Chengwei Yao
Zhi Yu
Can Wang
30
174
0
14 Nov 2019
Differentiable Convex Optimization Layers
Differentiable Convex Optimization Layers
Akshay Agrawal
Brandon Amos
Shane T. Barratt
Stephen P. Boyd
Steven Diamond
Zico Kolter
47
640
0
28 Oct 2019
Structured Prediction with Projection Oracles
Structured Prediction with Projection Oracles
Mathieu Blondel
21
33
0
24 Oct 2019
Previous
123
Next