Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.08083
Cited By
Language Modeling with Gated Convolutional Networks
23 December 2016
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Modeling with Gated Convolutional Networks"
50 / 294 papers shown
Title
Qwen3 Technical Report
A. Yang
A. Li
Baosong Yang
Beichen Zhang
Binyuan Hui
...
Zekun Wang
Zeyu Cui
Z. Zhang
Zhenhong Zhou
Zihan Qiu
LLMAG
OSLM
LRM
42
0
0
14 May 2025
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
Detao Bai
Zhiheng Ma
Xihan Wei
Liefeng Bo
120
0
0
06 May 2025
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
63
0
0
06 May 2025
Bielik v3 Small: Technical Report
Krzysztof Ociepa
Łukasz Flis
Remigiusz Kinas
Krzysztof Wróbel
Adrian Gwoździej
27
0
0
05 May 2025
Bielik 11B v2 Technical Report
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
34
0
0
05 May 2025
TriniMark: A Robust Generative Speech Watermarking Method for Trinity-Level Attribution
Yue Li
Wei Liu
Dongdong Lin
44
0
0
29 Apr 2025
Revisiting Reset Mechanisms in Spiking Neural Networks for Sequential Modeling: Specialized Discretization for Binary Activated RNN
Enqi Zhang
MQ
149
0
0
24 Apr 2025
Hadamard product in deep learning: Introduction, Advances and Challenges
Grigorios G. Chrysos
Yongtao Wu
Razvan Pascanu
Philip Torr
V. Cevher
AAML
98
0
0
17 Apr 2025
IgCraft: A versatile sequence generation framework for antibody discovery and engineering
Matthew Greenig
Haowen Zhao
Vladimir Radenkovic
Aubin Ramon
Pietro Sormanni
44
0
0
25 Mar 2025
SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures
Hui Liu
Chen Jia
Fan Shi
Xu Cheng
Shengyong Chen
Mamba
47
0
0
03 Mar 2025
Similarity-Distance-Magnitude Universal Verification
Allen Schmaltz
UQCV
AAML
149
0
0
27 Feb 2025
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
57
2
0
24 Feb 2025
ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition
Muhammad Waseem Akram
Stefano Dettori
V. Colla
Giorgio Buttazzo
54
0
0
17 Feb 2025
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference
Duc Hau Nguyen
Duc Hau Nguyen
Pascale Sébillot
52
5
0
23 Jan 2025
Modeling Time-Variant Responses of Optical Compressors with Selective State Space Models
Riccardo Simionato
Stefano Fasciani
78
1
0
17 Jan 2025
CURing Large Models: Compression via CUR Decomposition
Sanghyeon Park
Soo-Mook Moon
41
0
0
08 Jan 2025
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
Ji-Hoon Kim
Hong-Sun Yang
Yoon-Cheol Ju
Il-Hwan Kim
Byeong-Yeol Kim
Joon Son Chung
BDL
54
0
0
31 Dec 2024
Efficient LLM Inference using Dynamic Input Pruning and Cache-Aware Masking
Marco Federici
Davide Belli
M. V. Baalen
Amir Jalalirad
Andrii Skliar
Bence Major
Markus Nagel
Paul N. Whatmough
76
0
0
02 Dec 2024
MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking
Chunhui Zhang
Li Liu
Hao-Kai Wen
Xi Zhou
Y. Wang
Mamba
105
2
0
24 Nov 2024
HMIL: Hierarchical Multi-Instance Learning for Fine-Grained Whole Slide Image Classification
C. Jin
Luyang Luo
Huangjing Lin
Jun Hou
Hao Chen
47
3
0
12 Nov 2024
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Y. Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
49
3
0
04 Nov 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers
Enze Xie
Junsong Chen
Junyu Chen
Han Cai
Haotian Tang
...
Zhekai Zhang
Muyang Li
Ligeng Zhu
Yaojie Lu
Song Han
VLM
46
49
0
14 Oct 2024
Fusion Matrix Prompt Enhanced Self-Attention Spatial-Temporal Interactive Traffic Forecasting Framework
Mu Liu
MingChen Sun YingJi Li
Ying Wang
AI4TS
39
0
0
12 Oct 2024
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Zhihao He
Hang Yu
Zi Gong
Shizhan Liu
J. Li
Weiyao Lin
VLM
38
1
0
09 Oct 2024
Oscillatory State-Space Models
T. Konstantin Rusch
Daniela Rus
AI4TS
141
5
0
04 Oct 2024
GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions
Liang Feng
Zhixuan Shen
Lihua Wen
Shiyao Li
Ming Xu
CVBM
33
0
0
12 Sep 2024
Can Transformers Do Enumerative Geometry?
Baran Hashemi
Roderic G. Corominas
Alessandro Giacchetto
44
2
0
27 Aug 2024
CROME: Cross-Modal Adapters for Efficient Multimodal LLM
Sayna Ebrahimi
Sercan Ö. Arik
Tejas Nama
Tomas Pfister
44
1
0
13 Aug 2024
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Weizhi Liu
Yue Li
Dongdong Lin
Hui Tian
Haizhou Li
WIGM
36
9
0
15 Jul 2024
RouteFinder: Towards Foundation Models for Vehicle Routing Problems
Federico Berto
Chuanbo Hua
Nayeli Gast Zepeda
André Hottung
N. Wouda
Leon Lan
Kevin Tierney
J. Park
Jinkyoo Park
56
10
0
21 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
74
56
0
11 Jun 2024
Hidden Holes: topological aspects of language models
Stephen Fitz
P. Romero
Jiyan Jonas Schneider
35
0
0
09 Jun 2024
HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech
Zhongren Dong
Zixing Zhang
Weixiang Xu
Jing Han
Jianjun Ou
Björn W. Schuller
40
1
0
07 May 2024
Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time Defect Prediction
Xin Zhou
Donggyun Han
David Lo
VLM
29
2
0
17 Mar 2024
Multi-Level Attention Aggregation for Language-Agnostic Speaker Replication
Yejin Jeon
Gary Geunbae Lee
26
2
0
06 Mar 2024
Multimodal Clinical Trial Outcome Prediction with Large Language Models
Wenhao Zheng
Dongsheng Peng
Hongxia Xu
Yun-Qing Li
Hongtu Zhu
Tianfan Fu
Huaxiu Yao
Huaxiu Yao
50
5
0
09 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
39
1
0
01 Feb 2024
A Primer on Temporal Graph Learning
Aniq Ur Rahman
J. Coon
AI4CE
42
1
0
08 Jan 2024
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation
Shengkui Zhao
Yukun Ma
Chongjia Ni
Chong Zhang
Hao Wang
Trung Hieu Nguyen
Kun Zhou
J. Yip
Dianwen Ng
Bin Ma
13
21
0
19 Dec 2023
Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification
Yunlong Zhang
Honglin Li
Yuxuan Sun
Sunyi Zheng
Chenglu Zhu
Lin Yang
25
27
0
13 Nov 2023
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Zenan Zhou
Zhiying Wu
ELM
LRM
66
703
0
19 Sep 2023
Auto-Regressive Next-Token Predictors are Universal Learners
Eran Malach
LRM
24
36
0
13 Sep 2023
Evaluating ChatGPT as a Recommender System: A Rigorous Approach
Dario Di Palma
Giovanni Maria Biancofiore
Vito Walter Anelli
Fedelucio Narducci
Tommaso Di Noia
E. Sciascio
ALM
46
27
0
07 Sep 2023
Improving Social Media Popularity Prediction with Multiple Post Dependencies
Zhizhen Zhang
Xiao-Zhu Xie
Meng Yang
Ye Tian
Yong-jia Jiang
Yong Cui
21
5
0
28 Jul 2023
On the unreasonable vulnerability of transformers for image restoration -- and an easy fix
Shashank Agnihotri
Kanchana Vaishnavi Gandikota
Julia Grabinski
Paramanand Chandramouli
M. Keuper
32
9
0
25 Jul 2023
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui
Jiawen Kang
Jiajun Deng
Xiaoyue Yin
Yutao Xie
Xie Chen
Xunying Liu
29
8
0
23 Jun 2023
Bringing regularized optimal transport to lightspeed: a splitting method adapted for GPUs
Jacob Lindbäck
Zesen Wang
Mikael Johansson
OT
40
1
0
29 May 2023
Neural Machine Translation for Mathematical Formulae
Felix Petersen
M. Schubotz
André Greiner-Petter
Bela Gipp
23
7
0
25 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
24
17
0
18 May 2023
EENED: End-to-End Neural Epilepsy Detection based on Convolutional Transformer
Chenyu Liu
Xin-qiu Zhou
Yang Liu
ViT
MedIm
20
1
0
17 May 2023
1
2
3
4
5
6
Next