ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1308.3432
  4. Cited By
Estimating or Propagating Gradients Through Stochastic Neurons for
  Conditional Computation

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

15 August 2013
Yoshua Bengio
Nicholas Léonard
Aaron Courville
ArXiv (abs)PDFHTML

Papers citing "Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation"

50 / 1,513 papers shown
Title
Accurate Mapping of RNNs on Neuromorphic Hardware with Adaptive Spiking
  Neurons
Accurate Mapping of RNNs on Neuromorphic Hardware with Adaptive Spiking Neurons
Gauthier Boeshertz
Giacomo Indiveri
M. Nair
Alpha Renner
48
2
0
18 Jul 2024
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of
  Learnable Binary Vectors
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
Matt Gorbett
Hossein Shirazi
Indrakshi Ray
MQ
114
0
0
16 Jul 2024
Exploring Quantization for Efficient Pre-Training of Transformer
  Language Models
Exploring Quantization for Efficient Pre-Training of Transformer Language Models
Kamran Chitsaz
Quentin Fournier
Gonccalo Mordido
Sarath Chandar
MQ
95
4
0
16 Jul 2024
NITRO-D: Native Integer-only Training of Deep Convolutional Neural
  Networks
NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks
Alberto Pirillo
Luca Colombo
Manuel Roveri
MQ
112
0
0
16 Jul 2024
Scaling Diffusion Transformers to 16 Billion Parameters
Scaling Diffusion Transformers to 16 Billion Parameters
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Debang Li
Junshi Huang
DiffMMoE
115
21
0
16 Jul 2024
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Hongyu Wang
Shuming Ma
Ruiping Wang
Furu Wei
MoE
88
13
0
15 Jul 2024
Low-Rank Interconnected Adaptation across Layers
Low-Rank Interconnected Adaptation across Layers
Yibo Zhong
Jinman Zhao
Yao Zhou
OffRLMoE
114
1
0
13 Jul 2024
Trainable Highly-expressive Activation Functions
Trainable Highly-expressive Activation Functions
Irit Chelly
Shahaf E. Finder
Shira Ifergane
Oren Freifeld
100
6
0
10 Jul 2024
Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech
  Recognition
Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition
Jingjing Xu
Wei Zhou
Zijian Yang
Eugen Beck
Ralf Schlueter
93
3
0
10 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Ping Luo
MQ
160
35
0
10 Jul 2024
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive
  Distillation
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Liqun Ma
Mingjie Sun
Zhiqiang Shen
90
9
0
09 Jul 2024
Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of
  Modules
Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
Zhuocheng Gong
Ang Lv
Jian Guan
Junxi Yan
Wei Wu
Huishuai Zhang
Minlie Huang
Dongyan Zhao
Rui Yan
MoE
86
7
0
09 Jul 2024
OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks
OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks
Jingyang Xiang
Zuohui Chen
Siqi Li
Qing Wu
Yong-Jin Liu
90
1
0
07 Jul 2024
Scalable Variational Causal Discovery Unconstrained by Acyclicity
Scalable Variational Causal Discovery Unconstrained by Acyclicity
Nu Hoang
Bao Duong
Thin Nguyen
CML
101
0
0
06 Jul 2024
Balance of Number of Embedding and their Dimensions in Vector
  Quantization
Balance of Number of Embedding and their Dimensions in Vector Quantization
Hang Chen
Sankepally Sainath Reddy
Ziwei Chen
Dianbo Liu
93
2
0
06 Jul 2024
Resource-Efficient Speech Quality Prediction through Quantization Aware
  Training and Binary Activation Maps
Resource-Efficient Speech Quality Prediction through Quantization Aware Training and Binary Activation Maps
Mattias Nilsson
Riccardo Miccini
Clément Laroche
Tobias Piechowiak
Friedemann Zenke
MQ
66
0
0
05 Jul 2024
ISQuant: apply squant to the real deployment
ISQuant: apply squant to the real deployment
Dezan Zhao
MQ
74
0
0
05 Jul 2024
Learning Interpretable Differentiable Logic Networks
Learning Interpretable Differentiable Logic Networks
Chang Yue
N. Jha
NAIAI4CE
72
1
0
04 Jul 2024
Query-Guided Self-Supervised Summarization of Nursing Notes
Query-Guided Self-Supervised Summarization of Nursing Notes
Ya Gao
H. Moen
S. Koivusalo
M. Koskinen
Pekka Marttinen
74
1
0
04 Jul 2024
Functional Faithfulness in the Wild: Circuit Discovery with
  Differentiable Computation Graph Pruning
Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning
Lei Yu
Jingcheng Niu
Zining Zhu
Gerald Penn
81
7
0
04 Jul 2024
Fisher-aware Quantization for DETR Detectors with Critical-category
  Objectives
Fisher-aware Quantization for DETR Detectors with Critical-category Objectives
Huanrui Yang
Yafeng Huang
Zhen Dong
Denis A. Gudovskiy
Tomoyuki Okuno
Yohei Nakata
Yuan Du
Kurt Keutzer
Shanghang Zhang
MQ
106
0
0
03 Jul 2024
Towards Federated Learning with On-device Training and Communication in
  8-bit Floating Point
Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point
Bokun Wang
Axel Berg
D. A. E. Acar
Chuteng Zhou
FedMLMQ
132
0
0
02 Jul 2024
Quantum Circuit Synthesis and Compilation Optimization: Overview and
  Prospects
Quantum Circuit Synthesis and Compilation Optimization: Overview and Prospects
Yan Ge
Wu Wenjie
Chen Yuheng
Pan Kaisen
Lu Xudong
Zhou Zixiang
Wang Yuhan
Wang Ruocheng
Yan Junchi
76
17
0
30 Jun 2024
Kolmogorov-Smirnov GAN
Kolmogorov-Smirnov GAN
Maciej Falkiewicz
Naoya Takeishi
Alexandros Kalousis
GAN
57
0
0
28 Jun 2024
Directly Training Temporal Spiking Neural Network with Sparse Surrogate
  Gradient
Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient
Yang Li
Feifei Zhao
Dongcheng Zhao
Yi Zeng
77
3
0
28 Jun 2024
Efficient World Models with Context-Aware Tokenization
Efficient World Models with Context-Aware Tokenization
Vincent Micheli
Eloi Alonso
François Fleuret
OffRLVLM
80
6
0
27 Jun 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language
  Models
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
76
2
0
27 Jun 2024
Neural Texture Block Compression
Neural Texture Block Compression
S. Fujieda
Takahiro Harada
58
2
0
27 Jun 2024
ViT-1.58b: Mobile Vision Transformers in the 1-bit Era
ViT-1.58b: Mobile Vision Transformers in the 1-bit Era
Zhengqing Yuan
Rong Zhou
Hongyi Wang
Lifang He
Yanfang Ye
Lichao Sun
MQ
62
8
0
26 Jun 2024
Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers
Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers
Lei Chen
Yuan Meng
Chen Tang
Xinzhu Ma
Jingyan Jiang
Xin Wang
Zhi Wang
Wenwu Zhu
MQ
127
31
0
25 Jun 2024
Sparser is Faster and Less is More: Efficient Sparse Attention for
  Long-Range Transformers
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
Chao Lou
Zixia Jia
Zilong Zheng
Kewei Tu
ODL
85
26
0
24 Jun 2024
SimSMoE: Solving Representational Collapse via Similarity Measure
SimSMoE: Solving Representational Collapse via Similarity Measure
Giang Do
Hung Le
T. Tran
MoE
103
1
0
22 Jun 2024
BrowNNe: Brownian Nonlocal Neurons & Activation Functions
BrowNNe: Brownian Nonlocal Neurons & Activation Functions
Sriram Nagaraj
Truman Hickok
89
0
0
21 Jun 2024
Older and Wiser: The Marriage of Device Aging and Intellectual Property
  Protection of Deep Neural Networks
Older and Wiser: The Marriage of Device Aging and Intellectual Property Protection of Deep Neural Networks
Ning Lin
Shaocong Wang
Yue Zhang
Yangu He
Kwunhang Wong
Arindam Basu
Dashan Shang
Xiaoming Chen
Zhongrui Wang
AAML
41
1
0
21 Jun 2024
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with
  Multilingual Video Dataset
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset
Kim Sung-Bin
Lee Chae-Yeon
Gihun Son
Oh Hyun-Bin
Janghoon Ju
Suekyeong Nam
Tae-Hyun Oh
92
12
0
20 Jun 2024
HIGHT: Hierarchical Graph Tokenization for Molecule-Language Alignment
HIGHT: Hierarchical Graph Tokenization for Molecule-Language Alignment
Yongqiang Chen
Quanming Yao
Juzheng Zhang
James Cheng
Yatao Bian
131
3
0
20 Jun 2024
Learned Compression of Encoding Distributions
Learned Compression of Encoding Distributions
Mateen Ulhaq
Ivan V. Bajić
61
1
0
18 Jun 2024
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal
  Quantization levels and Rank Values trough Differentiable Bayesian Gates
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates
Cristian Meo
Ksenia Sycheva
Anirudh Goyal
Justin Dauwels
MQ
75
5
0
18 Jun 2024
TokenRec: Learning to Tokenize ID for LLM-based Generative
  Recommendation
TokenRec: Learning to Tokenize ID for LLM-based Generative Recommendation
Haohao Qu
Wenqi Fan
Zihuai Zhao
Qing Li
91
20
0
15 Jun 2024
Towards Adaptive Neighborhood for Advancing Temporal Interaction Graph
  Modeling
Towards Adaptive Neighborhood for Advancing Temporal Interaction Graph Modeling
Siwei Zhang
Xi Chen
Yun Xiong
Xixi Wu
Yao Zhang
Yongrui Fu
Yinglong Zhao
Jiawei Zhang
137
8
0
14 Jun 2024
ToneUnit: A Speech Discretization Approach for Tonal Language Speech
  Synthesis
ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis
Dehua Tao
Daxin Tan
Y. Yeung
Xiao Chen
Tan Lee
84
3
0
13 Jun 2024
Neural NeRF Compression
Neural NeRF Compression
Tuan Pham
Stephan Mandt
73
3
0
13 Jun 2024
To be Continuous, or to be Discrete, Those are Bits of Questions
To be Continuous, or to be Discrete, Those are Bits of Questions
Yiran Wang
Masao Utiyama
84
4
0
12 Jun 2024
Image and Video Tokenization with Binary Spherical Quantization
Image and Video Tokenization with Binary Spherical Quantization
Yue Zhao
Yuanjun Xiong
Philipp Krahenbuhl
94
24
0
11 Jun 2024
Visual Representation Learning with Stochastic Frame Prediction
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
99
3
0
11 Jun 2024
TernaryLLM: Ternarized Large Language Model
TernaryLLM: Ternarized Large Language Model
Tianqi Chen
Zhe Li
Weixiang Xu
Zeyu Zhu
Dong Li
Lu Tian
E. Barsoum
Peisong Wang
Jian Cheng
77
7
0
11 Jun 2024
Autoregressive Model Beats Diffusion: Llama for Scalable Image
  Generation
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Peize Sun
Yi Jiang
Shoufa Chen
Shilong Zhang
Bingyue Peng
Ping Luo
Zehuan Yuan
VLM
136
301
0
10 Jun 2024
Low-Rank Quantization-Aware Training for LLMs
Low-Rank Quantization-Aware Training for LLMs
Yelysei Bondarenko
Riccardo Del Chiaro
Markus Nagel
MQ
77
14
0
10 Jun 2024
Factor Graph Optimization of Error-Correcting Codes for Belief
  Propagation Decoding
Factor Graph Optimization of Error-Correcting Codes for Belief Propagation Decoding
Yoni Choukroun
Lior Wolf
70
1
0
09 Jun 2024
Binarized Diffusion Model for Image Super-Resolution
Binarized Diffusion Model for Image Super-Resolution
Zheng Chen
Haotong Qin
Yong Guo
Xiongfei Su
Xin Yuan
Linghe Kong
Yulun Zhang
DiffM
88
9
0
09 Jun 2024
Previous
123...567...293031
Next