Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.04535
Cited By
Synchronous Multi-modal Semantic Communication System with Packet-level Coding
8 August 2024
Yun Tian
Jingkai Ying
Zhijin Qin
Ye Jin
Xiaoming Tao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Synchronous Multi-modal Semantic Communication System with Packet-level Coding"
23 / 23 papers shown
Title
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
Gaoxiang Cong
Jiadong Pan
Liang-Sheng Li
Yuankai Qi
Yuxin Peng
Anton Van Den Hengel
Jian Yang
Qingming Huang
139
6
0
12 Dec 2024
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
Jiaxiang Tang
Kaisiyuan Wang
Hang Zhou
Xiaokang Chen
Dongliang He
Tianshu Hu
Jingtuo Liu
Gang Zeng
Jingdong Wang
3DH
48
78
0
22 Nov 2022
Towards Semantic Communications: Deep Learning-Based Image Semantic Coding
Danlan Huang
Fei Gao
Xiaoming Tao
Qiyuan Du
Jianhua Lu
57
164
0
08 Aug 2022
Robust Semantic Communications with Masked VQ-VAE Enabled Codebook
Qiyu Hu
Guangyi Zhang
Zhijin Qin
Yunlong Cai
Guanding Yu
Geoffrey Ye Li
AAML
37
147
0
08 Jun 2022
Wireless Deep Video Semantic Transmission
Sixian Wang
Jincheng Dai
Zijian Liang
K. Niu
Zhongwei Si
Chao Dong
Xiaoqi Qin
Ping Zhang
3DV
DiffM
71
147
0
26 May 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
Takaaki Saeki
Detai Xin
Wataru Nakata
Tomoki Koriyama
Shinnosuke Takamichi
Hiroshi Saruwatari
75
207
0
05 Apr 2022
Semantic Communications: Principles and Challenges
Zhijin Qin
Xiaoming Tao
Jianhua Lu
Wen Tong
Geoffrey Ye Li
74
346
0
30 Dec 2021
DeepWiVe: Deep-Learning-Aided Wireless Video Transmission
Tze-Yang Tung
Deniz Gündüz
70
122
0
25 Nov 2021
V2C: Visual Voice Cloning
Qi Chen
Yuanqing Li
Yuankai Qi
Jiaqiu Zhou
Mingkui Tan
Qi Wu
VGen
52
27
0
25 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
439
7,731
0
11 Nov 2021
PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering
Yurui Ren
Gezhong Li
Yuanqi Chen
Thomas H. Li
Shan Liu
DiffM
VGen
98
227
0
17 Sep 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
593
40,961
0
22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
168
1,931
0
12 Oct 2020
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
EGVM
96
777
0
23 Aug 2020
Wireless Image Retrieval at the Edge
Mikolaj Jankowski
Deniz Gunduz
K. Mikolajczyk
124
208
0
21 Jul 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,396
0
08 Jun 2020
Image Quality Assessment: Unifying Structure and Texture Similarity
Keyan Ding
Kede Ma
Shiqi Wang
Eero P. Simoncelli
96
780
0
16 Apr 2020
DeepJSCC-f: Deep Joint Source-Channel Coding of Images with Feedback
David Burth Kurka
Deniz Gündüz
73
305
0
25 Nov 2019
Deep Learning for Joint Source-Channel Coding of Text
Nariman Farsad
Milind Rao
Andrea J. Goldsmith
53
351
0
19 Feb 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Richard Y. Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
EGVM
347
11,784
0
11 Jan 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
Jonathan Shen
Ruoming Pang
Ron J. Weiss
M. Schuster
Navdeep Jaitly
...
Yuxuan Wang
RJ Skerry-Ryan
Rif A. Saurous
Yannis Agiomyrgiannakis
Yonghui Wu
77
2,697
0
16 Dec 2017
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
122
2,273
0
26 Jun 2017
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Justin Johnson
Alexandre Alahi
Li Fei-Fei
SupR
228
10,246
0
27 Mar 2016
1