Synchronous Multi-modal Semantic Communication System with Packet-level Coding

8 August 2024

Papers citing "Synchronous Multi-modal Semantic Communication System with Packet-level Coding"

23 / 23 papers shown

Title
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing Gaoxiang Cong Jiadong Pan Liang-Sheng Li Yuankai Qi Yuxin Peng Anton Van Den Hengel Jian Yang Qingming Huang 139 6 0 12 Dec 2024
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition Jiaxiang Tang Kaisiyuan Wang Hang Zhou Xiaokang Chen Dongliang He Tianshu Hu Jingtuo Liu Gang Zeng Jingdong Wang 3DH 48 78 0 22 Nov 2022
Towards Semantic Communications: Deep Learning-Based Image Semantic Coding Danlan Huang Fei Gao Xiaoming Tao Qiyuan Du Jianhua Lu 57 164 0 08 Aug 2022
Robust Semantic Communications with Masked VQ-VAE Enabled Codebook Qiyu Hu Guangyi Zhang Zhijin Qin Yunlong Cai Guanding Yu Geoffrey Ye Li AAML 37 147 0 08 Jun 2022
Wireless Deep Video Semantic Transmission Sixian Wang Jincheng Dai Zijian Liang K. Niu Zhongwei Si Chao Dong Xiaoqi Qin Ping Zhang 3DV DiffM 71 147 0 26 May 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022 Takaaki Saeki Detai Xin Wataru Nakata Tomoki Koriyama Shinnosuke Takamichi Hiroshi Saruwatari 75 207 0 05 Apr 2022
Semantic Communications: Principles and Challenges Zhijin Qin Xiaoming Tao Jianhua Lu Wen Tong Geoffrey Ye Li 74 346 0 30 Dec 2021
DeepWiVe: Deep-Learning-Aided Wireless Video Transmission Tze-Yang Tung Deniz Gündüz 70 122 0 25 Nov 2021
V2C: Visual Voice Cloning Qi Chen Yuanqing Li Yuankai Qi Jiaqiu Zhou Mingkui Tan Qi Wu VGen 52 27 0 25 Nov 2021
Masked Autoencoders Are Scalable Vision Learners Kaiming He Xinlei Chen Saining Xie Yanghao Li Piotr Dollár Ross B. Girshick ViT TPM 439 7,731 0 11 Nov 2021
PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering Yurui Ren Gezhong Li Yuanqi Chen Thomas H. Li Shan Liu DiffM VGen 98 227 0 17 Sep 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 593 40,961 0 22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 168 1,931 0 12 Oct 2020
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild Prajwal K R Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar EGVM 96 777 0 23 Aug 2020
Wireless Image Retrieval at the Edge Mikolaj Jankowski Deniz Gunduz K. Mikolajczyk 124 208 0 21 Jul 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,396 0 08 Jun 2020
Image Quality Assessment: Unifying Structure and Texture Similarity Keyan Ding Kede Ma Shiqi Wang Eero P. Simoncelli 96 780 0 16 Apr 2020
DeepJSCC-f: Deep Joint Source-Channel Coding of Images with Feedback David Burth Kurka Deniz Gündüz 73 305 0 25 Nov 2019
Deep Learning for Joint Source-Channel Coding of Text Nariman Farsad Milind Rao Andrea J. Goldsmith 53 351 0 19 Feb 2018
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric Richard Y. Zhang Phillip Isola Alexei A. Efros Eli Shechtman Oliver Wang EGVM 347 11,784 0 11 Jan 2018
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 77 2,697 0 16 Dec 2017
VoxCeleb: a large-scale speaker identification dataset Arsha Nagrani Joon Son Chung Andrew Zisserman 122 2,273 0 26 Jun 2017
Perceptual Losses for Real-Time Style Transfer and Super-Resolution Justin Johnson Alexandre Alahi Li Fei-Fei SupR 228 10,246 0 27 Mar 2016