Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.09173
Cited By
Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images
17 October 2022
Hien Ohnaka
Shinnosuke Takamichi
Keisuke Imoto
Yuki Okamoto
Kazuki Fujii
Hiroshi Saruwatari
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visual onoma-to-wave: environmental sound synthesis from visual onomatopoeias and sound-source images"
9 / 9 papers shown
Title
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
61
308
0
30 Sep 2022
Diffsound: Discrete Diffusion Model for Text-to-sound Generation
Dongchao Yang
Jianwei Yu
Helin Wang
Wen Wang
Chao Weng
Yuexian Zou
Dong Yu
DiffM
79
304
0
20 Jul 2022
COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts
Jeonghun Baek
Yusuke Matsui
Kiyoharu Aizawa
63
13
0
11 Jul 2022
vTTS: visual-text to speech
Yoshifumi Nakano
Takaaki Saeki
Shinnosuke Takamichi
Katsuhito Sudoh
Hiroshi Saruwatari
48
4
0
28 Mar 2022
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech
C. Chien
Jheng-hao Lin
Chien-yu Huang
Po-Chun Hsu
Hung-yi Lee
70
70
0
06 Mar 2021
Onoma-to-wave: Environmental sound synthesis from onomatopoeic words
Yuki Okamoto
Keisuke Imoto
Shinnosuke Takamichi
Ryosuke Yamanishi
Takahiro Fukumori
Y. Yamashita
23
14
0
11 Feb 2021
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,931
0
12 Oct 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,396
0
08 Jun 2020
Tacotron: Towards End-to-End Speech Synthesis
Yuxuan Wang
RJ Skerry-Ryan
Daisy Stanton
Yonghui Wu
Ron J. Weiss
...
Samy Bengio
Quoc V. Le
Yannis Agiomyrgiannakis
R. Clark
Rif A. Saurous
155
1,823
0
29 Mar 2017
1