Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.15821
Cited By
Audiobox: Unified Audio Generation with Natural Language Prompts
25 December 2023
Apoorv Vyas
Bowen Shi
Matt Le
Andros Tjandra
Yi-Chiao Wu
Baishan Guo
Jiemin Zhang
Xinyue Zhang
Robert Adkins
W.K.F. Ngan
Jeff Wang
Ivan Cruz
Bapi Akula
A. Akinyemi
Brian Ellis
Rashel Moritz
Yael Yungster
Alice Rakotoarison
Liang Tan
Chris Summers
Carleigh Wood
Joshua Lane
Mary Williamson
Wei-Ning Hsu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audiobox: Unified Audio Generation with Natural Language Prompts"
10 / 60 papers shown
Title
Proactive Detection of Voice Cloning with Localized Watermarking
Robin San Roman
Pierre Fernandez
Alexandre Défossez
Teddy Furon
Tuan Tran
Hady ElSahar
53
41
0
30 Jan 2024
Powerset multi-class cross entropy loss for neural speaker diarization
Alexis Plaquet
H. Bredin
109
91
0
19 Oct 2023
Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks
Ming-Hao Hsu
Kai-Wei Chang
Shang-Wen Li
Hung-yi Lee
34
7
0
19 Oct 2023
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT
Zhihao Du
Jiaming Wang
Qian Chen
Yunfei Chu
Zhifu Gao
...
Wen Wang
Siqi Zheng
Chang Zhou
Zhijie Yan
Shiliang Zhang
LLMAG
VLM
AuLLM
LM&MA
39
80
0
07 Oct 2023
Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
Deepanway Ghosal
Navonil Majumder
Ambuj Mehrish
Soujanya Poria
152
144
0
24 Apr 2023
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models
Rongjie Huang
Jia-Bin Huang
Dongchao Yang
Yi Ren
Luping Liu
Mingze Li
Zhenhui Ye
Jinglin Liu
Xiaoyue Yin
Zhou Zhao
DiffM
145
317
0
30 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
48
644
0
05 Jan 2023
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
124
264
0
02 Feb 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
695
0
27 Aug 2021
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
333
75,834
0
18 May 2015
Previous
1
2