ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.04692
  4. Cited By
Automated Audio Captioning using Transfer Learning and Reconstruction
  Latent Space Similarity Regularization

Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization

10 August 2021
Andrew Koh
Fuzhao Xue
Chng Eng Siong
ArXivPDFHTML

Papers citing "Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization"

16 / 16 papers shown
Title
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
81
2
0
10 Jan 2025
AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
AVCap: Leveraging Audio-Visual Features as Text Tokens for Captioning
Jongsuk Kim
Jiwon Shin
Junmo Kim
41
2
0
10 Jul 2024
Zero-Shot Audio Captioning Using Soft and Hard Prompts
Zero-Shot Audio Captioning Using Soft and Hard Prompts
Yiming Zhang
Xuenan Xu
Ruoyi Du
Haohe Liu
Yuan Dong
Zheng-Hua Tan
Wenwu Wang
Zhanyu Ma
VLM
35
4
0
10 Jun 2024
SECap: Speech Emotion Captioning with Large Language Model
SECap: Speech Emotion Captioning with Large Language Model
Yaoxun Xu
Hangting Chen
Jianwei Yu
Qiaochu Huang
Zhiyong Wu
Shixiong Zhang
Guangzhi Li
Yi Luo
Rongzhi Gu
33
22
0
16 Dec 2023
Weakly-supervised Automated Audio Captioning via text only training
Weakly-supervised Automated Audio Captioning via text only training
Theodoros Kouzelis
Vassilis Katsouros
CLIP
38
6
0
21 Sep 2023
RECAP: Retrieval-Augmented Audio Captioning
RECAP: Retrieval-Augmented Audio Captioning
Sreyan Ghosh
Sonal Kumar
Chandra Kiran Reddy Evuru
R. Duraiswami
Tianyi Zhou
VLM
70
17
0
18 Sep 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
61
42
0
30 Jul 2023
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Sihan Chen
Xingjian He
Longteng Guo
Xinxin Zhu
Weining Wang
Jinhui Tang
Jinhui Tang
VLM
34
103
0
17 Apr 2023
Prefix tuning for automated audio captioning
Prefix tuning for automated audio captioning
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
21
42
0
30 Mar 2023
Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption
  Similarity
Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity
Swapnil Bhosale
Rupayan Chakraborty
Sunil Kumar Kopparapu
27
1
0
03 Oct 2022
Language-Based Audio Retrieval with Converging Tied Layers and
  Contrastive Loss
Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss
Andrew Koh
Chng Eng Siong
29
1
0
29 Jun 2022
Automated Audio Captioning with Epochal Difficult Captions for
  Curriculum Learning
Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning
Andrew Koh
Soham Dinesh Tiwari
Chng Eng Siong
17
1
0
04 Jun 2022
Automated Audio Captioning: An Overview of Recent Progress and New
  Challenges
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
29
37
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges
  in Audio Captioning
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
36
13
0
11 May 2022
Interactive Audio-text Representation for Automated Audio Captioning
  with Contrastive Learning
Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen
Nana Hou
Yuchen Hu
Heqing Zou
Xiaofeng Qi
Chng Eng Siong
VLM
26
21
0
29 Mar 2022
Leveraging Pre-trained BERT for Audio Captioning
Leveraging Pre-trained BERT for Audio Captioning
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
33
29
0
06 Mar 2022
1