ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.08995
  4. Cited By
Connecting the Dots between Audio and Text without Parallel Data through
  Visual Knowledge Transfer

Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer

16 December 2021
Yanpeng Zhao
Jack Hessel
Youngjae Yu
Ximing Lu
Rowan Zellers
Yejin Choi
ArXivPDFHTML

Papers citing "Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer"

13 / 13 papers shown
Title
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
81
2
0
10 Jan 2025
Gramian Multimodal Representation Learning and Alignment
Gramian Multimodal Representation Learning and Alignment
Giordano Cicchetti
Eleonora Grassucci
Luigi Sigillo
Danilo Comminiello
102
1
0
16 Dec 2024
Acoustic Identification of Ae. aegypti Mosquitoes using Smartphone Apps
  and Residual Convolutional Neural Networks
Acoustic Identification of Ae. aegypti Mosquitoes using Smartphone Apps and Residual Convolutional Neural Networks
K. Paim
Ricardo Rohweder
M. R. Mendoza
R. Mansilha
Weverton Cordeiro
27
2
0
16 Jun 2023
Harvesting Event Schemas from Large Language Models
Harvesting Event Schemas from Large Language Models
Jialong Tang
Hongyu Lin
Zhuoqun Li
Yaojie Lu
Xianpei Han
Le Sun
26
4
0
12 May 2023
Prefix tuning for automated audio captioning
Prefix tuning for automated audio captioning
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
21
43
0
30 Mar 2023
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet
  Tag-guided Synthetic Data
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data
Xuenan Xu
Zhiling Zhang
Zelin Zhou
Pingyue Zhang
Zeyu Xie
Mengyue Wu
Ke Zhu
CLIP
73
14
0
14 Mar 2023
CAT: Causal Audio Transformer for Audio Classification
CAT: Causal Audio Transformer for Audio Classification
Xiaoyu Liu
Hanlin Lu
Jianbo Yuan
Xinyu Li
ViT
28
22
0
14 Mar 2023
Contrastive Audio-Language Learning for Music
Contrastive Audio-Language Learning for Music
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
27
44
0
25 Aug 2022
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Yan-Bo Lin
Jie Lei
Joey Tianyi Zhou
Gedas Bertasius
54
39
0
06 Apr 2022
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
...
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
49
574
0
01 Apr 2022
Multimodal Self-Supervised Learning of General Audio Representations
Multimodal Self-Supervised Learning of General Audio Representations
Luyu Wang
Pauline Luc
Adrià Recasens
Jean-Baptiste Alayrac
Aaron van den Oord
SSL
78
41
0
26 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
251
577
0
22 Apr 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,532
0
23 Jan 2020
1