ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.09732
  4. Cited By
Introducing Auxiliary Text Query-modifier to Content-based Audio
  Retrieval

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

20 July 2022
Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
K. Kashino
ArXivPDFHTML

Papers citing "Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval"

18 / 18 papers shown
Title
Audio-Language Datasets of Scenes and Events: A Survey
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
116
2
0
10 Jan 2025
Audio Retrieval with Natural Language Queries: A Benchmark Study
Audio Retrieval with Natural Language Queries: A Benchmark Study
A. Sophia Koepke
Andreea-Maria Oncescu
João F. Henriques
Zeynep Akata
Samuel Albanie
62
100
0
17 Dec 2021
AudioCLIP: Extending CLIP to Image, Text and Audio
AudioCLIP: Extending CLIP to Image, Text and Audio
A. Guzhov
Federico Raue
Jörn Hees
Andreas Dengel
CLIP
VLM
96
365
0
24 Jun 2021
Audio Retrieval with Natural Language Queries
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
45
79
0
05 May 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
104
862
0
05 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
861
29,341
0
26 Feb 2021
FSD50K: An Open Dataset of Human-Labeled Sound Events
FSD50K: An Open Dataset of Human-Labeled Sound Events
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
67
458
0
01 Oct 2020
Effects of Word-frequency based Pre- and Post- Processings for Audio
  Captioning
Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning
Daiki Takeuchi
Yuma Koizumi
Yasunori Ohishi
Noboru Harada
K. Kashino
38
27
0
24 Sep 2020
AVLnet: Learning Audio-Visual Language Representations from
  Instructional Videos
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
57
142
0
16 Jun 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern
  Recognition
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
184
1,075
0
21 Dec 2019
Clotho: An Audio Captioning Dataset
Clotho: An Audio Captioning Dataset
Konstantinos Drossos
Samuel Lipping
Tuomas Virtanen
87
389
0
21 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
218
7,481
0
02 Oct 2019
UMAP: Uniform Manifold Approximation and Projection for Dimension
  Reduction
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Leland McInnes
John Healy
James Melville
154
9,409
0
09 Feb 2018
Cross-modal Embeddings for Video and Audio Retrieval
Cross-modal Embeddings for Video and Audio Retrieval
Dídac Surís
A. Duarte
Amaia Salvador
Jordi Torres
Xavier Giró-i-Nieto
SSL
44
69
0
07 Jan 2018
Content-based Representations of audio using Siamese neural networks
Content-based Representations of audio using Siamese neural networks
Pranay Manocha
Rohan Badlani
Anurag Kumar
Ankit Parag Shah
Benjamin Elizalde
Bhiksha Raj
48
33
0
30 Oct 2017
CNN Architectures for Large-Scale Audio Classification
CNN Architectures for Large-Scale Audio Classification
Shawn Hershey
Sourish Chaudhuri
D. Ellis
J. Gemmeke
A. Jansen
...
Rif A. Saurous
Bryan Seybold
M. Slaney
Ron J. Weiss
K. Wilson
114
2,497
0
29 Sep 2016
Gaussian Error Linear Units (GELUs)
Gaussian Error Linear Units (GELUs)
Dan Hendrycks
Kevin Gimpel
169
5,001
0
27 Jun 2016
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.7K
150,006
0
22 Dec 2014
1