ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.01708
  4. Cited By
TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication

TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication

2 April 2025
Petr Vanc
Karla Stepanova
ArXiv (abs)PDFHTML

Papers citing "TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication"

9 / 9 papers shown
Title
LaMI: Large Language Models for Multi-Modal Human-Robot Interaction
LaMI: Large Language Models for Multi-Modal Human-Robot Interaction
Chao Wang
Stephan Hasler
Daniel Tanneberg
Felix Ocker
Frank Joublin
Antonello Ceravola
Joerg Deigmoeller
Michael Gienger
LM&Ro
78
30
0
26 Jan 2024
Context-aware robot control using gesture episodes
Context-aware robot control using gesture episodes
Petr Vanc
Jan Kristof Behrens
Karla Stepanova
70
4
0
24 Jan 2023
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLMVLM
385
3,542
0
29 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&RoLRMAI4CEReLM
817
9,576
0
28 Jan 2022
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALMUQCV
206
3,750
0
03 Sep 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
657
41,103
0
22 Oct 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
795
42,055
0
28 May 2020
Multimodal Uncertainty Reduction for Intention Recognition in
  Human-Robot Interaction
Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction
Susanne Trick
Dorothea Koert
Jan Peters
Constantin Rothkopf
10
33
0
04 Jul 2019
Multimodal Machine Learning: A Survey and Taxonomy
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrušaitis
Chaitanya Ahuja
Louis-Philippe Morency
101
2,932
0
26 May 2017
1