ICNLSP 2023: Improving Dhivehi Automatic Speech Recognition (ASR) with Sub-word Modelling ...

Generative AI in a Nutshell - how to survive and thrive in the age of AI

I Tried the Best AI Language Translator Device | Timekektte Translator Earbuds Review

There's no quit with this guy... Wheelz is a BEAST 💪

❌В ЛИСТОПАДІ ВСЕ СТАНЕ ЯСНО❌🇺🇦 ІРИНА КЛЕВЕР ТА ДМИТРО КОСТИЛЬОВ

РЫБКА С ПИВОМ

ICNLSP 2023: Direct Speech to Text Translation: Bridging the Modality Gap Using SimSiam

ICNLSP Conference

Переглядів 21

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 жов 2024
Title of the presentation: Direct Speech to Text Translation: Bridging the Modality Gap Using
SimSiam.
By: Balaram Sarkar (Indian Institute of Technology Indore); Chandresh K Maurya (IBM Research); Anshuman Agrahri (IIT, Indore)
, India.
6th International Conference on Natural Language and Speech Processing.
icnlsp.org/202...
Abstract:
Learning similar representations for spoken utterances and their written text involves understanding both forms in a shared manner. This process of developing similar representations for semantically related speech and text is essential, particularly for tasks like speech-to-text (S2T) translation. To that end, we propose a SimSiam-based S2T (S3T) model that leverages the SimSiam network, a state-of-the-art unsupervised learning architecture, to bridge
the modality gap between speech and text. The proposed model does not require negative sample mining. The comparative study using four directions of the standard MuST-C (Di Gangi
et al., 2019) dataset demonstrates that the proposed S3T translation model beats all the existing methods, and achieves an average metric of 30.02 BLEU score. Our analysis affirms that
S3T effectively bridges the representation gap between the two modalities.

КОМЕНТАРІ •

Наступне

Автоматичне відтворення

ICNLSP 2023: Improving Dhivehi Automatic Speech Recognition (ASR) with Sub-word Modelling ...

ICNLSP 2023: Improving Dhivehi Automatic Speech Recognition (ASR) with Sub-word Modelling ...

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

I Tried the Best AI Language Translator Device | Timekektte Translator Earbuds Review

I Tried the Best AI Language Translator Device | Timekektte Translator Earbuds Review

There's no quit with this guy... Wheelz is a BEAST 💪

There's no quit with this guy... Wheelz is a BEAST 💪

❌В ЛИСТОПАДІ ВСЕ СТАНЕ ЯСНО❌🇺🇦 ІРИНА КЛЕВЕР ТА ДМИТРО КОСТИЛЬОВ

❌В ЛИСТОПАДІ ВСЕ СТАНЕ ЯСНО❌🇺🇦 ІРИНА КЛЕВЕР ТА ДМИТРО КОСТИЛЬОВ

🚀ЧЕРНИК: Все! У КУРСЬКУ ПЕРЕЛОМ. Війна ЗУПИНИТЬСЯ через МІСЯЦЬ. КИТАЙ дав СИГНАЛ ЗЕЛЕНСЬКОМУ?

🚀ЧЕРНИК: Все! У КУРСЬКУ ПЕРЕЛОМ. Війна ЗУПИНИТЬСЯ через МІСЯЦЬ. КИТАЙ дав СИГНАЛ ЗЕЛЕНСЬКОМУ?

NSURL 2002 workshop: Syllable Subword Tokens for Open Vocabulary Speech Recognition in Malayalam.

NSURL 2002 workshop: Syllable Subword Tokens for Open Vocabulary Speech Recognition in Malayalam.

Who do multimodal models have disjointed embeddings? | Journal Club

Who do multimodal models have disjointed embeddings? | Journal Club

MS Thesis Defense - Anurag Bambardekar: "SDR-Based Emulation of ML Enabled Spectrum Sharing in 5G.."

MS Thesis Defense - Anurag Bambardekar: "SDR-Based Emulation of ML Enabled Spectrum Sharing in 5G.."

ICNLSP 2023: Transformer-Based Analysis of Sentiment Towards German Political Parties on Twitter ...

ICNLSP 2023: Transformer-Based Analysis of Sentiment Towards German Political Parties on Twitter ...

Viral Video of a Man's Crazy Job Interview

Viral Video of a Man's Crazy Job Interview

A Basic Introduction to Speech Recognition (Hidden Markov Model & Neural Networks)

A Basic Introduction to Speech Recognition (Hidden Markov Model & Neural Networks)

MGT7105E Business Intelligence & Analytics Systems Class 4

MGT7105E Business Intelligence & Analytics Systems Class 4

SUPER Fast AI Real Time Speech to Text Transcribtion - Faster Whisper / Python

SUPER Fast AI Real Time Speech to Text Transcribtion - Faster Whisper / Python

IELTS Speaking Test- Perfect Band 9

IELTS Speaking Test- Perfect Band 9

Лучший фокус с калькулятором + обучение! #shorts

Лучший фокус с калькулятором + обучение! #shorts

Англія - Україна: ПРЯМА ТРАНСЛЯЦІЯ, футбол / молодіжна збірна, відбір на Євро-2025

Англія – Україна: ПРЯМА ТРАНСЛЯЦІЯ, футбол / молодіжна збірна, відбір на Євро-2025

Анна Трінчер & CHEEV - Не знаю (Official Music Video)

Анна Трінчер & CHEEV - Не знаю (Official Music Video)

[UA] NAVI vs MOUZ | BO5 | IEM Rio 2024

[UA] NAVI vs MOUZ | BO5 | IEM Rio 2024

這種要是上擂台，幾個泰森才能打的過？ #shorts #sports #fighting

這種要是上擂台，幾個泰森才能打的過？ #shorts #sports #fighting

Seja Gentil com os Pequenos Animais 😿

Seja Gentil com os Pequenos Animais 😿

Що НАСПРАВДІ сталося між поляками та українцями?

Що НАСПРАВДІ сталося між поляками та українцями?

А что бы ты сделал? @LimbLossBoss

А что бы ты сделал? @LimbLossBoss