140
5 113

ICNLSP 2024: On Barriers to Archival Audio Processing (Peter Sullivan and Muhammad Abdul-Mageed)

11:47

ICNLSP 2024: Double Decoder: Improving latency for Streaming End-to-end ASR Models

10:54

ICNLSP 2024: Investigating Gender Bias in Large Language Models Through Text Generation

13:40

ICNLSP 2024: Design and Comparison of Arabic Negotiation Bots Using LLMs versus Seq2Seq Models .....

11:25

ICNLSP 2024: Enhancing LLM-based Arabic Negotiation by Fine Tuning on Dialogue Shortcomings

11:36

ICNLSP 2024: EEG Signal Analysis for Multimodal Simple Concepts Decoding

14:38

ICNLSP 2024: Probing Whisper Predictions for French, English and Persian Transcriptions

Probing Whisper Predictions for French, English and Persian Transcriptions
By: Nicolas Ballier, Léa Burin, Behnoosh Namdarzadeh, Sara B Ng, Richard Wright and Jean-Baptiste Yunès
Université Paris Cité
7th International Conference on Natural Language and Speech Processing.
icnlsp.org/2024welcome
Abstract:
Whisper is a widely-used open-access Large Language Model (LLM) trained using a multilingual paradigm. As such it represents an important opportunity for researchers to study how multilingual LLMs function across languages. In this paper, we analyse Whisper's Large and Medium models for Persian, English and French using a transcription task. To investigate the calibration of Whisper models, we use a customised C++ version of Whisper to probe Whisper's internal representations by extracting the subtoken probabilities for transcriptions of speech samples of the target languages. We discuss our subtoken-based evaluation of prediction accuracy as a proxy for standard Word Error Rate evaluation of the different Whisper models. The accuracy of the ASR predictions is investigated as a function of target language and part of speech. Our analysis reveals an architectural bias for French and discrepancies in accuracy in relation to the size of the training data. The results of our novel subtoken-based evaluation supplement previously-reported cross-lingual evaluations of Whisper, and enable better fine-tuning by suggesting types of data that may improve calibration.

Відео

ICNLSP 2024: On Barriers to Archival Audio Processing (Peter Sullivan and Muhammad Abdul-Mageed)

11:47

ICNLSP 2024: On Barriers to Archival Audio Processing (Peter Sullivan and Muhammad Abdul-Mageed)

Переглядів 214 днів тому

On Barriers to Archival Audio Processing By: Peter R Sullivan and Muhammad Abdul-Mageed University of British Columbia 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: In this study, we leverage a unique UNESCO collection of mid-20th century radio recordings to probe the robustness of modern off-the-shelf language identification (LID) and ...

ICNLSP 2024: Double Decoder: Improving latency for Streaming End-to-end ASR Models

10:54

ICNLSP 2024: Double Decoder: Improving latency for Streaming End-to-end ASR Models

Переглядів 414 днів тому

Double Decoder: Improving latency for Streaming End-to-end ASR Models By: Riqiang Wang, Shreekantha Nadig, Daniil Kulko, Simon Vandieken, Chia-tien Chang, Seyyed Saeed Sarfjoo and Jonas Robertson DIALPAD 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: In this paper, we propose a novel decoding algorithm for streaming End-to-end (E2E) auto...

ICNLSP 2024: Investigating Gender Bias in Large Language Models Through Text Generation

13:40

ICNLSP 2024: Investigating Gender Bias in Large Language Models Through Text Generation

Переглядів 414 днів тому

Investigating Gender Bias in Large Language Models Through Text Generation By: Shweta Soundararajan and Sarah Jane Delany Technological University Dublin 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: Large Language Models (LLMs) have swiftly become essential tools across diverse applications such as automated content creation, personal ...

ICNLSP 2024: Design and Comparison of Arabic Negotiation Bots Using LLMs versus Seq2Seq Models .....

11:25

ICNLSP 2024: Design and Comparison of Arabic Negotiation Bots Using LLMs versus Seq2Seq Models .....

Переглядів 514 днів тому

Design and Comparison of Arabic Negotiation Bots Using LLMs versus Seq2Seq Models with Reinforcement Learning By: Ahmad Hajj, Yasmine A Abu Adla, Samah Albast, Hazem Hajj, Shady Elbassuoni, Wassim El Hajj, Khaled Shaban University of Wisconsin-Madison 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: Negotiation is a crucial aspect of daily...

ICNLSP 2024: Enhancing LLM-based Arabic Negotiation by Fine Tuning on Dialogue Shortcomings

11:36

ICNLSP 2024: Enhancing LLM-based Arabic Negotiation by Fine Tuning on Dialogue Shortcomings

Переглядів 914 днів тому

Enhancing LLM-based Arabic Negotiation by Fine Tuning on Dialogue Shortcomings By: Yasmine A Abu Adla, Hazem Hajj, Shady Elbassuoni, Khaled Shaban, Wassim El Hajj American University of Beirut 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: This study advances Arabic dialogue negotiation by enriching the responses of Large Language Models...

ICNLSP 2024: EEG Signal Analysis for Multimodal Simple Concepts Decoding

14:38

ICNLSP 2024: EEG Signal Analysis for Multimodal Simple Concepts Decoding

Переглядів 214 днів тому

EEG Signal Analysis for Multimodal Simple Concepts Decoding By: Sergio Guillén Jiménez, Lorenzo J. Tardón, Ana M Barbancho, Isabel Barbancho Universidad de Málaga 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: In this paper, we explore the use of a feature extraction model for the detection of basic decision-making concepts such as 'yes'...

ICNLSP 2024: Asking the Right Questions: Exploiting Hidden Interactions in a Generative Framework...

14:06

ICNLSP 2024: Asking the Right Questions: Exploiting Hidden Interactions in a Generative Framework...

Переглядів 414 днів тому

Asking the Right Questions: Exploiting Hidden Interactions in a Generative Framework for Multilingual, Multitask Classification By: Sebastian-Antonio Toma, Camelia Lemnaru, Vlad Andrei Negru, Rodica Potolea Technical University of Cluj-Napoca 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: This study explores the potential of leveraging a...

ICNLSP 2024: Dual-Task Learning for AI-Generated Medical Text Detection and Named Entity Recognition

11:56

ICNLSP 2024: Dual-Task Learning for AI-Generated Medical Text Detection and Named Entity Recognition

Переглядів 214 днів тому

Dual-Task Learning for AI-Generated Medical Text Detection and Named Entity Recognition By: Saja B. Al-Dabet, Ban Alomar, Sherzod R Turaev, Abdelkader Belkacem United Arab Emirates University 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: "The integration of artificial intelligence (AI) into the medical field has revolutionized documenta...

ICNLSP 2024: Native Language Identification Improves Authorship Attribution

12:56

ICNLSP 2024: Native Language Identification Improves Authorship Attribution

Переглядів 414 днів тому

Native Language Identification Improves Authorship Attribution By: Ahmet Yavuz Uluslu, Gerold Schneider, Can Yildizli University of Zurich 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: This study investigates the integration of native language identification into authorship attribution, a previously unexplored aspect that is particularl...

ICNLSP 2024: Sawaal: A Framework for Automatic Question Generation in Urdu

13:55

ICNLSP 2024: Sawaal: A Framework for Automatic Question Generation in Urdu

Переглядів 214 днів тому

Sawaal: A Framework for Automatic Question Generation in Urdu By: Maria Rahim, Shakeel Ahmed Khoja Institute of Business Administration 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: This study proposes a novel framework for automatic question generation (AQG) designed specifically for the Urdu language. The framework encompasses seven s...

ICNLSP 2024: Detecting ChatGPT-Generated Text with GZIP-KNN: A No-Training, Low-Resource Approach

17:20

ICNLSP 2024: Detecting ChatGPT-Generated Text with GZIP-KNN: A No-Training, Low-Resource Approach

Переглядів 314 днів тому

Detecting ChatGPT-Generated Text with GZIP-KNN: A No-Training, Low-Resource Approach By: Matthias Berchtold, Sandra Mitrovic, Davide Andreoletti, Daniele Puccinelli, Omran Ayoub Istituto Dalle Molle di Studi sull'Intelligenza Artificiale 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: Text classification is a fundamental Natural Language ...

ICNLSP 2024: Semantically Enriched Text Generation for QA through Dense Paraphrasing

14:23

ICNLSP 2024: Semantically Enriched Text Generation for QA through Dense Paraphrasing

Переглядів 914 днів тому

Semantically Enriched Text Generation for QA through Dense Paraphrasing By: Timothy Obiso, Bingyang Ye, Kyeongmin Rim, James Pustejovsky Brandeis University 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: Large language models (LLMs) are very effective at extractive language tasks such as QA. While LLMs can improve their performance on th...

ICNLSP 2024: Human and Machine: Language Processing in Translation Tasks

8:44

ICNLSP 2024: Human and Machine: Language Processing in Translation Tasks

Переглядів 314 днів тому

Human and Machine: Language Processing in Translation Tasks By: Hening Wang, Leixin Zhang, Ondřej Bojar Universität Tübingen 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: The present study analyzes the influence of linguistic factors (sentence ambiguities) and non-linguistic factors (visual cues) on online language processing in transla...

ICNLSP 2024: Bulgarian Grammar Error Correction with Data Augmentation and Machine Translation Tech.

11:44

ICNLSP 2024: Bulgarian Grammar Error Correction with Data Augmentation and Machine Translation Tech.

Переглядів 214 днів тому

Bulgarian Grammar Error Correction with Data Augmentation and Machine Translation Techniques By: Bozhidar Klouchek and Riza Batista-Navarro The University of Manchester 7th International Conference on Natural Language and Speech Processing. icnlsp.org/2024welcome Abstract: Grammar Error Correction (GEC) in Bulgarian is particularly difficult because of the lack of specialised linguistic resourc...

ICNLSP 2024: Large-scale Summarization of Chat Transcripts in the Absence of Annotated Summaries

15:45

ICNLSP 2024: Large-scale Summarization of Chat Transcripts in the Absence of Annotated Summaries

Переглядів 314 днів тому

ICNLSP 2024: Large-scale Summarization of Chat Transcripts in the Absence of Annotated Summaries

ICNLSP 2024: Conversational Exploratory Search of Scholarly Publications Using Knowledge Graphs

12:29

ICNLSP 2024: Conversational Exploratory Search of Scholarly Publications Using Knowledge Graphs

Переглядів 814 днів тому

ICNLSP 2024: Conversational Exploratory Search of Scholarly Publications Using Knowledge Graphs

ICNLSP 2024: A Hybrid Retrieval Approach for Advancing Retrieval-Augmented Generation Systems

12:25

ICNLSP 2024: A Hybrid Retrieval Approach for Advancing Retrieval-Augmented Generation Systems

Переглядів 714 днів тому

ICNLSP 2024: A Hybrid Retrieval Approach for Advancing Retrieval-Augmented Generation Systems

ICNLSP 2024: Linking Quran and Hadith Topics in an Ontology using Word Embeddings and Cellfie Plugin

11:21

ICNLSP 2024: Linking Quran and Hadith Topics in an Ontology using Word Embeddings and Cellfie Plugin

Переглядів 714 днів тому

ICNLSP 2024: Linking Quran and Hadith Topics in an Ontology using Word Embeddings and Cellfie Plugin

ICNLSP 2024: Modeling Score Estimation for Japanese Essays with Generative Pre-trained Transformers

14:34

ICNLSP 2024: Modeling Score Estimation for Japanese Essays with Generative Pre-trained Transformers

Переглядів 1414 днів тому

ICNLSP 2024: Modeling Score Estimation for Japanese Essays with Generative Pre-trained Transformers

ICNLSP 202: PoliTun: Tunisian Political Dataset for Detecting Public Opinions and Categories ...

12:28

ICNLSP 202: PoliTun: Tunisian Political Dataset for Detecting Public Opinions and Categories ...

Переглядів 414 днів тому

ICNLSP 202: PoliTun: Tunisian Political Dataset for Detecting Public Opinions and Categories ...

ICNLSP 2024: Data Bias According to Bipol: Men are Naturally Right and It is the Role of Women to ..

10:45

ICNLSP 2024: Data Bias According to Bipol: Men are Naturally Right and It is the Role of Women to ..

Переглядів 514 днів тому

ICNLSP 2024: Data Bias According to Bipol: Men are Naturally Right and It is the Role of Women to ..

ICNLSP 2024: Improving Long-term F0 representation using post-processing techniques

13:54

ICNLSP 2024: Improving Long-term F0 representation using post-processing techniques

Переглядів 828 днів тому

ICNLSP 2024: Improving Long-term F0 representation using post-processing techniques

ICNLSP 2024: FeruzaSpeech: A 60 Hour Uzbek Read Speech Corpus with Punctuation, Casing, and Context

8:38

ICNLSP 2024: FeruzaSpeech: A 60 Hour Uzbek Read Speech Corpus with Punctuation, Casing, and Context

Переглядів 928 днів тому

ICNLSP 2024: FeruzaSpeech: A 60 Hour Uzbek Read Speech Corpus with Punctuation, Casing, and Context

ICNLSP 2024: Deep Information Maximisation to Mitigate Information Loss in Text Independent ...

12:41

ICNLSP 2024: Deep Information Maximisation to Mitigate Information Loss in Text Independent ...

Переглядів 1128 днів тому

ICNLSP 2024: Deep Information Maximisation to Mitigate Information Loss in Text Independent ...

ICNLSP 2024: Improved Spoken Emotion Recognition With Combined Segment-Based Processing ...

9:27

ICNLSP 2024: Improved Spoken Emotion Recognition With Combined Segment-Based Processing ...

Переглядів 6Місяць тому

ICNLSP 2024: Improved Spoken Emotion Recognition With Combined Segment-Based Processing ...

ICNLSP 2024: GemST: Continual Learning for End-to-End Speech-to-Text Translation

6:14

ICNLSP 2024: GemST: Continual Learning for End-to-End Speech-to-Text Translation

Переглядів 7Місяць тому

ICNLSP 2024: GemST: Continual Learning for End-to-End Speech-to-Text Translation

ICNLSP 2024: Thonburian Whisper: Robust Fine-tuned and Distilled Whisper for Thai

12:51

ICNLSP 2024: Thonburian Whisper: Robust Fine-tuned and Distilled Whisper for Thai

Переглядів 26Місяць тому

ICNLSP 2024: Thonburian Whisper: Robust Fine-tuned and Distilled Whisper for Thai

ICNLSP 2024: Personalised Abusive Language Detection Using LLMs and Retrieval-Augmented Generation

8:56

ICNLSP 2024: Personalised Abusive Language Detection Using LLMs and Retrieval-Augmented Generation

Переглядів 10Місяць тому

ICNLSP 2024: Personalised Abusive Language Detection Using LLMs and Retrieval-Augmented Generation

ICNLSP 2024: Resolving Gender Biases in LLMs at Inference Time with Novel Dijkstra’s-based .......

12:42

ICNLSP 2024: Resolving Gender Biases in LLMs at Inference Time with Novel Dijkstra’s-based .......

Переглядів 35Місяць тому

ICNLSP 2024: Resolving Gender Biases in LLMs at Inference Time with Novel Dijkstra’s-based .......

ICNLSP Conference