ISPL KU
ISPL KU
  • 161
  • 9 529

Відео

[ISPL seminar]MemoryBank: Enhancing Large Language Modelswith Long-Term Memory (AAAI, 2024)
Переглядів 42 місяці тому
2024년 08월 14일 오전 10시 발표자 : 홍윤아 MemoryBank: Enhancing Large Language Modelswith Long-Term Memory (AAAI, 2024)
[ISPL seminar]Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech
Переглядів 193 місяці тому
2024년 07월 10일 오전 10시 발표자 : 최철원 Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech (ICASSP 2024)
[ISPL seminar]MIXLORA Enhancing Large Language Models
Переглядів 444 місяці тому
2024년 06월 26일 오전 10시 발표자 : Maab MIXLORA Enhancing Large Language Models
Image-to-Text Generation Demo 3
Переглядів 114 місяці тому
Image-to-Text Generation Demo 3
Image-to-Text Generation Demo 2
Переглядів 44 місяці тому
Image-to-Text Generation Demo 2
Image-to-Text Generation Demo1
Переглядів 74 місяці тому
Image-to-Text Generation Demo1
[ISPL seminar]NaturalSpeech2LatentDiffusionModelsare Natural, Zero-Shot Speech, Singing Synthesizers
Переглядів 104 місяці тому
2024년 05월 29일 오전 10시 발표자 : 민정기 NaturalSpeech2_Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers(ICLR2024-Microsoft)
[ISPL seminar]Fine-tuning Pre-trained Language Models for Few-shot Intent Detection
Переглядів 505 місяців тому
2024년 05월 01일 오전 10시 발표자 : 여은기 Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
[ISPL seminar]Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification
Переглядів 336 місяців тому
2024년 4월 03일 오전 10시 발표자 : 김용민 MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification (INTERSPEECH2022, Tsinghua University & TEG AI)
[ISPL seminar]Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
Переглядів 607 місяців тому
2024년 3월 27일 오전 10시 발표자 : 이원명 Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
[ISPL seminar]Target Speaker Extraction with Ultra-short Reference Speech by VE-VE Framework
Переглядів 147 місяців тому
2024년 3월 20일 오전 10시 발표자 : 고경득 Target Speaker Extraction with Ultra-short Reference Speech by VE-VE Framework (ICASSP2023, Samsung Research)
[ISPL seminar]Time-Series Representation Learning via Temporal and Contextual Contrasting
Переглядів 447 місяців тому
2024년 3월 13일 오전 10시 발표자 : 이준엽 Time-Series Representation Learning via Temporal and Contextual Contrasting
[ISPL seminar]Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
Переглядів 537 місяців тому
2024년 3월 06일 오전 10시 발표자 : 이석한 Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
[ISPL seminar]The Internal State of an LLM Knows When It’s Lying
Переглядів 697 місяців тому
2024년 2월 28일 오전 10시 발표자 : 박노진 The Internal State of an LLM Knows When It’s Lying
[ISPL seminar]LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation
Переглядів 318 місяців тому
[ISPL seminar]LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation
[ISPL seminar]An MOE-based Parameter Efficient Fine-Tuning Method for Multi-task Medical Application
Переглядів 448 місяців тому
[ISPL seminar]An MOE-based Parameter Efficient Fine-Tuning Method for Multi-task Medical Application
[ISPL Seminar]Structure-aware Editable Morphable Model for 3DFacialDetail Animation and Manipulation
Переглядів 178 місяців тому
[ISPL Seminar]Structure-aware Editable Morphable Model for 3DFacialDetail Animation and Manipulation
[ISPL Seminar]dense X Retrieval: What Retrieval Granularity Should We Use?
Переглядів 638 місяців тому
[ISPL Seminar]dense X Retrieval: What Retrieval Granularity Should We Use?
[VIP 701] Text-Conditional Contextualized Avatars
Переглядів 13Рік тому
[VIP 701] Text-Conditional Contextualized Avatars
[VIP 701] 3D Human pose estimation
Переглядів 56Рік тому
[VIP 701] 3D Human pose estimation
[VIP 701] K-Planes: explicit radiance fields
Переглядів 209Рік тому
[VIP 701] K-Planes: explicit radiance fields
[VIP 701] 3D Face Reconstruction
Переглядів 35Рік тому
[VIP 701] 3D Face Reconstruction
[VIP 701] Source separation (speech)
Переглядів 19Рік тому
[VIP 701] Source separation (speech)
[VIP 701] Self-supervised learning (audio)
Переглядів 20Рік тому
[VIP 701] Self-supervised learning (audio)
[VIP 701] Object detection and classification
Переглядів 14Рік тому
[VIP 701] Object detection and classification
[VIP 701] Facial expression based emotion recognition
Переглядів 24Рік тому
[VIP 701] Facial expression based emotion recognition
[VIP 701] TTS
Переглядів 23Рік тому
[VIP 701] TTS
[VIP 701] Voice Conversion
Переглядів 17Рік тому
[VIP 701] Voice Conversion
[VIP 701] Text-to-gesture training
Переглядів 12Рік тому
[VIP 701] Text-to-gesture training

КОМЕНТАРІ

  • @harmonite99
    @harmonite99 6 місяців тому

    Thank you for this amazing research and the video. I read the paper and would like to test out the framework. Is there a Github Repos for this project ?

  • @김용민-y8f
    @김용민-y8f 6 місяців тому

    If we look at the experiment, we can see that it does not exceed 20dB. It is understood that this is due to the use of DPRNN as the base model. I think it would be good to develop it based on Sepformer or another model based on Transformer, rather than the DPRNN model.

  • @김용민-y8f
    @김용민-y8f 7 місяців тому

    Overall, it seems like you've done a great job explaining the contents of the PowerPoint. In the experimental results, there are details on Accuracy, but I was wondering if there were any findings utilizing methods like a manifold to visually inspect whether the features of the classes were effectively differentiated after reducing the characteristics of high-dimensional classes to lower dimensions (2D or 3D). Furthermore, the idea of applying strong and weak augmentation to other tasks to improve generalization also appears to be a good approach. Thank you.

  • @HuyNgô-e3g
    @HuyNgô-e3g 8 місяців тому

    Hello, is there any way I can contact you?

  • @HuyNgô-e3g
    @HuyNgô-e3g 8 місяців тому

    Hello, I really like this demo, can you give me the source code of this demo?

  • @kottapallisaiswaroop9849
    @kottapallisaiswaroop9849 8 місяців тому

    Is English version available??

  • @oskarikaadinugroho1332
    @oskarikaadinugroho1332 10 місяців тому

    hello, i can't see the video image clearly

  • @meme2002a
    @meme2002a 10 місяців тому

    can you send presentation

  • @luiswu7885
    @luiswu7885 Рік тому

    Can it tell the difference between dreamtalk and speech? or snoring and breathe? thx

  • @vaishalishiv5529
    @vaishalishiv5529 Рік тому

    Can you provide code for learning

  • @vaishalishiv5529
    @vaishalishiv5529 Рік тому

    Hello

  • @uprisingjc_sub6664
    @uprisingjc_sub6664 Рік тому

    Does this use 2 models? An object detection that takes the whole image as the input to detect persons and then use an action recognition model on each detected person?

  • @이상윤-n7d
    @이상윤-n7d 3 роки тому

    How does D_m can be trained on only few data paired data?

  • @mohamed_bouallegue
    @mohamed_bouallegue 3 роки тому

    any english subtitles ?