- 161
- 9 529
ISPL KU
South Korea
Приєднався 27 сер 2020
Intelligent Signal Processing Laboratory (ISPL) of the School of Electrical and Computer Engineering at Korea University UA-cam channel.
[ISPL seminar]Mini-Monkey : Multi-Scale Adaptive Cropping for Multimodal Large Language Models
2024년 09월 4일 오전 10시
발표자 : 노성혁
Mini-Monkey : Multi-Scale Adaptive Cropping for Multimodal Large Language Models
발표자 : 노성혁
Mini-Monkey : Multi-Scale Adaptive Cropping for Multimodal Large Language Models
Переглядів: 27
Відео
[ISPL seminar]MemoryBank: Enhancing Large Language Modelswith Long-Term Memory (AAAI, 2024)
Переглядів 42 місяці тому
2024년 08월 14일 오전 10시 발표자 : 홍윤아 MemoryBank: Enhancing Large Language Modelswith Long-Term Memory (AAAI, 2024)
[ISPL seminar]Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech
Переглядів 193 місяці тому
2024년 07월 10일 오전 10시 발표자 : 최철원 Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech (ICASSP 2024)
[ISPL seminar]MIXLORA Enhancing Large Language Models
Переглядів 444 місяці тому
2024년 06월 26일 오전 10시 발표자 : Maab MIXLORA Enhancing Large Language Models
[ISPL seminar]NaturalSpeech2LatentDiffusionModelsare Natural, Zero-Shot Speech, Singing Synthesizers
Переглядів 104 місяці тому
2024년 05월 29일 오전 10시 발표자 : 민정기 NaturalSpeech2_Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers(ICLR2024-Microsoft)
[ISPL seminar]Fine-tuning Pre-trained Language Models for Few-shot Intent Detection
Переглядів 505 місяців тому
2024년 05월 01일 오전 10시 발표자 : 여은기 Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
[ISPL seminar]Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification
Переглядів 336 місяців тому
2024년 4월 03일 오전 10시 발표자 : 김용민 MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification (INTERSPEECH2022, Tsinghua University & TEG AI)
[ISPL seminar]Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
Переглядів 607 місяців тому
2024년 3월 27일 오전 10시 발표자 : 이원명 Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
[ISPL seminar]Target Speaker Extraction with Ultra-short Reference Speech by VE-VE Framework
Переглядів 147 місяців тому
2024년 3월 20일 오전 10시 발표자 : 고경득 Target Speaker Extraction with Ultra-short Reference Speech by VE-VE Framework (ICASSP2023, Samsung Research)
[ISPL seminar]Time-Series Representation Learning via Temporal and Contextual Contrasting
Переглядів 447 місяців тому
2024년 3월 13일 오전 10시 발표자 : 이준엽 Time-Series Representation Learning via Temporal and Contextual Contrasting
[ISPL seminar]Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
Переглядів 537 місяців тому
2024년 3월 06일 오전 10시 발표자 : 이석한 Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
[ISPL seminar]The Internal State of an LLM Knows When It’s Lying
Переглядів 697 місяців тому
2024년 2월 28일 오전 10시 발표자 : 박노진 The Internal State of an LLM Knows When It’s Lying
[ISPL seminar]LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation
Переглядів 318 місяців тому
[ISPL seminar]LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation
[ISPL seminar]An MOE-based Parameter Efficient Fine-Tuning Method for Multi-task Medical Application
Переглядів 448 місяців тому
[ISPL seminar]An MOE-based Parameter Efficient Fine-Tuning Method for Multi-task Medical Application
[ISPL Seminar]Structure-aware Editable Morphable Model for 3DFacialDetail Animation and Manipulation
Переглядів 178 місяців тому
[ISPL Seminar]Structure-aware Editable Morphable Model for 3DFacialDetail Animation and Manipulation
[ISPL Seminar]dense X Retrieval: What Retrieval Granularity Should We Use?
Переглядів 638 місяців тому
[ISPL Seminar]dense X Retrieval: What Retrieval Granularity Should We Use?
[VIP 701] Text-Conditional Contextualized Avatars
Переглядів 13Рік тому
[VIP 701] Text-Conditional Contextualized Avatars
[VIP 701] K-Planes: explicit radiance fields
Переглядів 209Рік тому
[VIP 701] K-Planes: explicit radiance fields
[VIP 701] Self-supervised learning (audio)
Переглядів 20Рік тому
[VIP 701] Self-supervised learning (audio)
[VIP 701] Object detection and classification
Переглядів 14Рік тому
[VIP 701] Object detection and classification
[VIP 701] Facial expression based emotion recognition
Переглядів 24Рік тому
[VIP 701] Facial expression based emotion recognition
Thank you for this amazing research and the video. I read the paper and would like to test out the framework. Is there a Github Repos for this project ?
If we look at the experiment, we can see that it does not exceed 20dB. It is understood that this is due to the use of DPRNN as the base model. I think it would be good to develop it based on Sepformer or another model based on Transformer, rather than the DPRNN model.
Overall, it seems like you've done a great job explaining the contents of the PowerPoint. In the experimental results, there are details on Accuracy, but I was wondering if there were any findings utilizing methods like a manifold to visually inspect whether the features of the classes were effectively differentiated after reducing the characteristics of high-dimensional classes to lower dimensions (2D or 3D). Furthermore, the idea of applying strong and weak augmentation to other tasks to improve generalization also appears to be a good approach. Thank you.
Hello, is there any way I can contact you?
Hello, I really like this demo, can you give me the source code of this demo?
Is English version available??
hello, i can't see the video image clearly
can you send presentation
Can it tell the difference between dreamtalk and speech? or snoring and breathe? thx
Can you provide code for learning
Hello
Does this use 2 models? An object detection that takes the whole image as the input to detect persons and then use an action recognition model on each detected person?
How does D_m can be trained on only few data paired data?
any english subtitles ?