Center for Language & Speech Processng (CLSP), JHU
Center for Language & Speech Processng (CLSP), JHU
  • 380
  • 82 082
Streaming Sequence Transduction through Dynamic Compression
We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams. STAR dynamically segments input streams to create compressed anchor representations, achieving nearly lossless compression (12x) in Automatic Speech Recognition (ASR) and outperforming existing methods. Moreover, STAR demonstrates superior segmentation and latency-quality trade-offs in simultaneous speech-to-text tasks, optimizing latency, memory footprint, and quality.
Переглядів: 8

Відео

Fighting Bias from Bias: Robust Natural Language Techniques to Promote Health Equity
Переглядів 222 години тому
As artificial intelligence (AI) continues to rapidly expand into existing healthcare infrastructure - e.g., clinical decision support, administrative tasks, and public health surveillance - it is perhaps more important than ever to reflect on the broader purpose of such systems. While much focus has been on the potential for this technology to improve general health outcomes, there also exists ...
Adversarial and Poisoning Attacks against Speech Systems: Where to Find Them?
Переглядів 304 години тому
In this presentation, we delve into the intricate world of machine learning system vulnerabilities, focusing primarily on poisoning attacks and their impact on data integrity. The speaker, research scientist Thomas Thebuad, offers a comprehensive breakdown of how data can be maliciously altered to affect machine learning outcomes, highlighting the dangers and effectiveness of both "dirty labels...
Artificial Language Learning and Language Acquisition - Rebecca Gomez (JHU) - 2000
Переглядів 767 годин тому
Abstract The rapidity with which children acquire language is one of the mysteries of human cognition. A widely held view is that children master language by means of a language-specific learning device. An earlier proposal, generating renewed interest, is that children make use of domain-general, associative learning mechanisms in acquiring language. However, we know little about the actual le...
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
Переглядів 5914 годин тому
ICLR 2024 arxiv.org/abs/2310.00840 Abstract: Text generation models are notoriously vulnerable to errors in the training data. With the wide-spread availability of massive amounts of web-crawled data becoming more commonplace, how can we enhance the robustness of models trained on a massive amount of noisy web-crawled text? In our work, we propose Error Norm Truncation (ENT), a robust enhanceme...
Eric Fosler-Lussier: Speech Recognition - 2000
Переглядів 15День тому
Eric Fosler-Lussier: Speech Recognition - 2000
Tutorial: Efficient language models -- Tianjian Li (JHU)
Переглядів 229День тому
Lecture given as part of CS 601.471/671 NLP: Self-supervised Models: self-supervised.cs.jhu.edu/sp2024/ Speaker: Tianjian Li: tianjianl.github.io/
Data-distributional Approaches for Generalizable Language Models -- Sang Michael Xie (Stanford)
Переглядів 12514 днів тому
Abstract: High-quality datasets are crucial for improving the capabilities and training efficiency of large language models. However, current datasets are typically prepared in an ad hoc, heuristic way. In this talk, Sang Michael Xie will present principled approaches to improving and understanding language models centered on the pre-training data distribution. First, he will describe how to im...
UBM based Acoustic Modeling for ASR (Daniel Povey) - 2009
Переглядів 3714 днів тому
UBM based Acoustic Modeling for ASR (Daniel Povey) - 2009
A Universe To Be Decided: Towards Specialized Foundation Models for Advancing Astronomy
Переглядів 8914 днів тому
Abstract: I discuss the application of Foundation Models in Astronomy through the collaborative efforts of the UniverseTBD consortium with a mission to democratize Science for everyone. One of our key objectives is to overcome the limitations of general-purpose Foundation Models, such as producing limited information in specialized fields. To this end, we have developed the first specialized la...
Neurosymbolic AI or: How I Learned to Stop Worrying and Love the Large Language Model
Переглядів 42614 днів тому
Abstract: Large language models like ChatGPT have shown extraordinary abilities for writing. While impressive at first glance, large language models aren't perfect and often make mistakes humans would not make. The main architecture behind ChatGPT mostly doesn't differ from early neural networks, and as a consequence, carries some of the same limitations. My work revolves around the use of neur...
Intro to information retrieval (James Mayfield) - 2009
Переглядів 6021 день тому
Intro to information retrieval (James Mayfield) - 2009
On Representing Acoustics of Speech for Speech Processing - Bishnu Atal (UW) - 2009
Переглядів 65Місяць тому
Abstract Proper representation of the acoustic speech signal is crucial for almost every speech processing application. We often use short-time Fourier transform to convert the time-domain speech waveform to a new signal that is a function of both time and frequency by applying a moving time window of about 20 ms in duration. There are many issues, such as the size and shape of the window, that...
Bayesian Nonparametric Methods for Complex Dynamical Phenomena - Emily Fox (UPenn) - 2012
Переглядів 49Місяць тому
Abstract Markov switching processes, such as hidden Markov models (HMMs) and switching linear dynamical systems (SLDSs), are often used to describe rich classes of dynamical phenomena. They describe complex temporal behavior via repeated returns to a set of simpler models: imagine, for example, a person alternating between walking, running and jumping behaviors, or a stock index switching betwe...
To Sentences and Beyond! Paving the way for Context-Aware Machine Translation
Переглядів 57Місяць тому
Presentation by Rachel Wicks Most machine translation systems operate on the sentence-level while humans write and translate within a given context. Operating on individual sentences forces error-prone sentence segmentation into the machine translation pipeline. This limits the upper-bound performance of these systems by creating noisy training bitext. Further, many grammatical features necessi...
Machine Learning - Finding Patterns in the World - Mark Dredze (JHU) - 2009
Переглядів 530Місяць тому
Machine Learning - Finding Patterns in the World - Mark Dredze (JHU) - 2009
Clean Label Poisoning Attacks: from Classification to Speech Recognition
Переглядів 87Місяць тому
Clean Label Poisoning Attacks: from Classification to Speech Recognition
How Geometric Should Our Semantic Models Be? - Katrin Erk (University of Texas)
Переглядів 36Місяць тому
How Geometric Should Our Semantic Models Be? - Katrin Erk (University of Texas)
Fast, Accurate and Robust Multilingual Syntactic Analysis - Slav Petrov (Google) - 2012
Переглядів 54Місяць тому
Fast, Accurate and Robust Multilingual Syntactic Analysis - Slav Petrov (Google) - 2012
Improving machine translation by propagating uncertainty - Chris Dyer (2009)
Переглядів 932 місяці тому
Improving machine translation by propagating uncertainty - Chris Dyer (2009)
CDG-Based Language Models (Mary Harper) - 2009
Переглядів 352 місяці тому
CDG-Based Language Models (Mary Harper) - 2009
Speech and Audio Processing in Non-Invasive Brain-Computer Interfaces at Meta [Michael Mandel]
Переглядів 3922 місяці тому
Speech and Audio Processing in Non-Invasive Brain-Computer Interfaces at Meta [Michael Mandel]
EM Works for Pronoun-Anaphora Resolution - Eugene Charniak (Brown University) - 2009
Переглядів 322 місяці тому
EM Works for Pronoun-Anaphora Resolution - Eugene Charniak (Brown University) - 2009
Student Lightning Talks - Tianjian, Lingfeng, Kaiser
Переглядів 1223 місяці тому
Student Lightning Talks - Tianjian, Lingfeng, Kaiser
Probabilistic Models of Relational Domains - Daphne Koller (Stanford)
Переглядів 893 місяці тому
Probabilistic Models of Relational Domains - Daphne Koller (Stanford)
Prominence in conversational speech: pitch accent, contrast and givenness - Ani Nenkova (UPenn) 2008
Переглядів 553 місяці тому
Prominence in conversational speech: pitch accent, contrast and givenness - Ani Nenkova (UPenn) 2008
Bipartite Graph Factorization in Static Decoding Graphs with Long-Span Acoustic Context - G. Zweig
Переглядів 624 місяці тому
Bipartite Graph Factorization in Static Decoding Graphs with Long-Span Acoustic Context - G. Zweig
Foundation Models and the Transfer of Embodied Autonomy -- Alvaro Velasquez (DARPA)
Переглядів 1844 місяці тому
Foundation Models and the Transfer of Embodied Autonomy Alvaro Velasquez (DARPA)
An Overview of Digital Libraries - Tim DiLauro (Johns Hopkins University) - 2003
Переглядів 484 місяці тому
An Overview of Digital Libraries - Tim DiLauro (Johns Hopkins University) - 2003
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation -- Jack Zhang (JHU)
Переглядів 634 місяці тому
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation Jack Zhang (JHU)

КОМЕНТАРІ

  • @newton7724
    @newton7724 10 днів тому

    😈 *Promosm*

  • @johndoolan9732
    @johndoolan9732 18 днів тому

    Now why not try visualisation in your mind because this will teach more with different teaching methods Now there will teach so much more with much more speed and better results then with the mind we utilise our best tool

  • @EvanTanMusic
    @EvanTanMusic 28 днів тому

    fantastic

  • @__________________________6910
    @__________________________6910 Місяць тому

    Thanks for this wonderful tutorial tutorial ❤

  • @__________________________6910
    @__________________________6910 Місяць тому

    Upload latest videos not old one

  • @qalabeabbas6114
    @qalabeabbas6114 Місяць тому

    Hi, great talk. Is it possible to get that presentation material?

  • @whoknows4756
    @whoknows4756 2 місяці тому

    *She does not know anything...lawyer*

  • @pulkitmehta1795
    @pulkitmehta1795 2 місяці тому

    This is great talk . I learnt a lot . I am currently working on speaker diarization problem and pyannote is by far the best for our use case . Great work Herve .

  • @Phooenixification
    @Phooenixification 2 місяці тому

    This was really interesting But it would be more interesting if it was an uncensored experiment since our world in itself is uncensored (i get that this could get wild and what you show here is only as a concept). Like one of the things you said that they always adress each other formally, very similar to GPT. If it were more unleashed the agent may start to develop their own kind of language over time and between each other and saying good morning to your spouse gets shortned just to just a good morning, or something else they develop to say to each other. And starts to say inside jokes perhaps? But that would require the emotion part I'll talk about later. Don't know how deep the generative part goes or if GPT needs to reach "system 2" before we can see this type of behaviour. And since agents don't really have a mood they will always be pretty neutral on all encounters and is only simulated through already learnt in behaviour. Although i think it wouldn't give a real interpretation of our world even if it was uncensored, stuff like emotions and consequences, and emotions which will come due to that consequence etc. (like serving jailtime) and that we have a limited timespan, and that our live is sectioned up into parts (child, teen, young adult, mid adult, etc) needs to be adressed aswell. For example an old man might be more inclined to do a certain crime than a younger person, because his life is soon over anyway). For a hypothetical smallville example: John invites his crush Jennifer to his birthday party, but Kenny invites Jennifer at the same time to watch a new netflix serie and she goes there instead, John resents this and kills John after his birthday party to get Jennifer to himself. A real person would go through so much reasoning and consequence thinking before reaching such a conclusion, and to kill another person because of such a reason is primarly just emotions, and all our actions comes from some kind of emotion. So some kind of atleast basic simulated emotion and consequence thinking (the sims style-ish) to get the real interesting drama to come out as a next step, that would be really cool to see as a smallville 2.

  • @makdiose
    @makdiose 3 місяці тому

    What would it take to have this mini community available online, like on a website? Where visitors like us can view these agents realtime and see what are their doing for that specific moment. Fascinating to see what are their up to next.

  • @lincolnkroll
    @lincolnkroll 4 місяці тому

    at 24:05 an erroneous result is presented that is accepted as fact by the panel of experts, and is in fact presented when I Google search the same question. Pearls from otters are NOT used in inlays for guitars, but rather Mother Of Pearl, which comes from abalone shell. It is easy to see how the mistake is made, but illustrates the difficulty of fact checking AI answers.

  • @markdisney260
    @markdisney260 4 місяці тому

    Thankyou. Interesting. These deep dives into how the media mould minds are fascinating. I may have misunderstood, but the slides suggest that in 2002 more people received their news from TV than the internet. 20 years ago that might have been true, but I'd be shocked if this were so today.

  • @annaf8143
    @annaf8143 5 місяців тому

    I LOVE this! Please do a cooperation with Electronic Arts and make a sims4 style computer game with similar graphics but generative agents <3

    • @Zanthous_
      @Zanthous_ 16 днів тому

      sorry for the late response but there are legal issues with the content generated that pose a risk companies won't want to take (users may prompt agents to say dangerous things, say how to create weapons), and then aside from that I saw someone say simulating just 3 agents for an hour was like 8$, so costs have to come down an order of magnitude (which might happen soon enough, and starting to develop an app/game now might make sense if not for legal issues). There is one team working on a game like this right now based on animal crossing called Campfire - cozy ai villagers. I'm considering making a small game prototype as well

  • @zoeytala
    @zoeytala 5 місяців тому

    Thanks a lot for this talk! I have one question regarding the co-occurrence matrix. At 19:45 you said, that the content of the matrix is for how long the "colors" overlap in seconds. So my question is, why is there a 2 for blue and grey if they overlap twice for all together at least as long as pink and red. So shouldn't grey and blue be at least 3 if not 4? I would greatly appreciate it if you could tell me what I am missing. Thanks alot!

    • @hervebredin
      @hervebredin 3 місяці тому

      You are right, that's a mistake on my slide. That does not change the mapping nor the message I was trying to pass, though.

  • @imenbenamor1367
    @imenbenamor1367 5 місяців тому

    BA-LR not BA-LHR. Could you please correct it in the title? Thank you

  • @hervebredin5734
    @hervebredin5734 5 місяців тому

    s/Berdin/Bredin

  • @abdulshabazz8597
    @abdulshabazz8597 6 місяців тому

    Wow. The CMU engineering program is a phenomenal juggernaut . Such a large body of high quality research .

  • @levpesa2022
    @levpesa2022 6 місяців тому

    🤔 P r o m o s m

  • @AlgoNudger
    @AlgoNudger 6 місяців тому

    Thanks.

  • @jennyhorner
    @jennyhorner 6 місяців тому

    Fascinating! I have a little AI staff team I’m trying to learn how to get them to be more independent! A question that comes up for me: Klaus is a dedicated sociology student who has an interest in gentrification. Is this ‘experienced’ as a superficial identity label/label given to explain his activity, or does he see Smallville through the lens of a sociology student? Does he observe things related to gentrification in a way which the other characters wouldn’t even notice?

  • @AlgoNudger
    @AlgoNudger 6 місяців тому

    Thanks.

  • @fitybux4664
    @fitybux4664 6 місяців тому

    Have you considered allowing them to have money? "You job just paid your $1500 monthly paycheck. You have a monthly rent of $700." (Either as some sort of disembodied "world character", or by the people doing the charges and payments themselves in the world such as a landlord/job boss/etc?) I think having negative stimulus can really help things along. "You didn't pay your rent. Now you have to live outside in a cardboard box. Your living condition is terrible." What if you had a disembodied "world character" that dolls out negative stimulus randomly? 😲 ("Today, you got into a car accident.")

  • @AlgoNudger
    @AlgoNudger 6 місяців тому

    Thanks.

  • @nekomatic
    @nekomatic 7 місяців тому

    I wonder how this experiment would behave on smaller models. I.e. Agents would use specific (specialized?) small model depending on their role or select from a pool of small models depending on situation?

    • @Phooenixification
      @Phooenixification 2 місяці тому

      Wouldn't they loose their individuality then? If two persons are in the same situation at some point wouldn't they use the same model then and reasoning similarly? Or what do you mean?

  • @levioptionallastname6749
    @levioptionallastname6749 7 місяців тому

    Ugh, you beat me to it! in my defense I am only one person!

  • @user-fd5lf8op6s
    @user-fd5lf8op6s 7 місяців тому

    Thats why americans are afraid of asians. they are evil smart.

  • @jessicadyer4389
    @jessicadyer4389 9 місяців тому

    promo sm 😓

  • @beautifulmind684
    @beautifulmind684 11 місяців тому

    wow, surprising finding, achievements/opportunities 😮 so interesting 🧐

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 11 місяців тому

    wow, way too many questions. hard to follow the presentation.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 11 місяців тому

    Audio could be better

  • @Silly.Old.Sisyphus
    @Silly.Old.Sisyphus 11 місяців тому

    if you can't think it, fake it

  • @eva__4380
    @eva__4380 11 місяців тому

    Is it possible that the model has seen the data used for these benchmarks during training .

  • @gsm1
    @gsm1 11 місяців тому

    Thanks for uploading this. However, I noticed that the text in your videos can be a bit hard to read due to the small size and it's somewhat blurry at times. I think your videos would be even better in a higher resolution, perhaps greater than 480p!

  • @zizifn9142
    @zizifn9142 Рік тому

    16:00 lol google use openai playground for demo.....

  • @bindurao3463
    @bindurao3463 Рік тому

    Really helped

  • @bindurao3463
    @bindurao3463 Рік тому

    Good work

  • @VerseUtopia
    @VerseUtopia Рік тому

    Seem Professor are too Young for AI development..

  • @disarmyouwitha
    @disarmyouwitha Рік тому

    Ah yes, the timeless topic of emergence and reasoning in large language models! As I rest my weary fingers upon the keyboard, preparing to share my innermost thoughts and wisdom on the subject, it occurs to me that, much like any good lasagna, this particular topic comprises multiple layers of complexity and intrigue. So, let's dive right in, my fellow internet sojourners! First and foremost, credit must be given where credit is due. Mr. Wei's elegant soliloquy on large language models at the prestigious Google headquarters resonates with both the seasoned researcher and the neophyte alike. As a cardinal for the internet comment realm, I must express my gratitude to Jason for regaling us with his insight. Now, one simply cannot discuss large language models without acknowledging their capacity to simulate almost mind-boggling levels of human-like cognition. From composing impressive literary works to identifying penguins in a sea of malformed pixels that only a madman would consider "images," these computational wünderkinds represent the apex of human innovation. Or do they, my dear reader? For, are we not jeopardizing our intellectual sovereignty as we relinquish our authorship to these silicon sages? Potentially. Perhaps. Who's to say, really? Aside from philosophical conundrums, we cannot ignore the computational intricacies of these vivacious virtual virtuosos. The nuance and finesse that constitute their digital DNA, and their thirst for modular knowledge, undeniably place them amongst the most fascinating creations of humankind. Now, as I elucidate the enigmatic world of such prodigious language models, let us not forget the immortal words of Albert Einstein: "Two things are infinite: the universe and a UA-cam comment attempting to summarize the complexity of large language models; and I'm not sure about the universe." Ah, such a paragon of wisdom. In conclusion, as the night envelopes us all in its comforting embrace and my eyelids grow heavier with each passing keystroke, I am reminded that, sometimes, the very answers we seek within the realms of technology transcend the limits of our understanding. Language models shall guide us through the labyrinthine fortress of knowledge. Just like a lighthouse in a stormy sea, they are but humble beacons, pointing us towards our destiny… which hopefully involves making lasagna with a competitive edge in Robot MasterChef. AND POST!

  • @CalculatingMonkey
    @CalculatingMonkey Рік тому

    So insightful!! Thanks!!

  • @ellepeterson9992
    @ellepeterson9992 Рік тому

    Brilliant man, glad this exists

  • @DistortedV12
    @DistortedV12 Рік тому

    I'm going to call out the elephant in the room. This is the stupidest thing I've heard. "yeah it sorta works, but we don't know why"...does this guy have a Ph.D?

  • @fumikaisono4706
    @fumikaisono4706 Рік тому

    What is the name of the paper that is mentioned at 32:09?

  • @billykotsos4642
    @billykotsos4642 Рік тому

    39:50 Yh but it gets extra tricky when the 'reasoning path' is wrong, but the final answer is correct !

  • @MichiaLatia
    @MichiaLatia Рік тому

    The volume of the speaker is incredibly low. Could not hear anything w/ max volume on speakers.

  • @Pruthvikajaykumar
    @Pruthvikajaykumar Рік тому

    Thank you, really helpful

    • @jhuclsp
      @jhuclsp Рік тому

      You're welcome!

  • @nintishia
    @nintishia Рік тому

    1. Is it possible to lower the scale at which emergence occurs by choosing a suitable architecture? 2. Is there a possibility that we decompose large language models into parts that deal with syntax, knowledge and reasoning?

  • @andrewparsons1041
    @andrewparsons1041 Рік тому

    Below is an approximate chapter index. Introduction 0:00 CLSP and Faculty 3:22 Admissions Process 9:47 Admissions Process: Availability 9:55 Admissions Process: Stats 10:30 Admissions Process: Triage 11:00 Admissions Process: Faculty Review: 11:31 Admissions Process: Interviews: 12:30 Admissions Process: Visit Day: 13:30 Admissions Process: Rolling Admissions: 14:20 Application Review: 15:36 Application Review: The Cons of the Common Approach used by Other Institutions 18:10 Application Review: Normalization: 20:10 Application: Items in Order of Importance: 21:37 Application: Letters of Recommendation 22:07 Application: Statements 25:20 Common Misconceptions: 30:30 Vivien Thomas Scholars Initiative: 35:10 Application help: 37:17 Question: Deadlines 38:30 Question: Summer undergraduate research opportunities? 40:17 Question: Do I need to major in computer science? 41:31 Question: Do I need to publish before applying? 42:26 Question: Evaluation of applicants with master's degrees 43:01 Question: How does funding work? 44:00 Question: GRE and fee waiver 46:52 Question: Relation to other departments and faculty 47:11 Question: Statement, research interest 48:23 Question: How many international vs. domestic applicants? 50:53 Question: Disability accommodations: 52:04 Question: Other questions? 53:07 Question: Time to graduate? 54:25 Question: State of the market (academia, industry)? 57:16 Question: Should I notify the admissions committee of my achievements which occurred after submission? 59:57 Question: Personal Statement vs. Statement of Purpose 1:03:00 Question: How much math and programming do you need? 1:05:23 Question: What are good attributes for an applicant coming from arts and humanities? 1:08:08 Question: Which faculty are taking students? 1:11:11 Question: Will I be rejected if I have a specific research area not available at JHU (this year)? 1:14:00 Question: Will I get an email if I am rejected? 1:16:40 Question: How should I prepare for interviews? 1:17:48 Thanks and goodbye 1:20:16