Lecture 10: Evaluation of Language Models, Basic Smoothing

Поділитися
Вставка
  • Опубліковано 25 сер 2024

КОМЕНТАРІ • 16

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому +1

    One solution to probability density estimation is referred to as Maximum Likelihood Estimation or MLE for short. First, it involves defining a parameter called theta that defines both the choice of the probability density function and the parameters of that distribution.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому +1

    Traditionally, language model performance is measured by perplexity, cross-entropy, and bits-per-character (BPC). As language models are increasingly being used as pre-trained models for other NLP tasks, they are often also evaluated based on how well they perform on downstream tasks.

  • @louerleseigneur4532
    @louerleseigneur4532 4 роки тому +2

    Thanks sir

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    perplexity is a measurement of how well a probability model predicts a sample. In the context of Natural Language Processing, perplexity is one way to evaluate language models.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    A 2-gram (or bigram) is a two-word sequence of words, like “I love”, “love reading”, or “Analytics Vidhya”. And a 3-gram (or trigram) is a three-word sequence of words like “I love reading”, “about data science” or “on Analytics Vidhya”.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    In information theory, perplexity is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models. A low perplexity indicates the probability distribution is good at predicting the sample.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    The Shannon Visualization Method
    1. Choose a random bigram (, w) according to its probability.
    2. Now choose a random bigram (w, x) according to its probability.
    3. And so on until we choose
    4. Then string the words together.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    The term smoothing refers to the adjustment of the maximum likelihood estimator of a language model so that it will be more accurate. ... When estimating a language model based on a limited amount of text, such as a single document, smoothing of the maximum likelihood model is extremely important.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    Perplexity is the inverse probability of the test set, normalized by the number of words. In the case of unigrams: Now you say you have already constructed the unigram model, meaning, for each word you have the relevant probability.

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    What is extrinsic and intrinsic evaluation?
    In an intrinsic evaluation, quality of NLP systems outputs is evaluated against pre-determined ground truth (reference text) whereas an extrinsic evaluation is aimed at evaluating systems outputs based on their impact on the performance of other NLP systems

  • @pawanchoure1289
    @pawanchoure1289 2 роки тому

    unigram prior smoothing

  • @divyanshukumar2605
    @divyanshukumar2605 3 роки тому +5

    Never goes in depth of any concepts, just says a bunch of technical words without explaining explicitly, even the explanations are word to word copied from the lecture of Dan Jurafsky.

  • @sumonchakrabarty6805
    @sumonchakrabarty6805 Рік тому +3

    Worst teacher ever seen in my life. He don't even know English properly. His vocabulary is worse. These kind of professors should be fired out immediately of IITs. They are polluting the teaching process....

  • @divyanshukumar2605
    @divyanshukumar2605 3 роки тому +1

    A third grade teacher, he should be teaching a 5th grader.