Fine Tuning BERT for Named Entity Recognition (NER) | NLP | Data Science | Machine Learning

Поділитися
Вставка
  • Опубліковано 3 жов 2024
  • 🔥🐍 Checkout the MASSIVELY UPGRADED 2nd Edition of my Book (with 1300+ pages of Dense Python Knowledge) Covering 350+ Python 🐍 Core concepts
    🟠 Book Link - rohanpaul.gumr...
    ---------------------
    Hi, I am a Machine Learning Engineer | Kaggle Master. Connect with me on 🐦 TWITTER: / rohanpaul_ai - for daily in-depth coverage of Machine Learning / LLM / OpenAI / LangChain / Python Intricacies Topics.
    Code in GitHub - github.com/roh...
    ======================
    You can find me here:
    **********************************************
    🐦 TWITTER: / rohanpaul_ai
    👨🏻‍💼 LINKEDIN: / rohan-paul-ai
    👨‍🔧 Kaggle: www.kaggle.com...
    👨‍💻 GITHUB: github.com/roh...
    🧑‍🦰 Facebook Page: / rohanpaulai
    📸 Instagram: / rohan_paul_2020
    **********************************************
    Other Playlist you might like 👇
    🟠 Natural Language Processing with Deep Learning : bit.ly/3P6r2CL
    🟠 MachineLearning & DeepLearning Concepts & interview Question Playlist - bit.ly/380eYDj
    🟠 DataScience | MachineLearning Projects Implementation Playlist - bit.ly/39MEigt
    🟠 ComputerVision / DeepLearning Algorithms Implementation Playlist - bit.ly/36jEvpI
    #NLP #machinelearning #datascience #textprocessing #kaggle #tensorflow #pytorch #deeplearning #deeplearningai #100daysofmlcode #pythonprogramming #100DaysOfMLCode

КОМЕНТАРІ • 48

  • @ashleypaloalto3259
    @ashleypaloalto3259 Рік тому +1

    This is great video Rohan. Really learned to implement NER now, and for smaller Private datasets (which are mostly the case for Corporate proprietary data) - doing BERT on them for NER is a great Value for money. No LLM required here. Thanks again for the video.

  • @doyleBellamy03
    @doyleBellamy03 7 місяців тому

    After watching this video alone, I now grasp the basic idea behind NER. Thank you; keep it up.

  • @noorhassanwazir8133
    @noorhassanwazir8133 Рік тому

    Explained very well with a very clear voice.

  • @Brahma2012
    @Brahma2012 Рік тому

    This is a very useful, unique and insightful video on NER using BERT Model. Thank you Rohan

    • @RohanPaul-AI
      @RohanPaul-AI  Рік тому

      Very happy to know Brahma. Thanks for watching.

  • @shivakiranreddy4654
    @shivakiranreddy4654 Рік тому +5

    how to create our own new entity?

  • @Raaj_ML
    @Raaj_ML 2 місяці тому

    Great tutorial !! Would you recommend BERT over Spacy custom NER , though the data prep in this looks complex ?

    • @RohanPaul-AI
      @RohanPaul-AI  2 місяці тому +1

      Hi Rajarams,
      This is an interesting question, for smaller projects or if you don’t have enough data or resources I would recommend using spacy for NER.
      Another point is type of data, if your textual data is very domain specific, then again BERT may be suitable, as you will need intensive fine tuning/retraining, also in this case even if you have enough data but limited resources I still would use SpaCy.

    • @Raaj_ML
      @Raaj_ML 2 місяці тому

      @@RohanPaul-AI Thanks for that quick clarification. Just wanted to confirm..LLMs like GPT, Llama etc are not much used in NER..right ?

    • @RohanPaul-AI
      @RohanPaul-AI  2 місяці тому +1

      They are also indeed used. To give an example I did a video on a company UBIAI who uses GPT 3.5 / 4 for NER
      ua-cam.com/video/AEtjSIdjwlQ/v-deo.html

  • @softwine91
    @softwine91 Рік тому +4

    That was the best tutorial that I came across for NER with transformers.
    Thank you very much.
    Please keep sharing your great content in NLP.

  • @TheMaxxit
    @TheMaxxit Рік тому

    very helpful thank you

  • @arunp1291
    @arunp1291 4 місяці тому

    Exceptional overview of Named Entity Recognition (NER) using the BERT model. It's incredibly informative and offers valuable insights.
    quick question
    is there any alternative for tokenize_and_allign_labels function or mapping to conll2003 format, when I have already tokenized inputs & ner tags available ?

  • @saurabhmukherjee9757
    @saurabhmukherjee9757 Місяць тому

    Very, very nice. Pls keep uploading more items

  • @jipabr
    @jipabr Рік тому +1

    Great video. I have a question, is there an easy way of creating a custom ner dataset (with new labels)?

    • @RohanPaul-AI
      @RohanPaul-AI  Рік тому +1

      For that the standard workflow is -
      Data collection (sources could be news articles, social media, websites, or any other text relevant to your domain).
      Define your Annotation guidelines, for annotating the entities in your text e.g. specifying the new labels you want to use, examples of each label, and instructions on how to handle ambiguous cases.
      And for Data annotation, You can do manually or use an annotation tool (like Prodigy, Doccano brat GATE etc )

  • @gnanaprakashm843
    @gnanaprakashm843 Рік тому +1

    How to use custom csv datasets to train ?

    • @XavierPhilipponneau
      @XavierPhilipponneau Рік тому

      Very good question. I'm trying to reproduce this exemple and I still don't know how to transform the chunk tags from spacy (or any other library) to the same format. I heard the chunk tags are related to the model you will use

  • @KhaliDALKhafaji
    @KhaliDALKhafaji 11 місяців тому +1

    how to get the dataset

    • @RohanPaul-AI
      @RohanPaul-AI  11 місяців тому +1

      huggingface.co/datasets/conll2003

    • @KhaliDALKhafaji
      @KhaliDALKhafaji 11 місяців тому

      thanks alot my friend i really appreciate it@@RohanPaul-AI

  • @captainmonkey4605
    @captainmonkey4605 Рік тому

    the variables label_list and metric in method compute_metrics() seem to only appear in global context. Is there anyway to pass them into the compute_metrics() method?

  • @alwanrahmanasubian471
    @alwanrahmanasubian471 9 місяців тому

    does the dataset format should formated as what you showed in video?

  • @muskanrath7125
    @muskanrath7125 4 місяці тому

    I wanted to use a LLM to extract job titles from text. Could you please suggest which LLM should be used to extract job title? I used finetuned xlm-roberta model but it only identifies Person, Organisation and Location

    • @RohanPaul-AI
      @RohanPaul-AI  4 місяці тому

      mDebertaV3 or XLM-ROBERTA is great for named entity recognition. Before finetuning, you would probably have to label in the order of 50-100 samples for this. More the better.
      If you want to use an LLM because you don't have training data to finetune an encoder, this one was built specifically for Entity Extraction: github.com/hitz-zentroa/GoLLIE

  • @VladimirIglovikov-fn5mr
    @VladimirIglovikov-fn5mr Рік тому +1

    Great in-depth coverage here, thanks. Also your informative Twitter posts are my daily go-to for learning. Please keep the invaluable content coming!

  • @nandkishoreswami1267
    @nandkishoreswami1267 10 місяців тому

    how can I use GPU for training here?

  • @opusdei1151
    @opusdei1151 11 місяців тому

    How much does it cost to fine tune such a model?

  • @surajmishra3330
    @surajmishra3330 7 місяців тому

    Please upload video explaining NER for pre-trained huggingface model Universal-NER/UniNER-7B-all

  • @maximocuadros279
    @maximocuadros279 Рік тому

    Super nice. Learned NER within 40 mints. Thanks Man 👏👏

  • @lovingdata8362
    @lovingdata8362 Рік тому

    Beautiful Tutorial here. Just detailed enough for the topic. Thanks Rohan.

  • @mrityunjaykarmankar9239
    @mrityunjaykarmankar9239 Рік тому

    Which ide you are using

  • @LearningWorldChatGPT
    @LearningWorldChatGPT 2 роки тому

    Too fantastic!
    Thank you very much for sharing your knowledge.

    • @RohanPaul-AI
      @RohanPaul-AI  2 роки тому

      Very happy to know you liked Eddy. Thanks for the appreciation.