Fine Tuning BERT for Named Entity Recognition (NER) | NLP | Data Science | Machine Learning
Вставка
- Опубліковано 3 жов 2024
- 🔥🐍 Checkout the MASSIVELY UPGRADED 2nd Edition of my Book (with 1300+ pages of Dense Python Knowledge) Covering 350+ Python 🐍 Core concepts
🟠 Book Link - rohanpaul.gumr...
---------------------
Hi, I am a Machine Learning Engineer | Kaggle Master. Connect with me on 🐦 TWITTER: / rohanpaul_ai - for daily in-depth coverage of Machine Learning / LLM / OpenAI / LangChain / Python Intricacies Topics.
Code in GitHub - github.com/roh...
======================
You can find me here:
**********************************************
🐦 TWITTER: / rohanpaul_ai
👨🏻💼 LINKEDIN: / rohan-paul-ai
👨🔧 Kaggle: www.kaggle.com...
👨💻 GITHUB: github.com/roh...
🧑🦰 Facebook Page: / rohanpaulai
📸 Instagram: / rohan_paul_2020
**********************************************
Other Playlist you might like 👇
🟠 Natural Language Processing with Deep Learning : bit.ly/3P6r2CL
🟠 MachineLearning & DeepLearning Concepts & interview Question Playlist - bit.ly/380eYDj
🟠 DataScience | MachineLearning Projects Implementation Playlist - bit.ly/39MEigt
🟠 ComputerVision / DeepLearning Algorithms Implementation Playlist - bit.ly/36jEvpI
#NLP #machinelearning #datascience #textprocessing #kaggle #tensorflow #pytorch #deeplearning #deeplearningai #100daysofmlcode #pythonprogramming #100DaysOfMLCode
This is great video Rohan. Really learned to implement NER now, and for smaller Private datasets (which are mostly the case for Corporate proprietary data) - doing BERT on them for NER is a great Value for money. No LLM required here. Thanks again for the video.
Glad it was helpful Ashley !
After watching this video alone, I now grasp the basic idea behind NER. Thank you; keep it up.
Great to hear that!
Explained very well with a very clear voice.
Great to know it was helpful @noorhassanwazir8133 ..
@@RohanPaul-AI 👏👏❤
This is a very useful, unique and insightful video on NER using BERT Model. Thank you Rohan
Very happy to know Brahma. Thanks for watching.
how to create our own new entity?
Great tutorial !! Would you recommend BERT over Spacy custom NER , though the data prep in this looks complex ?
Hi Rajarams,
This is an interesting question, for smaller projects or if you don’t have enough data or resources I would recommend using spacy for NER.
Another point is type of data, if your textual data is very domain specific, then again BERT may be suitable, as you will need intensive fine tuning/retraining, also in this case even if you have enough data but limited resources I still would use SpaCy.
@@RohanPaul-AI Thanks for that quick clarification. Just wanted to confirm..LLMs like GPT, Llama etc are not much used in NER..right ?
They are also indeed used. To give an example I did a video on a company UBIAI who uses GPT 3.5 / 4 for NER
ua-cam.com/video/AEtjSIdjwlQ/v-deo.html
That was the best tutorial that I came across for NER with transformers.
Thank you very much.
Please keep sharing your great content in NLP.
Thanks, will do!
very helpful thank you
Thanks for watching Max.
Exceptional overview of Named Entity Recognition (NER) using the BERT model. It's incredibly informative and offers valuable insights.
quick question
is there any alternative for tokenize_and_allign_labels function or mapping to conll2003 format, when I have already tokenized inputs & ner tags available ?
Very, very nice. Pls keep uploading more items
Great video. I have a question, is there an easy way of creating a custom ner dataset (with new labels)?
For that the standard workflow is -
Data collection (sources could be news articles, social media, websites, or any other text relevant to your domain).
Define your Annotation guidelines, for annotating the entities in your text e.g. specifying the new labels you want to use, examples of each label, and instructions on how to handle ambiguous cases.
And for Data annotation, You can do manually or use an annotation tool (like Prodigy, Doccano brat GATE etc )
How to use custom csv datasets to train ?
Very good question. I'm trying to reproduce this exemple and I still don't know how to transform the chunk tags from spacy (or any other library) to the same format. I heard the chunk tags are related to the model you will use
how to get the dataset
huggingface.co/datasets/conll2003
thanks alot my friend i really appreciate it@@RohanPaul-AI
the variables label_list and metric in method compute_metrics() seem to only appear in global context. Is there anyway to pass them into the compute_metrics() method?
does the dataset format should formated as what you showed in video?
I wanted to use a LLM to extract job titles from text. Could you please suggest which LLM should be used to extract job title? I used finetuned xlm-roberta model but it only identifies Person, Organisation and Location
mDebertaV3 or XLM-ROBERTA is great for named entity recognition. Before finetuning, you would probably have to label in the order of 50-100 samples for this. More the better.
If you want to use an LLM because you don't have training data to finetune an encoder, this one was built specifically for Entity Extraction: github.com/hitz-zentroa/GoLLIE
Great in-depth coverage here, thanks. Also your informative Twitter posts are my daily go-to for learning. Please keep the invaluable content coming!
Thanks, will do!
how can I use GPU for training here?
How much does it cost to fine tune such a model?
Please upload video explaining NER for pre-trained huggingface model Universal-NER/UniNER-7B-all
Will try.
Super nice. Learned NER within 40 mints. Thanks Man 👏👏
Glad you liked it!
Beautiful Tutorial here. Just detailed enough for the topic. Thanks Rohan.
Most welcome!
Which ide you are using
Its vscode.
Too fantastic!
Thank you very much for sharing your knowledge.
Very happy to know you liked Eddy. Thanks for the appreciation.