Natural Language Processing with spaCy & Python - Course for Beginners

freeCodeCamp.org

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 25 гру 2024

КОМЕНТАРІ •

@rublev88 3 роки тому ⁺¹²⁶
Yes! Please, definitely make a second part. I teach in the Humanities (college literature and creative writing classes), and I'm actively searching for tools I can use for creative experiments with texts.
@NickWindham 2 роки тому ⁺¹
Check out Microsft Power Automate AI Builder
@engw7090 2 роки тому ⁺⁷
@@NickWindham hey is the second part out yet?
@krishnapatel8852 Рік тому
hey, I`m not getting the expected output at 1:57:26. It`s showing KeyError: 0. Can you help me with that?
@amberstiefel9748 Рік тому ⁺¹
Where do you teach?
@TechTessellator 3 роки тому ⁺⁵⁹
Bangers one after another. This channel is a treasure.
@cuteorpyarahamza 23 дні тому ⁺²
46:00
# Print tokens and their part of speech
print("Tokens and their POS tags:")
for token in doc:
print(f"{token.text}: {token.pos_}")
print('
Sentences:')
for sent in doc.sents:
print(sent)
# Print named entities
print("
Named Entities:")
for ent in doc.ents:
print(f"{ent.text} ({ent.label_})")
@cuteorpyarahamza 23 дні тому ⁺²
45:04
*Named Entity Recognition:*
A named entity is a real object that you can refer to by a proper name. It can be a person, organization, location, or other entity. Named entities are important in NLP because they reveal the place or organization the user is talking about.
doc = nlp(u'I have flown to Islamabad. Now I am flying to Lahore.')
for token in doc:
if token.ent_type != 0: # If the ent_type attribute of a token is not set to 0
print(token.text, token.ent_type_) # then the token is a named entity
@thesupermachak 3 роки тому ⁺³
Clicked this video by accident but got hypnotised by the shirt and now I'm learning Python.
@cuteorpyarahamza 24 дні тому
doc = nlp(u'A severe storm hit the beach. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
for sent in doc.sents:
print([word for word in sent])
print([doc[i] for i in range(len(doc))])
# Check if the first word in the second sentence of the text is a pronoun
for i, sent in enumerate(doc.sents):
if i == 1 and sent[0].pos_ == 'PRON':
print('The second sentence begins with a pronoun.')
# Check how many sentences in the text end with a verb
counter = 0
for sent in doc.sents:
if sent[-2].pos_ == 'VERB':
counter += 1
print(f'{counter} sentence(s) in the document end with a verb.')
@firdovsihasanzada Рік тому ⁺⁶
50 minutes in and it is already the best practical explanation of how spaCy works.
@yugioh8307 3 роки тому ⁺¹²
Best Helpline for those who really want to learn NLP with ease and for free , can't wait for part 2
@cuteorpyarahamza 23 дні тому
**Named Entity Recognition**:
A named entity is a real object that you can refer to by a proper name. It can be a person, organization, location, or other entity. Named entities are important in NLP because they reveal the place or organization the user is talking about.
doc = nlp(u'I have flown to Islamabad. Now I am flying to Lahore.')
for token in doc:
if token.ent_type != 0: # If the ent_type attribute of a token is not set to 0
print(token.text, token.ent_type_) # then the token is a named entity
@varunnayyar3138 3 роки тому ⁺⁹⁸
this is absurd, opened yt for NLP videos and it was uploaded 1 sec ago.
@avnishpanwar9502 3 роки тому ⁺⁴
Happened to me for a deep learning course.
@varunnayyar3138 3 роки тому ⁺⁴
@@avnishpanwar9502 this channel is a boon
@b_28_vaidande_ayush93 Рік тому ⁺³
Destiny
@kainoa_written Місяць тому ⁺¹
I started building a personal NLP agent and this was immediately recommended
Wild
@cuteorpyarahamza 24 дні тому ⁺¹
doc = nlp(u'A severe storm hit the beach. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
for sent in doc.sents:
print([word for word in sent])
print([doc[i] for i in range(len(doc))])
# Check if the first word in the second sentence of the text is a pronoun
for i, sent in enumerate(doc.sents):
if i == 1 and sent[0].pos_ == 'PRON':
print('The second sentence begins with a pronoun.')
# Check how many sentences in the text end with a verb
counter = 0
for sent in doc.sents:
if sent[-2].pos_ == 'VERB':
counter += 1
print(f'{counter} sentence(s) in the document end with a verb.')
@sarasharick5209 2 роки тому ⁺²
I’ve been watching Dr. Mattingly’s other videos and they’re great.
@wdonno 3 роки тому ⁺⁵
I’ve come back to this video several times. The ONLY tutorial I’ve seen which walks through the whole process . The Python Tutorials for the digital humanities videos are also great. I am focused on biomedical text, but text is text when you are trying to get started.
@mohammadyusuf1636 3 роки тому ⁺¹⁶
I was searching for Spacy tutorials yesterday, and FCC uploaded it, thank you 💝. Interested in part 2.
@muhammadsanwal4384 3 роки тому ⁺³
Hi, can you tell where can i find the repository for the data?
@cuteorpyarahamza 24 дні тому
doc = nlp(u'A severe storm hit the beach. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
for sent in doc.sents:
print([word for word in sent])
print([doc[i] for i in range(len(doc))])
# Check if the first word in the second sentence of the text is a pronoun
for i, sent in enumerate(doc.sents):
if i == 1 and sent[0].pos_ == 'PRON':
print('The second sentence begins with a pronoun.')
# Check how many sentences in the text end with a verb
counter = 0
for sent in doc.sents:
if sent[-2].pos_ == 'VERB':
counter += 1
print(f'{counter} sentence(s) in the document end with a verb.')
@anassalhi8191 2 роки тому ⁺¹
I can't believe such good content is for free, thank you.
@pabloneirotti7945 2 роки тому ⁺⁶
Thank you so much. The best course on SpaCy I have founded. Please make Part Two! We are waiting for it!
@sourmans 23 дні тому
so much value! thanks for making this material available for free. Incredible value
@qutluq7794 3 роки тому ⁺⁷
This video lesson was great. Looking forward to see the second part.
@soumensarkar8967 Рік тому ⁺¹¹
Thank you Dr William for taking me through such wonderful journey on NLP - it was my first learning on this area of python application and i found it quite useful and excited to do some more. Looking forward to having your part 2 soon!
@pradeeprekapalli8355 11 місяців тому
where can I find text book
@joaopauloborges3866 Рік тому ⁺²
Thanks for this incredible class and textbook, it was very helpful. Greetings from Brazil
@rakshittherakki99 3 роки тому ⁺²¹
Definitely interested in part2 of this course
@roman445 3 роки тому ⁺⁵
I'm definitely interested in the ML aspects of spaCy) Thank you very much for the video!
@stephenbradley445 2 роки тому ⁺³
Definitely important to dig into the .similarity() output before using it in one's own work. One of its flaws is that it cares too much about the number of words in the spans being compared. For example:
print(nlp2("fries").similarity(nlp2("burgers"))) = .65
print(nlp2("fries").similarity(nlp2("hamburgers"))) = .58
print(nlp2("fries").similarity(nlp2("ham burgers"))) = .70
print(nlp2("french fries").similarity(nlp2("hamburgers"))) = .46
print(nlp2("french fries").similarity(nlp2("ham burgers"))) = .64
Also, I find that the small model correctly identifies West Chestertenfieldville as a GPE without modification, and I find that nlp.add_pipe("entity_ruler") does not add of the pipeline-description we see via nlp.analyze_pipes(). Rather, it seems that element of this description is in alphabetical order, and every nested sub-element is also in alphabetical order. I suspect this does not say anything about the true order of the pipeline.
@chenalbert2996 2 роки тому ⁺¹
just finished 1/3 and I have to say very good introduction. thanks a lot on the sharing
@giorgiacecchinato9449 3 роки тому ⁺⁴
Also an historian looking for ways to extract info from old documents. very looking forward to the second part.
@python-programming 3 роки тому ⁺²
I am trying to have it out in early January.
@wdonno 3 роки тому
@@python-programming really looking forward to the next video. One topic I have not seen you address is the question of tools for Annotation. When working in specialized language domains, extra training of models is a key step. As a newcomer, I have not yet found a process compatible with Spacy which is reasonably efficient. Prodigy?
@AdityaSingh-wr7ow 9 місяців тому ⁺³
Where can I access the textbook? Can someone let me know! Would really appreciate it!
@rodlu811 10 місяців тому ⁺¹
Excelent!!! The best of the best!!!! Please do the second showing how to train the model.
@ATWORKZ Місяць тому
Super interessant de meer de diepte in te gaan. Met andere woorden stukje geschiedenis les.💪💪👍
@tundescope 2 роки тому ⁺²
Great video. Please make the second part ASAP. Keep up the good work.
@cuteorpyarahamza 23 дні тому ⁺¹
We can extract noun chunks by iterating over the nouns in the sentence and finding the syntactic children for each noun to form a chunk.
doc = nlp(u'The quick brown fox jumps over the lazy dog.')
'''for chunk in doc.noun_chunks: # Regular method
print(chunk)'''
for token in doc: # Manual method
if token.pos_ == 'NOUN':
chunk = ''
for w in token.children:
if w.pos_ == 'DET' or w.pos_ == 'ADJ':
chunk += w.text + ' '
chunk += token.text
print(chunk)
@FF4546 2 роки тому ⁺⁸
Thank you for the great video!
When I run the most_similar method, copying the code on your notebook, I end up receiving a complete set of differemt words, some unrelated to the word and some in other languages. Example: country gave me ['country-0,467', 'nationâ\x80\x99s', 'countries-', 'continente', 'Carnations', 'pastille', 'бесплатно', 'Argents', 'Tywysogion', 'Teeters']
Can somebody help me understand why this is happening?
@matthewhernandez9152 2 роки тому ⁺¹
Same here. Curious but...maybe the transformers/models (not sure which) are retrained thus giving us a different set of words? Hopefully someone can answer this!
@TT-cf7xl 3 роки тому ⁺²
You're a wizard, W.J.B. Mattingly! Sincerely yours, a stan
@bbppchan 3 роки тому ⁺⁶
Why I cannot run the following?
I cannot find where the repository contain.
with open ("data/wiki_us.txt", "r") as f:
text = f.read()
@furkanfiratli7908 2 роки тому
did you find a solution ?
@bbppchan 2 роки тому
@@furkanfiratli7908 create your own wiki_us.txt, open notepad copy the text from the website and paste
@bernardmontgomery3859 2 роки тому
@@bbppchan did you encounter the error:" 'gbk' codec can't decode byte 0x93 in position 1186: illegal multibyte sequence " ?
@bbppchan 2 роки тому
@@bernardmontgomery3859 no
@loctran1648 2 роки тому
You must clone the repository first.
open the website in the description, click on the github button on the top right (pretty hard to find tbh) will link to the repository
clone the repository then you can run the existing notebook file or create a new one
@NolanMurphyWhitehead 10 місяців тому ⁺³
Has the Textbook been uploaded somewhere else? The link isn't working
@ziya5811 10 місяців тому
Hello , do u find?
@NolanMurphyWhitehead 10 місяців тому
@@ziya5811 No luck so far
@mansouriwhafidoumansouri4706 Рік тому ⁺¹
Thanks Great course and I love how easy and smooth the explanation is. Moreover I like how explaining each step before diving into it is really making the understanding easier for us to follow thanks a lot. BTW I've spent some time looking for the github account and repo related to this video here is it if anyone needs it to begin following the video, ENJOY...
@SirAlph4 Рік тому
where?`
@fatimzhraannassiri 3 роки тому ⁺⁴
Thank you 🙏 , Interested in part 2
@michaelmatthaei7759 2 роки тому
Best Helpline for those who really want to learn
@sheshankjoshi 3 роки тому
Thanks
@abdelkaderbensaid432 3 роки тому ⁺⁵
Thank you! interested in part 2.
@cuteorpyarahamza 23 дні тому ⁺¹
The Doc object’s **doc.sents** property lets us separate a text into its individual sentences:
doc = nlp(u'A severe sand storm hit the Gobi desert. It started to rain.')
for sent in doc.sents:
print([sent[i] for i in range(len(sent))])
@micahjesse6580 2 роки тому
Please create Part 2!!!!! Part one was 🔥🔥🔥🔥🔥🔥🔥
@JayGhiyaGoogle 3 роки тому
Thanks!
@RajaRamani.R 3 роки тому ⁺²
Superb, Waiting for part 2 with thanks🙏👍
@MrShlomke24 2 роки тому
Awesome Video! Can't wait for part 2
crossing my fingers 🤞🤞🤞
@tthtlc 2 роки тому ⁺³
Everyone seemed to be asking for part 2, but this coverage is good enough - so good that I don't think it deserved a part 2, otherwise a large part is going to be lots of repetition. I will keep exploring deeper based on this video itself.
@tthtlc 2 роки тому ⁺³
There are also a lot of other resources available (and free), if u have time to go through: course.spacy.io/en/
@meherunnesaneela783 Рік тому
@@tthtlc thank you so much! do you have other resources like that?
@150yashwanth 8 місяців тому
Great Tutorial. Learnt a lot about SpaCy fundamentals.
@angelantartico Рік тому ⁺¹
(1:35:44) Matcher
03/22/2023 2:21:22
@thomaspalka1919 2 роки тому ⁺²
This video is fantastic! I would really appreciate part 2
@TheDouglasDale 3 роки тому ⁺²
Thank you for the explanations. They are very clear and relevant. Excellent video.
@hokapokas 2 роки тому
Love the work you are doing. Many thanks from India
@kevinliu1142 2 роки тому ⁺²
Please make the second video about machine learning! this was so helpful
@mohammedabdulrahman4056 Рік тому ⁺³
can anyone give me the link of the repo that is being used
@jaygastby9575 Рік тому ⁺¹
Hi, quick question, where the repo mentioned in 22:38 is located?
@jasonchesney9750 Рік тому
Very much interested in the machine learning aspect of SpaCy. Thank you, this course was informative and handy.
@FedorablePenguin 3 роки тому ⁺⁴
Where’s part 2!!! If there’s time in part 2, I would definitely be interested to know how to train ML to help with research and literature reviews as an example
@tatendamuzenda8442 Рік тому
very simple and easy to understand thank you for this
@javohirxusanov1229 3 роки тому ⁺⁸
Let's do the second part of it 🙂
@nelsondelarosa5490 9 днів тому
Thanks so much Dr. Mattingly. Where can we find the machine learning related video?
@mattfichtner7100 2 роки тому
Outstanding overview of Spacy, can't wait for part 2! Thank you so much.
@TheStainlessFish 2 роки тому ⁺²
is part 2 out?
@konstantinpluzhnikov4862 2 роки тому
Thank you Dr. William. Looking forward for a part two.
@lixx1278 2 роки тому ⁺⁵
Thank you very much for making this video. I want to create my own corpus to analyze data. But as a newbie to Python, I found it really hard to start without a clear direction. Looking forward to Part 2!
@janabasogizon5214 2 роки тому ⁺²
Very very helpful stuff! 31 minutes in the video and I'm already using spacy for my own analyses! Thank you so much!
@OdorousPayload 3 роки тому ⁺³
im getting an error early on at 22:40 for opening up the first text file
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
in
----> 1 with open ("data/wiki_us.txt", "r") as f:
2 text = f.read()
FileNotFoundError: [Errno 2] No such file or directory: 'data/wiki_us.txt'
anyone know why? i am running jupyter notebooks via anaconda on a 2021 MacBook pro M1
@paulmathew1214 2 роки тому
This is a great NLP tutorial. I have checked out a few others but this one here takes the cake. Thanks for the excellent resource!
@VincentFulco 3 роки тому ⁺²
Really fascinating and accessible. Thank you.
@machinimaaquinix3178 2 роки тому ⁺³
Lol. spaCy's most recent build as of Feb 2, 2022 does properly identify West Chestertenfieldville as a GPE. ua-cam.com/video/dIUTsFT2MeQ/v-deo.html
EDIT: Just finished the course - A phenomenal piece of work. Thanks so much for doing this. I can tell you put in an incredible amount of time and effort to do this and you provide it so graciously for free. It is so much clearer than spacy's documentation. They have a new spacy 101 so I'll give that a go now to cement this all in the nogin. And yes, eagerly awaiting part 2.
@asmitadhikari1472 3 роки тому ⁺⁴
Awesome content there Dr. William. I was really hyped during the series and every aspects of spaCy you've described perfectly. Now I'm interested on ML aspect of spaCY and It'd be great if you come with ML aspect of spaCy.
@lilacostaqueiroz8639 3 роки тому ⁺¹
Waiting for the second part ! This tutorial is perfect , thank you so much !
@andrijor 3 роки тому ⁺¹
This tutorial is so freaking inspiring to me. NLP is so exciting and I'd love to integrate it with machine learning!!!!
@andrijor 3 роки тому ⁺²
I'd be 100% down to watch a tutorial with part 2!!!!!
@python-programming 3 роки тому ⁺²
Thanks! Good to know. I think I will start planning it this week.
@andrijor 3 роки тому ⁺¹
@@python-programming Hi hi, any updates on part 2? I hope everything's ok :)
@python-programming 3 роки тому ⁺²
@@andrijor indeed! I am still working on it. Between the textbook and the video it takes a while to make. I am hoping to have it ready in early January.
@andrijor 2 роки тому
@@python-programming Looking forward to it! 😊
@footballistaedit25 2 роки тому
This video is very useful for me. Thanks for always bringing the great video. Mad respect from me
@vasanthakumarg4538 2 роки тому
Great work! Really a good video to learn using spaCy.
@bilalmahmood9423 2 роки тому
Thanks for your awesome introduction :). Would love to have your next course on using spaCy for ML.
@____kklw7148 3 роки тому ⁺¹
This is ready awesome teaching video. I feel highly interesting in the part two video.
@prototroy4140 3 роки тому ⁺³
Where i can get the datasets ?
@anandachetanelikapati8387 2 роки тому
Excellent tutorial. Straight into the subject. Hats off to you !!
@rexxwei8184 2 роки тому
Such a nice video, 2nd part please!!
@ayushsaxena7837 3 роки тому ⁺³
where can we find these datasets?
@anumulakarthikreddy1195 3 роки тому
Enjoyed This video waiting for part2.
@SirEstaban 2 роки тому
I did not get the INFO: confirmations at 16:23 when running import spacy, any hints?
@prasadcaher 3 роки тому
Eagerly waiting for the second part.
@yifenghou986 2 роки тому
This is super awesome tutorial. Just what I need. Thanks!
@brandonruiz1746 2 роки тому
Thanks for the depth with this library sir
@oarimen 2 роки тому
Thank you so much! Such a wonderful video.
@elghark 2 роки тому
Problems:
1) at minute 1:42:23= It returns me an empty list even if I did the very same code
2) Minute 2:14:32= I don't understand why nlp doesn't recognize and returns 'Mary' as entity. I'm using the same "en_core_web_sm"
@krishnapatel8852 Рік тому
hey, I`m not getting the expected output at 1:57:26. It`s showing KeyError: 0. Can anybody help me with that?
@amandaahringer7466 2 роки тому
This is very helpful, thank you!
@martynsnwaokocha6982 2 роки тому
Hi and thank you very much for your tutorial. I really enjoyed it and looking forward to the second part of the tutorial
@iNTERnazionaleNotizia589 Рік тому ⁺¹
Hi Guys, could someone inform me, which is the hardest to learn/master: Computer Vision OR NLP ?
@10milesfromnowhere 2 роки тому
Yes please for a part 2 on Machine Learning with Spacy!
@Superdooperhero 3 роки тому ⁺¹
Can't wait for Part 2
@markkertzner 2 роки тому
i found this to be an excellent tutorial - very clear, great examples and thorough. thank you for sharing this and i look forward to seeing you continue with another covering machine learning in spacy.
@peterthiel-c6s 10 місяців тому ⁺¹
Has anyone been able to find the source material for this? He's referencing the text and says it's in the description but it is not...
@fanzfanzilla 2 роки тому ⁺²
This is very helpful. I am very new to Machine Learning and NLP. I am in a situation where I have thousands of documents which don't always have correct spellings. I have to analyze these documents to look for trends related to parts failure especially if the failure has resulted in death or injury. Ideally like to learn from the data that can inform the future failures before there is a death or injury. Can SpaCy help with this?
@ctRonIsaac Рік тому
I would like the ML version too. So looking forward to seeing that
@jeyasuriyam5185 2 роки тому
Thank you so Much. It is very helpful
@claudiomalfetti9320 2 роки тому
Hi all,
for some reason I get a different output from the script at 57:35:
['POVERTY', 'inner-city', 'Poverty', 'INTERSECT', 'INEQUALITY', 'Inequality', 'ILLITERACY', 'illiteracy', 'handicaps', 'poorest']
did something go wrong?
@phillychew2154 2 роки тому
I get the same thing: ['POVERTY', 'inner-city', 'Poverty', 'INTERSECT', 'INEQUALITY', 'Inequality', 'ILLITERACY', 'illiteracy', 'handicaps', 'poorest']
@eskoo8396 2 роки тому ⁺¹
How do I reach the data folder that you work with?
Thanks!
@rahuljha6774 2 роки тому
Did you ever find out ? I get an error because obviously we don't have that data file in the same file path
@eskoo8396 2 роки тому
I don't think we can🤗
@venugudavalli 3 роки тому ⁺¹
Looking forward for Machine Learning aspects of Spacy.
@nsnilesh604 3 роки тому ⁺¹
Waiting for 2nd part sir 👌🙏
@praise135 2 роки тому
That was a very nice explanation and an awesome tutorial. Waiting for the machine learning part.
@dieEinfachAnderen 10 місяців тому ⁺¹
Where can I find the text book? The link in the description is dead.
@DePhpBug Рік тому
super interesting , already quickly subscribe both of the channels (y)
@NickWindham 2 роки тому ⁺¹
Your tutorials and your UA-cam channel are great. Thanks so much for sharing your knoledge online. So helpful and well made.

Наступне

Автоматичне відтворення

Transformers (how LLMs work) explained visually | DL5