How-to Decode Outputs From NLP Models (Python)

SpaCy-RU: быстрые нейросети для анализа текстов - Юрий Бабуров | Data Science

Tutorial 2: Extracting Information from Documents

Life hack 😂 Watermelon magic box! #shorts by Leisi Crazy

Анна Трінчер & CHEEV - Не знаю (Official Music Video)

Дикий Бармалей разозлил всех!

Identify Stocks on Reddit with SpaCy (NER in Python)

James Briggs

Переглядів 2 378

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 11 жов 2024

КОМЕНТАРІ • 10

@AlgoTradingX 3 роки тому ⁺²
One of the best UA-cam video. Thanks James!
@jamesbriggs 3 роки тому
Super happy you think so, thanks Sajid!
@lemuffinity 3 роки тому
Many thanks. Super helpful!
@egomalego 3 роки тому
What's the advantage of using spacy as oppose to having a csv of ticker names and comparing it to the data scraped from the Reddit API? Is spacy faster and/or efficient?
@jamesbriggs 3 роки тому ⁺¹
Good question! It depends really, spaCy will be slower for sure, as there is a lot more complexity under the hood - but it will also pick up on different versions of the same organization name (TSLA, $TSLA, Tesla, Tesla Motors) and differentiate similar words/names (Nikola Tesla), whereas a rule-based approach (the CSV) would struggle with that.
After that, however, we need to build out a rule-based/intelligent process for compiling all of the different versions of the organization names into one - which is something I want to explore, but I would imagine a simple 'similarity' match would be pretty effective - although I'm sure there are methods built specifically for this too :)
@egomalego 3 роки тому
@@jamesbriggs Ahh okay, that makes a lot of sense. I was thinking about testing each. Thank you for the explanation, and thank you for making these kinds of videos :D
@jamesbriggs 3 роки тому ⁺¹
@@egomalego definitely try testing each if you have the time - before NER I'd been relying on rule-based stuff / regex, and it still works well :)
Thankyou for watching!
@Alex-costanza 3 роки тому ⁺¹
This proved really difficult for me. I have the csv files of ticker names , NASDAQ , AMEX, NYSE. Then compared each comment with help of PRAW reddit webscraping module. I had a set of rules in order to identifying a stock as a stock symbol/ticker name. I did manage to obtain a reasonable list of the most mentioned stocks but still there was a lot of incorrectly identified stocks. I had to in the end add a list to exceptions manually, which is not a really nice solution. For example there exist ticker names such as: ONE, GO, OPEN etc really commonly used words in sentences in the comment section on WSB reddit. I started to understand I need to use machine learning/AI and found spacy. Thank you for making this video :)
@egomalego 3 роки тому
@@Alex-costanza I had the exact same problem. I decided, in the end, to go with Spacy and train my own model on data that would make sense for my project. The default model is pretty accurate, but I am only using it for organizations, and I'm looking to only grab ticker symbols.

Наступне

Автоматичне відтворення

How-to Decode Outputs From NLP Models (Python)

How-to Decode Outputs From NLP Models (Python)

SpaCy-RU: быстрые нейросети для анализа текстов - Юрий Бабуров | Data Science

SpaCy-RU: быстрые нейросети для анализа текстов - Юрий Бабуров | Data Science

Tutorial 2: Extracting Information from Documents

Tutorial 2: Extracting Information from Documents

Life hack 😂 Watermelon magic box! #shorts by Leisi Crazy

Life hack 😂 Watermelon magic box! #shorts by Leisi Crazy

Анна Трінчер & CHEEV - Не знаю (Official Music Video)

Анна Трінчер & CHEEV - Не знаю (Official Music Video)

Дикий Бармалей разозлил всех!

Дикий Бармалей разозлил всех!

"Якщо ми приймаємо європейські закони, то хай мені дадуть і пенсію європейську" #shortsvideo #пенсія

"Якщо ми приймаємо європейські закони, то хай мені дадуть і пенсію європейську" #shortsvideo #пенсія

Next 10 Hot Stocks: r/wallstreetbets Python PRAW analysis 🔥

Next 10 Hot Stocks: r/wallstreetbets Python PRAW analysis 🔥

Training a NAMED ENTITY RECOGNITION MODEL with Prodigy and Transfer Learning

Training a NAMED ENTITY RECOGNITION MODEL with Prodigy and Transfer Learning

Vectors & Dot Product • Math for Game Devs [Part 1]

Vectors & Dot Product • Math for Game Devs [Part 1]

How to Extract Information from Text with SpaCy

How to Extract Information from Text with SpaCy

Semantic Chunking for RAG

Semantic Chunking for RAG

Building a buzzing stocks news feed using NLP and Streamlit | Named Entity Recognition & Linking

Building a buzzing stocks news feed using NLP and Streamlit | Named Entity Recognition & Linking

Natural Language Processing with spaCy & Python - Course for Beginners

Natural Language Processing with spaCy & Python - Course for Beginners

Neo4j (Graph Database) Crash Course

Neo4j (Graph Database) Crash Course

Train Custom NER with Spacy v3.0

Train Custom NER with Spacy v3.0

Генерал СБУ Омельченко: Россию ждет полный военный разгром и капитуляция

Генерал СБУ Омельченко: Россию ждет полный военный разгром и капитуляция

1 сквиш тебе или 2 другому? 😌 #шортс #виола

1 сквиш тебе или 2 другому? 😌 #шортс #виола

Сюрприз для Златы на день рождения

Сюрприз для Златы на день рождения

Затулин: Цели СВО ПРОВАЛЕНЫ. Украина под руководством ЗЕЛЕНСКОГО останется существовать!

Затулин: Цели СВО ПРОВАЛЕНЫ. Украина под руководством ЗЕЛЕНСКОГО останется существовать!

0ЧНАЯ SТАВКА, ПР0БЛЕМН0ГО 0ФИЦЕРА РАZВЕДКИ АЛТАЯ & ХИЩНИКА @VolodymyrZolkin

0ЧНАЯ SТАВКА, ПР0БЛЕМН0ГО 0ФИЦЕРА РАZВЕДКИ АЛТАЯ & ХИЩНИКА @VolodymyrZolkin

Психолог на заводе 😱 #тнт #юмор #шоу #лигагородов #психолог #батрутдинов #щербаков #артемкалайджян

Психолог на заводе 😱 #тнт #юмор #шоу #лигагородов #психолог #батрутдинов #щербаков #артемкалайджян

Помоги Nuggets Gegagedigedagedago удрать от бабульки Granny !

Помоги Nuggets Gegagedigedagedago удрать от бабульки Granny !

Epic Reflex Game vs MrBeast Crew 🙈😱

Epic Reflex Game vs MrBeast Crew 🙈😱