How to Create a List of Named Entities from an Index with OpenCV (OCR in Python Tutorials 03.03)

How to Preprocess Images for Text OCR in Python (OCR in Python Tutorials 02.02)

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Как мы играем в игры 😂

БЕЛКА РОЖАЕТ? #cat

А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts

How to use Bounding Boxes with OpenCV (OCR in Python Tutorials 03.02)

Python Tutorials for Digital Humanities

Переглядів 50 495

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 19 вер 2024

КОМЕНТАРІ • 38

@haniihsanuddin9585 Рік тому ⁺¹⁰
1. Blur image (to identify overall structure, and not focusing on text itself)
2. Create threshold (and kernal) to separate text block
3. Perform dilation (~white thickening)
4. Perform contour (finding boundaries)
5. Perform loop to only draw boundrary box of specific size (to exclude small bbox)
@fuemma--7122 День тому
The opencv was so easy to understand!
@aayushsinha7439 3 місяці тому
Thanks for such a simplified explanation, helped me with my ongoing project a lot!
@letslearn2674 Рік тому ⁺³
This is the one I have been looking for. Thank you so much!
@python-programming Рік тому
No problem !
@BrandonJF4 Рік тому ⁺²
Thank you so much, this really helped me make progress on a project!
@thenotoriousrkf3012 2 роки тому ⁺¹⁶
I guess, there is an error in your code. From minute 15:45 on, you define the ROI. However, instead of x+h, w would have to be added to x. Therefore, roi should be defined as: roi = image[y:y+h, x:x+w]
Since this typo also appears on your GitHub you should change it there as well.
Kind regards!
@Atharva_S9 2 роки тому ⁺⁸
why can't you provide the code for this
@steffenhalama5558 3 роки тому ⁺¹
Very nice video helped me a lot.
@python-programming 3 роки тому
Excellent! Glad it helped!
@conorforster8853 Рік тому ⁺³
Hi, this tutorial series has been the best thing slince sliced bread, and honestly dont know where id be with out it
however i am stumped, im trying to read pdfs into jpeg format, the problem arises when i have tables and images within these files that i would like to either skip or try to read into file with out wreacking structure (obviouslty not images within the images). idealy i would like this process to be automated as the final program is not being used by myself but by others less aquainted with technolagy. As of now there is no documentation i can find that helps facilitate this.
i know its a long shot but honestly ive hit a wall and if by some chance anyone can help and guidence would or advice would mean the world
@joshuasmitherman1712 Рік тому ⁺³
It's not finding the sections for me. It captures the whole document as a section. Any suggestions?
@ThatRussian Місяць тому
Did you find the solution?
@ThatRussian Місяць тому
I actually found the solution if you're still interested: you basically need to crop the image so no extra blank spaces are left. Since mine was a vertical page I used this code - image[10:1060, 670:1250] image[start_row:end_row, start_column:end_column]
@kltr007 Рік тому ⁺³
Short question: in Box [15] it reads "else cents[1]". Is this a typo and should be "else cnts[1]" or did I miss something?
But great content! Keep going!
@pcb5135 Рік тому
i assume its typo
@vildanhuseynov6492 3 роки тому ⁺¹
good job man!!!
@python-programming 3 роки тому
Thanks
@niladrimallik3172 11 днів тому
At 15:29, after adding the "if h > 200 and w > 20:" statement, I am still getting the same result as without the if statement. Any idea why this is happening? I changed variable names, defined the contours again, but still the same result.
@Maruti_Pai 9 днів тому
rerun the whole code again
@ridafatima1739 5 місяців тому
will this work on colored images as well , if not, what changes should I make for the colored images?
@DilipDas-ys5ph 2 роки тому
Great Thanks !!
@ppmanguin Рік тому ⁺¹
at 11:16 I have an error, can you tell me how to fix it? Thank you!
error: OpenCV(4.7.0) :-1: error: (-5:Bad argument) in function 'boundingRect'
> Overload resolution failed:
> - array is not a numerical tuple
> - Expected Ptr for argument 'array'
@ppmanguin Рік тому
fixed by adding a variable, because findContours creates 2 outputs.
cnts, new_variable = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
@mateussaar4071 6 місяців тому ⁺¹
MAGIC
@khushibaghel220 8 місяців тому
I am trying to run this in google colab but getting an error: TesseractNotFoundError: C:\Program Files\Tesseract-OCR is not installed or it's not in your PATH. See README file for more information. How to resolve this? I have already added pytesseract in my env variables
@virendartripathi4645 Рік тому
I can't download the images from the course can you help me so that I can practice this
@farahjabeen7707 2 роки тому
@Python tutorials for digital humanities can you explain how to make bounding box using pixel location?
@breezyfeels1802 2 роки тому
Hi, can you please tell how I can have bounding boxes around each question in any question paper? I have tried a lot, but unable to get it. I would be really glad if you could help me..Thanks!
@ROKKor-hs8tg 11 місяців тому
كيف يمكن عرض اشكال مطبوعة ف صورة ممسوحة ضوئيا الى ملفdocx
@anjuathouse5370 3 роки тому ⁺¹
could you please make a video on handwritten scanned document image line segmentation
@python-programming 3 роки тому ⁺²
Sure! I actually wrote that code a year or so ago. I will try and dig it up and make a video on it.
@anjuathouse5370 3 роки тому
@@python-programming Thank you so much..
@vildanhuseynov6492 3 роки тому
dude, do you have experience in aligned text?
@jumbertparrenas3218 Рік тому
Can I ask this is capable to application or only for desktop..? Im asking because this is same on my title thesis.
@nathantafelsky7089 Рік тому
It could be used in the source code of an application, or used on different operating systems. Imports and syntax would vary by language and implementation.
@tiennguyentran9358 3 роки тому
*i Love u so much tks u*
@darksidegumball7205 Місяць тому
I am sorry, but you keep saying that i explained these things on the previous videos, and i watched all of the previous ones and all you did was copy code and paste it into your juypter notebook without any proper explanation, hope you can provide a newer tutorial, otherwise thanks for the tutorials.

Наступне

Автоматичне відтворення

How to Create a List of Named Entities from an Index with OpenCV (OCR in Python Tutorials 03.03)

How to Create a List of Named Entities from an Index with OpenCV (OCR in Python Tutorials 03.03)

How to Preprocess Images for Text OCR in Python (OCR in Python Tutorials 02.02)

How to Preprocess Images for Text OCR in Python (OCR in Python Tutorials 02.02)

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Как мы играем в игры 😂

Как мы играем в игры 😂

БЕЛКА РОЖАЕТ? #cat

БЕЛКА РОЖАЕТ? #cat

А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts

А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts

ПОЛ ЭТО ЛАВА В РЕАЛЬНОЙ ЖИЗНИ!**Бустер, Exile, Дилара, Куертов, Ликс, Кокошка**

ПОЛ ЭТО ЛАВА В РЕАЛЬНОЙ ЖИЗНИ!**Бустер, Exile, Дилара, Куертов, Ликс, Кокошка**

Document Scanner OPENCV PYTHON | Beginner Project

Document Scanner OPENCV PYTHON | Beginner Project

How to Install the Libraries (OCR in Python Tutorials 01.02)

How to Install the Libraries (OCR in Python Tutorials 01.02)

How to OCR a Text with Marginalia by Extracting the Body (OCR in Python Tutorials 04.01)

How to OCR a Text with Marginalia by Extracting the Body (OCR in Python Tutorials 04.01)

Text Detection with OpenCV in Python | OCR using Tesseract (2020)

Text Detection with OpenCV in Python | OCR using Tesseract (2020)

Optical Character Recognition with EasyOCR and Python | OCR PyTorch

Optical Character Recognition with EasyOCR and Python | OCR PyTorch

Introduction to PyTesseract (OCR in Python Tutorials 02.03)

Introduction to PyTesseract (OCR in Python Tutorials 02.03)

Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just numpy & math)

Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just numpy & math)

Feature Detection and Matching + Image Classifier Project | OPENCV PYTHON

Feature Detection and Matching + Image Classifier Project | OPENCV PYTHON

Поветкин заставил себя уважать!

Поветкин заставил себя уважать!

Прийняла ваду синочка | #ЯСоромлюсьСвогоТіла #медицина #здоровя

Прийняла ваду синочка | #ЯСоромлюсьСвогоТіла #медицина #здоровя

Кінець РФ близько ❗️ Власна балістична ракета України

Кінець РФ близько ❗️ Власна балістична ракета України

🥹Із російського полону повернули лучанина Дмитра Селютіна #конкурентtv #новини

🥹Із російського полону повернули лучанина Дмитра Селютіна #конкурентtv #новини

Хто зверху? 2024 - Випуск 1 від 05.09.2024 | ПРЕМ'ЄРА

Хто зверху? 2024 – Випуск 1 від 05.09.2024 | ПРЕМ'ЄРА

Ого😳 #люксфм #новинишоубізнесу #ністиданісовісті #залужний

Ого😳 #люксфм #новинишоубізнесу #ністиданісовісті #залужний

Пришёл к другу на ночёвку 😂

Пришёл к другу на ночёвку 😂

БЕРЕМЕННА В 16 ► Репер АЛЬФОНС и мама АЛКАШКА

БЕРЕМЕННА В 16 ► Репер АЛЬФОНС и мама АЛКАШКА