Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Build Anything with Claude Agents, Here’s How

Robotics Software Engineer Roadmap 2025! (Get Started in Robotics Today!)

The evil clown plays a prank on the angel

Нельзя смеяться | Смех с водой | 97 #shorts

НА ЦЕ можна дивитись ВІЧНО! Такої ПАЛКОЇ зустрічі НІХТО НЕ ЧЕКАВ

Best OCR Models to Extract Text from Images (EasyOCR, PyTesseract, Idefics2, Claude, GPT-4, Gemini)

Kevin Wood | Robotics & AI

Переглядів 3 747

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 гру 2024

КОМЕНТАРІ • 25

@SethMitchell-b6m Місяць тому ⁺²
Wow! this was EXACTLY what I was looking for. Took me going on reddit to find it lol
@kevinwoodrobotics Місяць тому
Glad to hear!
@arashpirak Місяць тому ⁺¹
Thank you! It was useful 👌
@kevinwoodrobotics Місяць тому
Glad to hear
@ihebennine3110 Місяць тому
Thank you for the video! For better visualization of the results one could suggest making scatter plots: Accuracy vs Speed where each dot represents a model (or Accuracy vs cost), great content though!
@kevinwoodrobotics Місяць тому ⁺¹
Yes great idea!
@KevinCleppe-gl7hx Місяць тому ⁺¹
awesome info!
@kevinwoodrobotics Місяць тому
Thanks!
@AzizKerimzhanov-ho3cb Місяць тому ⁺¹
I was doing text recognition from the images, namely serial numbers and would say that the most accurate and consistent is AWS Textract. I was comparing Gpt-4o-mini, Azure Document AI and AWS. Gpt4 sometimes misses 1 number among 16 or add additional number so the total becomes 17, sometimes instead of D letter, it puts 0 and so on (usually it only happens in the middle of the serial number, the beginning and the end was ok).
Images were in pretty good quality (product information), AWS and Azure detected them all correctly. But AWS can retrieve info based on customer query(nlp), which is better than Azure. I tested on 16 images, where Azure and AWS detected all of them correctly, where GPT4o-mini detected correct serial number only in 10 out 17 images. Just if someone needs it.
@kevinwoodrobotics Місяць тому ⁺¹
Thanks for sharing your experience! Definitely useful to hear these use cases
@rafaeel731 25 днів тому
GPT-4o mini uses way too many tokens too, wouldn't recommend it for vision
@AzizKerimzhanov-ho3cb 25 днів тому
@@rafaeel731 it took me around 1300 tokens for input, but output is around 20 tokens, so approximately 0.0015 per 1 image processing
@beingalien6394 4 дні тому
Have you created Example using all the above models? is there any Github link for this or your website?
@kevinwoodrobotics 4 дні тому
No not yet only for some in my other videos. Can search for easyocr and paddleocr on my channel
@AmphibianDev Місяць тому ⁺¹
How about Llama vision model?
@kevinwoodrobotics Місяць тому
Would be cool to test
@rafaeel731 25 днів тому
I find Gemini Flash 1.5 8B to be particularly good. It's a very new model. Plus you can't really run any LLM locally even if small; it has to be open source at least.
@kevinwoodrobotics 25 днів тому
Oh good to know!
@maniksarmaal Місяць тому
What about PaddleOCR??
@kevinwoodrobotics Місяць тому ⁺¹
Yes I heard it’s good. Will evaluate it
@tikkivolta2854 28 днів тому
very useful. unfortunately easyOCR isn't able to crack a well scanned gas station slip. it just can't.
@kevinwoodrobotics 27 днів тому
Oh man. Anything that worked for you?
@tikkivolta2854 27 днів тому
@@kevinwoodrobotics tried a tuned easyOCR model with at least 6 diff settings: greyscale, blur, contrast whatnot. had no idea OCR is the endgame. subsequently we need an open source sonnet. no product out there (worst is the praised AABBYY fine reader) can handle this. tried them all. turns out our brain is still the best to decipher handwriting. in the meantime i'd go for 65% accuracy and then let an LLM stitch the missing parts together. .pdf is the worst invention ever made. never meant to transport data efficiently.
@rafaeel731 25 днів тому
@@tikkivolta2854 So AABBYY fine reader was bad? A client uses FlexiCapture and it is horrible, wondering if fine reader is better
@ahmedsaliem7041 Місяць тому ⁺¹
Valid for Arabic text

Наступне

Автоматичне відтворення

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Build Anything with Claude Agents, Here’s How

Build Anything with Claude Agents, Here’s How

Robotics Software Engineer Roadmap 2025! (Get Started in Robotics Today!)

Robotics Software Engineer Roadmap 2025! (Get Started in Robotics Today!)

The evil clown plays a prank on the angel

The evil clown plays a prank on the angel

Нельзя смеяться | Смех с водой | 97 #shorts

Нельзя смеяться | Смех с водой | 97 #shorts

НА ЦЕ можна дивитись ВІЧНО! Такої ПАЛКОЇ зустрічі НІХТО НЕ ЧЕКАВ

НА ЦЕ можна дивитись ВІЧНО! Такої ПАЛКОЇ зустрічі НІХТО НЕ ЧЕКАВ

Они Скупали ВСЁ Серебро Мира и вот ЧТО Было Дальше! #shorts

Они Скупали ВСЁ Серебро Мира и вот ЧТО Было Дальше! #shorts

Stunning Research Visuals in SECONDS with This Free AI Tool! [Napkin AI]

Stunning Research Visuals in SECONDS with This Free AI Tool! [Napkin AI]

Google Gemini AI Vision - OCR Text Extraction with Python

Google Gemini AI Vision - OCR Text Extraction with Python

Powerful ChatGPT Hacks 90% of Users Overlook

Powerful ChatGPT Hacks 90% of Users Overlook

Why & When You Should be Using Claude over ChatGPT

Why & When You Should be Using Claude over ChatGPT

EasyOCR Python: Extract Text from Images with OCR (Improve Results with Image Processing)

EasyOCR Python: Extract Text from Images with OCR (Improve Results with Image Processing)

GPT4ALL 3.0: The AI Sensation That's Taking Over the Internet! (And It's FREE)

GPT4ALL 3.0: The AI Sensation That's Taking Over the Internet! (And It's FREE)

10 AI Animation Tools You Won’t Believe are Free

10 AI Animation Tools You Won’t Believe are Free

Why Are Programmers Switching from ChatGPT to Claude 3.5

Why Are Programmers Switching from ChatGPT to Claude 3.5

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

Мясо вегана? 🧐 @Whatthefshow

Мясо вегана? 🧐 @Whatthefshow

Новонароджену донечку бачив лише декілька разів #shorts #війна

Новонароджену донечку бачив лише декілька разів #shorts #війна

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

УДИВИЛ ВСЕХ СВОИМ УХОДОМ!😳 #shorts

УДИВИЛ ВСЕХ СВОИМ УХОДОМ!😳 #shorts

ФИЛЬМ! ЮЛИЙ ЦЕЗАРЬ ИЗ ПРОСТОГО ВОЕНАЧАЛЬНИКА СТАЛ ПОВЕЛИТЕЛЯ РИМА! ЮЛИЙ ЦЕЗАРЬ! Русский фильм

ФИЛЬМ! ЮЛИЙ ЦЕЗАРЬ ИЗ ПРОСТОГО ВОЕНАЧАЛЬНИКА СТАЛ ПОВЕЛИТЕЛЯ РИМА! ЮЛИЙ ЦЕЗАРЬ! Русский фильм

Подарував батьку-військовому машину його мрії

Подарував батьку-військовому машину його мрії

Outsmarted😅 Subscribe to me 🙌🏻

Outsmarted😅 Subscribe to me 🙌🏻