Build A Javascript OCR App Tutorial
Вставка
- Опубліковано 18 вер 2024
- Check out my courses and become more creative!
developedbyed....
🎁Support me on Patreon for exclusive episodes, discord and more!
/ dev_ed
Microphones I Use
Audio-Technica AT2020 - geni.us/Re78 (Amazon)
Deity V-Mic D3 Pro - geni.us/y0HjQbz (Amazon)
BEHRINGER Audio Interface - geni.us/AcbCpd9 (Amazon)
Camera Gear
Fujifilm X-T3 - geni.us/7IM1 (Amazon)
Fujinon XF18-55mmF2.8-4 - geni.us/sztaN (Amazon)
PC Specs
Kingston SQ500S37/480G 480GB - geni.us/s7HWm (Amazon)
Gigabyte GeForce RTX 2070 - geni.us/uRw71gN (Amazon)
AMD Ryzen 7 2700X - geni.us/NaBSC (Amazon)
Corsair Vengeance LPX 16GB - geni.us/JDqK1KK (Amazon)
ASRock B450M PRO4 - geni.us/YAtI (Amazon)
DeepCool ATX Mid Tower - geni.us/U8xJY (Amazon)
Dell Ultrasharp U2718Q 27-Inch 4K - geni.us/kXHE (Amazon)
Dell Ultra Sharp LED-Lit Monitor 25 2k - geni.us/bilekX (Amazon)
Logitech G305 - geni.us/PIjyn (Amazon)
Logitech MX Keys Advanced - geni.us/YBsCVX0 (Amazon)
DISCLAIMERS:
I am a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.
Today we are building a new exciting javascript project!
We will be using OCR (optical character recognition) to recognize an image and extract all the text from it.
We will be using a few packages such as express, multer and tesseract.js. At the end we will convert our image into a pdf.
If you feel like you need more practice with node.js or you just want some project ideas I highly recommend you to follow this tutorial!
🛴 Follow me on:
Twitch: / developedbyed
Twitter: / developedbyed
Instagram: / developedbyed
Github: github.com/dev...
#javascript #ocr
For anyone who is still struggling with 'const worker = new TesseractWorker()' is not a constructor. Just do everything exactly as shown in the video.
Just use npm install tesseract.js@2.0.0-alpha.15 to use the exact same version of tesseract as used in the video.
Not sure if this is necessary but make sure you have "tesseract.js": "^2.0.0-alpha.15" mentioned as a dependency in your JSON file.
***
const {TesseractWorker} = require('tesseract.js')
const worker = new TesseractWorker();
Thank u brother
WoW! looked for solution all around in web but finally solution i want was down in comments!
Thank you pal GOD bLESS YOU!!!
Thanks a lot brother worked for me
Hello my gorgeous friend on the internet
hi
Hello. my. gorgeous. friends. on.. the.. internet..
For anyone still struggling with 'const worker = new TesseractWorker()' is not a constructor. TeseractWorker is depreciated in versions 2+ so we use createWorker() instead.
Here is the fix:
Replace all forms of TesseractWorker with createWorker() for versions 2.0+ of Tesseract by using npm install tesseract.js@next and npm install tesseract.js-core
Install tesseract.js-core link: www.npmjs.com/package/tesseract.js-core
Make sure to remove ^ the carrot sign before tesseract.js within the package.json which forces the dependency to be the newest version. Not sure if this step matters but the code below is what you use instead of TesseractWorker() which is depreciated in the 2.0+ versions.
SOLUTION:
const { createWorker } = require("tesseract.js");
const worker = new createWorker({
logger: m => console.log(m)
});
... rest of backend code in app.js
man I just want to say thank you, I've learned so much from you and done so many awesome projects I though it would take years for me to see realized. I am about to start college and I hope I get to meet people as passionate and good explaining stuff like you, You have been a real motivation to keep coding and learning, dont stop man , I wish happiness and success to you.
Thanks
sooo....... hows college going? just curious
@@naveenarora6467 bad as fuck
Thank you and Happy anniversary!
Ed back with the great projects
Edit : just finished your react tutorial series
You are more than our expectations. Power to Dev Ed.
Fantastic work, man! I love your over 8570 level of humor, by the way!
just make it over 9000 if you had watched dragon ball z when vegeta's scouter explodes and he says the power level is over 9000
@@meteachesprogramming9395 I was hinting to the fact that it's not 9000 just yet, but getting there. :)
Wow, this is so impresive! Great job, and thanks for sharing with us! Cheers!
orange cucumber raspberries this abbreviation's gonna stay in my mind forever
wow! what an work it is ......an effortless work from this man
Awesome your tutorials are very creative
cool tut as always, keep up the good work :)
In new tesseract.js a lots of commands have changes. Please make a update tutorial 🙏🏼🙏🏼
great tutrl
btw, thumbnail is classy!
😂😂😂 first minute and I love the total randomness
use this code I gave blow instead of -> "new createWorker();"
const { createWorker } = require('tesseract.js');
const worker = createWorker({
});
Thanks, it worked
i think im in love, these are the best tutorials ever
I think you have to do process.env.PROCESS || 5000; because if you are deploying... It will find PORT in env variable and then set it to app.. What you did will always set it to 5000 no matter the env has PORT or not
It depends, if any of you're core dependencies has already defined port, then it will use that else the other defined port
I still get the tesseract error not being a object:( after uninstalling and installing the its npm
Same
Still getting the TesseractWorker is not a constructor on both versions :(
Looking for what changed now
github.com/naptha/tesseract.js
I've been looking at issues about the constructor error on github too.
Did you ever figure it out?
Fix:
github.com/naptha/tesseract.js/issues/346
I'm on mobile so no correct syntax but
Changing to "tesseract.js" : "2.0.0.alpha.13",
And
tesseract-js-core : 2.0.0-beta.11
^ Add those to package.json
Plain `npm install` for dependant
Remove any "^"'s from the package . json so it takes both versions (I think that's how you do it? Someone smarter than me can correct:) )
Changing those in package.json seemed to get the app working.
Ily
it will be always port 5000, because it's always true, doesn't it?
Yes, he put it the wrong way around. :-)
Such a good and useful tutorial!
Can you make a PWA please!!
PWA is lame. just do native
@@midsummerstation3345 how can i start native i don't have sufficient knowledge of native.
@@mustafaaur4019 learn react-native or svelte-native. It's not that hard
@@mustafaaur4019 Instagram is built on react-native yet the performance is good and it's successful
According to the GitHub repo of tesseract , TesseractWorker has been depreciated and it's alternative is createWorker but it's too not working can any one help me with that ..I am unable to use both the key words also I have tried different versions of tesseract..
use this:
---------------------------------------------------------------
const { createWorker } = require("tesseract.js");
const worker = createWorker({});
------------------------------------------------------------------
i just removed the new keyword that was behind createWorker({})
You have great sponsors. The message was clear. But the project is nice.
The video is awesome great work
I like the magic 😂😂😂
Coolest Programmer on UA-cam 😛🤣😜🤩🤩
Love your videos
You are an amazing teacher.
Thanks ED!
Hey Ed, do what you like, just do it! 0:07
After I click Convert and it the OCR does it thing and redirects, I'm getting "Cannot GET /download". It doesn't seem to be creating the ocr-result.pdf file. I had to rewrite the worker since the newer version of tesseractjs doesn't like how it's done in the video.
const { createWorker } = require('tesseract.js'); // OCR
const worker = createWorker({
logger: m => console.log(m),
});
app.post('/upload', (request, response) => {
upload(request, response, err => {
console.log(request.file);
fs.readFile(`./uploads/${request.file.originalname}`, (error, data) => {
if (error) return console.log('ERROR: ', error);
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(data);
//console.log(text);
response.redirect('/download');
await worker.terminate();
})();
});
});
});
yeah same error
I subscribed within the first minute itself, The way you teach is funny and we can watch without getting bored. It would be great if you could share the code as well :D
U always get me with the thumbnail😂
Lol the toothpick :) great work
"Organge, Cucumbers, and Raspberries" hahaha
Awesome Ed it is great:)
Nice video as always Ed ! But how abt my Bar Chart Race tutorial in D3.js 🤔
Hey bruv!....Worker.progress throws an error "worker.progress is not a function".Any solution? And also for worker.then
Anyone know what his visual studio code theme is? love it
Material Theme Palenight High Contrast
Oceanic Next (dimmed bg)
constructor error. This solved the issue ->
const worker = createWorker({
logger: m => console.log(m)
});
Can you do a flutter app next?
Something which you would like to use daily, like the anti-nail-biting hand detection project that you did.
watch thenetninja flutter tutorials
@@AmineAmine-qw4xx Not asking for a tutorial. Some complex but useful UI and backend desgin.
I have been looking for this for a long time
The Google API for OCR worked better for me plus there are some paid ones that are very powerful. I was using them to get results from virtual greyhounds straight from the live stream. :)
Cannot read properties of undefined (reading 'transfer-encoding') Getting this error 17:05
super!!🤩
Thumbs up for magic trick! Even with funny fail!
Meat and the potatoes!! lol🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣
arigatou sensei !!! 🙏
dude ... u are the best
I am getting an Error: TypeError: Cannot read property '_malloc' of undefined. How to overcome this error? I am using tesseract: 2.1.4 and using cretaeworker line
You the best ! Appreciate dat
RuntimeError: abort(TypeError: Failed to parse URL from /home/bashar/Desktop/project/ocr/node_modules/tesseract.js-core/tesseract-core.wasm). Build with -s ASSERTIONS=1 for more info.
i got the same thing, any fix?
Just wanted to let you know you are awesome.
Can someone explain to me what is the difference between ComSci and IT?
ComSci would be more theoretical ( covers concepts that are more abstract that IT ) but IT would have less theoretical concepts and I think IT would cover more of the Technology concepts like maybe IT would have more networking and other technology related concepts that may not be as abstract. I think :) but you can google it too.
@@JORDIFUNGULA if I want to focus more on programming, I should choose ComSci?
@@riodacayana7823 I honestly think you would need to decide that for yourself. But in terms of focusing only on programming (with a focus on the theoretical underlying knowledge and concepts to write ad understand software and solve those problems ) Yes, probably CompSci. However, there are so many overlaps and it does not always mean that IT students do not have the experience or knowledge required to do the same thing. CompSci though is less common than IT though ( a lot of people are doing IT but, then again IT and Tech is broad ) and same with CompSci because now there are so many different filed within CompSci( AI, ML, Web Dev, Software , Robotics) and probably many other future related type stuff... I would choice CompSci but I would not turn down or look down on IT too. :)
Hope that helps
It helps a lot thanks
please build like more small vanilla js app.. Thank you for all ur work Big fan from India
ohhh my gggod thats what im looking for thanks ed your great potato ! :)
I have complex tables in my PDF file, will the text output of the OCR deliver the text content in the same format and representation as in the table cell contents of the PDF
Hey Ed, Which theme of visual studio you use?
Amazing tutorial! After poking around with newer Tesseract syntax I got it working wow. If you are stuck, you'd want to search for the documentation on your version of Tesseract.
Hi! I've been trying to make the code work with the current version of Tesseract and I just can't seem to make it work, how did you make it work exactly? Thanks!
Which theme u are using??
Instead of using saved image can we use camera of device to dynamically convert image to text?
This is so fucking cool.
How do you remember all of this?
You'd probably run rings around people who have been doing this for 10 years.
strong grasp on fundamentals, planning and preparations beforehand, googling when running into problems
Good prep on anything. Especially when you're on camera
He also is looking off to the side and retyping stuff.
@@DF999 Yeah I think these guys prepare a draft version and then retype it and add in some stuff to spice it up a bit.
Brad Traversy in his channel once said that mostly his tutorials are scripted and well prepared. That's what good tutor does
Cucumbers 🤣
Dev that’s really helpful ! Can we do this I. App script to execute in google sheet?
I am unable to see what you writing because of subtitles
turn them off then
The subtitles are provided by UA-cam, not him. You can always turn them off.
@@epiczeus945 thanks dude
when i click to convert, its pointing this error:
"TypeError: Cannot read property '_malloc' of undefined"
I am getting an error when i am using pdf file to ocr...
Error. Tesseract can't recognize pdf file
Is there any better reason to make some OCR tool on TensorFlow and Python that talks to a web app?
TypeError: worker.recognize(...).progress is not a function
solution please
I have the same error.
Hey i fixed it by replacing the
*
worker
.recognize(data, "eng")
.progress(progress => {
console.log(progress);
})
.then(result => {
res.send(result.text);
})
.finally(() => worker.terminate());
*
with:
*
(async () => {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(data);
console.log(text);
await worker.terminate();
res.send(text);
})();
*
(Just remove the * they are just there to distinguish different parts)
@@enemyyellow8979 I used your code but I got this error: UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'load' of undefined.
Is it possible to get only the first name and last name on the ID card?
Nice work bro. Can you pls create passport scanner app. Which scans the passport and extract all the information.
Please make a deploying app tutorial :)
Can we extract text from scanned PDF(PDF of images) using OCR?
Can you make a video about wordpress for beginners and even with some web development knowlegde!
what kind of syntax is const upload= multer({storage:storage}) ? (i don 't get what does the storage:storage do)
thanks you if you will reply
You set the property "storage" of the object to the previously defined variable "storage", while here they have the same name its not necessary
I challenge you to create a photoshop-style app (online in javascript), where we can do small manipulations on images and save them.
do you msybe know of anything similar which also works for handwriting?
which tesseract version to install
hhhhh you are awesome , learned a lot from you .thank you ^_^.
Ed, can you share the code with us please ?🙃
Came for the magic trick, stayed for the tutorial..
it is showing the error
worker.recognize(...).progress is not a function
This helped me, along the comments of others (do what they say + this)
Inside the "fs.readFile(`./uploads/${req.file.originalname}`, (err, test) => {" part of the code, replace everything with
(async () => {
const worker = createWorker();
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(image);
console.log(text);
const { data } = await worker.getPDF('Tesseract OCR Result');
fs.writeFileSync('tesseract-ocr-result.pdf', Buffer.from(data));
console.log('Generate PDF: tesseract-ocr-result.pdf');
await worker.terminate();
})();
Hi bro, can you tell me, best way to extract text from invoice, please do reply, it will help me lot
cool af!
Can anyone tell me the course where I can learn whole express , like file uploads, videos.
Hi, will it work even the image has dot matrix characters ?
This video is great, I'm tried to create the proyect but I have this error TypeError: worker.progress is not a function, any idea How I can resolve this issue, Thanks
:::?????????:(
This helped me, along the comments of others (do what they say + this)
Inside the "fs.readFile(`./uploads/${req.file.originalname}`, (err, test) => {" part of the code, replace everything with
(async () => {
const worker = createWorker();
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(image);
console.log(text);
const { data } = await worker.getPDF('Tesseract OCR Result');
fs.writeFileSync('tesseract-ocr-result.pdf', Buffer.from(data));
console.log('Generate PDF: tesseract-ocr-result.pdf');
await worker.terminate();
})();
replace "image" with whatever you're calling the file in the app.post function's parameters (but don't make it "data")
hope this helped
@@paulogodoyp Hi, I have a question. What do you mean by 'changing "image" with whatever you're calling the file in the app.post'? What should I change it to?
@@paulogodoyp I finally realized what you mean, in the line of 'const { data: { text } } = await worker.recognize(image);'
we replace the parameter inside function worker.recognize() with `./upload/${req.file.originalname}`
so basically it is now const { data: { text } } = await worker.recognize(`./upload/${req.file.originalname}`);'
To anyone reading this, please do note to use backtick ` ` instead of apostrophe ' ' when filling in the field in the await worker.recognize() parameter.
do not forget to look on your right...
Kanye stop touching my hard disk.
Hello, when i'm trying to convert a jpg or npg file, it send me this error: " TypeError: worker.recognize(...).progress is not a function " and localhost didn't send any data
Can someone help me ?
could you solve it?
im having the same error
.
.progress(progress => {console.log(progress);})
^
TypeError: worker.recognize(...).progress is not a function
I have this error in my console whenever i try to convert. I am using createWorker and my tesseract version is ^2.1.5. Is there any solution for this ?
deprecated
This helped me, along the comments of others (do what they say + this)
Inside the "fs.readFile(`./uploads/${req.file.originalname}`, (err, test) => {" part of the code, replace everything with
(async () => {
const worker = createWorker();
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
const { data: { text } } = await worker.recognize(image);
console.log(text);
const { data } = await worker.getPDF('Tesseract OCR Result');
fs.writeFileSync('tesseract-ocr-result.pdf', Buffer.from(data));
console.log('Generate PDF: tesseract-ocr-result.pdf');
await worker.terminate();
})();
@@paulogodoyp const { data: { text } } = await worker.recognize(image);
^
ReferenceError: image is not defined
@@paulogodoyp console o/p
"Callback pyramid of doom" .
Still getting the TesseractWorker is not a constructor on both versions :(
Looking for what changed now
github.com/naptha/tesseract.js
How to make this web app on online? What webhosting sites should i use
try deploy using heroku
Hello can you make a video on search with node js
Hi, I got an error: TypeError: TesseractWorker is not a constructor. Does anybody know how to fix this?
Thanks
I tried this and it worked:
const Tesseract = require('tesseract.js')
const worker = Tesseract.createWorker({
logger: m => console.log(m)
});
@@iqaznili Thanks bro!!
i come for the magic tricks 0:42