How to set up Tess4j in Eclipse
Вставка
- Опубліковано 19 вер 2024
- Hi Everyone.
I have decided to make this video to show you how to load Java JNA wrapper for Tesseract OCR APIs into Java Eclipse and how to get it to successfully read an image.
Tesseract is written in C# and anyone working with Java will need a wrapper to ensure this is possible. Therefore, Tess4J is ideal.
After spending a long time trying to get it set up and found there was limited documentation on how to use it so I've decided to create this video to help anyone wanting to start exploring OCR with Java.
You will need the following:
- Eclipse
- Tess4j (sourceforge.ne...)
- Lipt4j (sourceforge.ne...)
- libtesseract302.dll (api.256file.com...)
Additional documentation found for set up:
- tphangout.com/h...
- tess4j.sourcefo...
thank you for making this video. it solved my problem after wasting hours haggling with different versions and tutorials which are half-baked and don't talk about actual configurations.
The only difference for me was that I did not copy over any source, just adding the tess4j jar worked for me.
Thankyou so much this is the perfect video working today also. If anyone encounters any problem I advice them to kindly execute each and every step given in the video
Nice!!! It's working for me. I did two different things. 1. Used 64 bit dll files 2. Included all jars from Tess4j-3.3.1-src into Tess4j build path else getting various errors.
thanks dude.
Really informative, even after 4 years. Thanks amccu!
You are my HERO!
It works! I couldn't have done this without you. Thank you!
Excellent job!! I think I was looking at the same netbeans project as you though and I was able to simply import that project and it worked fine in eclipse neon!
YESSS youre an angel. thank you so much for making this
i cant find the gsdll64.dll
neither did i but it ran regardless if you included the other files she mentioned
@@garymejia9035 This library comes with GhostScript. If it ran for you, it means that you had GhostScript already installed on your system. If not installed, do a search for gsdll64.dll and download it into the main project folder.
@@garymejia9035 Oh, the gsdll4.dll is not needed for reading JPEG files as I just found out. It is, however, needed to read PDF files. You'd also need to copy the two pdfbox*.jar files to lib.
Works on NetBeans 13 as well, but keep in mind:
1)If on NetBeans, just add the libraries by right clicking Libraries, then add JAR/Folder, and then press open on the Jar files in the video
2)When adding a folder, you need to add it on properties as well before, and then on NetBeans.
😎😎😎😎
Thankyou so much for sharing this video it had helped me a lot!!
Thanks Alexandra... this tutorial was very helpful
I am getting this error:
The package net.sourceforge.tess4j conflicts with a package accessible from another module: tess4j
Hi Alexandra, Thank you very much for this amazing vid
@Alexandra: Did you make any changes to Tessseract.java class before or after adding slf4j-api?
Reason being i saw an error for the class in video recording before you added the slf4j jar and then once you reconnected the video with jar added, it didnt show up.. Did you by any chance modify the class to remove the override keyword?
Yes, I need a help, please create new video with this ocr to display particular field
This is a very nice video explaining what to do step by step, saved me a lot of time trying to figure it out myself. By the way, I get the feeling you think ".dll" stands for "download" ;), it doesn't ^^
Thank you! Worked like a charm
thank you very very much ma'am you just saved my day!
Hi Alexandra, i followed same procedure but i found
"SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.lang.UnsatisfiedLinkError: The specified module could not be found." this error
please help me to resolve this ASAP.
your voice is really sweeet
thank you, worked out for me...!!!appreciate it
I'm not able to understand at 11:22 how notepad get open
I did everything step by step, but I getting an error with the slf4j.
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
The type org.slf4j.Logger cannot be resolved. It is indirectly referenced from required .class files
org.slf4j.Logger cannot be resolved to a type
LoggerFactory cannot be resolved
Logger cannot be resolved to a type
It's generated in the "new Tesseract();" part and I can't solve it
May I please know the Eclipse version that you are using?
you are so great :D
i very like the sound of your voice :D
i love you so much :D
I am getting error couldn't find module. My laptop is 64 bit and I follow all steps according to your instructions and taking dil file from 64 bit folder but couldn't achieved it. Please help me out
Thank you~! This video helped me a lot.
Thank you, it helped me a lot.
Thanku so much ....awesome.. working great :)
Hi Alexandra, Nice video I am using a macbook but I can not find the gsdll32.dll. Do you know were I can find it? Thank you, Paul
I am using a Mac as well and I actually didn't need that .dll file. All I needed to do is install tesseract from terminal with homebrew. I typed "brew install tesseract", and it worked!
I am getting an error in netbeans Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236) .could you please help me out here
Thank you, worked fine.
I did what you show in the video but my program not compile. Message error: java.lang.unsatisfiedlinkerror. My Microsoft Visual C++ is updated. Can you help me with this?
I need to recognize a handwritten text from different people and then extract the text from the image, how can I do it? I am getting error, it says "net.sourceforge.tess4j.TesseractException: javax.imageio.IIOException: Inconsistent metadata read from stream". I need your help, thank you!
thanks from India
Thanks. I know it's been long since you posted this video. Currently I'm facing one issue. This works perfectly as java core application but implementing same as web project and creating war and deploying it on server , it's not working. I searched everywhere. Everyone facing same issue. Can you help
or you can just use gradle and homebrew instead of all this, STEPS:
In the gradle dependecys:
compile group: 'net.sourceforge.tess4j', name: 'tess4j', version: '3.4.6'
In terminal:
brew install tesseract
DONE, start coding
I am getting error java.lang.unsatisfiedlinkerror : could not find the module
in netbeans. Can you help me with this.
Have this as well.
Did you find any solution yet?
Hi. This error can also occur if you are trying to load a 32 bit library on a 64 bit system. Can you check what system you have and ensure you take files from the Tess4j that match your system.
I also found that set up in Netbeans is slightly different to Eclipse. Here are netbeans instructions by another user: tphangout.com/how-to-use-the-tesseract-api-to-perform-ocr-in-your-java-code/
Alexandra McCusker I m getting error in net beans ..... let me try in eclipse once ......
hey buddies, I had that error too, you re gonna love me for that : if you REALLY followed the tutorial in every details, then HERE is the solution : update you microsoft visual c++ to, at least, the 2015 version (v.14.0.24215). Let me know if it worked ;)
i dont have an idea about your pointer optimization
Hi thanks for the video. Is it possibile to explain how to get text from PDF with Tess4j?
I tried doing this in IntelliJ IDEA but it did not work. Could someone help ?
i did same process what have in this vedio but I don't have get out put
that error is :
Exception in thread "main" java.lang.unsatisfied link error: The specific module could not be found
Ranjith shaun... Have u got solution
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
I am getting the error 16:43:57.656 [JavaFX Application Thread] ERROR net.sourceforge.tess4j.Tesseract - Input not set
java.lang.IllegalStateException: Input not set
me too, have had you sloved it?
nope, just gave up
que versión de Eclipse estas usando?
thank for tutorial
Even I am getting the same exception i.e.java.lang.unsatisfiedlinkerror : could not find the module. I have already spend some few hours and I have brought my version of OS, JVM, Eclipse, and dll's everything to 32-bit. Still,I am facing the same issue.
Could you please share your version of java and eclipse and also have you placed the libraries anywhere else besides than this project ?
Have you installed visual studio or is it already installed at your end
I didn't use visual studio for this project.
The previous comment mentions this same error. I believe this can be due to trying to load a 32 bit library on a 64 bit system, or vice versa. I suggest checking what type of system you have, then if you look back at my instructions, I took some of the dll files from the downloaded Tess4J zip in the directory: Tess4J-3.3.1-src/Tess4J/lib. In this lib folder there are two other folders; win32-x86 and win32-x86-64. Depending on whether or not your system is 32-bit or 64-bit, take only the dll's from the appropriate folders. I.e. 32-bit system = win32-x86
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
Access restriction: The type IIOImage is not accessible due to restriction on required library C:\Program Files\Java\jre8\lib
t.jar
Syntax error, static imports are only available if source level is 1.5 or greater
The import net.sourceforge.lept4j.ILeptonica.L_CLONE cannot be resolved
Syntax error, static imports are only available if source level is 1.5 or greater
The import net.sourceforge.tess4j.ITessAPI.TRUE cannot be resolved
The type Tesseract must implement the inherited abstract method ITesseract.createDocuments(String[], String[], List)
The type Tesseract must implement the inherited abstract method ITesseract.createDocuments(String, String, List)
RenderedFormat cannot be resolved to a type
enum cannot be resolved to a type
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
RenderedFormat cannot be resolved to a type
enum cannot be resolved to a type
enum cannot be resolved to a type
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Access restriction: The type IIOImage is not accessible due to restriction on required library C:\Program Files\Java\jre8\lib
t.jar
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Access restriction: The type IIOImage is not accessible due to restriction on required library C:\Program Files\Java\jre8\lib
t.jar
Syntax error, 'for each' statements are only available if source level is 1.5 or greater
Access restriction: The type IIOImage is not accessible due to restriction on required library C:\Program Files\Java\jre8\lib
t.jar
Access restriction: The method getRenderedImage() from the type IIOImage is not accessible due to restriction on required library C:\Program Files\Java\jre8\lib
t.jar
RenderedFormat cannot be resolved to a type
enum cannot be resolved to a type
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, annotations are only available if source level is 1.5 or greater
The constructor StringArray(T[]) is undefined
Syntax error, parameterized types are only available if source level is 1.5 or greater
RenderedFormat cannot be resolved to a type
enum cannot be resolved to a type
Syntax error, parameterized types are only available if source level is 1.5 or greater
RenderedFormat cannot be resolved to a type
Syntax error, 'for each' statements are only available if source level is 1.5 or greater
RenderedFormat cannot be resolved to a type
TEXT cannot be resolved to a variable
HOCR cannot be resolved to a variable
PDF cannot be resolved to a variable
BOX cannot be resolved to a variable
UNLV cannot be resolved to a variable
Syntax error, annotations are only available if source level is 1.5 or greater
Name clash: The method createDocuments(String, String, List) of type Tesseract has the same erasure as createDocuments(String, String, List) of type ITesseract but does not override it
Syntax error, parameterized types are only available if source level is 1.5 or greater
RenderedFormat cannot be resolved to a type
Syntax error, annotations are only available if source level is 1.5 or greater
Name clash: The method createDocuments(String[], String[], List) of type Tesseract has the same erasure as createDocuments(String[], String[], List) of type ITesseract but does not override it
Syntax error, parameterized types are only available if source level is 1.5 or greater
RenderedFormat cannot be resolved to a type
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
TRUE cannot be resolved to a variable
L_CLONE cannot be resolved to a variable
Syntax error, annotations are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
Syntax error, parameterized types are only available if source level is 1.5 or greater
TRUE cannot be resolved to a variable
at net.sourceforge.tess4j.Tesseract.(Tesseract.java:27)
at tess.Test.main(Test.java:9)
I'm not entirely sure. Can you tell me what Lept4J file you are using and where you got it from?
(Please don't delete comments arbitrarily. My previous one was about the problems I was facing while trying to follow your video.) At 18-30 where you are configuring the data path, why not use a relative path such as "../tessdata"? Also, will this work if this is compiled into an executable JAR?
I didn't delete your comment, or any other. I'm sorry this happened to you.
I didn't use a relative path but I imagine it would work if you did. I am also not sure if it compiles into an executable JAR.
@@alexandramccusker4158 Thank you for the reply. Yes, I realize that this is a tutorial for Tess4J and not for strict Java coding standards. In any case, if anyone is interested, you can use this: instance.setDatapath(System.getProperty("user.dir") + "\\tessdata");
@@alexandramccusker4158 My previous comment had to do with copying of files. In the video, your mouse pointer moves differently from the items selected. I had Eclipse open and was playing / pausing the vid to follow along. When you said, "we need this file and this file", I was having trouble as you were not mentioning the names and the mouse was elsewhere.
Hello.
Do you have any idea how well Tess4j works in terms of handwriting recognition?
I haven't actually tried it with handwriting before.
I tried it now, unfortunately it doesn't work flawlessly. It's fairly inaccurate.
I am unable to read a JPG files. Could someone share me the code for reading JPG files if possible.
convert it to a png in ur code lol
Tesseract is not written in C#.
Tesseract is written in C++ not C#
thx bb :)
Hey, could you please send me your working project if possible?
will this work in ubuntu ?
I don't know.
yes this is work in ubantu..
15:28
With Maven it takes 5 minutes and thats an over estimate
With which dependencies?
@@LucasRampillon cant say on top of my head but use stackoverflow
Alexandra, I love you :)