hi..plz help me i got this one error.... partially initialized module 'doctr.models' has no attribute 'classification' (most likely due to a circular import)
Do you have any process of getting text from different bank's passbook scans. information like Account Holder name, Accout no. Nominee Name, IFSC code. save it in the dataframe But remember all the passbook have different layout and different clarity and quality
Hi buddy i followed your this video "OCR Text from PDFs and Image Documents using docTR | Better than Tesseract OCR | Text Extraction" and got json file of my text present in images. now can you tell me how to get that text in to a txt file or docx file on anyother format u suggest where i can get the same structure of text like it was in the img. Also how to do that? like i tried my all possible ways but all was failures. Can you help me to get out of this problem? please its related to my fyp. Thanks in advance
same condition i tried all the possible way too i used paddle ocr is give output in text but the problem is not giving structured manner same as image format
Hey did you try replacing different extraction algorithms like Master,sar_resnet31 I tried it's not working they didn't release those models as open source?
If you want specific things to be extracted then you can do object detection ( only if templates remains same) then apply OCR for the detected region or else First apply ocr then NER
Thanks for the video. When I try to install doctr on Jupyter, I get the following error : OSError: cannot load library 'gobject-2.0-0': error 0x7e. Additionally, ctypes.util.find_library() did not manage to locate a library called 'gobject-2.0-0' However, I am able to install on Google Colab. Any help with the Jupyter installation would be a great help !!
Nice Video,could you please tag the colab notebook link ? I am facing an error ' pypdfium2 --> AttributeError: module 'pypdfium2' has no attribute 'render_pdf_topil'. i even down graded pypdfium2 to 1.0.0 without any solution.Could you shed some light on it? thanks
hi. please make video on extract hindi table contains text in devnagri or utf-8 to csv from images. i try lot on inter but not found any video or method.. please make video on this it will help lot
Thanks a lot for sharing this better OCR Engine
is DOC TR OCR can be used for commercial purpose.
Thanks for such an informative video.
From where I can get the code?
Is there anyway to turn the exported js object/json back into a pdf?
Thanks a lot for sharing this concept..
Can you explain about docTR training text detection and recognition
Pls
i have a problem i wanted the extracted text in same format as image can you tell me how to get the structured output same as image?
please can you make a video on how to fine-tune DocTr on custom dataset
hi..plz help me
i got this one error.... partially initialized module 'doctr.models' has no attribute 'classification' (most likely due to a circular import)
Do you have any process of getting text from different bank's passbook scans. information like Account Holder name, Accout no. Nominee Name, IFSC code. save it in the dataframe
But remember all the passbook have different layout and different clarity and quality
You can train layout model to extract such entities from banks template
@@karndeepsingh how to train a layout model karn
@@karndeepsingh can we extract a only needed text from entities like (account number :12345 ) like key value pair
not able to read pdf filr
error : module 'pypdfium2' has no attribute 'render_pdf_topil'
need to downgrade the pypdfium2.. pip install pypdfium=1.0.0
can i use offline
hi i am facing error related to the doctr_io related
Hi buddy i followed your this video "OCR Text from PDFs and Image Documents using docTR | Better than Tesseract OCR | Text Extraction" and got json file of my text present in images. now can you tell me how to get that text in to a txt file or docx file on anyother format u suggest where i can get the same structure of text like it was in the img. Also how to do that? like i tried my all possible ways but all was failures. Can you help me to get out of this problem? please its related to my fyp. Thanks in advance
same condition i tried all the possible way too i used paddle ocr is give output in text but the problem is not giving structured manner same as image format
result.render() 😊 instead of .export()
Hey did you try replacing different extraction algorithms like Master,sar_resnet31 I tried it's not working they didn't release those models as open source?
Haven’t tried with different variation of models but it should work.
Hey, how to convert if we have many individuals I'd cards in a scanned image pdf and need to convert them into excel
If you want specific things to be extracted then you can do object detection ( only if templates remains same) then apply OCR for the detected region or else First apply ocr then NER
What about after extract the text , could you please show us storing values in excel file or in dataframe
Once you have JSON output, you can format the output in any format
Thanks for the video. When I try to install doctr on Jupyter, I get the following error :
OSError: cannot load library 'gobject-2.0-0': error 0x7e. Additionally, ctypes.util.find_library() did not manage to locate a library called 'gobject-2.0-0'
However, I am able to install on Google Colab. Any help with the Jupyter installation would be a great help !!
May be there are some dependencies changes that might have happened.
You can try to install old versions of OCR
Nice Video,could you please tag the colab notebook link ?
I am facing an error ' pypdfium2 --> AttributeError: module 'pypdfium2' has no attribute 'render_pdf_topil'. i even down graded pypdfium2 to 1.0.0 without any solution.Could you shed some light on it?
thanks
Hey, did you find any solution yet?
hi. please make video on extract hindi table contains text in devnagri or utf-8 to csv from images. i try lot on inter but not found any video or method.. please make video on this it will help lot
Sure.
Code