“Automation 2.0 coming…No more boring data entry job”
Вставка
- Опубліковано 1 чер 2024
- The real AI Automation is coming - Let GPT reads invoices and enter data into Xero - The step by step guide from extracting structured data from docs, to send data to Xero, HubSpot and more;
🤘 Get 1 month Pro plan on make.com free: www.make.com/en/register?pc=j...
🔗 Links
- Follow me on twitter: / jasonzhou1993
- Join my AI email list: www.ai-jason.com/
- My discord: / discord
- Github link: github.com/JayZeeDesign/gpt-d...
- Zoum’s video for extract data from PDF: • How to Extract Text fr...
- No code alternative: relevanceai.com/
⏱️ Timestamps
0:00 Intro
1:35 Quick demo
2:05 Step1: PDF to Text
6:05 Step2: LLM extract structured data
7:55 Step3: Streamlit GUI
10:48 Step4: Xero integration
16:00 No code alternative
👋🏻 About Me
My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com
#gpt #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #largelanguagemodels #largelanguagemodel #langchain #nocode #langflow #flowise #chatgpt #automation #aiautomation#aiautomationagency - Наука та технологія
I think yours is the only channel that shows practical usage for gpt and automation with existing tools.
I learn a lot here, thankyou man.
That is the fact, this is my go to channel for learning.
This is an incredible saas product on its own. Now you just need a easy to use frontend for the user to take pictures and export a well defined excel spreadsheet. Incredible work!
Thanks! Good idea to turn this into a micro sass with simple scanning function
the bottleneck is: no company want to send private highly sensitive data as cleartext to openAIs chatGPT to process. Not in USA, not in Europe.
@@amandamate9117 maybe some encrypted solution
@@amandamate9117 Any large company could afford to run an open source LLM internally on a private network.
EDIT: or even private microsolft openAi endpoints
@@amandamate9117the OpenAI API data is never used for training etc…
Great vidoe Jason, you are awesome at explaining these things. I personally support doing more of these guides in core coding format like here, it is super helpful for understanding.
Your videos are pretty helpful. The way you logically explain each tool is helpful.
Hey Jason, you are the greatest teacher I have encountered! This is exactly how people need to learn to build AI apps. You're going to be very successful if you keep teaching us like this. Thank you for all the great work, man!
Thank you for all of these videos bro, please keep making them!
Jason is always giving amazing and practical use cases
Thank you Jason. Great work as always. Very practical user case
As a beginner in ML I am very glad to find your channel. I learn a lot and you from each topic everything understandable. Many thanks
dude you're on fire! keep it up, I can't wait to apply knowledge from your videos
Thank you!!
Hi Jason, fantastic video, I learned a lot from your content. Please keep up the good work. Cheers
Fantastic simple walk-thru of e2e Business Scenario
man, your content is brilliant, by the way the thumbnails ROCK :)
This content is so GREAT. Thank you. Very transparent.
That is amazing. Thank you so much for this great code and tutorial!
Thank you for exposing me to Make, just signed up. great tool will use this in a lot my projects, and it will make my life a lot easier.
Freaking awesome video, Jason! So much info! Keep these videos coming!
Another great video, Jason!
Another banger tutorial, thanks!
Good Job Jason. Top content🔥
Incredible work as always Jason!
P/s : I just realised Jin Yang and you has over 90% resemblance. What a doppelgänger! Minus the hair of course
Another wonderful tutorial thank you Jason so much ❤. In the perfect world, there should be no manual intervention, the POS machine should just talk to the bank, and AI in the middle transforming the semi/un-structured data into structured data, which then get feed into your online banking and accounting software. Scanning is a serious pain when the transaction gets large and digitalise receipts save a lot of trees and ink too 😂
Awesome tutorial !! 🎉
This is amazing, love all your content, thank you! Would you be able to make this video’s git public? Also, love the thumbnails 😂
I was here before he was subscribed by every AI enthusiast
Incredible video as always, thank you!
Thank you for the video.
Brother, you’re amazing
Great channel!
Thanks!
You are a Genuis bro. 👏
Exactly what I’m looking for 😭
This is Amazing !
Very nice video. Have you tried to use function calling in GPT instead of asking it to return a string json ?
i was thinking of using this to just extract text from PDF if its better then langchain for embedding, i guess your example is good for forms and invoices, but for instructional document or PDF of wikipedia, the tesseract dont handle some data that well.
but still its a very good guide.. thx for sharing
waiting for this
Best llm content on the web!
Why OCR instead of native pdf text retrieval though? Don’t you risk to incur into ocr-related mistakes?
I mean, you already have the “real” text! Thank you
Seriously I mean this is great video for educational purposes and I have two specific questions 1’ have you got access to GPT 4 api 2’ they are great educational contents, have you ever thing about productizing your idea such as this one, I mean filling for tax return seems to be a high demand for a lot of people
I like the approach above here , as I require to do alot of admin work as well.
Was wondering is there a way to protect your data ? Bit concern with data privacy!! T.T
Surprised could not access the pdf object model to get text from the pages. . But yes tessaract does work well
believe me you do fantastic AI use case to handle business processes which anyone can use to get a job in AI. It will be great if you can do more use case in AI. would be really helpful to me and many others. At the end thanks a lot. 😃
Awesome. Jason do you offer 1-on-1 consulting?
Hey Jason, your content is really amazing. Thanks for creating AI related content. I wanted to ask if there's any advantage of saving the image in jpeg format before extracting text because if there's no actual advantage the same can be done with just 3 lines of code which also makes the process faster.
def parse_pdf(file_path, scale=300/72):
pdf_file = pdfium.PdfDocument(file_path)
renderer = pdf_file.render(
pdfium.PdfBitmap.to_pil,
scale=scale
)
return "
".join(image_to_string(img) for img in renderer)
Hey, you seem to understand the field. Looking to launch this idea into the market? Sales guy here looking for a tech cofounder. Cheers!
Why did you use a simple langchain prompt template instead of using openai’ s function api to get the structured data?
Would not function calling be more appropriate for formatting invoice data into a JSON format you need?
when Jason drops a video I can't click fast enough
This is a great as you.
Hi Jason, when I try to run your code I encounter the following error: PdfiumError: Failed to load document (PDFium: File access error). Do you know what might be causing this and how to rectify it? Thanks
+1 I am facing the same error. Appreciate if someone has an advice on how to solve it
i also have the same error :/, someone help pls
facing same error
The NamedTemporaryFile is getting deleted. You can change it like - with NamedTemporaryFile(suffix='.pdf',delete=False) as f:
Thanks. It worked for me.@@bibinbalakrishnan
And are those language model libraries available in Python? You said you Will explain it later in the video but i think you didnt
If the PDF has many pages (for example, a contract), do I need to go through the process of splitting it into smaller chunks, or can I simply insert any PDF, regardless of the text size?
the function auto split them into pages!
I'm curious how's the accuracy of pytesseract. I did the exact same project a long time ago (it's in production up to this date) and we used Google Vision API to perform OCR. The biggest issue is that although the accuracy is at idk like 99.9% it's still at least one wrong character recognized in each invoice! And since there's a lot of numeric data (prices, VAT values, amounts, different units of measures) writing validation for this all took more time than the rest of the project. You never actually knew what the OCR will return and you REALLY don't want to put the wrong data for accounting.
And actually here's the thing, in the video the Transaction ID wasn't recognized 100% correctly
@@senxo.visuals You're right it's missing an extra W @5:46, eagle eyes🦅! I suppose you could feed this output to another llm checking whether sequences numbers of another run match, repeating until however accuracy you want. It wouldn't ever be perfect tho and would add up quickly💸
yh thats one of the problems w all these ai apps, problems where u need to be 100% accurate or there could be big consequences is hard to actually solve with ai
@@TheParagamer ohh Having 2 OCR service to do text extraction & LLM to validate, this is 🧠
@@senxo.visuals ahh good catch! i really like @TheParagamer idea on having 2 service for validating the result, will give it a try
when the invoice has subtotal with an indented item, it gets read as duplicate item (as pytesseract doesn't recognized indent) and therefore, the total doesn't match the invoice total... do you have any suggestions for this kid of error?
Hey Jason, this is great! But can you Llama2 to achieve the same?
great content! thank you for taking the time to put together and share!
With function calling, is it more convenient for LLMs to extract structured data?
Im getting an an error that says Import "dotenv" could not be resolved Pylance (reportMissingImports) [Ln 4_ Col 6], what am I doing wrong?
not clear yet, what are the output difference from pypdf/langchain pdf to pdf->img->text? do the later one, keep some structure of the info in certain way or what's good/bad from these 2 approach?
When I tried pypdf/langchain unstructured file upload, it only extract like 10~20% of the text from img, so almost unusable
Can you do an assist filling form using langchain tutorial?
tons of companies have been doing this with OCR. I don't know what are you saying!!
But ocr is built already right? why cant we directly use that
One of the key takeaways from this amazing tutorial is: AI by itself will not replace you but rather one who uses AI effectively is the one will insha’Allah (God willing). So go learn how to use AI in your day-to-day job now and impress your employers with your ideas.
Great tutorial Jason.
Given the thumbnail I have to ask... when do we get the Hot Dog or Not Hot Dog App?
Hi Jason! thanks for the great video. looks like your github link is broken. would love an updated link to access the code!
Sorry forgot to set it public, just updated it! github.com/JayZeeDesign/gpt-data-extraction
how do I get the file_url to be passed from make to relevance?
AI Jason is a must watch, now I wanna make a copycat of him on Chinese web, what about NewAI Jason for my channel 👨🏿🔧👨🏿🔧👨🏿🔧
AWS has a really good system to extract data from a document and it cods $1.50 per 1000 pages... so its super efficient
oh nice, didnt know that, will give it a try! whats the name of the service?
please attach link to medium article
What about safety concerns regarding data? Anyway to overcome this? Good video.
How does GPT4 vision affect this ? better or worse?
🤬 why am I having so much trouble importing? What am I missing?
Красавчик
all repetitive work using computers can be automated within 2 years by ai.
Why are we converting pdf to image instead u can use any python Library to get text from pdf
would using OpenAI's function calling be useful here?
You can try function calling for data extraction for sure! but still need a way to turn PDF text well first
Hey everyone, im pretty new to all of this. im the type to just dive in and do, i keep getting this error after i pip install anything "is not recognized as the name of a cmdlet, function, script file, or operable program." any help?
Ask chatgpt
The github link seems to be broken. Could repost the link pls? :)
he took it down looks like he will be turning it into a micro service
Sorry forgot to set it public, just updated it! github.com/JayZeeDesign/gpt-data-extraction
@@jmanhype1 Sorry forgot to set it public, just updated it! github.com/JayZeeDesign/gpt-data-extraction
hey, github link doesn't work :(
Sorry forgot to set it public, just updated it! github.com/JayZeeDesign/gpt-data-extraction
@@AIJasonZ Thanks!
Vid content aside,你的声音jimmy o yang是真的很像哈哈哈哈哈
This is nice. maybe deploying these models on MS Azure so we can have their API?
and for the next video try making a simple streamlit app with that API
Really appreciate the work you are doing. Thank you very much
wow the python coding tutorials keep getting more and more complicated lately thats good.
Jason for presiden 2024
Can you do the all SLOWLY.. Again I COULDN'T FOLLOW YOU 😮😮😮😮
Each time he says pdffiles 👀
Use new Microsoft office features xs
im a marketer, i just don't understand the whole coding part, it's like chinese for me.
Anyone with tech background wanna work on this? I'm looking to launch a SaaS company and I have more than 10 years in Sales working on B2B Finance. Reply here and I will get in touch!
the bottleneck is: no company want to send private highly sensitive data as cleartext to openAIs chatGPT to process. Not in USA, not in Europe.
yea you are right; Im making a new video about how companies can handle data privacy soon, so hopefully it can address that :) But in general, host private cloud, or using opensource LLM should solve that
Looking forward to this video!
What you call a boring data entry job feels millions of families around the world where the bread "earner" has no better skills. I find the attitude of CS and esp AI folks distasteful. You guys are so flippant about the destruction of families and communities caused by AI taking over jobs. There will be a day of reckoning I am afraid when the world turns against CS folks. Please watch your language leave the commentary out....
Thanks!