- 13
- 280 017
LLMs for Devs
Canada
Приєднався 14 чер 2023
Developers learning AI & LLM.
app.catswithbats.com/90d4bd29
app.catswithbats.com/90d4bd29
Comparing 10 different models, including Gemini Flash 2 0, Grok, Claude, GPT, Llama for OCR
Code: github.com/trancethehuman/ai-workshop-code/tree/main/projects/ocr-battle
My deep dive content (paid) app.catswithbats.com/90d4bd29
Follow me on X: x.com/haithehuman
This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each step in the IO Venture Path is designed to shorten the path to growth. They help today’s entrepreneurs leverage the experience, expertise and insights of businesses leaders who have been there, done that - successfully launched and grown world-class technology enterprises. To date, they’ve supported over 1100 startups and scaleups.
If you’re a tech founder in the National Capital Region (or thinking of becoming one) - check out IO’s Venture Path: www.investottawa.ca/venture-path/
My deep dive content (paid) app.catswithbats.com/90d4bd29
Follow me on X: x.com/haithehuman
This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each step in the IO Venture Path is designed to shorten the path to growth. They help today’s entrepreneurs leverage the experience, expertise and insights of businesses leaders who have been there, done that - successfully launched and grown world-class technology enterprises. To date, they’ve supported over 1100 startups and scaleups.
If you’re a tech founder in the National Capital Region (or thinking of becoming one) - check out IO’s Venture Path: www.investottawa.ca/venture-path/
Переглядів: 4 363
Відео
Compare Tavily, Perplexity API, Google Search Grounding (Gemini), Exa with LLM as Judge in LangSmith
Переглядів 1,7 тис.Місяць тому
Code: github.com/trancethehuman/ai-workshop-code/tree/main/projects/web_search_battle judges library: pypi.org/project/judges/ My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each...
Setup your first LLM observability traces with LangSmith and iterate on prompts with Quotient AI
Переглядів 2,6 тис.2 місяці тому
Code: github.com/trancethehuman/ai-workshop-code/tree/main/tracing_eval Tools mentioned: www.quotientai.co/ www.langchain.com/langsmith My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path progr...
Onboard new users faster by scraping their websites and use LLM to extract info (Groq / Firecrawl)
Переглядів 3,1 тис.2 місяці тому
Code: github.com/trancethehuman/ai-workshop-code/blob/main/Onboard_customers_quickly_firecrawl_llm_extract.ipynb Tools mentioned: Firecrawl: www.firecrawl.dev/ Groq: groq.com/ My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path...
Agentically scrape the web with Firecrawl & LangGraph (LangChain)
Переглядів 3,9 тис.2 місяці тому
Code: github.com/trancethehuman/ai-workshop-code/blob/main/Scrape_the_web_agentically_with_Firecrawl_and_LangGraph.ipynb Firecrawl: www.firecrawl.dev/ LangGraph: www.langchain.com/langgraph 🧠 My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman
Don't naive RAG do hybrid search instead (Pinecone Weaviate or pgvector + full text search & rerank)
Переглядів 11 тис.5 місяців тому
We'll compare hybrid search using Pinecone, Weaviate and then Postgres (Supabase) full text search pgvector and then rerank using Jina AI's Reranker. 👨💻 Code: github.com/trancethehuman/ai-workshop-code/blob/main/Hybrid_Search_Workshop.ipynb Tools mentioned: @pinecone-io: www.pinecone.io/ @Weaviate: weaviate.io/ @Supabase: supabase.com/ @JinaAI Reranker: jina.ai/reranker/ Vecs (Python): github....
How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai
Переглядів 191 тис.7 місяців тому
👨💻 Code: github.com/trancethehuman/ai-workshop-code/blob/main/Web_scraping_for_LLM_in_2024.ipynb (if there are issues with viewing the code, just fork and clone the repository. It's just a current problem with GitHub's way of displaying Jupyter notebooks - nbconvert) Tools mentioned: Jina AI: jina.ai/reader Mendable's Firecrawl: www.firecrawl.dev/ Scrapegraph-ai: github.com/VinciGit00/Scrapegr...
5 tiers of long-term memory and personalization for LLM applications (in-person workshop)
Переглядів 6 тис.7 місяців тому
👨💻 Code: github.com/trancethehuman/ai-workshop-code/blob/main/Long_term_memory_&_personalized_LLM_responses.ipynb 🧠 My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each step in t...
Scrape any website with OpenAI Functions & LangChain
Переглядів 50 тис.Рік тому
👨💻 GitHub code: github.com/trancethehuman/entities-extraction-web-scraper 🚌 Premium AI for developers content: app.catswithbats.com/90d4bd29/a15db702 👋 Follow me on X / Twitter: haithehuman #ai #langchain #openai #webscraping #llm
AI Companion Clone (Replika) with LangChain and OpenAI Functions as entity extractor
Переглядів 1,9 тис.Рік тому
Build an AI companion using this codebase. Sign up for my paid courses: tally.so/r/n9daQ1
Pi (inflection.ai) has become Samantha from Her (2014)
Переглядів 1,2 тис.Рік тому
Pi (inflection.ai) has become Samantha from Her (2014)
Can't pick a vector database? Use this pattern and try them all.
Переглядів 1,4 тис.Рік тому
GitHub: github.com/trancethehuman/factory-pattern-vectorstore-interface Connect with me: www.linkedin.com/in/haiphunghiem/ #vectorstore #langchain #openai #gpt #gpt-3.5
Langfuse is opensource and can be selfhosted. Or you can go old reliable route and log everything to use something like ELK stack on your logs
true
Mu gut feeling tells me, that the only how you can actually have a long term memory without flaws and scaffolding, which always accompany RAG, is continuous fine tuning.
Great video, thank you! Btw, which LLM was used as a judge model.. and doesn't using an LLM model over the other as a judge bias the evaluation
How can we store the fitted model? I want to use the fitted BM25 model repeatedly on my app. Is there a way to keep it?
Bro, that chewing gum …. Just dont
www.amazon.ca/gp/product/B01CU5ZJP8/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1
it’s bad I know, but I just got this recently and it’s fire
@@devlearnllm 😂👍
Thanks for the video, Can you pls make a video on LLM deployment in AWS/LLMOps
Yes.
how to config Jina to scrape what we want in a specific website? (image, avatar, links, etc)
Sure!
7:25 transforming HTML to Markdown ? its amazing. Why don't we do this to the entire Web and get rid of all the front-end code in Javascript and all the bullshit and leave just the Text so we humans can read without the gibberish. Someone needs to make a Browser that uses LLMs to read the stupid HTML and get rid of the crap and present a beautiful markdown that can be directly rendered using DirectX12 to my quad-hd screen. Maybe I could make that. How about we make the web go back to HATEOAS
I like that. Can't even read anything on Forbes nowadays with 4 ad bannners. if you haven't already, Mozilla has an open source reader mode repo that's really neat
Hmmm. CHATGpt recommended your video🎉. Just thought I'd share incase you aren't aware.
that’s crazy. how does that work?
Can someone just give me the conclusion
Gemini 2.0 Flash best.
@@devlearnllm Thank you!
Can anybody explain to me why I should use paid AI services for web scraping rather than using my own scripts with open source libraries like Selenium?
Thank you for the video, Can't wait if you can do the same for TTS & TTS please 🙏
You got it. I'm thinking comparing the real time APIs between a few providers
Horrible disrespectful chewing. Could not follow it due to chewing.
Unacceptable.
Interesting videos on your channel but could you please not chew gum as the mic picks it up too much? Keep it up 👍
Yep. I got addicted to gum recently but will keep that in mind!
DELETE THIS VIDEO RN, ITS TO POWERFUL 🫨
ITS TOO LATE
This method is outdated now... JINA is not the best choice compared with Crawl4AI + Pydantic
Excellent video!
Thank you! Cheers!
greatttt
Thanks!
Would really appreciate a similar video for translation.
you got it. putting that on my list
Your videos are fun to watch
Great Video! The chewing is a Little bit annoying though.😂
Haha my bad!
@@devlearnllm still a pleasure to watch ;)
maybe i've missed it on the video, but why do you think Exa was not able to answer any of the queries correctly? like 0% every time
I didn't have an answer. But I did have high expectations for it given the promises on their landing page, and the docs. I know some folks who have strong opinions about it haha
This is awesome stuff, very high value investment of 15 minutes.
Glad you thought so!
great
That’s a great tutorial/ code example. Thanks a lot! It inspired me to include something similar in my project. I especially like the idea to first read out the sitemap and then loop through the pages! And thanks for the other comments with ideas about more scraping tools, I need to look them up! At the moment I’m using DataForSEO, which works pretty well, too.
Glad it helped!
Great work! Keep it up!
Appreciate it!
awesome vid!
Pure gold, thanks for sharing
Always very good content…..
More content here! app.catswithbats.com/90d4bd29
Its a cool product but only issue is Jina getting blocked as a bot, so its not making it past the "Are you human?" screen.
Yeah, Firecrawl recently faired better
This video made my SaaS possible thanks - I had no idea 5 months ago what LLM scraping was.
I'm glad! I talk further in-depth about web scraping here: app.catswithbats.com/90d4bd29/a15db702
why this video isn't in trending list !!!!!!!!!!!!!!!
IKR
Hey i wanna connect with you guys , is there a online community kind of thing?
Here you go: app.catswithbats.com/90d4bd29 (p.s the Discord is in the app when you sign up)
Always happy when you upload new video
i watch these videos, i have evolved
LFG
allergies season is finally over!
you're a great teacher!
Thanks! Let me know if you have any feedbacks
Who else was disappointed after clicking the thumbnail and seeing a dude?
me
You still need a big computing and storage to actually build her.
exactly
Nice demo <3
Thank you!
Hey, great content! Thanks for sharing your knowledge. However, instead of just using tsvector in PostgreSQL, you can leverage sparse vector search by utilizing the pg_search extension, right?
yup, they're both full text search. Or use pgroonga
that sounds illegal?
No because you have to ask your new user for their website to be able to crawl.
crappy camera work, good content
Lol what about the camera work that's bad? I'll try to fix it.
@@devlearnllm Just have it on a tripod
video lacks a good hook
good point; what would make a good hook? Overview of what's gonna be built?
@@devlearnllm overview AND a demo. I would love to see what's the end goal and if it's worth watching for or not. It can be a double edged sword I guess!
@@invictus8441 agreed! Will do for the next vid
@@devlearnllm Your content is outstanding so I already know it's worth watching to the end. I've learned a ton of useful things from you. Please keep it up. BTW, most people will overlook crappy camera work for great content. We're not here to be entertained. Just focus on what's important, the material.
Haha, in the old days, I just used lynx URL | perl -e '....' to achieve somehow the same, nowadays lynx usually doesn't work anymore just as every website doesn't contain content any more, but executables, and of course Perl became the most hated language around, so nobody likes to parse anything with perl anymore, my colleagues very often even struggle with regexes, so year, using 3rd party tool to get content out of websites and then ask an LLM to answer your question probably is a million or billion times or worse compute inefficient than the old way, but can be done by more people. But I remember we analyzed huge product catalogues like amazon, icecat, .... to get product description of every category with our approach for probably one some dollars in total while processing Gigabytes of raw content (so close to a billion tokens in these terms).
thank you for the presentation. What is the best (cheapest) way to scrape data from instagram? all I need travers graph of accounts to find accounts with certain threshold of followers?
Are you having difficulty breathing?
Yeah allergies
Firecrawl is not as good as scrapfly ... firecrawl seems like it was invented by a single guy in his bedroom while scrapfly is enterprise grade.
basically that's how the story went - they built it very recently. But they ship fast and the DX is much better. I'm willing to bet on it.
firecrawl seems cool but I still feel like jina might be a bit better overall...
Jina's a competitive research lab too lol
i like to use scrapy because its open source and LLM friendly
Nice! Never tried it but looking now