LLMs for Devs
LLMs for Devs
  • 13
  • 280 017
Comparing 10 different models, including Gemini Flash 2 0, Grok, Claude, GPT, Llama for OCR
Code: github.com/trancethehuman/ai-workshop-code/tree/main/projects/ocr-battle
My deep dive content (paid) app.catswithbats.com/90d4bd29
Follow me on X: x.com/haithehuman
This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each step in the IO Venture Path is designed to shorten the path to growth. They help today’s entrepreneurs leverage the experience, expertise and insights of businesses leaders who have been there, done that - successfully launched and grown world-class technology enterprises. To date, they’ve supported over 1100 startups and scaleups.
If you’re a tech founder in the National Capital Region (or thinking of becoming one) - check out IO’s Venture Path: www.investottawa.ca/venture-path/
Переглядів: 4 363

Відео

Compare Tavily, Perplexity API, Google Search Grounding (Gemini), Exa with LLM as Judge in LangSmith
Переглядів 1,7 тис.Місяць тому
Code: github.com/trancethehuman/ai-workshop-code/tree/main/projects/web_search_battle judges library: pypi.org/project/judges/ My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each...
Setup your first LLM observability traces with LangSmith and iterate on prompts with Quotient AI
Переглядів 2,6 тис.2 місяці тому
Code: github.com/trancethehuman/ai-workshop-code/tree/main/tracing_eval Tools mentioned: www.quotientai.co/ www.langchain.com/langsmith My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path progr...
Onboard new users faster by scraping their websites and use LLM to extract info (Groq / Firecrawl)
Переглядів 3,1 тис.2 місяці тому
Code: github.com/trancethehuman/ai-workshop-code/blob/main/Onboard_customers_quickly_firecrawl_llm_extract.ipynb Tools mentioned: Firecrawl: www.firecrawl.dev/ Groq: groq.com/ My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path...
Agentically scrape the web with Firecrawl & LangGraph (LangChain)
Переглядів 3,9 тис.2 місяці тому
Code: github.com/trancethehuman/ai-workshop-code/blob/main/Scrape_the_web_agentically_with_Firecrawl_and_LangGraph.ipynb Firecrawl: www.firecrawl.dev/ LangGraph: www.langchain.com/langgraph 🧠 My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman
Don't naive RAG do hybrid search instead (Pinecone Weaviate or pgvector + full text search & rerank)
Переглядів 11 тис.5 місяців тому
We'll compare hybrid search using Pinecone, Weaviate and then Postgres (Supabase) full text search pgvector and then rerank using Jina AI's Reranker. 👨‍💻 Code: github.com/trancethehuman/ai-workshop-code/blob/main/Hybrid_Search_Workshop.ipynb Tools mentioned: @pinecone-io: www.pinecone.io/ @Weaviate: weaviate.io/ @Supabase: supabase.com/ @JinaAI Reranker: jina.ai/reranker/ Vecs (Python): github....
How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai
Переглядів 191 тис.7 місяців тому
👨‍💻 Code: github.com/trancethehuman/ai-workshop-code/blob/main/Web_scraping_for_LLM_in_2024.ipynb (if there are issues with viewing the code, just fork and clone the repository. It's just a current problem with GitHub's way of displaying Jupyter notebooks - nbconvert) Tools mentioned: Jina AI: jina.ai/reader Mendable's Firecrawl: www.firecrawl.dev/ Scrapegraph-ai: github.com/VinciGit00/Scrapegr...
5 tiers of long-term memory and personalization for LLM applications (in-person workshop)
Переглядів 6 тис.7 місяців тому
👨‍💻 Code: github.com/trancethehuman/ai-workshop-code/blob/main/Long_term_memory_&_personalized_LLM_responses.ipynb 🧠 My exclusive AI Engineer content (premium): app.catswithbats.com/90d4bd29 Follow me on X: x.com/haithehuman This workshop was made possible by Invest Ottawa. IO supports tech founders across the National Capital Region of Canada through their Venture Path programs. Each step in t...
Scrape any website with OpenAI Functions & LangChain
Переглядів 50 тис.Рік тому
👨‍💻 GitHub code: github.com/trancethehuman/entities-extraction-web-scraper 🚌 Premium AI for developers content: app.catswithbats.com/90d4bd29/a15db702 👋 Follow me on X / Twitter: haithehuman #ai #langchain #openai #webscraping #llm
AI Companion Clone (Replika) with LangChain and OpenAI Functions as entity extractor
Переглядів 1,9 тис.Рік тому
Build an AI companion using this codebase. Sign up for my paid courses: tally.so/r/n9daQ1
Pi (inflection.ai) has become Samantha from Her (2014)
Переглядів 1,2 тис.Рік тому
Pi (inflection.ai) has become Samantha from Her (2014)
Can't pick a vector database? Use this pattern and try them all.
Переглядів 1,4 тис.Рік тому
GitHub: github.com/trancethehuman/factory-pattern-vectorstore-interface Connect with me: www.linkedin.com/in/haiphunghiem/ #vectorstore #langchain #openai #gpt #gpt-3.5

КОМЕНТАРІ

  • @alx8439
    @alx8439 День тому

    Langfuse is opensource and can be selfhosted. Or you can go old reliable route and log everything to use something like ELK stack on your logs

  • @alx8439
    @alx8439 День тому

    Mu gut feeling tells me, that the only how you can actually have a long term memory without flaws and scaffolding, which always accompany RAG, is continuous fine tuning.

  • @nobody84980
    @nobody84980 3 дні тому

    Great video, thank you! Btw, which LLM was used as a judge model.. and doesn't using an LLM model over the other as a judge bias the evaluation

  • @samarthsaraogi6088
    @samarthsaraogi6088 4 дні тому

    How can we store the fitted model? I want to use the fitted BM25 model repeatedly on my app. Is there a way to keep it?

  • @svenst
    @svenst 6 днів тому

    Bro, that chewing gum …. Just dont

    • @devlearnllm
      @devlearnllm 6 днів тому

      www.amazon.ca/gp/product/B01CU5ZJP8/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1

    • @devlearnllm
      @devlearnllm 6 днів тому

      it’s bad I know, but I just got this recently and it’s fire

    • @svenst
      @svenst 6 днів тому

      @@devlearnllm 😂👍

  • @alt_realm
    @alt_realm 9 днів тому

    Thanks for the video, Can you pls make a video on LLM deployment in AWS/LLMOps

  • @fuhodev9548
    @fuhodev9548 9 днів тому

    how to config Jina to scrape what we want in a specific website? (image, avatar, links, etc)

  • @monad_tcp
    @monad_tcp 11 днів тому

    7:25 transforming HTML to Markdown ? its amazing. Why don't we do this to the entire Web and get rid of all the front-end code in Javascript and all the bullshit and leave just the Text so we humans can read without the gibberish. Someone needs to make a Browser that uses LLMs to read the stupid HTML and get rid of the crap and present a beautiful markdown that can be directly rendered using DirectX12 to my quad-hd screen. Maybe I could make that. How about we make the web go back to HATEOAS

    • @devlearnllm
      @devlearnllm 8 днів тому

      I like that. Can't even read anything on Forbes nowadays with 4 ad bannners. if you haven't already, Mozilla has an open source reader mode repo that's really neat

  • @NjabuloHadebe
    @NjabuloHadebe 12 днів тому

    Hmmm. CHATGpt recommended your video🎉. Just thought I'd share incase you aren't aware.

    • @devlearnllm
      @devlearnllm 11 днів тому

      that’s crazy. how does that work?

  • @loganhallucinates
    @loganhallucinates 16 днів тому

    Can someone just give me the conclusion

  • @Lhtokbgkmvfknv
    @Lhtokbgkmvfknv 16 днів тому

    Can anybody explain to me why I should use paid AI services for web scraping rather than using my own scripts with open source libraries like Selenium?

  • @ahmedd.masoud6809
    @ahmedd.masoud6809 17 днів тому

    Thank you for the video, Can't wait if you can do the same for TTS & TTS please 🙏

    • @devlearnllm
      @devlearnllm 17 днів тому

      You got it. I'm thinking comparing the real time APIs between a few providers

  • @radudamianov
    @radudamianov 17 днів тому

    Horrible disrespectful chewing. Could not follow it due to chewing.

  • @vitalis
    @vitalis 19 днів тому

    Interesting videos on your channel but could you please not chew gum as the mic picks it up too much? Keep it up 👍

    • @devlearnllm
      @devlearnllm 19 днів тому

      Yep. I got addicted to gum recently but will keep that in mind!

  • @lukez3618
    @lukez3618 19 днів тому

    DELETE THIS VIDEO RN, ITS TO POWERFUL 🫨

    • @devlearnllm
      @devlearnllm 19 днів тому

      ITS TOO LATE

    • @readmarketings9061
      @readmarketings9061 17 днів тому

      This method is outdated now... JINA is not the best choice compared with Crawl4AI + Pydantic

  • @gregmeldrum
    @gregmeldrum 20 днів тому

    Excellent video!

  • @reserseAI
    @reserseAI 20 днів тому

    greatttt

  • @thefatcat-hd6ze
    @thefatcat-hd6ze 20 днів тому

    Would really appreciate a similar video for translation.

    • @devlearnllm
      @devlearnllm 20 днів тому

      you got it. putting that on my list

  • @jonathanpark873
    @jonathanpark873 20 днів тому

    Your videos are fun to watch

  • @janigiovanni6075
    @janigiovanni6075 20 днів тому

    Great Video! The chewing is a Little bit annoying though.😂

  • @crimsonkim6824
    @crimsonkim6824 24 дні тому

    maybe i've missed it on the video, but why do you think Exa was not able to answer any of the queries correctly? like 0% every time

    • @devlearnllm
      @devlearnllm 22 дні тому

      I didn't have an answer. But I did have high expectations for it given the promises on their landing page, and the docs. I know some folks who have strong opinions about it haha

  • @Buildthingsthatbuildthings
    @Buildthingsthatbuildthings Місяць тому

    This is awesome stuff, very high value investment of 15 minutes.

  • @ihebakermi943
    @ihebakermi943 Місяць тому

    great

  • @empfehlbar
    @empfehlbar Місяць тому

    That’s a great tutorial/ code example. Thanks a lot! It inspired me to include something similar in my project. I especially like the idea to first read out the sitemap and then loop through the pages! And thanks for the other comments with ideas about more scraping tools, I need to look them up! At the moment I’m using DataForSEO, which works pretty well, too.

  • @songzeyang1725
    @songzeyang1725 Місяць тому

    Great work! Keep it up!

  • @som6553
    @som6553 Місяць тому

    awesome vid!

  • @Future_me_66525
    @Future_me_66525 Місяць тому

    Pure gold, thanks for sharing

  • @reserseAI
    @reserseAI Місяць тому

    Always very good content…..

  • @devlearnllm
    @devlearnllm Місяць тому

    More content here! app.catswithbats.com/90d4bd29

  • @amzpro5734
    @amzpro5734 Місяць тому

    Its a cool product but only issue is Jina getting blocked as a bot, so its not making it past the "Are you human?" screen.

    • @devlearnllm
      @devlearnllm Місяць тому

      Yeah, Firecrawl recently faired better

  • @Incomestreamsurfers
    @Incomestreamsurfers Місяць тому

    This video made my SaaS possible thanks - I had no idea 5 months ago what LLM scraping was.

    • @devlearnllm
      @devlearnllm Місяць тому

      I'm glad! I talk further in-depth about web scraping here: app.catswithbats.com/90d4bd29/a15db702

  • @imshaiknasir
    @imshaiknasir 2 місяці тому

    why this video isn't in trending list !!!!!!!!!!!!!!!

  • @rajshekar3108
    @rajshekar3108 2 місяці тому

    Hey i wanna connect with you guys , is there a online community kind of thing?

    • @devlearnllm
      @devlearnllm 2 місяці тому

      Here you go: app.catswithbats.com/90d4bd29 (p.s the Discord is in the app when you sign up)

  • @reserseAI
    @reserseAI 2 місяці тому

    Always happy when you upload new video

  • @zeropointengineer
    @zeropointengineer 2 місяці тому

    i watch these videos, i have evolved

  • @devlearnllm
    @devlearnllm 2 місяці тому

    allergies season is finally over!

  • @som6553
    @som6553 2 місяці тому

    you're a great teacher!

    • @devlearnllm
      @devlearnllm 2 місяці тому

      Thanks! Let me know if you have any feedbacks

  • @priapushk996
    @priapushk996 2 місяці тому

    Who else was disappointed after clicking the thumbnail and seeing a dude?

  • @CreepyFilmz
    @CreepyFilmz 2 місяці тому

    You still need a big computing and storage to actually build her.

  • @KasunWijesekara
    @KasunWijesekara 2 місяці тому

    Nice demo <3

  • @SandeeshCroos
    @SandeeshCroos 2 місяці тому

    Hey, great content! Thanks for sharing your knowledge. However, instead of just using tsvector in PostgreSQL, you can leverage sparse vector search by utilizing the pg_search extension, right?

    • @devlearnllm
      @devlearnllm 2 місяці тому

      yup, they're both full text search. Or use pgroonga

  • @IStMl
    @IStMl 2 місяці тому

    that sounds illegal?

    • @devlearnllm
      @devlearnllm 2 місяці тому

      No because you have to ask your new user for their website to be able to crawl.

  • @intelpakistan
    @intelpakistan 2 місяці тому

    crappy camera work, good content

    • @devlearnllm
      @devlearnllm 2 місяці тому

      Lol what about the camera work that's bad? I'll try to fix it.

    • @somechrisguy
      @somechrisguy 2 місяці тому

      @@devlearnllm Just have it on a tripod

  • @inviciously
    @inviciously 2 місяці тому

    video lacks a good hook

    • @devlearnllm
      @devlearnllm 2 місяці тому

      good point; what would make a good hook? Overview of what's gonna be built?

    • @invictus8441
      @invictus8441 2 місяці тому

      @@devlearnllm overview AND a demo. I would love to see what's the end goal and if it's worth watching for or not. It can be a double edged sword I guess!

    • @devlearnllm
      @devlearnllm 2 місяці тому

      @@invictus8441 agreed! Will do for the next vid

    • @raccoon_dad
      @raccoon_dad Місяць тому

      @@devlearnllm Your content is outstanding so I already know it's worth watching to the end. I've learned a ton of useful things from you. Please keep it up. BTW, most people will overlook crappy camera work for great content. We're not here to be entertained. Just focus on what's important, the material.

  • @janekschleicher9661
    @janekschleicher9661 2 місяці тому

    Haha, in the old days, I just used lynx URL | perl -e '....' to achieve somehow the same, nowadays lynx usually doesn't work anymore just as every website doesn't contain content any more, but executables, and of course Perl became the most hated language around, so nobody likes to parse anything with perl anymore, my colleagues very often even struggle with regexes, so year, using 3rd party tool to get content out of websites and then ask an LLM to answer your question probably is a million or billion times or worse compute inefficient than the old way, but can be done by more people. But I remember we analyzed huge product catalogues like amazon, icecat, .... to get product description of every category with our approach for probably one some dollars in total while processing Gigabytes of raw content (so close to a billion tokens in these terms).

  • @jimshtepa5423
    @jimshtepa5423 2 місяці тому

    thank you for the presentation. What is the best (cheapest) way to scrape data from instagram? all I need travers graph of accounts to find accounts with certain threshold of followers?

  • @kocokan
    @kocokan 2 місяці тому

    Are you having difficulty breathing?

  • @perc-ai
    @perc-ai 2 місяці тому

    Firecrawl is not as good as scrapfly ... firecrawl seems like it was invented by a single guy in his bedroom while scrapfly is enterprise grade.

    • @devlearnllm
      @devlearnllm 2 місяці тому

      basically that's how the story went - they built it very recently. But they ship fast and the DX is much better. I'm willing to bet on it.

  • @NamTran-jd6lp
    @NamTran-jd6lp 2 місяці тому

    firecrawl seems cool but I still feel like jina might be a bit better overall...

    • @devlearnllm
      @devlearnllm 2 місяці тому

      Jina's a competitive research lab too lol

  • @reydelmundo6284
    @reydelmundo6284 2 місяці тому

    i like to use scrapy because its open source and LLM friendly

    • @devlearnllm
      @devlearnllm 2 місяці тому

      Nice! Never tried it but looking now