debugverse
debugverse
  • 11
  • 16 653

Відео

Scrape ANY Website in JUST 5 Lines of Code
Переглядів 2 тис.2 місяці тому
Scraping is easy, see? Autoscraper project repo github.com/alirezamika/autoscraper
DIY OpenAI Voice Assistant UNDER 100 Lines of Code
Переглядів 2722 місяці тому
Self-hosted voice assistant with Ollama, Langchain, Python FastAPI and a little bit of that good JS. Source code: github.com/debugverse/debugverse-youtube
SUPERFAST Audio Transcription with OpenAI Whisper Turbo - Python Tutorial
Переглядів 8013 місяці тому
a New OpenAI Whisper Turbo model lets anybody transcribe audio files locally into text with high performance and high accuracy. Source code: github.com/debugverse/debugverse-youtube
AI Image Organizer with Python Tutorial
Переглядів 1144 місяці тому
short tutorial for AI powered context-aware image renaming program in Python using Google Gemini Flash model.
AI Summarize HUGE Documents Locally! (Langchain + Ollama + Python)
Переглядів 11 тис.4 місяці тому
Today we are looking at a way to efficiently summarize huge PDF (or any other text) documents using clustering method with HuggingFace embeddings, Langchain Python framework and Ollama Llama 3.1 model.
5 FastAPI ProTips For Writing Better API
Переглядів 574 місяці тому
Five quick tips to make your Fast API app work faster and better.
AI ChatBot with Gemini + Tool calling (Golang Tutorial)
Переглядів 4524 місяці тому
In this tutorial we will take a look at the genAI package and write a chat bot in Go with the ability to call tools using a Gemini Flash model.
AI Generated Blog with Langchain + FastAPI Python Tutorial (Part 2)
Переглядів 6784 місяці тому
In the second part we focus on Pydantic models for data serialization, Mongo DB for storing and retrieving data and implementing Jinja2 templating.
AI Generated Blog with Langchain + FastAPI Python Tutorial (Part 1)
Переглядів 2634 місяці тому
In this project we will be creating an API which will be able to automatically generate a blog post on any topic, complete with title, tags and content. We will store the blog to MongoDB and create endpoints for listing posts and generating them.
AI PDF Summarize and Sentiment analysis with Langchain + Ollama Python tutorial
Переглядів 6754 місяці тому
Summarize a PDF using 100% local LLama3.1 AI model and generate a sentiment of it using Langhcain framework and structured output.

КОМЕНТАРІ

  • @chulung3190
    @chulung3190 4 дні тому

    Hi, I am working on a company project. Can this help me extract the required data from a PDF? I receive a monthly PDF that includes all our company clients' monthly statements. I need to extract the 'Brought Forward' and 'Realized Loss/Profit Amount' from the PDF, which is nearly a thousand pages long. I will need to perform this process monthly.

    • @DebugVerseTutorials
      @DebugVerseTutorials 2 дні тому

      I have worked on a similar task with both vision LLM and pdfminer so I would recommend those tools.

  • @jonm691
    @jonm691 5 днів тому

    Nice video - thanks for sharing that

  • @mikew2883
    @mikew2883 9 днів тому

    Very cool! Do you mind providing an example of how to filter the data like you mention in closing?

    • @jonm691
      @jonm691 5 днів тому

      I looked at this. Basically, you use the results to provide your source pages, and then use that as the context. For example: filter = EmbeddingsClusteringFilter(embeddings=embeddings, num_clusters=10, num_closest=3) result = filter.transform_documents(documents=texts) context="" for i in result: context += f"{i.page_content} " # convert your result pages into a single text blob by combining them prompt = " Ask your question here... use the context within triple backticks ``` {context}```" response = llm.invoke(prompt) print(response) However... this is not a replacement for RAG, because remember that much of the document has been discarded and so you're unlikely to find your answer. k-means is basically just collating similar pages, but not necessarily the one with the unique information you need. K-means is therefore great for summarisation, but not necessarily good for specific questions. So, if your specific question relates to something that is summary-like, then if should be more relevant. Maybe I've missed something here, but that's my conclusion from playing with it.

  • @RedCloudServices
    @RedCloudServices 9 днів тому

    I think the latest vision models will make RAG obsolete

  • @danila8823
    @danila8823 19 днів тому

    Using gemini vision to describe the video?? Nice technique

  • @jakubzakowski7422
    @jakubzakowski7422 28 днів тому

    one of the best videos i have ever seen. I just want to tell you Thank you and good job

  • @ajays6393
    @ajays6393 28 днів тому

    Thank you very informative!

  • @thingX1x
    @thingX1x Місяць тому

    Will this work for a procedurally generated file containing a conversation? Or should I look at another method?

  • @HalkawtMawlood
    @HalkawtMawlood Місяць тому

    I really love your UA-cam contents but until now I didn't a tutorial video like how to cluster clients feedback for example one says we need electricity and an other one says lack of electricity, I want to develop a python code to automatically cluster these comments which are similar to each other into one unique sentence but I didn't want to delete the feedback column in excel but I want to create another column next to the feedback column then to do these clustering so that I can see how accurate are they for doing it(note: I don't want to make summarise no but having 500 feedback and have each of them a cluster but a when I filter the cluster then I should have in total 5 to ten or more for summarise similar feedbacks) . If it is possible with python or any other program I would be happy and grateful.

  • @rundeks
    @rundeks 2 місяці тому

    LangChain has moved to Pydantic 2. To update this code change "from langchain_core.pydantic_v1 import BaseModel, Field" to "from pydantic import BaseModel, Field". This caused me to get some errors with score which I had to change to a float since the definition of a number between 0 and 1 implies float to the system.

  • @ahassan7270
    @ahassan7270 2 місяці тому

    Wonderful. Thank you so much for sharing this valuable tool.

  • @KHe3CaspianXI
    @KHe3CaspianXI 2 місяці тому

    not working, i'll just stick with selectolax

  • @srivenkateswaraswamy3403
    @srivenkateswaraswamy3403 2 місяці тому

    what if images of tables and equations are there in that case?

  • @adrianooficial2008
    @adrianooficial2008 2 місяці тому

    WHY THE AI VOICE 😩

  • @mightyboessu
    @mightyboessu 2 місяці тому

    Why do you use the HuggingFaceBgeEmbeddings and not OllamaEmbeddings?

  • @terryliu3635
    @terryliu3635 3 місяці тому

    Nice demo. Quick question, do you know how PyPDFLoader will process the images and tables within the PDF file? THanks.

    • @DebugVerseTutorials
      @DebugVerseTutorials 3 місяці тому

      Hi, the images are not processed by default and tables, if possible, are (clumsily) converted to text. if you are looking for more advanced extraction, one way I tried is to convert a PDF page to PNG and give it to Vision LLM for evaluation, which can understand pictures and graphs better.

  • @LibertyRecordsFree
    @LibertyRecordsFree 3 місяці тому

    Can you do a full instal tutorial for windows? I want to use Whisper v3 turbo in my python programm but still did not figured out a proper install ^^#

    • @DebugVerseTutorials
      @DebugVerseTutorials 3 місяці тому

      Hi, on Windows you can use openai-whisper package. see pypi.org/project/openai-whisper/ for more details. Either way on Windows I recommend using WSL backend for better compatibility

  • @meereslicht
    @meereslicht 3 місяці тому

    Excellent, thank you! A very clever strategy for large documents. However, I am a little at a loss in the search of a good embedding model for texts in Spanish. I am not sure whether the BGE models are the best option for these. Can you suggest one that could be integrated seamlessly within your code?

    • @DebugVerseTutorials
      @DebugVerseTutorials 3 місяці тому

      Hi, for Spanish language take a look at jinaai/jina-embeddings-v2-base-es . In your code simply replace the model_name variable and everything should work.

    • @meereslicht
      @meereslicht 3 місяці тому

      @@DebugVerseTutorials Thank you very much for your kind answer. I'll do that 😊🤗🤗

    • @igorcastilhos
      @igorcastilhos 2 місяці тому

      @@DebugVerseTutorials Hi, if I would to use the Ollama model, how can I know the exact name necessary to put in the model_name?

    • @mukeshkund4465
      @mukeshkund4465 Місяць тому

      ​@@igorcastilhosdo ollama list to see the model available and copy the name.

    • @allok501
      @allok501 Місяць тому

      you can use latest jina embeddings v3 as it is multilinugal.

  • @DebugVerseTutorials
    @DebugVerseTutorials 4 місяці тому

    Source code github.com/debugverse/debugverse-youtube

  • @DebugVerseTutorials
    @DebugVerseTutorials 4 місяці тому

    Source code github.com/debugverse/debugverse-youtube/tree/main/summarize_huge_documents_kmeans

  • @DebugVerseTutorials
    @DebugVerseTutorials 4 місяці тому

    Source code: github.com/debugverse/debugverse-youtube/tree/main/go-genai-chatbot

  • @DebugVerseTutorials
    @DebugVerseTutorials 4 місяці тому

    Source code github.com/debugverse/debugverse-youtube/tree/main/ai_blog

  • @DebugVerseTutorials
    @DebugVerseTutorials 4 місяці тому

    Source code github.com/debugverse/debugverse-youtube/tree/main/ai_blog

  • @irvingpichardo310
    @irvingpichardo310 5 місяців тому

    Code source, please.

    • @DebugVerseTutorials
      @DebugVerseTutorials 5 місяців тому

      Here you go github.com/debugverse/debugverse-youtube/tree/main/pdf_summary_and_sentiment