- 230
- 994 945
Data Science Basics
Finland
Приєднався 30 гру 2022
Hi, I am one of the data enthusiast like you ! On this channel, I teach data science as well as recent AI trend (LLM) stuffs in the most simplest manner possible.
Currently, video is one of the most important and go-to content type online. I aim to make Data Science Basics a go to UA-cam Channel for videos surrounding data science stuffs in a practical way.
If you find the content helpful then consider subscribing.
For business inquiries email at: basicsdatascience@gmail.com
💼 Consulting: topmate.io/sudarshan_koirala
Currently, video is one of the most important and go-to content type online. I aim to make Data Science Basics a go to UA-cam Channel for videos surrounding data science stuffs in a practical way.
If you find the content helpful then consider subscribing.
For business inquiries email at: basicsdatascience@gmail.com
💼 Consulting: topmate.io/sudarshan_koirala
100% Local RAG Using LangChain, DeepSeek, Ollama, Qdrant, Docling, Huggingface & Chainlit
In this video, we create a fully local Retrieval-Augmented Generation (RAG) application using the DeepSeek model via Ollama. The architecture involves utilising Docling to extract information from PDFs, segmenting the document into smaller chunks, and then embedding these segments using HuggingFace embeddings. The segments are stored in a local Qdrant Vectorstore using Docker. When a query is made, embeddings are generated and queried against the knowledge base to retrieve relevant information. We leverage LangChain for orchestration and interface with a UI bot using Chainlit, implementing a step-by-step approach that highlights the capabilities of DeepSeek in handling and reasoning over the retrieved data. The video also discusses prerequisites like installing Docker and setting up the environment, while walking through the code implementation and testing the application with sample queries.
TimeStamps ⏰
00:00 Introduction
00:10 Understanding the Architecture
01:50 Setting Up the Environment
04:03 Ingesting Documents
04:58 Creating the Vector Database
07:30 Running the Application
14:59 Using the UI and Asking Questions
22:10 Conclusion and Final Thoughts
Links ⛓️💥
ollama.com/
qdrant.tech/documentation/quickstart/
www.docker.com/get-started/
docs.phidata.com/introduction
blog.gopenai.com/how-to-build-a-chatbot-to-chat-with-your-pdf-9abb9beaf0c4
ds4sd.github.io/docling/examples/rag_langchain/
github.com/sudarshan-koirala/youtube-stuffs/tree/main/chainlit
------------------------------------------------------------------------------------------
☕ Buy me a Coffee: ko-fi.com/datasciencebasics
✌️Patreon: www.patreon.com/datasciencebasics
------------------------------------------------------------------------------------------
🤝 Connect with me:
📺 UA-cam: www.youtube.com/@datasciencebasics?sub_confirmation=1
👔 LinkedIn: www.linkedin.com/in/sudarshan-koirala/
🐦 Twitter: mesudarshan
🔉Medium: medium.com/@sudarshan-koirala
💼 Consulting: topmate.io/sudarshan_koirala
#ollama #deepseek #qdrant #langchain #chainlit #docling
TimeStamps ⏰
00:00 Introduction
00:10 Understanding the Architecture
01:50 Setting Up the Environment
04:03 Ingesting Documents
04:58 Creating the Vector Database
07:30 Running the Application
14:59 Using the UI and Asking Questions
22:10 Conclusion and Final Thoughts
Links ⛓️💥
ollama.com/
qdrant.tech/documentation/quickstart/
www.docker.com/get-started/
docs.phidata.com/introduction
blog.gopenai.com/how-to-build-a-chatbot-to-chat-with-your-pdf-9abb9beaf0c4
ds4sd.github.io/docling/examples/rag_langchain/
github.com/sudarshan-koirala/youtube-stuffs/tree/main/chainlit
------------------------------------------------------------------------------------------
☕ Buy me a Coffee: ko-fi.com/datasciencebasics
✌️Patreon: www.patreon.com/datasciencebasics
------------------------------------------------------------------------------------------
🤝 Connect with me:
📺 UA-cam: www.youtube.com/@datasciencebasics?sub_confirmation=1
👔 LinkedIn: www.linkedin.com/in/sudarshan-koirala/
🐦 Twitter: mesudarshan
🔉Medium: medium.com/@sudarshan-koirala
💼 Consulting: topmate.io/sudarshan_koirala
#ollama #deepseek #qdrant #langchain #chainlit #docling
Переглядів: 1 453
Відео
RUN DeepSeek R1 LOCALLY
Переглядів 78121 годину тому
Try DeepSeek R1 locally on your machine. DeepSeek’s first-generation reasoning models, achieving performance comparable to OpenAI-o1 across math, code, and reasoning tasks. Links ⛓️💥 ollama.com/ api-docs.deepseek.com/news/news250120 ollama.com/library/deepseek-r1 github.com/open-webui/open-webui ☕ Buy me a Coffee: ko-fi.com/datasciencebasics ✌️Patreon: www.patreon.com/datasciencebasics 🤝 Conne...
Agentic RAG: Build Your First Agentic RAG Using Qdrant and Phidata
Переглядів 2,4 тис.14 днів тому
Curious about the difference between traditional RAG and Agentic RAG? 🔹 Traditional RAG: Uses simple search and prompt stuffing - great for straightforward tasks but struggles with complex queries 😕 🔹 Agentic RAG: Gives the agent a tool to search for information independently, exactly when it needs it! 🚀 In this video, we’ll break down why Agentic RAG is the next step for more nuanced, powerful...
Building no-code AI Agents Using LangFlow | For Complete Beginners
Переглядів 1,6 тис.21 день тому
In this beginner-friendly tutorial, we’ll explore how to build AI agents without writing any code using LangFlow-a powerful low-code platform designed for creating AI applications. LangFlow simplifies the development of AI agents by providing a visual interface where you can connect pre-built components to design complex workflows effortlessly. What You’ll Learn: Introduction to LangFlow: Under...
Building no-code low-code RAG Application Using LangFlow
Переглядів 1 тис.21 день тому
In this video, I will explore Langflow, a no-code low-code tool for building AI applications using drag-and-drop functionality. The video is divided into two parts: the first shows creating simple chat applications and second creating retrieval-augmented generation (RAG) applications using Lang Flow's managed service. Langflow is a low-code app builder for RAG and multi-agent AI applications. I...
Mastering Document Parsing with LlamaParse from LlamaIndex: Complete Guide
Переглядів 1,8 тис.Місяць тому
In this video, I will walk you through the document parsing using LlamaParse from LlamaIndex. LlamaParse allows you to securely parse complex documents such as PDFs, PowerPoints, Word documents and spreadsheets into structured data using state-of-the-art AI. LlamaParse is available as a standalone REST API, a Python package, a TypeScript SDK, and a web UI. First, I will walk you through the UI ...
Building Your First AI Agents With Phidata & models from Groq | Beginners Guide
Переглядів 8 тис.Місяць тому
In this video, I will show how you can create a simple agent, multi-agent using Phidata. We start with the basics of setting up an AI project in a virtual environment, proceed with creating individual agents such as a web search agent using DuckDuckGo and a finance agent utilising Yahoo Finance. We then demonstrate how to combine these agents into a multi-agent system and run everything from th...
Docling from IBM | Open Source Library To Make Documents AI Ready | LlamaIndex
Переглядів 2,1 тис.Місяць тому
Dive into the capabilities of IBM's open source AI tool, Docling, designed for efficient document parsing and exporting. This video explores how DocLink works, its easy-to-use interface, and its ability to handle various document types including PDFs, DOCX, PowerPoints, and more. The video covers setting up the environment, basic and advanced features, and integrating Docling with Lama Index fo...
Get Started With Github Copilot Free in Visual Studio Code 🔥
Переглядів 1,1 тис.Місяць тому
In this video, I will explore how to set up and use GitHub Copilot in VS Code effectively. Learn about the announcements made on December 18th regarding GitHub Copilot's free plan, how to configure it, and various commands you can run. We also cover privacy settings, creating projects from scratch or existing ones, generating commit messages, and using Copilot Edit for multi-file editing. Perfe...
Extremely Fast Python Package Manager | written in Rust 🚀
Переглядів 823Місяць тому
In this video, we explore UV, a versatile and ultra-fast tool for managing Python projects and packages. Learn how to install UV, initialise projects, manage dependencies, and utilize various useful commands. The video highlights UV's speed and efficiency in handling multiple Python versions, creating virtual environments, and running scripts. Discover why UV is a powerful alternative to other ...
All You Need To Know About Amazon Bedrock
Переглядів 478Місяць тому
In this video, I will cover the highlights of AWS ReInvent 2024 and take a detailed look into the updates and features of Amazon Bedrock. From exploring the Bedrock console UI, configurations, and newly added models in the Bedrock marketplace to advanced functionalities like prompt routers, model routing, and watermark detection, we guide you through all the essential aspects of Bedrock. Additi...
Top 5 Essential Resources for Learning Generative AI
Переглядів 383Місяць тому
In this video, I'll guide you through five essential resources for anyone interested in learning about AI, particularly generative AI. We start with Hugging Face, a platform for collaborating on machine learning models and applications. Next is DeepLearning.AI, founded by Andrew Ng, offering a variety of AI courses and practical applications. The third resource is a site that evaluates AI model...
aisuite: Unified Interface for Multiple Generative AI Providers
Переглядів 3302 місяці тому
In this video, we dive into aisuite, an exciting new package from Andrew Ng and his team that provides a simple, unified interface to interact with multiple generative AI models, including OpenAI, LLaMA, and others. We explore its features, demonstrate installation and implementation steps, and highlight how it allows developers to switch and compare responses from different large language mode...
Mastering Prompt Engineering with LangSmith's Prompt Canvas
Переглядів 6282 місяці тому
In this video, we dive into LangSmith's Prompt Canvas, an innovative tool for developing and optimising AI prompts. The video explores the user interface and features of Prompt Canvas as a simplified and efficient prompt creation experience inspired by OpenAI's canvas UX. The host demonstrates how to use the tool, provides walkthroughs of various functionalities like editing prompts, utilising ...
Exploring Open Canvas: The Open Source Alternative to ChatGPT Canvas
Переглядів 2,2 тис.3 місяці тому
In this video, we will delve into Open Canvas from LangChain, an open-source alternative to ChatGPT Canvas. We explore its key features, including built-in memory, the ability to start from existing documents, and comprehensive UX for writing and coding. The video also provides a step-by-step guide on how to use Open Canvas both online and locally. Additionally, we discuss different functionali...
Maximize Your Efficiency: Exploring Canvas in ChatGPT for Writing and Coding
Переглядів 3813 місяці тому
Maximize Your Efficiency: Exploring Canvas in ChatGPT for Writing and Coding
Run GGUF models from Hugging Face Hub on Ollama and OpenWebUI
Переглядів 3,6 тис.3 місяці тому
Run GGUF models from Hugging Face Hub on Ollama and OpenWebUI
Prompt Generator From OpenAI | ANYONE Can Write Prompts With This New Feature
Переглядів 1,7 тис.3 місяці тому
Prompt Generator From OpenAI | ANYONE Can Write Prompts With This New Feature
Super Easy Way To Parse Documents | LlamaParse Premium 🔥
Переглядів 2 тис.4 місяці тому
Super Easy Way To Parse Documents | LlamaParse Premium 🔥
AI/BI Dashboards | Databricks New AI Powered Visualization Tool
Переглядів 1,7 тис.4 місяці тому
AI/BI Dashboards | Databricks New AI Powered Visualization Tool
DATABRICKS AI/BI GENIE | No Code Interface For Your Data | Text TO SQL
Переглядів 9704 місяці тому
DATABRICKS AI/BI GENIE | No Code Interface For Your Data | Text TO SQL
Exploring Databricks Notebook: New Features and Functionalities Overview
Переглядів 6634 місяці тому
Exploring Databricks Notebook: New Features and Functionalities Overview
Use Llava In GroqCloud & OpenWebUI
Переглядів 1,1 тис.4 місяці тому
Use Llava In GroqCloud & OpenWebUI
Open WebUI: Local ChatGPT Alternative | For Complete Begineers | Full Tutorial
Переглядів 19 тис.5 місяців тому
Open WebUI: Local ChatGPT Alternative | For Complete Begineers | Full Tutorial
Extract Table Info From SCANNED PDF & Summarise It Using Llama3.1 via Ollama | LangChain
Переглядів 3,4 тис.5 місяців тому
Extract Table Info From SCANNED PDF & Summarise It Using Llama3.1 via Ollama | LangChain
Installing and Using LangGraph Studio | First Agent IDE
Переглядів 4,1 тис.5 місяців тому
Installing and Using LangGraph Studio | First Agent IDE
Introduction to LangGraph: Building and Enhancing LLM Agents
Переглядів 1,6 тис.5 місяців тому
Introduction to LangGraph: Building and Enhancing LLM Agents
Implementing Guardrails in Amazon Bedrock: A Step-by-Step Guide
Переглядів 8846 місяців тому
Implementing Guardrails in Amazon Bedrock: A Step-by-Step Guide
This functionality, provided by pipreqs, is exactly what I was hoping to find. Thanks for presenting it. Your presentation manner is very clear, and at a speed that allows viewers to assimilate what you are saying and typing.
Thank you !!! I was trying to parse the low quality image from last 4 days. Then I found this video. Superb!!! Thanks for this video sirrrr!!! Grateful!!!!!!!!1
You are welcome. Glad that the video was helpful !!
What is the advantage of using AWS Bedrock instead of directly calling Anthropic API? Is the price cheaper?
AWS Bedrock offers seamless integration with AWS services and eliminates infrastructure management through its serverless architecture, while providing enterprise-grade security and access to multiple AI models beyond Anthropic’s Claude. Pricing for Anthropic models through Bedrock is consistent with direct API costs, but Bedrock provides additional cost-saving options like batch inference at 50% lower rates. The choice depends on needing AWS ecosystem integration versus direct model access.
Can you tell me how to stop running llm and revert
If you wanr to stop while LLM is responding, there is a stop icon on the right side where you ask the question. If you want to stop OpenWebUI, you can stop it from terminal.
what are the system requirements?
it depends which model you want to use from Ollama. You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
Thanks for this amazing content I have two questions:- I am using the docling library for the pdf extraction but it has more than 500 pages and a lot of similar tables when I am using the nomic-embed-text for the embedding and the Faiss vector database retriever the issue is that the LLM is not getting the correct context from the retriever due to which I am getting the wrong answers.Will this method work for large pdfs having more than 150 tables.Would like to know more about different solution I can use through open source libraries?
i am also looking for sustainalbe solution for the similar problem, should colab about it
@ Sure
How can we share dashboard within organization without granting them access on workspace?
2 Questions ========= - Yes, can it be deployed on a VPS ? - Can it read scanned docs or handwritten docs? (Heads up!, I'm a beginner)
Thanks, this helped!
Glad that it was helpful !!
Me parece mejor LocalGPT-Vision
Again fantastic content! Very valuable information. What do you prefer using for a RAG? Langchain or Llamaindex? Using docling more and more thanks to you, but I am still searching for a local ingestion/validation solution (f.e. Colpali or similar) to handle the ingestion of "difficult" (scanned complex table data) data. Llamaparse does this pretty good, but I need a local solution. If you have any ideas ... would love to hear your thoughts. Or maybe a idea for the next video ;) ...
Can you have this deployed on a vps (if with an api, even better)?
where I should put the API key for local docker qdrant? How to generate it? Thanks in advance!
You can take help from this video Agentic RAG: Build Your First Agentic RAG Using Qdrant and Phidata ua-cam.com/video/3vCNHpLs2-M/v-deo.html
Thanks a Lot
You are welcome !!
Amazing Man !!
I can't see the guardrails on the agent creation panel :/, how can I use it on my agents?
Got it: Depends on the region
This is not the R1 model that's causing all the fuzz. this is one of the smaller distillations DeepSeek made
Thanks for this helpful video!
Thanks for putting together this nice how-to video 👏
You are welcome !!
Nice what configuration of Mac are you using and what configuration needed please let me know
I am using Macbook M3 pro 36 GB RAM. Having said that, to run small distilled models you don’t need that much of power. I have mentioned in the video itself and its also available in Ollama’s github page. Here is what it is mentioned, “You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.”
without API key can you provide some examples of chat with PDF
there are many videos in my channel, please navigate through the channel.
I’m looking for a program that could analyze a pdf questionnaire responses and based on input could output a deficiency list of corrections to the completed questionnaire. Is this possible?
It's great that there are so many models, it would be useful for beginners to have information on how to load such a model onto laptop and how to attach context to this model and whether it is possible to attach it at all
Nice overview, thanks!
I have tried PhiData and it is great. Easy to use and very powerful for some use cases.
Hi sir thank you for great explanation... Just a question will this work good with pdfs containing tables
You are welcome. It should work, give a try. If not, you can use some other tools to parse pdf ( docling, llamaparse, etc )
Phidata 🔥🔥🔥
Very well organized content and clear explanations. Thank you!
You are welcome !!
Thank you so much
You are welcome !!
I am getting the "Unable to get a page count. Is Poppler installed and in PATH?" error even after following all the steps; any help, pls?
!sudo apt-get install poppler-utils !apt-get update !apt-get install -y poppler-utils run these.
have you tried out n8n or flowise also? any preference?
I haven’t used n8n yet, but I have experience with Flowise, and I’ve shared related videos on my channel. Personally, I prefer Langflow because it’s python based !!
@@datasciencebasics ty. n8n uses langchain / langgraph as components; so i believe it's python based
This is gem bro. How do you find these tools?
thanks, research on these tech helps find diff tools :)
Dude, I am so happy to have found your channel. Gonna binge your stuff now, and gonna learn me some stuff. You rule 🤘 You get error messages, read them, understand them, then explain them... I could watch this 24/7 Nobody does it this well!
Thank you for the feedback, glad that you find the contents helpful !!
What if pdf content has multi language? Such as english and chinese? Good content btw😊
Sir, I have tried using the Ollama model and need to connect it with the agent to access the tools. However, I am unable to do so. Is there any possible way to connect Ollama with the agent? Thank you.
there are many tutorials on yt on how to do it
I am getting an error when I am trying to run this command pip3 install -r . equirements.txt, its like there is no requirements.txt document, do you know if something recently changed in this process?
Excellent details, no where I can find :-)
Glad that you find it helpful !!
I like to use pyenv to install different python versions.. only problem is that I need to build a pyqt app for Windows XP! So If install the last version of python that is officially supported for xp (pyenv install 3.4.4-win32 ; pyenv global 3.4.4-win32 ) how can I install the last version of PyQt5.5.1 that works with Python 3.4 / XP? The oldest whl version I can find on PyPi is PyQt5-5.6-cp35-none-win32.whl.. but that's Python 3.5. SF does have PyQt5-5.5.1-gpl-Py3.4-Qt5.5.1-x32.exe , but that's a windows executable installer, so not sure what happens if I install it outside pyenv.. will only Python 3.4.4 be affected and not my other versions of installed python/pyqt? Interestingly there is a NEWER whl version of PyQt that supports an OLDER version of Python (PyQt5-5.8-5.8.0-cp34.cp35.cp36.cp37-none-win32.whl)
Is it possible to use a local llama API instead of OpenAI's GPT?
you can install it locally and use local models. For i stalling locally you can refer to my another video Building no-code AI Agents Using LangFlow | For Complete Beginners ua-cam.com/video/HwII8r43Fhc/v-deo.html
Thank you for this video. what is your experience with the performance of the solution. It seems that results is taking some time to be shown ?
My experience is pretty okay. Simple to use and the performance depends upon the llm you use. About latency, it might also depend from where you use the app as the vector database is hosted in US in the demo I showed but used from Europe.
If your pdf has both table and text it was not flattening the table into a sentence for embadding model to understand better.
table is flattened into markdown format so the llm can get info from the table. If you render the markdown, you can see it as table again.
Waiting for 2 next videos. 1. The part-2 2. Validation of the results of RAG
Great, you been great Koirala ji
Do i need to download the poppler and tessaract to use the unstructured api for pdf files?
If you use locally then yes, if you use the unstructured client then no.
@@datasciencebasics Thanks a lot for replying
Ahh, this is so cool! Thanks for including Phidata in your video!
You are welcome !
Can we use llama 3.2 and Faissa vector database provided by meta ?
Does a Phidata fine-tuned python agent app run on Vercel with a Nextjs app? I mean, is it compatible with JS on any deployment service?
It should be possible. While Phidata is primarily designed for Python-based AI applications, it’s not directly compatible with Next.js or Vercel’s JavaScript environment. However, you can still leverage Phidata in a Next.js application deployed on Vercel by creating a separate Python backend that uses Phidata and integrating it with your Next.js frontend.
Can these AI commands be limiting?
Great sir, Keep continue this series
Awesome video, thanks!
You are welcome !!