How to Create an AI-Assisted Search Engine with Python and txtAI in Seconds! Easy Tutorial

Поділитися
Вставка
  • Опубліковано 27 жов 2024

КОМЕНТАРІ • 49

  • @python-programming
    @python-programming  2 роки тому +4

    Repository: github.com/wjbmattingly/youtube-txtai
    Help video for Anaconda and Environments: ua-cam.com/video/mIB7IZFCE_k/v-deo.html (a little old but still very useful)

  • @python-programming
    @python-programming  2 роки тому +2

    Also in this video, I reference the previous video. That video will go live next week instead.

  • @khalifakhalifa610
    @khalifakhalifa610 2 роки тому +3

    Please we need more videos on TextAI.
    Your channel became my favorite!!! Kudos!!

  • @sarasharick5209
    @sarasharick5209 Рік тому +1

    Great video! I might have an opportunity to use this at work instead of some NER I was doing.

  • @satishchaudhary7875
    @satishchaudhary7875 2 роки тому +1

    Such a wonderful tutorial on txtai please do more video of txtai and paperai.

  • @rickyS-D76
    @rickyS-D76 2 роки тому +1

    Thanks for the great video on txtAI, just loved the way you explained it. Thanks. I would like to see more txtAi + Streamlit app kind of videos that you mentioned in the end of this video.

    • @python-programming
      @python-programming  2 роки тому +1

      Thanks! So glad you enjoyed it! I will be doing more in the near future!

  • @umangternate
    @umangternate 5 місяців тому

    I failed installing txtai[all] with errors in fasttext block (cannot build wheels for fasttext). With pip install txtai, there was no error but still txtai is not "detectable" in vscode when importing. This is on windows 11 with visual studio (c++ build tools) installed. The version of python is 3.11 and the virtual environment is placed in drive D: (physically separated from drive C because I use SSD for C: and HD for D:). What could be the problem? Thank you.

  • @debgandharghosh3981
    @debgandharghosh3981 7 місяців тому

    This video is so helpful for people like me who are taking baby steps towards NLP , I would really love to see how to update a txtai model, the github code for untitled.ipynb might be corrupt I couln't see the code , however wasn't a big issue I could write the code for drawing inferences by myself after seeing your video

  • @theh1ve
    @theh1ve 2 роки тому +2

    Hi another awesome little video that not only shows a great use case but how to get up running with the code. Thank you. In answer to your questions yes to all! Integrated with streamlit absolutely as this would be how I would apply it. And understanding how to update the model would be great. Also say I had two written texts broken down into smaller documents could I return a tag with the results to see which text the document was returned from?

    • @python-programming
      @python-programming  2 роки тому +1

      Thanks! Awesome! I will put that video together that shows how to do that. As luck would have it, I have just done this for another project. The nested index will also be included.

  • @Frank97006
    @Frank97006 Рік тому +2

    At 2:40 it says txtai needs Python 3.7. What is meant is Python 3.7 or higher.
    So there is no need to install Python 3.7.

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq Рік тому

    Big next thing is an appropriate UI for users including showing them the source of each finding. Also using a vector database that doesn’t rely on memory which won’t work at scale. This is very cool though!

  • @sacred1profane
    @sacred1profane 2 роки тому +1

    Thanks for introducing the txtai.
    BTW the model works with python 3.9.9 on Mac.

    • @python-programming
      @python-programming  2 роки тому

      No problem! Thanks for that update on mac! I only have linux and Windows machines. Purchasing a mac is on my to do list.

    • @ojaskulkarni8138
      @ojaskulkarni8138 4 місяці тому

      @@python-programming How did you get the data set?

  • @Kalks95
    @Kalks95 6 місяців тому

    How did you generate the data file? would I have to build my own data sets for this?

  • @AndrewPeverells
    @AndrewPeverells 2 роки тому +1

    Great video as always, thank you for the tutorial! :)
    Just one quick question: I don't know anything about semantic search nor TextAI, but is it language indipendent? Or has it been trained on English?

    • @python-programming
      @python-programming  2 роки тому +1

      Thanks! Oh that is a great question. I did not speak about that. You will want to select a sentence transformer model for your language

    • @python-programming
      @python-programming  2 роки тому +1

      But it is language agnostic as a workflow

    • @AndrewPeverells
      @AndrewPeverells 2 роки тому +1

      @@python-programming cool, thank you very much! Do you know how many languages these models cover? Is there one for classical languages also?

    • @python-programming
      @python-programming  2 роки тому +1

      @@AndrewPeverells a good deal! What language(s) are you looking for? I can do some research and maybe demo that in a video for you.

    • @AndrewPeverells
      @AndrewPeverells 2 роки тому +1

      @@python-programming oh well, that would be awesome! You don't have to though, it's fine :) I was just wondering whether there was a transformer model for Latin!

  • @hunaydahsaeid1609
    @hunaydahsaeid1609 2 роки тому +2

    If want to build system to research papers data base to determine new research originality. What the best type of files csv or json to apply machine learning algorithms on it

    • @python-programming
      @python-programming  2 роки тому +2

      Great question. The file really is not important. It comes down to preference and use case. Will it sit on the web? JSON may be better.

    • @hunaydahsaeid1609
      @hunaydahsaeid1609 2 роки тому +2

      @@python-programming thank you.. 😊

    • @hunaydahsaeid1609
      @hunaydahsaeid1609 2 роки тому +1

      @@python-programming yes I want to sit it on the web ... and it will contains abstracts and titles for the research papers.. and it supposed to help students to determine researches with most similarity ot their proposed research.. I am applying your lessons about South Africa data set. You used json in the beginning . So I'm still learning and didn't complete all your lessons..
      Your lessons helped me so much. I'm grateful for it 🙏

  • @ameybikram5781
    @ameybikram5781 Рік тому

    Is this library safe ? In terms of data breaches ?

  • @MonoJunkie
    @MonoJunkie Рік тому

    I don't think you mentioned whether txtAI sends my/your data off to the cloud somewhere for analysis or refers to any external APIs or providers which would be important if dealing with sensitive information? Or who/what is behind it and whether it is legitimate? Paid/free? Or if there are licensing restrictions?

    • @python-programming
      @python-programming  Рік тому +1

      Last I checked, it is all local. The creator is very pro open source.

    • @neuml
      @neuml Рік тому +1

      Confirming that txtai is all local and doesn't send your data off to the cloud. You can download a model, disconnect your internet and everything will still work.

    • @python-programming
      @python-programming  Рік тому +1

      @@neuml thanks for responding!

  • @venkatesanr9455
    @venkatesanr9455 2 роки тому

    Thanks for the valuable videos. I have involved in semantic search mapping text as query and image/ other pdf docs as output. I have followed the approches for unstructured images/pdf/other extensions---->Tried like OCR based text extraction from images , pdf text extraction for pdf files only, bert embedding and doing clustering the images.
    Any other inputs or approaches using libraries from your end will be helpful for semantic search on unstructured data/images/pdf. Whether txtai is open sourced helpful for QA between images and text.
    Kindly reply.

  • @techdiyer5290
    @techdiyer5290 11 місяців тому +1

    Im looking to create a web scraper thing in python that basically makes me find what im actually searching for. I want it to include something ive named table search. If anyone cares to ask, ill explain what that is/ how im thinking of making it work.

  • @Superdooperhero
    @Superdooperhero 2 роки тому +1

    You're in South Africa? If you're in Cape Town I can show you around.

    • @python-programming
      @python-programming  2 роки тому

      Thanks! I am actually and would have totally taken you up on that but we are leaving soon. Coming back next winter though!

  • @PabloPazosGutierrez
    @PabloPazosGutierrez 9 місяців тому

    Would be nice to search by phrases not just a word

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Рік тому

    what makes txtai different from the alternatives?

    • @neuml
      @neuml Рік тому

      There are a lot of great vector database options available. txtai strives to make it easy to get up and running fast. It has built-in vectorization, vector storage, hybrid search and an LLM workflow framework for retrieval augmented generation (RAG). Everything runs local, no external APIs are required.

  • @ravinkponjg
    @ravinkponjg Рік тому +2

    Make more interesting video on txt ai

  • @kosemekars
    @kosemekars 2 роки тому +2

    The GH link is 404

  • @hunaydahsaeid1609
    @hunaydahsaeid1609 2 роки тому +1

    👋

    • @khalifakhalifa610
      @khalifakhalifa610 2 роки тому

      Please more videos. You’re my favorite channel now!!!