NEW AI Framework - Steerable Chatbots with Semantic Router

Поділитися
Вставка
  • Опубліковано 15 вер 2024

КОМЕНТАРІ • 164

  • @Darthus
    @Darthus 8 місяців тому +18

    This may have been within your suggestions, but I can also see this for routing queries to different RAG models, for example, if someone is asking about rules of a game, versus asking to play the game, you could use separate models that are more specifically tuned to information retrieval vs creativity.

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +10

      should be doable with current version of library, putting together some examples over coming weeks and will include something like this, thanks!

    • @kavian4249
      @kavian4249 3 місяці тому

      @@jamesbriggs Any updates for this?

  • @broomva
    @broomva 8 місяців тому +4

    This is great for creating 'fuzzy' like if/else statements, where the statement refers to a cloud of options, really cool. It's like filtering based on the embedding space

  • @benjaminrigby877
    @benjaminrigby877 8 місяців тому +10

    this is, of course, intent detection under a different name. the complexity it brings is that now you've got to monitor and update two different systems - so somebody starts asking for a "route" in a slightly different way - or with a different accent that gets transcribed in a different way, and you've got to be able to catch it and fix it across the generative and intent systems.

    • @thebozbloxbla2020
      @thebozbloxbla2020 8 місяців тому +2

      i guess one way to solve that would be to to have all routes sort of defined in a tuple dict. and then have another LLM check to see if the correct semantic route was chosen in async fashion. after that if no useful results were from, say a vector search in a specific category which was defined as a route and chosen by the route selector, then we can fall upon choosing the route suggested by LLM (which could be done in async)
      in the end we can compare results and send which ever is better as system response.
      obviously haven't looked too deep into time optimisation. but i have a feeling it could still be faster

    • @larsbell1569
      @larsbell1569 6 місяців тому +1

      Ya but won’t this be an order of magnitude cheaper and faster than an intent system?

  • @itsjustmeemman
    @itsjustmeemman 8 місяців тому +3

    I have a production LLM endpoint and I realized that 70% of the time it takes to process everything happens on the different guard rails, intent identification and other classifiers in the sequence for a simple RAG 🥲. Thank you for this video I'll implement this and hopefully give some feedback or share what I've learned.

  • @truehighs7845
    @truehighs7845 8 місяців тому +1

    This is pretty much exactly what Langchain was missing, especially combined with actions and API calls! Well done!

  • @RichardGetzPhotography
    @RichardGetzPhotography 8 місяців тому +2

    Building colabs to make James laugh during filming..... priceless!!

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      yes need more of these haha

  • @dusanbosnjakovic6588
    @dusanbosnjakovic6588 8 місяців тому +2

    I love your work and this video. However I have implemented something like this in heavy production and faced a lot of issues. Chat history, complex multi intent queries and just overall accuracy prevented us from moving forward. Surprised you think this is such a slam dunk.

  • @albertgao7256
    @albertgao7256 2 місяці тому

    simple idea but solves real problems, and i would really love to see more about the semantic usage, LLM is been studied too much, but the fundamental embedding model could actually get more used than just providing search for RAG, love your video mate, always high quality!

  • @concretec0w
    @concretec0w 8 місяців тому +1

    Sooooo soooo coool :) This is going straight into my voice assistant so that i don't need multiple keyboard shortcuts to handle different tasks :D

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      haha sounds awesome, hope it goes well :)

  • @Sarah_ai_student
    @Sarah_ai_student 3 місяці тому

    Your videos are fantastic, offering excellent content. They are probably the best on UA-cam for beginners in AI and generative AI.
    I am a student and my school hasn't quite caught up to speed on this subject yet. Could you create a video on how to develop a full-stack chatbot application using Python with Django or another Python framework, incorporating a vector database like Milvus/other, a retrieval-augmented generation (RAG) approach, and a locally hosted language model such as Mistral/other ?
    A Q&A style format would be greatly appreciated.

  • @scharlesworth93
    @scharlesworth93 8 місяців тому +1

    Cool I gotta wait two days to watch

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      I need some time to get everything ready 😅 will not make you wait for following videos :)

  • @tonyrungeetech
    @tonyrungeetech 8 місяців тому +8

    Really looking forward to seeing it applied to RAG! I've been thinking about some sort of process where we search the top K results, then doing a simple semantic evaluation of 'yes I found it' or 'not enough info' to trigger a more in depth search - is it something along those lines?

    • @megamehdi89
      @megamehdi89 8 місяців тому +1

      Great idea

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +2

      yeah it feels like it would be a similar process when using the default RouteLayer + static Routes

  • @plashless3406
    @plashless3406 8 місяців тому +1

    This is really interesting. Will definitely try it out. Great job james and team.

  • @jdray
    @jdray 8 місяців тому

    Watching this with interest. A few months ago I identified a hole in the general field of AI stacks that I conceptually filled with a tool called BRAD (which I've forgotten by now what it stood for). In essence the idea was the same as here, except with a tiny, fast LLM deciding on the route rather than routing logic like you describe here. So thank you for (sort of) implementing my vision. Now I can take this item off my list of things to work on. 😁

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      haha I'm glad we were able to accidentally help out with your vision of BRAD!

  • @carterjames199
    @carterjames199 8 місяців тому

    I love how similar it is to guardrails the semantic search based on preset utterances was amazing when I first saw it in your video last year

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      yeah I do love guardrails, it's a great library and ofcourse it inspired what we're building here

  • @ahmadzaimhilmi
    @ahmadzaimhilmi 8 місяців тому

    This has a good use case in research papers if it can be restructured to categorize sentences into type such as result, challenges, methods etc.

  • @nicolaithomsen7005
    @nicolaithomsen7005 7 місяців тому

    As always, mind-blowing. Great video, James. My favorite GenAI educator!

  • @deemo16
    @deemo16 8 місяців тому

    Awesome work, and great contribution! I have a clunky solution using langchain LLMRouter (which gets me there, but as you pointed out, slow, and a bit awkward). I look forward to implementing this (I can already tell it will be a much cleaner solution than the convoluted logic I've been fumbling with)! Very cool project 🙂

  • @user-fs5lb3ce3b
    @user-fs5lb3ce3b 8 місяців тому

    oh, I've been dreaming about this possibility; a kind of dynamic workflow engine that allows the LLM to choose amongst plausible decision paths... will watch

  • @nathancanbereached
    @nathancanbereached 4 місяці тому

    Have you tested precision in situations where there are multiple routing options that have multiple criteria and are mostly similiar? I wonder if there would be benefit to layering 2 or more routing layers in a branching decision tree format. Would need to test- but theoretically you could have the quality of tree of thought decisioning but at lower cost, higher speed, and each decision point could be human-reviewed before pushing to production. I'm glad I found your channel again- definitely worth looking into.

  • @berdeter
    @berdeter 8 місяців тому +2

    Very interesting. Few questions: how does it compare to Nemo Guardrails you've covered earlier? Have you considered the case where user's question would trigger several semantic routes?
    Let's say you make a bot for candidates going to a job search event in a big company.
    User says "I love AI. Can you propose me something?"
    Route 1 : go to the IT desk
    Route 2 : attend to the conference about digital transformation at 11am
    Route 3: check this job offer that is made fo you
    So the bot should be able to answer on the 3 suggestions.

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      we want to support multiple routes, but it isn't implemented yet - there's nothing in the current methodology that would prevent this though

    • @berdeter
      @berdeter 8 місяців тому

      @@jamesbriggs I have implemented multiple semantic routes on a project. I couldn't find a good framework so it's all hand made.
      I work on a list of intentions such as I want to attend to a conference about digital transformation
      I perform a semantic search and I have 3 thresholds:
      Never more than 3 intentions selected
      Never select intention under a cosine similarity of 82%
      If I find an intention with cousine similarity above 91% I keep only that one and disregard the rest.
      Percentages come from experimentation.
      I think that would be a good starting point to implement something generic in your framework.

  • @dusanbosnjakovic6588
    @dusanbosnjakovic6588 8 місяців тому +3

    Do you have any stats on accuracy differences between this and LLM? Especially in longer queries.

  • @BradleyKieser
    @BradleyKieser 8 місяців тому

    Very interesting idea, I have been using a similar idea but with a light weight fast locally hosted LLM for the routing decision (your "RAG" technology adds a very good layer of speed and precise control). Weird hearing "route" ("root") pronounced as "rowt". Very Game of Thrones. Guess it's for American viewers. Excellent presentation, clear explanation and very well thought out overview.

  • @mustafadut8430
    @mustafadut8430 7 місяців тому

    Man in nice shirt. I can't wait to see an example of it collaborating with a Langchain chatbot.

  • @alivecoding4995
    @alivecoding4995 5 місяців тому

    I love your work and content! Thanks so much, James. :)

  • @jakobkristensen2390
    @jakobkristensen2390 Місяць тому

    Super cool project

  • @danielvalentine132
    @danielvalentine132 8 місяців тому

    Brilliant. Simple and elegant solution.

  • @gfertt13
    @gfertt13 8 місяців тому +3

    I have been using nemo guardrails for work, but exclusively for the "input rails", which seem to be very similar to this library, with the important difference being that it includes an LLM call to make the final decision of classifying into a route/"canonical form". I'm curious, have you run any tests against a benchmark similar to the way the nemo guardrails paper shows tests against NLU++ benchmark?

    • @gfertt13
      @gfertt13 8 місяців тому

      From reading through some of the code it seems like you handle the KNN search with plain numpy and linear search. Any reason for not using a package like FAISS here?

  • @storytimewithme2
    @storytimewithme2 5 місяців тому

    Getting Thailand vibez off this guy

  • @awakenwithoutcoffee
    @awakenwithoutcoffee 2 місяці тому

    this is incredible James. I have implemented a similar (but less powerful) system inside Botpressand was looking for Langchain intent classification/routes which this seems to cover very well. Are you still using this in production or has Langchain come out with an alternative that you prefer ? keep going my man.

  • @KarlJuhl
    @KarlJuhl 8 місяців тому

    Great development, thanks for sharing.
    Personally I have been using LLMs for an intent detection step, to route a user query to a given prompt that has the relevant context to answer a user question.
    I'm interested to see the latency of this in the same setup. Makes total sense to use vector space to cluster a user query and route to the most similar intent in the space.

  • @mooksha
    @mooksha 8 місяців тому +1

    Very useful, looking forward to more videos on this! Will explore the repo as well. How much work has gone into optimising testing for large number of semantic routes? Is it still fast if there are 200 routes with 5 utterances in each? Also how does it fare if routes have many utterances, like 100 each?

  • @avg_ape
    @avg_ape 6 місяців тому

    Fantastic contribution. Thank you.

  • @tajwarakmal
    @tajwarakmal 8 місяців тому

    This is fantastic! already have a few use cases in mind.

  • @SimonMariusGalyan
    @SimonMariusGalyan 7 місяців тому

    Great work which can be integrated into apps speeding up data processing and reducing hallucinations… 🎉

  • @yoshkebab
    @yoshkebab 8 місяців тому +3

    Very interesting. How do you handle history and context? A lot of times a single prompt can't be categorizes on it's own, and routing will change according to context. I'm curious about your approach to this.

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      for now when using the suggestive method I demo in the video the LLM makes the final decision on what to do, so we get around the issue of implementing support for chat history and just use a single query
      In any case, I 100% agree and adding support for chat history is our next priority, we have already built the methodology for how we will handle it, which you can see here github.com/aurelio-labs/cookbook/blob/main/semantic-analysis/semantic-topic-change.ipynb

  • @narutocole
    @narutocole 8 місяців тому

    Super excited to try this!

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      hey Jordan! Let me know how it goes!

  • @nikosterizakis
    @nikosterizakis 6 місяців тому

    That is a great piece of development James. Certainly a very useful 'brick in the wall' of the LLM ecosystem and something we will be using in future projects. Out of interest, are you guys going to branch out to the other two gen AI areas, namely video and audio?

    • @jamesbriggs
      @jamesbriggs  6 місяців тому +1

      For sure, we're already taking some steps into multi-modal, for example using semantic router we're already doing:
      - Image detection ua-cam.com/video/EqKjaLrpeI4/v-deo.html
      - Video processing github.com/aurelio-labs/semantic-router/blob/main/docs/examples/video-splitter.ipynb

    • @nikosterizakis
      @nikosterizakis 6 місяців тому

      Great stuff, with permission I am going to test drive that python code!

  • @marcomorales9417
    @marcomorales9417 6 місяців тому

    Hey! I've found using rules within the system initial message works great for filtering questions about unrelated topics to the domain the rag chtabot should answer, maybe the win with this approach is reducing cost and time? Also I've been playing around with having 1 chatbot which can have different phases within a conversation which are set by different system messages, for example stage 1 is to extract customer infromation fo obtain a lead and then help them with the question, this information can be outputed by the LLM through a JSON response. Quite interesting

  • @avatarelemental
    @avatarelemental 8 місяців тому

    This sounds great ! I will give it a try

  • @MidtownAI-qi1yq
    @MidtownAI-qi1yq 8 місяців тому +1

    With react prompting, you can trigger a sequence of actions. How would semantic router handler the same processing ? It appears to me that everything needs to be programmed in and anticipated instead of expecting a "reasoning engine" which can build in theory any workflow based on the available tools. Just thinking about the limitations of this approach. Very creative though!

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      yes it would require a lot of logic build around it to support multiple steps, I think it would be doable, especially if you have differing sets of routes that could enter the "decision space" but it would be nontrivial with current lib - I think we could build some interesting tooling to attempt to solve for this use-case

  • @lorenzospataro26
    @lorenzospataro26 5 місяців тому

    Have you thought about using Setfit as an alternative to pure semantic matching? If you have a small dataset to train the intent classifier, it would probably perform better for some use cases (at the cost of being a bit slower but still faster than LLM)

  • @johnny017
    @johnny017 8 місяців тому

    Very cool! I built something similar for a project, but much simpler. I also agree that, for now, it doesn't make much sense to use agent reasoning in production. It takes too much time to output, and it consumes too many tokens. I will keep an eye on the repo and try to contribute!

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      I think there are times for agent reasoning, but it is certainly overused - and I believe we will improve semantic-router to replace more of those more expensive+slow reasoning steps with more efficient methods
      We'd love to have you contribute!

  • @GiovanneAfonso
    @GiovanneAfonso 8 місяців тому

    Incredible work! Thank you for sharing, I'm really excited right now, I'll give a try.
    Could you answer some questions?
    - Is it possible to have a 1000 different routes? (I'm just curious)
    - Is it possible to cache / reuse embeddings in multiple services?
    - Is it using an AI model under the hood for choosing the right router?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +3

      Great to hear!
      - yes it could have 1000 different routes in theory, I have not tested yet
      - we don’t cache the embedding yet but we are adding it to the RouteLayer.to_file method
      - the default RouteLayer uses vector space and embeddings to create what is essentially an inherent classification model

    • @GiovanneAfonso
      @GiovanneAfonso 8 місяців тому

      @@jamesbriggs you guys are making the future more shiny. Thanks

  • @mrchongnoi
    @mrchongnoi 8 місяців тому

    Thank you for the video. Very useful

  • @socalledtwin
    @socalledtwin 8 місяців тому

    For now, it could be useful for selecting the best RAG pipeline or collection to search through, though it could be the case that future models like GPT-5 are so good at determining the correct function to call or collection to search, it won't be needed. Either way, nice work.

  • @cuburtrivera1167
    @cuburtrivera1167 7 місяців тому

    i think this can be replicated by just using output parser, but this abstraction layer makes it easier

  • @codingcrashcourses8533
    @codingcrashcourses8533 8 місяців тому

    I don´t really like the LangChain integration, accessing attributes like this and overwriting prompts like this: agent.agent.llm_chain.prompt = new_prompt :(. I still hope for a build in LangChain functionality for something like this. Thanks for the demonstration. It is still a small and young project.

  • @micbab-vg2mu
    @micbab-vg2mu 8 місяців тому

    Graet video - thank you. I will try it.

  • @mathiasschrooten903
    @mathiasschrooten903 2 місяці тому

    Could this, if so, be integrated with LangGraph? This seems like the perfect hybrid solution!

  • @musifmuzammir354
    @musifmuzammir354 8 місяців тому +3

    Isn't this how RASA framework works?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      RASA does intent detection, and that is to some degree what we're doing here - to be honest I have not used RASA for years so I don't know where they are now, but it began to get quite dated - I unfortunately don't know enough to compare their recent versions to this, I will investigate though

    • @musifmuzammir354
      @musifmuzammir354 8 місяців тому

      @@jamesbriggs It still does the intent classification only. But this seems like a faster approach, finding the correct intent using embedding search.

    • @familyaccount-eb7cb
      @familyaccount-eb7cb 5 місяців тому

      Rasa started move away from intent based routing. Because it doesn’t handle follow up inputs well. (Input that needs previous messages context). I do wonder how semantic router handle this follow up inputs

  • @dawid_dahl
    @dawid_dahl 8 місяців тому

    Just so I can understand... do you mean that just by 1) providing some example of those various sentences "isn't politics the best thing ever", etc, and 2) user's embedded query - it will give back a route, without ever going to an actual LLM to make the routing decision?

  • @JanVansteenlandt
    @JanVansteenlandt 5 місяців тому

    What I'm wondering is how to best use this when dealing with chat history. For example if you ask a political question followed up by "please elaborate". The message itself does not mean anything, however taking the previous question into account does... is it as simple as just concatenating the last X user questions and using that as a basis for the routing input? Giving the routing input a limited "memory" but memory nonetheless.

  • @robcz3926
    @robcz3926 4 місяці тому

    man this is so much easier than langgraph routing, and it works with smaller models too

    • @jamesbriggs
      @jamesbriggs  4 місяці тому +1

      yeah we use both in projects, but I'm planning to try building agents that rely wholly on semantic routes soon

    • @robcz3926
      @robcz3926 3 місяці тому

      @@jamesbriggs looking forward to that mate🤘

  • @crotonium
    @crotonium 8 місяців тому

    Isn't this just calculating the cosine similarity between the input query, and the mean of each route class to categorize which route it belongs to?

  •  8 місяців тому

    Thanks James, this is a very good solution for us moving forwards to using agents instead of chains.
    How do you see this fits with more advanced rags, that have Query preprocessing, reranking etc.

  • @marktucker8537
    @marktucker8537 6 місяців тому +1

    How does Semantic Router work under the hood?

  • @megamehdi89
    @megamehdi89 8 місяців тому

    Key Insights by TubeOnAI:
    1. Semantic Router Introduction: The Semantic Router is presented as a crucial layer for achieving control and determinacy in AI dialogue. It is described as a fast decision-making layer for natural language processing, enabling instant triggering of specific responses based on predefined queries. The speaker emphasizes its significance in refining the behavior of AI assistants and chatbots, stressing its necessity for deploying such systems.
    2. Library Setup and Integration: The video provides a step-by-step guide on setting up and using the Semantic Router library. It introduces the installation process and demonstrates how to define routes and test their interaction. The integration with an AI assistant is showcased, illustrating how the Semantic Router augments user queries and influences the agent's responses based on predefined routes.
    3. Enhancing AI Dialogue and Control: The speaker highlights the capabilities of the Semantic Router in influencing AI behavior, protecting against unwanted queries, and suggesting specific actions or information to the AI assistant. The framework is portrayed as a tool for not only steering dialogue but also enhancing the AI's decision-making process and overall functionality.
    4. Future Developments and Open Source Collaboration: The video concludes with an outlook on future developments, expressing the speaker's excitement about the potential of the Semantic Router framework. The open-source nature of the project is emphasized, encouraging contributions and promising further insights into advanced features such as dynamic routing and the hybrid layer.

  • @TommyJefferson1801
    @TommyJefferson1801 8 місяців тому

    So basically, this avoids additional time during inference if I'm not wrong.
    Also why not finetune LLM and use functional calling? I mean yes it can take some time but how well does this Approach compare to that in production scenarios? Do we have like benchmarks on this?

  • @llaaoopp
    @llaaoopp 8 місяців тому +1

    How does this differ from NeMo Guardrails? I thought this pre-check of your query against a database of semantically embedded utterances is at the heart of how Guardrails functions.

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +4

      Yes, I'm very familiar with the lib - it is what we used before semantic router and I and others from aurelioAI made a few contributions to make the lib easier to develop with and deploy in projects, but ultimately it was limiting + overly complex.
      To add things like dynamic routing, hybrid layers, etc (I will talk about these soon) was too difficult. We also have other upcoming features such as a topics-based conversation splitter that we believe will be key to getting the next level of performance from this type of approach. Again, implementing in guardrails would have been more complex than developing it independently, and possibly even out of scope for what they're building.
      Although I think guardrails is awesome, and it served as the starting point for what we're building here, I ultimately felt it better to move away from the library, that may change if nvidia decide to put more resources into it, but I haven't seen this happen yet.

    • @llaaoopp
      @llaaoopp 8 місяців тому

      @@jamesbriggs
      I completely agree, Guardrails felt a bit clunky if you wanted to actually code around it and this it honestly felt like it was not "built for the job" really, but more like a PoC. I love the direction that you guys are going with this and can't wait what other concepts you came up with around the concept of semantic routing!
      Cheers :)

  • @DemetrioFilocamo
    @DemetrioFilocamo 8 місяців тому

    Great project and thanks for open sourcing it! What’s the difference with Nemo Guardrails?

  • @plashless3406
    @plashless3406 8 місяців тому +1

    I have one question though: in order to make the most out of a route, do we need to have many utterances to cover the whole use case of a route? I mean what if a usee query belngs to a route and it return no matching route?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      You need to cover the routes well, if I see a query miss a route, I usually just add it directly to the route and iteratively do that, adding queries that were missed

    • @plashless3406
      @plashless3406 8 місяців тому

      @@jamesbriggs this really is interesting and promising.

  • @gabrieleguo
    @gabrieleguo 8 місяців тому

    Very interesting framework. What if the utterances are in another language? Does it still work well? I guess it should be matching what the encoder supports, is it correct?

  • @Truzian
    @Truzian 8 місяців тому +1

    Would this every be supported in TS/JS?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      would love to build it, we do have some great TS/JS devs, so it's quite likely

  • @dgroechel
    @dgroechel 8 місяців тому

    Great video. With function calling, how do you generate the arguments and keep the speed?

  • @thebozbloxbla2020
    @thebozbloxbla2020 8 місяців тому

    was wondering if we could use the semantic search titles and use it for index searching vector databases to return similarity results more related to the topic...

  • @parchamgupta8417
    @parchamgupta8417 8 місяців тому

    Hi just had a thought in mind, dont you think dialogflow does this kind of task already although at a much basic level.

  • @thebozbloxbla2020
    @thebozbloxbla2020 8 місяців тому

    hey man, really weird question but i hope you respond. i was wondering if the routes we created got saved in some sort of database, aka all the utterances and what not. or are you just vectorising the utterances each time you open a new instance?
    can i define a route for the long term?

  • @wolpumba4099
    @wolpumba4099 8 місяців тому

    *Summary*
    *Introduction to Semantic Router*
    - 0:00 - Introduction to the concept of a semantic router as a key component in building AI assistants.
    - 0:24 - Definition and purpose of a semantic router in AI dialogue systems.
    *Working Mechanism of Semantic Router*
    - 0:42 - Semantic router acts as a fast decision-making layer for language models.
    - 1:00 - The deterministic setup of the semantic router through query-response mapping.
    - 1:33 - Personal experience of using semantic routers in chatbots and agents.
    *Setting Up the Semantic Router*
    - 2:04 - Guide to accessing and installing the semantic router library.
    - 2:32 - Explanation of the library installation process and version details.
    - 3:00 - Steps to restart session post-installation in Google Colab.
    - 3:11 - Creating and testing sample routes for the semantic router.
    *Practical Examples and Usage*
    - 4:07 - Demonstration of initializing embedding models for the router.
    - 4:52 - Introduction to different types of route layers in the library.
    - 5:46 - Testing and interpreting the output of the semantic router with various queries.
    - 7:00 - Example of using the semantic router to control dialogue topics (e.g., politics).
    *Integration with AI Agents*
    - 7:42 - Demonstrating the integration of the semantic router with an AI agent.
    - 8:03 - Enhancing agent responses using semantic router augmented queries.
    - 10:07 - Various applications of the semantic router in customizing agent interactions.
    *Conclusion and Future Developments*
    - 12:43 - Reflections on the implementation and effectiveness of the semantic router in projects.
    - 13:01 - Acknowledging the early stage of the semantic router but emphasizing its effectiveness.
    - 13:30 - Invitation for community involvement and future instructional content on advanced features.
    *Closing Remarks*
    - 14:12 - Concluding thoughts and anticipation for future developments and community engagement.

  • @franciscocaruso4458
    @franciscocaruso4458 8 місяців тому

    Hey great video, and very interesting tool!! This looks very simmilar to NeMo guardrails. What is the difference in this case?

  • @caiyu538
    @caiyu538 8 місяців тому

    Great great. Great

  • @naromsky
    @naromsky 8 місяців тому

    The name is a banger.

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      I wish I could take credit, but it's all the aurelioAI team, my ideas were terrible and fortunately not chosen 😅

  • @carterjames199
    @carterjames199 8 місяців тому

    This is really cool, can the semantic router or do you have plans to allow the use of open source embedding models or maybe like local mini lm models?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      yes you can already, we added it this weekend - see here github.com/aurelio-labs/semantic-router/blob/main/docs/05-local-execution.ipynb
      it works incredibly well, using mistral 7b we get better performance on the few tests I did than gpt-3.5

  • @andydataguy
    @andydataguy 8 місяців тому

    BigAI p100 whey protein 😂 love to see the sense of humor!

  • @roberth8737
    @roberth8737 8 місяців тому

    Passing in just the query could often lack context - requiring the usual "create a standalone question.. etc etc" so that statements are correctly interpreted. Although this could be done with a quick 3.5 query, could we somehow combine the past X queries semantically without resorting to LLMs to solve for those cases?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      We’re working on it, expecting to have a v1 feature for identifying most relevant messages required for a query by next week

  • @eyemazed
    @eyemazed 8 місяців тому

    interesting. how do you determine a threshold for whether a user query belongs to a certain route?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      A default value is set based on the encoder being used at the moment, I want to add route specific thresholds and auto optimization of those soon

  • @JulianHarris
    @JulianHarris 8 місяців тому

    It’d be quite fun to integrate this with the ollama web ui project.

  • @luisliz
    @luisliz 8 місяців тому

    this is awesome is this similar to guidance-ai? sorry if im confusing concepts

  • @zongguixie3407
    @zongguixie3407 3 місяці тому

    Uvula oum😊

  • @merefield2585
    @merefield2585 7 місяців тому

    OK, but isn't this theoretically equivalent to feeding function prompts to catch certain user behaviour? I've already written an escalation function on my chatbot that responds to anger or frustration and is written as a "pure" function prompt. The advantage of that approach is I don't have to maintain another layer of logic. Presumably Open AI is using semantic similarity thresholds to trigger functions, so the approach here is not a lot different? Could you elaborate on the advantages and any disadvantages of doing this locally? On the politics issue, why not just add to the system prompt something like "you must NEVER discuss politics and decline to do so politely. "?

    • @varunmehra5
      @varunmehra5 3 місяці тому

      Do you mind sharing it?

  • @areebtariq6755
    @areebtariq6755 5 місяців тому

    If we are using embeddedings for this ? Where do the system store the embeddings ?
    Or it is as the run time that they generate those ?

    • @jamesbriggs
      @jamesbriggs  5 місяців тому

      Using the local index it will be rebuilt with each session - however we support Pinecone and qdrant index, which will maintain the embeddings, video on that here ua-cam.com/video/qjRrMxT20T0/v-deo.html

  • @georgegowers4037
    @georgegowers4037 8 місяців тому

    Is this the same as the planner in Semantic Kernel SDK?

  • @onufriienko
    @onufriienko 8 місяців тому

    Thanks for sharing James,
    Are there any examples with LlamaIndex? Thanks 😊

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      not yet, will be working on creating examples w/ different libs over the coming weeks

  • @scharlesworth93
    @scharlesworth93 8 місяців тому

    This reminds me of that Nemo rails thing you were talking about a while back, is this the preferred tool?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      for me yes, we built this due to limitations we were seeing with nemo (although, nemo is still a great tool)

  • @avidlearner8117
    @avidlearner8117 8 місяців тому

    Isn't that close to what Constitutional AI, by Anthropic, does? I can see your solution has layers of functionality that the other method doesn't have...

  • @eyemazed
    @eyemazed 8 місяців тому

    i built a RAG for our inhouse project management system. it works fine when users write a prompt like "when was X topic opened and who started it" for example, because it performs vector search on "X" and then creates a context around it. however, it does not work for a prompt like "who's the newest registered user?" because that cannot be retrieved by vector search. it can however be retrieved via database. anyone knows how to solve this? given that LLM can in fact write DB queries... i'm thinking some sort of routing should be implemented for this as well

    • @towards_agi
      @towards_agi 7 місяців тому

      Built something similar but was using a formula 1 dataset. Used a graph db to do multi hop queries.
      E.g. I asked it when a particular driver won their first race. Based on the graph schema the LLM would generate a query that linked the driver and all their races. Then filters by races were the driver got 1st and returns them in ascending order.
      Let me know if you need assistance.

  • @Bubbalubagus
    @Bubbalubagus 8 місяців тому

    How does this differ from NLU inference architecture?

  • @sangyeonlee5417
    @sangyeonlee5417 8 місяців тому

    Wow super excited, how about in case of using different language like CjK ?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      we support Cohere embedding models, and they do have multilingual support, so you would initialize our CohereEncoder using `CohereEncoder(name="embed-multilingual-v3.0")` and that comes with support for CJK languages as far as I know :)

    • @sangyeonlee5417
      @sangyeonlee5417 8 місяців тому

      Tanks alot i will try it. @@jamesbriggs

  • @pascalshehata7648
    @pascalshehata7648 8 місяців тому

    Hey, isn't it just another name for intents?

  • @adsk2050
    @adsk2050 2 місяці тому

    How is this different from langgraph?

  • @Kalebryee
    @Kalebryee 8 місяців тому

    How is this different from guard-rail

  • @shaunpx1
    @shaunpx1 7 місяців тому

    Can this work with triggering function calls when some response or simular response(utterance is detected)?

    • @jamesbriggs
      @jamesbriggs  7 місяців тому

      yeah 100%, I do exactly that here ua-cam.com/video/NGCtBFjzndc/v-deo.html

  • @smoq20
    @smoq20 8 місяців тому

    Is there a way to make it 100% local without relying on OpenAI or Cohere? ChromaDB?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      Fully local support in progress, doesn’t need chroma

  • @larsbell1569
    @larsbell1569 8 місяців тому

    what if it happens to contain both?
    dl("The rainy weather we are having today reminds me of the Prime Minister who is a damp dull man.")

  • @landerstandaert6649
    @landerstandaert6649 8 місяців тому

    Similar Microsoft copilot studio

  • @Mr_Arun_Raj
    @Mr_Arun_Raj 8 місяців тому

    Can I use hugging face embeddings?

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      yes they were added a couple days ago github.com/aurelio-labs/semantic-router/pull/90

    • @jamesbriggs
      @jamesbriggs  8 місяців тому

      can see an example notebook here github.com/aurelio-labs/semantic-router/blob/main/docs/encoders/huggingface.ipynb

  • @seel1823
    @seel1823 8 місяців тому

    Does it work with GPT4? - Ik im asking before it even starts lol

    • @jamesbriggs
      @jamesbriggs  8 місяців тому +1

      yes it can haha, it's a separate layer, so it works with anything you want it to work with :)

    • @seel1823
      @seel1823 8 місяців тому +1

      One last question @@jamesbriggs can you use it with RAG?

  • @altered.thought
    @altered.thought 7 місяців тому

    Are there plans to support JavaScript in the future? 🙃

    • @jamesbriggs
      @jamesbriggs  7 місяців тому +2

      not planned, but would love to

  • @whiskeycalculus
    @whiskeycalculus 8 місяців тому

    Accelerate

  • @chillydoog
    @chillydoog 8 місяців тому

    No offense, but I don't really see the point of this semantic routing technique? I also don't think I fully understand what it is. why wouldn't you just use a chat GPT assistant? It seems like it's doing the same thing, but probably better. IDK feel free to correct me or help me understand with the value and this is.