Chapters 00:00:00 - Introduction 00:00:18 - Overview of the Azure OpenAI service 00:01:23 - Applying ChatGPT to enterprise-grade applications on the Azure service 00:02:29 - Retrieval Augmented Generation 00:03:06 - Private Knowledge 0:03:32 - Using ChatGPT in an App 0:04:25 - Asking Questions in the App 0:05:49 - Exposing Details of Conversation Turns 0:06:31 - Injecting fragments of documents 0:06:46 - Different approaches for generating responses 0:08:14 - Adapting style of response 0:09:41 - How Information Protection Works 0:10:02 - Demonstration of Document-Level Granular Access Control 0:11:00 - Adding New Information into Search 0:11:20 - Running Scripts to Add New Information 0:12:04 - Code Behind Sample App 0:12:50 - Overview of ChatGPT 0:13:31 - Using Azure OpenAI Studio Playground 0:14:30 - Building Your Own Enterprise-grade ChatGPT-enabled App
How do you ensure the cognitive search results don’t exceed the 4096 token limit for the ChatGPT? And if they do exceed this (entirely possible with large amount of corporate data), how son you chunk it for ChatGPT ?
This is pure gold. Would it not be reasonable to expect that in the next few years every major MSFT cloud storage and development tool like Azure SQL, SharePoint, Dataverse, Power Apps and Power Bi to offer this feature automatically?
@@MSFTMechanics I was wondering how did you use that giant prompt by using all the histories to the Completion API. I thought there is a limit on the number of token the Completion API can digest?
Honestly, the more I hear about this tech, the more I want to use it for worldbuilding and lore for games and storytelling because wooooow that seems like a good way to prevent ever making another characterization flub or timeline mistake ever again.
Great application! However in my experience you would not be able to rely on the current generation of models to avoid flubs or continuity errors. Have a play - see what you find - but I have found that while the response always makes grammatical sense it doesn't always make logical sense. All it does it estimate a plausible set of words to complete the meaning of your prompt. Most of the time this makes logical sense, but there's nothing forcing it to. So while it would generate a plausible, immersive world which mostly worked, I'm sure every now and again you would still get characters coming back from the dead or teleporting from one place to another or whatever...
I have been playing around with this solution and it's amazing! Nice work from Pablo and the rest of the team! One thing that I still don't get is when we should use a Cognitive Search to index the content for later retrieval based on text versus using Embeddings to get the semantics of each document, store them in a Vector Store for later search based on embbedings of text search (with cosine similarity) for example.
Thanks for watching and checking out the demo. It is correct, that vector stores can often be useful for this and is an active area of research for us, however from what we have seen although embeddings (for vector search) are generally quite good at helping to recall candidate content they are not necessarily as good at relevancy. The research we have seen seems to show that a more hybrid approach (vector with traditional linguistic search such as BM25) generally provides the best results. In this demo, you might have noticed that we leverage our semantic search capability which first uses linguistic search (BM25) to find good candidate content (L1). Then as a second stage (L2), this content is automatically passed to a ML model (which BTW is the same family of models that powers Bing.com) to help with the re-ranking of these results. Hopefully you will find as you test this demo that this does perform quite well. The other advantage here is that you do not have to do your own vectorization of content which can be both time consuming and expensive. However, as mentioned earlier, we are continuing to research how vectorization can play a part here.
@@RichardsonNascimento I am a complete novice when it comes to these development issues. I am looking for someone to create such a chatbot for my website which uses my own organisation's data / information. Can you maybe help me with this?
Could you share the script to update the data? explain how it works? Do you have an automatic way to update the application by doing azd deploy? or other?
We are trying to build a similar solution to enable conversational q&a but using elastic search for indexing with embeddings. 1. How do you decide on the chunk size before indexing? 2. How different would the retrieved chunks based on cosine similarity be when compared with cognitive search?
OMG been looking for this information for 3 weeks. Thank you. Saw it before on another channel but it was very confusing compared to how this video explained things.
It looks really interesting. What about using it to gather insights about structured data? Say for a set of headlines, what is the top-performing headline (based upon summary data) and what the CTA is and how far above or below a benchmark? Basically gathering insights, thru a guided process, from structured data?
Sorry, I didn't get it. Do I understand correctly that if we want to keep our data private, we need to keep it separate from the model? And only add a piece of information during the response generation process. If we start to teach the model our data, will it become public?
Great questions. Short answer is no. The search is retrieving additional information to add to the prompt. Information from prompts is not stored in the large language model. Also, there are multiple instances of the model running, and the ones used for the Azure OpenAI Service are not public instances.
Amazing one! I love it, there are so much nuances in this - its not just simple retrieve and generate - Sudden curiosity - if LLM has much reasoning and language understanding - why cant we ask it straightaway on ranking the documents and filtering the unnecessary ones as a in-context learning prompt - why do we need separate re-ranker component?
Please help!!! What advice do you have for where to store the data used to tune the GPT? We have a complex data set in Azure Storage tables, and are wondering the best database...is it AzureSQL, Access? AzureBlob? Something else?
Thanks for this great video, really exciting! I have one question: Are prompts (and thus company information) processed exclusively in Azure OpenAI Service, and NOT through OpenAI's API?
Great presentation! Now that this information is public knowledge, I need to come up with something more creative when communicating with clients who are interested in LLM's : )
Very interesting! Would it also be possible to make an integration between GPT and SAP or MS Dynamics? I am an SAP FI consultant handling incidents and changes submitted by the finance departments. Would it be possible to make a private model in which GPT can read through the SAP system and give instructions on how to solve certain incidents? For example, if a user gets a certain error when performing a payment run, would GPT be able to analyse where in the system this error is coming from and how to solve it? Not just giving recommendations as is it does now when anonymizing the data and submitting it in the public GPT environment. And fff course all without sharing any information to the outside world.
Hi, do you have details on the type of RBAC role required to be able to deploy the demo? I am getting a : 'the client does not have the necessary permissions to perform the specified action', I have Cognitive Services Contributor access
Can anyone help me with robust strategy for handling dependent and independent questions during the conversation, including generating a standalone question to provide additional context for dependent questions? Is the strategy used here to augment the user's latest question with prior conversation history robust for all kinds of scenarios?
are source citations accurate? because in bing they are often wrong: when you click on a citation, you realize the referenced website is not the actual source of the information provided
Hi! I've been trying to recreate this project on my machine and I'm getting an error I don't quite get. I've found a workaround, but I feel like my workaround is reducing the performance of the assistant. I'm using an AzureOpenAI service that is based on gpt-35-turbo and when I try to ask a question using RRR or RDA I'm getting an exception saying that gpt-35-turbo does not support parameters "logprobs, best_of and echo". I've deactivated them in order to make the project work, but as I've said it feels like the quality of the responses have diminished. Did anybody else encounter this problem?
Where someone has a long session, how does AzureOpenAI service deal with token limits where it has to give the whole context especially where previous responses are long?
We cover that in the video. Your data is not used for training the large language model, it's only part of the prompt for inference as demonstrated in the example.
@@MSFTMechanics Noted. Thanks for responding! So if that's the case, can an organization be HIPAA-compliant (w.r.t to not exposing PII and PHI)? I want to make sure that the in-context learning (or 'RAG') paradigm doesn't expose our customers data to OpenAI / Azure OpenAI or anyone else. That's probably the biggest blocker to implementing any production-grade app for our team. Thanks for a thorough answer in advance.
The question I would have is this one: do the « private data » is « protected »? In a chat, chatgpt said that I sould not share private information with it, because it cannot guarantee that the data « will not be used / made public or something ».
Impressive, the one i was looking for a long while. can anyone suggest, which language model of Azure Open AI can i use to compare 02 pdf documents to check whether the information is available in both documents or not?
You would instrument the same access controls and permissions as you would now for implementing Azure Cognitive Search. We demonstrate that in the video. The information used to augment the prompt is retrieved based on the individual's permissions.
I looks great!! thanks. is there a video explaining step by step the coding from star to end. It would be great for those us who are starting in the AI and Azure
You can sign up for it as an individual developer, but you do need an Azure subscription. For a "free to use" option, you can also try Bing Chat if you're looking for alternatives to OpenAI chat.
We show the manual, on-demand process for updating the search index at 11:27, but normally these types updates would run on a schedule or based on eventing logic.
Chapters
00:00:00 - Introduction
00:00:18 - Overview of the Azure OpenAI service
00:01:23 - Applying ChatGPT to enterprise-grade applications on the Azure service
00:02:29 - Retrieval Augmented Generation
00:03:06 - Private Knowledge
0:03:32 - Using ChatGPT in an App
0:04:25 - Asking Questions in the App
0:05:49 - Exposing Details of Conversation Turns
0:06:31 - Injecting fragments of documents
0:06:46 - Different approaches for generating responses
0:08:14 - Adapting style of response
0:09:41 - How Information Protection Works
0:10:02 - Demonstration of Document-Level Granular Access Control
0:11:00 - Adding New Information into Search
0:11:20 - Running Scripts to Add New Information
0:12:04 - Code Behind Sample App
0:12:50 - Overview of ChatGPT
0:13:31 - Using Azure OpenAI Studio Playground
0:14:30 - Building Your Own Enterprise-grade ChatGPT-enabled App
How do you ensure the cognitive search results don’t exceed the 4096 token limit for the ChatGPT? And if they do exceed this (entirely possible with large amount of corporate data), how son you chunk it for ChatGPT ?
That problem seems to be partially solved now with the token limit increase.
This is pure gold.
Would it not be reasonable to expect that in the next few years every major MSFT cloud storage and development tool like Azure SQL, SharePoint, Dataverse, Power Apps and Power Bi to offer this feature automatically?
Only a matter of time.
Finally, a video about this. Can not wait to dive in deeper in to this subject.
Glad you liked it. The aha moment for us was that search helps create a giant prompt.
@@MSFTMechanics I was wondering how did you use that giant prompt by using all the histories to the Completion API. I thought there is a limit on the number of token the Completion API can digest?
Honestly, the more I hear about this tech, the more I want to use it for worldbuilding and lore for games and storytelling because wooooow that seems like a good way to prevent ever making another characterization flub or timeline mistake ever again.
Great application! However in my experience you would not be able to rely on the current generation of models to avoid flubs or continuity errors. Have a play - see what you find - but I have found that while the response always makes grammatical sense it doesn't always make logical sense. All it does it estimate a plausible set of words to complete the meaning of your prompt. Most of the time this makes logical sense, but there's nothing forcing it to. So while it would generate a plausible, immersive world which mostly worked, I'm sure every now and again you would still get characters coming back from the dead or teleporting from one place to another or whatever...
@@JonTaylor-pp4hl what are the downsides in this process? Which part of the process is of concern to you? May I know.
You can go even further. How about AI characters that you can interact with and can interact with each other, and who build the lore themselves
I like how with every release there more enticing improvements. Game changer.
Thank you! Exciting times
I have been playing around with this solution and it's amazing! Nice work from Pablo and the rest of the team! One thing that I still don't get is when we should use a Cognitive Search to index the content for later retrieval based on text versus using Embeddings to get the semantics of each document, store them in a Vector Store for later search based on embbedings of text search (with cosine similarity) for example.
I'm interested in this difference as well
Thanks for watching and checking out the demo. It is correct, that vector stores can often be useful for this and is an active area of research for us, however from what we have seen although embeddings (for vector search) are generally quite good at helping to recall candidate content they are not necessarily as good at relevancy. The research we have seen seems to show that a more hybrid approach (vector with traditional linguistic search such as BM25) generally provides the best results. In this demo, you might have noticed that we leverage our semantic search capability which first uses linguistic search (BM25) to find good candidate content (L1). Then as a second stage (L2), this content is automatically passed to a ML model (which BTW is the same family of models that powers Bing.com) to help with the re-ranking of these results. Hopefully you will find as you test this demo that this does perform quite well. The other advantage here is that you do not have to do your own vectorization of content which can be both time consuming and expensive. However, as mentioned earlier, we are continuing to research how vectorization can play a part here.
@@MSFTMechanics thank you so much! This is really helpful!
@@RichardsonNascimento I am a complete novice when it comes to these development issues. I am looking for someone to create such a chatbot for my website which uses my own organisation's data / information. Can you maybe help me with this?
@@RichardsonNascimento glad it helped. Thanks for taking the time to comment. 🙂
Could you share the script to update the data? explain how it works?
Do you have an automatic way to update the application by doing azd deploy? or other?
We are trying to build a similar solution to enable conversational q&a but using elastic search for indexing with embeddings.
1. How do you decide on the chunk size before indexing?
2. How different would the retrieved chunks based on cosine similarity be when compared with cognitive search?
Ask chatgpt
OMG been looking for this information for 3 weeks. Thank you. Saw it before on another channel but it was very confusing compared to how this video explained things.
Glad this helped
So you implement this solution?
@@RajaSekharaReddyKaluri Not yet I am going to propose a POC for my company to do this to assist new on-boarded developers code to our standards.
Thank you! Really well explained. I'm eager to get started with some prototyping. Will do that soon!
It looks really interesting. What about using it to gather insights about structured data? Say for a set of headlines, what is the top-performing headline (based upon summary data) and what the CTA is and how far above or below a benchmark? Basically gathering insights, thru a guided process, from structured data?
Hi David, I am looking for insights from structured data as well. Let me know if you already figured it out. Thanks!
How do we connect this to Sharepoint?
Very impressive. Looking forward to see how it could work with technical documents.
Glad you see the potential. Thanks for taking the time to comment
Thank you for this video, it shows that there is an understanding of the company's fears of data loss. I have to test now :-)
Thanks for taking the time to comment and glad you liked it.
What do you mean by data loss?
Question. If we have all our data on remote servers or through AWS, would it be difficult to use those data sources?
Sorry, I didn't get it. Do I understand correctly that if we want to keep our data private, we need to keep it separate from the model? And only add a piece of information during the response generation process. If we start to teach the model our data, will it become public?
Great questions. Short answer is no. The search is retrieving additional information to add to the prompt. Information from prompts is not stored in the large language model. Also, there are multiple instances of the model running, and the ones used for the Azure OpenAI Service are not public instances.
Amazing one! I love it, there are so much nuances in this - its not just simple retrieve and generate - Sudden curiosity - if LLM has much reasoning and language understanding - why cant we ask it straightaway on ranking the documents and filtering the unnecessary ones as a in-context learning prompt - why do we need separate re-ranker component?
Can I get it to point to a sql table?
They have a video for that using Azure SQL
Please help!!! What advice do you have for where to store the data used to tune the GPT? We have a complex data set in Azure Storage tables, and are wondering the best database...is it AzureSQL, Access? AzureBlob? Something else?
Thanks for this great video, really exciting! I have one question: Are prompts (and thus company information) processed exclusively in Azure OpenAI Service, and NOT through OpenAI's API?
Yes, it's a separate instance of the LLM.
@@MSFTMechanics Great, thank you!
Stunning 🤩
I have been waiting for this, This will be a game changer!
It will be. Retrieval Augmented Generation with search is a big deal for generating informed responses.
I want to integrate this with Microsoft teams across my enterprise. Is this able to be done and if so what is recommended? PVAT?
Great presentation! Now that this information is public knowledge, I need to come up with something more creative when communicating with clients who are interested in LLM's : )
That's why we do what we do on Mechanics. Thank you!
What is the power consuption of this services ?
Fantastic I'm using it for a legal database and research bot!
How that Chat GPT take care of data security? Like, how Chat GPT or Cognitive search restricts documents/content where a user does not have access?
Very interesting! Would it also be possible to make an integration between GPT and SAP or MS Dynamics? I am an SAP FI consultant handling incidents and changes submitted by the finance departments. Would it be possible to make a private model in which GPT can read through the SAP system and give instructions on how to solve certain incidents? For example, if a user gets a certain error when performing a payment run, would GPT be able to analyse where in the system this error is coming from and how to solve it? Not just giving recommendations as is it does now when anonymizing the data and submitting it in the public GPT environment. And fff course all without sharing any information to the outside world.
Can you share the link to learn more through videos like these on the features offered by Azure AI studio?
When Azure OpenAI Service will be available to the general public to experiment with? Currently you need to submit a request form and be approved.
Can we integrate azure bot framework? If I want add action to promote results? And is there any way we can do content moderation?
Hi, do you have details on the type of RBAC role required to be able to deploy the demo? I am getting a : 'the client does not have the necessary permissions to perform the specified action', I have Cognitive Services Contributor access
Not sure why its all based on pdfs... regular companies have data in sql. why not show some examples there? how to query relational data
You can also query relational data. We only demonstrated documents, because that was the primary form of that data in the open source sample app
Can anyone help me with robust strategy for handling dependent and independent questions during the conversation, including generating a standalone question to provide additional context for dependent questions?
Is the strategy used here to augment the user's latest question with prior conversation history robust for all kinds of scenarios?
I'm assuming this works with a code repository hosted on Azure as well right?
Is there a c# port of this project ?
Can this use data from SharePoint or does it need to be stored in Azure storage?
are source citations accurate? because in bing they are often wrong: when you click on a citation, you realize the referenced website is not the actual source of the information provided
Can we train our confluence wiki to azure gpt?
i have the same question here
Hi! I've been trying to recreate this project on my machine and I'm getting an error I don't quite get. I've found a workaround, but I feel like my workaround is reducing the performance of the assistant. I'm using an AzureOpenAI service that is based on gpt-35-turbo and when I try to ask a question using RRR or RDA I'm getting an exception saying that gpt-35-turbo does not support parameters "logprobs, best_of and echo". I've deactivated them in order to make the project work, but as I've said it feels like the quality of the responses have diminished.
Did anybody else encounter this problem?
Hi, I have not, actually, I'm using only the chat version, I've removed the ask feature.
How to use the same ai response on your MS Teams?
how do we make sure that Microsoft is not taking our own property code/data in the cognitive search ??
What is the chat UI? Something custom they made or is it available for enterprise customers?
This is the sample app available on GitHub at aka.ms/EntGPTSearch
Are there limitations in Azure Cognitive Search regarding language? Is German worse than English? Thanks!
Where someone has a long session, how does AzureOpenAI service deal with token limits where it has to give the whole context especially where previous responses are long?
It's a different api call in every turn.
Is our data kept privately or shared with OpenAI or used for research?
We cover that in the video. Your data is not used for training the large language model, it's only part of the prompt for inference as demonstrated in the example.
@@MSFTMechanics Noted. Thanks for responding!
So if that's the case, can an organization be HIPAA-compliant (w.r.t to not exposing PII and PHI)? I want to make sure that the in-context learning (or 'RAG') paradigm doesn't expose our customers data to OpenAI / Azure OpenAI or anyone else. That's probably the biggest blocker to implementing any production-grade app for our team. Thanks for a thorough answer in advance.
The question I would have is this one: do the « private data » is « protected »? In a chat, chatgpt said that I sould not share private information with it, because it cannot guarantee that the data « will not be used / made public or something ».
This is a separate instance running in the Azure OpenAI Service and designed to maintain privacy.
Impressive, the one i was looking for a long while.
can anyone suggest, which language model of Azure Open AI can i use to compare 02 pdf documents to check whether the information is available in both documents or not?
Thanks for sharing these invaluable tips and source code. Any experiment results on which approach (e.g. read decompose ask) to use where?
So this is langchain + openai API? Nice
How's the solution implemented to have information protected at user level?
You would instrument the same access controls and permissions as you would now for implementing Azure Cognitive Search. We demonstrate that in the video. The information used to augment the prompt is retrieved based on the individual's permissions.
I looks great!! thanks.
is there a video explaining step by step the coding from star to end. It would be great for those us who are starting in the AI and Azure
it‘’s very good for open and cognitive search
Super!
I'm so lost how to tailor this code to my needs. Is anyone here a software developer that can help me use this code? Thank you.
I can help
what a shame individuals cannot use the Azura openAI services
You can sign up for it as an individual developer, but you do need an Azure subscription. For a "free to use" option, you can also try Bing Chat if you're looking for alternatives to OpenAI chat.
How do you update data after adding or removing docs?
We show the manual, on-demand process for updating the search index at 11:27, but normally these types updates would run on a schedule or based on eventing logic.
I got here from langchain and custom embedding for openai
Looks like this is just a langchain + openai API wrapper
😍😍😍😍😍😍😍😍😍😍😍
I tried and not working
it do not work on mac
It will work on a Mac, it is a web service
Is this app using OpenAI ChatGPT app or it using Azure OpenAI GPT model API?
Why calling all LLM ChatGPT? Do you call all car Ford T?
This is using GPT 3.5 Turbo with GPT-4 support coming soon. It is a separate instance of the LLM, which runs in the Azure OpenAI Service.
I don’t trust anything with the word “Microsoft” in it, but I still end up using their software.
I think I work with you 😂
ChatGPT have some math mistake,not all the answer correct!