Deploy LLMs (Large Language Models) on AWS SageMaker using DLC

AI Anytime

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 30 чер 2023
In this comprehensive video tutorial, I will show you how to effortlessly deploy large language models (LLMs) on AWS SageMaker using the unique DLC (Deep Learning Containers) service. With the ability to deploy models like Falcon 7B and MPT7B, you can quickly configure endpoints and create the necessary infrastructure for seamless deployment.
I will guide you through the entire process, starting from the initial setup of SageMaker to the configuration of LLM model endpoints. You'll learn how to leverage the power of AWS Lambda to trigger events and generate responses using a function URL. This integration enables you to seamlessly incorporate LLM models into your applications and services.
By following this step-by-step guide, you'll gain the confidence and knowledge to deploy LLMs on AWS SageMaker with ease. Subscribe now to embark on your journey towards harnessing the full potential of large language models for your projects. Join me in this video as we explore the fascinating world of LLM deployment on AWS SageMaker using DLC.
AWS Sagemaker: aws.amazon.com/sagemaker/
Falcon 7B Huggingface: huggingface.co/tiiuae/falcon-40b
AI Anytime's Github: github.com/AIAnytime
#sagemaker #aws #ai
Наука та технологія

КОМЕНТАРІ • 95

@SharmaManoj-pl5jy 2 місяці тому
great video brother..I was looking exactly for this stuff and luckily landed over your channel..keep up the good work
@VIVEKKUMAR-kx1up Рік тому ⁺²
really love to see this I will definitely follow this video and get this done today only!!
@AIAnytime Рік тому ⁺¹
Thanks Vivek for your kind words.....
@UpendraKumar-zc8lm 10 місяців тому ⁺⁴
great video sir, I was looking something really like this for my client POC and fortunately landed to your videos. Thanks a ton for your efforts.
@AIAnytime 10 місяців тому ⁺¹
Glad to hear that... keep learning and growing.
@sravantipris3544 Місяць тому
Can anyone please say how much money it will cost for me for doing all this or is it free???
@nikhilkomakula2869 Рік тому ⁺³
Great video Sonu. Thanks for sharing 🙏
@AIAnytime Рік тому
My pleasure 😊
@royriver23 11 місяців тому
I learned a lot here, thanks a lot! 🙌
@AIAnytime 11 місяців тому ⁺¹
Glad it was helpful!
@safaelaqrichi9096 5 місяців тому
Great content and excellent tutorial! thank you
@AIAnytime 5 місяців тому
Glad it was helpful!
@PrasadPrasad-hi7pl 11 місяців тому ⁺⁴
Excellent tutorial. Could you please give a tutorial how to feed a document and extract the answers from it and then deployment it. Thank you in advance❤
@usmanthechamp123 7 місяців тому
Excellent tutorial
@AIAnytime 7 місяців тому
Thank you! Cheers!
@gurjotsingh3726 6 місяців тому
Thanks a lot brother! Means a lot
@AIAnytime 6 місяців тому
No problem
@akshay_raut Рік тому
Great video ❤
@AIAnytime Рік тому ⁺¹
Glad you liked it!!
@Imran-Alii Рік тому
@Thanks for Sharing Sonu!!!!!
@AIAnytime Рік тому
My pleasure!
@AnkitKumar-sz3by 11 місяців тому
Thanks got to know about how to increase output length using hyperparameters ....
@AIAnytime 11 місяців тому
Glad it helped
@Phas0ruk 11 місяців тому ⁺²
Great tutorial thanks. The only bit i got lost with is creating the policy to let lambda call the sagemaker endpoint. GPT4 helped :)
@AIAnytime 11 місяців тому
Glad to hear that ..... Thanks!
@Artificialintelligencia 9 місяців тому
@@AIAnytime better hota bata dete wo policy kaise create karna hai ?
@kevinyuan2735 9 місяців тому ⁺¹
谢谢！
@AIAnytime 9 місяців тому
Thank you so much for the support.
@user-gz3zn6vw4z 8 місяців тому
just great .....
@AIAnytime 8 місяців тому
Thank you!
@aillmforsenioritprofessionals 4 місяці тому ⁺¹
A very nice video.
A small suggestion - Maybe after the video, you should also show how to terminate all the 3 things - notebook, model and endpoint, so that people don't incur a lot of cost.
Keep up the good work!
@AIAnytime 4 місяці тому ⁺¹
Noted
@user-so2yu9im6g 4 місяці тому
if i do all of the things that shows in the video so it will cost me for 24 hours? if yes then how can i save my cost by only triggering it when sending a request and then terminate it?
@sravantipris3544 Місяць тому
Can anyone please say how much money it will cost for me for doing all this or is it free???
@aillmforsenioritprofessionals Місяць тому
@@sravantipris3544 If you use initial free credits then it would not cost you any money. However, make sure to disable all the services immediately else it could go to USD 600+
@parasetamol6261 Рік тому
That Greath Video, Thank
@AIAnytime Рік тому
You're welcome!
@parasetamol6261 Рік тому
@@AIAnytime can your deploy huggingface model on azure please 🙏
@AIAnytime Рік тому
Very soon. I am working on it.
@luizfelipevercosa Місяць тому
Was the API Gateway used for anything? Thank you for the video again! Very useful!
@ricand5498 2 місяці тому
Great, video. Followed along until 48:46. Please go into depth on the Policies error and how you fixed it. I have no experience with AWS and got the same error but you skipped over why the error occurred or detailed instructions of how to solve it.
@ashwanikumar1690 Рік тому
Very informative video... I have a query. There will be some provision to disable the notebook and endpoint when not in use right??
@AIAnytime Рік тому ⁺²
Yes, correct, Ashwani. You can control it completely. Stop the endpoint, delete endpoint, etc! You can also set limits on budget etc.
7 місяців тому
Very instructive video, I would like to know if it is possible to upload the model directly to AWS without going through HF. Thanks you in advance.
@AIAnytime 7 місяців тому ⁺¹
Absolutely , you can do that. That will be a bit of manual deployment. Where you can push the model weights to S3 and using script you can deploy via Sagemaker. The good way is to deploy through DLC. I mean images.
@amitt9053 2 місяці тому
I think I almost missed where you referred "AWS SageMaker DLCs", maybe possible to emphasize more on DLCs.
@pluckthelivingflowers 11 місяців тому ⁺¹
How are you querying the model in Jupyter Labs prior to you ever deploying the model? I am confused by that. Amazing video, just want some clarification if possible. In addition, why is the instance type configured in the deployment code different than the T5.2xlarge you configured in SageMaker?
@alekseiwolff 7 місяців тому
He first configured the notebook instance that was used to run the Jupyter Notebook code, etc. then later in the video he is configuring the predictor (i.e. inference endpoint) that will hold the model and can be called from AWS Lambda
@sravantipris3544 Місяць тому
Can anyone please say how much money it will cost for me for doing all this or is it free???
@programwithpradhan 5 місяців тому ⁺¹
I finetuned tinyllama on my own dataset. Can I deploy my finetuned model with these steps that you mentioned in this video
@AIAnytime 5 місяців тому
Absolutely....
@SloanMosley 11 місяців тому ⁺¹
How does it compare with hosting on a cheaper cloud provider or GPU such as Lambda Labs?
@AIAnytime 11 місяців тому ⁺¹
Depends..... AWS is the primary cloud provider. If you are working in IT you will probably work with AWS, Azure or GCP. AWS provides different ways of deploying this models like via one click deployment using DLC . And hourly rate or pay per go is quite affordable. But yes you have to select your options. Based on many things like data protection, privacy, governance, scaling, etc.
@pablogonzalezrobles803 9 місяців тому ⁺⁴
how much is the monthly cost of keeping the service up?
@aillmforsenioritprofessionals 4 місяці тому
it could go up to $500 or more. We need to terminate if we don't want to incur this cost.
@naturalmeditation2241 4 місяці тому
Error show when choosing 70b how do fix it this error showing jumpstart-dft-meta-textgenerationneuron-llama-2-70b-f
Something went wrong
We encountered an error while preparing to deploy your endpoint. You can get more details below.
operation deployAsync failed: handler error
@VIVEKKUMAR-kx1up Рік тому ⁺²
in the video I have 2 doubts.
1) 48:48 you have created some IAM > Policy like AWSlambdaBasicExecutionRole-30e....... and AmzonSagemaker-ExecutionPolicy...
how did you did that!!
2) 46:40 what is that "path" : "\example" can you please explain that!!
@AIAnytime Рік тому ⁺¹
1. You can create policies in the IAM. Search IAM in search box. Open IAM, look for policies in the left hand side and go inside it add policies.
2. Path is typically related to the URL path of the incoming HTTP request, specifically when working with API gateway. Mainly, you will configure API gateway after Lambda function. That's why but you can ignore that. You can just define queryStringParameters. param1=query or something depending upon how you write your lambda code.
@VIVEKKUMAR-kx1up Рік тому
@@AIAnytime thankyou!!
@user-so2yu9im6g 3 місяці тому
i am stucked on policy creation. anybody can help or have a guide how to create that policy?
@siddhardhselvam6861 11 місяців тому
Hi! I am trying to deploy llama 2 in sagemaker. Not sure how to use the HF tokens. The endpoint is failing saying that the repo is gated.
@AIAnytime 11 місяців тому
May be you have to login into Huggingface hub using your access tokens. Just do a login from notebook cell in sagemaker. Then you can deploy. FYI, DLC for official Llama2 is still not available for deployment. You can deploy manually or from jumpstart.
@sravantipris3544 Місяць тому
Can anyone please say how much money it will cost for me for doing all this or is it free???
@satyamtiwari3839 Рік тому ⁺¹
how to fix permissions error problem
go to i am Access management/Policies and make new policy and create for s3, lamda, and sagemaker give access to all permission and save after that link the policy to your sagemaker project go to Access management/Policies select the policy you created then go to entitled attached and attach your sagemaker sand save problem fix
@SubodhDeolekar 7 місяців тому
Great. Thanks for the detailed steps.
@pagadishyam7049 10 місяців тому ⁺¹
Hi, getting below error when creating a end point. can anyone help please. Error Message: " UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-tgi-inference-2023-09-04-16-49-09-918: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.. "
@AIAnytime 10 місяців тому
Can you check if you are using the right model? Do you need to authenticate with the Huggingface model repo? Pls look at the logs in cloudwatch in AWS console.
@pagadishyam7049 10 місяців тому
checkpoint = "MBZUAI/LaMini-T5-738M"
@raghurana1 11 місяців тому ⁺¹
How can u fine tune this model with your own data ?
@AIAnytime 11 місяців тому ⁺¹
1. Prepare the data in Alpaca format 2. Spin up a machine like g5.2x large or above 3. Fine tune using PEFT and QLoRA
@VenkatesanVenkat-fd4hg Рік тому ⁺¹
Hi, Thanks for the valuable video. My doubts are
1. How you handled error on lambda related to IAM policy whether it is specifically for accessing sagemaker endpoints here...
2. For getting api response, whether we do not need any flask or fastapi implementation.
Can you guide on this. waiting for ur responses and videos....
@AIAnytime Рік тому ⁺¹
Hi Venkatesan, you have to attach the policies in IAM policy for lambda, S3, sagemaker, etc. For getting an API response, you can deploy a Microservice as well. I have created a function url from lambda that i can use in any of my app through backend like FastAPI, flask, Streamlit, etc.
@VenkatesanVenkat-fd4hg Рік тому
@@AIAnytime Thanks for the kind response whether lambda is mandatory here or i can use inference endpoint from aws sagemaker directly in fastapi....
@utkarshtiwari7947 4 місяці тому ⁺¹
I am getting error here-
model = llm_pipeline()
generated_text = model(input_prompt)
print(generated_text)
ValueError: The following `model_kwargs` are not used by the model: ['return_full_text'] (note: typos in the generate arguments will also show up in this list)
@user-tf1iq7mr1w 3 місяці тому
i am also getting error here
@Ryan-yj4sd 11 місяців тому ⁺¹
where is the code?
@gurjotsingh3726 6 місяців тому
Bro I am not able to add the policy, can you help?
@AIAnytime 6 місяців тому
Why are you not able to attach policies? Can you open an issue on GitHub repo of this video and put some screenshots so I can help you debug?
@SubodhDeolekar 7 місяців тому
How to fix "Internal Server Error" ?
@AIAnytime 7 місяців тому
Can you paste the error complete trace?
@sravantipris3544 Місяць тому
Can anyone please say how much money it will cost for me for doing all this or is it free???
@user-wr4yl7tx3w Рік тому
How about on Google Cloud
@AIAnytime Рік тому
Very soon.....
@sndrstpnv8419 4 місяці тому
where it code repo on your girhab?
@AIAnytime 4 місяці тому
This should be
@prithviprakash1110 11 місяців тому ⁺¹
Is this serverless?
@AIAnytime 11 місяців тому
Yes it is.....
@deepthireddy1789 4 місяці тому
Unable to understand clearly because of video quality, please provide high quality video.
@AIAnytime 4 місяці тому
Sure... Thanks for the feedback
@bingolio 11 місяців тому
When will you begin to look like your Avatar photo? 😝
@AIAnytime 11 місяців тому
Haha... I usually keep like that. Let's c 🔜
@SMSHRIYOKESHST 9 місяців тому
Hi, thanks for the video it teaches a lot. I just want to know, what is ideal notebook instance i can go to load and deploy starcoder 15B model? At first level, i tried with ml.g4dn.xlarge instance but i got "out of memory error".
@febyferdinansyah8896 10 місяців тому
I've got this error, how to solve it?
Test Event Name
generateTestResponse
Response
{
"errorMessage": "'queryStringParameters'",
"errorType": "KeyError",
"requestId": "4196dde2-b2e7-4863-afa7-f2a67129021b",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 10, in lambda_handler
query_params = event['queryStringParameters']
"
]
}
@ashutoshroutaray7080 10 місяців тому
I'm getting the same error, did you find anything? @AIAnytime can you please check?

Наступне

Автоматичне відтворення

Containerizing LLM-Powered Apps: Part 1 of the Chatbot Deployment