Deploying Llama3 with Inference Endpoints and AWS Inferentia2

Building, training and deploying machine learning models with Amazon SageMaker (July 2020)

The AI-Powered VS Code Killer? Checking Out Cursor and AI Coding

Крутой фокус + секрет! #shorts

Dad took her, blood pressure soared 180 directly.😡When she came back from the bath, she saw this s

Кирилл Набутов. Над трупом Маслякова надругались, Патрушева прикончили, Терешкова выжила из ума

Deploying Hugging Face models with Amazon SageMaker and AWS Inferentia2

Julien Simon

Переглядів 9 006

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 3 жов 2024
In this video, I walk you through the simple process of deploying a Hugging Face large language model on AWS, with Amazon SageMaker and the AWS Inferentia2 accelerator.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at / julsimon or Substack at julsimon.subst.... ⭐️⭐️⭐️
Notebook:
gitlab.com/jul...
Deep Dive: Hugging Face models on AWS AI Accelerators
• Deep Dive: Hugging Fac...
Blog posts:
huggingface.co...
aws.amazon.com...
Наука та технологія

КОМЕНТАРІ • 11

@briangman3 4 місяці тому ⁺²
Great video!
@juliensimonfr 4 місяці тому
Glad you enjoyed it
@caiyu538 6 місяців тому ⁺⁴
Great
@juliensimonfr 4 місяці тому
Thank you
@rileyheiman1161 5 місяців тому ⁺¹
Great video Julien, thank you! Does the model have to be pre-compiled to run on AWS (EC2 or SageMager)?
@juliensimonfr 4 місяці тому
Thank you. If you're going to deploy on SageMaker, yes. At the moment, our container won't compile the moment. On EC2, the model will be compiled on the fly if needed.
@leonardoschenkel9168 3 місяці тому
hi Julien! Do you have any tips on how can I convert a ComfyUI workflow SD1.5 based model to 🤗 or run directly on INF2 ?
@briangman3 4 місяці тому
I am going to use inf2 to run finetuned llama 3 70B should be great, I am curious about token gen speed on inf2 different sizes, if you can as a side note mention that in your next video, like this created it at x token/s
@juliensimonfr 4 місяці тому
You'll find benchmarks in the the Neuron SDK documentation awsdocs-neuron.readthedocs-hosted.com/en/latest/general/benchmarks/index.html
@larsjacobs253 5 місяців тому ⁺²
Great video! However when I try to deploy llama2 7B on a inf2.xlarge instance, I get an out of memory error. However I have seen posts about people deploying llama2 7b on a inf2.xlarge instance. How can this be?
@juliensimonfr 5 місяців тому ⁺¹
Please post details and logs at discuss.huggingface.co/c/aws-inferentia-trainium/66

Наступне

Автоматичне відтворення

Deploying Llama3 with Inference Endpoints and AWS Inferentia2

Deploying Llama3 with Inference Endpoints and AWS Inferentia2

Building, training and deploying machine learning models with Amazon SageMaker (July 2020)

Building, training and deploying machine learning models with Amazon SageMaker (July 2020)

The AI-Powered VS Code Killer? Checking Out Cursor and AI Coding

The AI-Powered VS Code Killer? Checking Out Cursor and AI Coding

Крутой фокус + секрет! #shorts

Крутой фокус + секрет! #shorts

Dad took her, blood pressure soared 180 directly.😡When she came back from the bath, she saw this s

Dad took her, blood pressure soared 180 directly.😡When she came back from the bath, she saw this s

Кирилл Набутов. Над трупом Маслякова надругались, Патрушева прикончили, Терешкова выжила из ума

Кирилл Набутов. Над трупом Маслякова надругались, Патрушева прикончили, Терешкова выжила из ума

ЭТО НАСТОЯЩАЯ МАГИЯ😬😬😬

ЭТО НАСТОЯЩАЯ МАГИЯ😬😬😬

Hugging Face LLMs with SageMaker + RAG with Pinecone

Hugging Face LLMs with SageMaker + RAG with Pinecone

Amazon Bedrock Vs Amazon Sagemaker #awsbedrock #sagemaker #llm #huggingface

Amazon Bedrock Vs Amazon Sagemaker #awsbedrock #sagemaker #llm #huggingface

Deploy LLMs (Large Language Models) on AWS SageMaker using DLC

Deploy LLMs (Large Language Models) on AWS SageMaker using DLC

Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps

Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Do NOT Learn Kubernetes Without Knowing These Concepts...

Do NOT Learn Kubernetes Without Knowing These Concepts...

How is THIS Coding Assistant FREE?

How is THIS Coding Assistant FREE?

End To End Machine Learning Project Implementation Using AWS Sagemaker

End To End Machine Learning Project Implementation Using AWS Sagemaker

SageMaker JumpStart: deploy Hugging Face models in minutes!

SageMaker JumpStart: deploy Hugging Face models in minutes!

НЕ ВКЛЮЧАЕТСЯ ПОСЛЕ РЕМОНТА В СЕРВИСАХ / Ноутбук Dell Vostro 3501 | РЕМОНТ

НЕ ВКЛЮЧАЕТСЯ ПОСЛЕ РЕМОНТА В СЕРВИСАХ / Ноутбук Dell Vostro 3501 | РЕМОНТ

Почему нужно включать режим самолета 😰

Почему нужно включать режим самолета 😰

REDMI NOTE 14 УЖЕ ЗДЕСЬ. Xiaomi сделали невозможное…

REDMI NOTE 14 УЖЕ ЗДЕСЬ. Xiaomi сделали невозможное…

10 Днів з iPhone 16 Pro Max - НАРЕШТІ ЗРОЗУМІВ

10 Днів з iPhone 16 Pro Max - НАРЕШТІ ЗРОЗУМІВ

Скучнее iPhone еще не было!

Скучнее iPhone еще не было!

Игровой руль - штука годная 👍

Игровой руль - штука годная 👍

iOS 18 в реальной жизни

iOS 18 в реальной жизни

Новый СКАФАНДР Китая, Гибкие микрочипы, ИИ обошел КАПЧУ, Новый патент Apple и другие новости

Новый СКАФАНДР Китая, Гибкие микрочипы, ИИ обошел КАПЧУ, Новый патент Apple и другие новости