If you have a very strong GPU machine with high memory, you can clone the model to your local machine, follow their README for setting up and run it with "python subprocess" instead of replicate locally. Replicate basically a Cloud API, which "lends" you a GPU Compute Engine for those don't have enough budget to buy such one.
This is done in part 2 of the course (see course preview here: ua-cam.com/video/_C-boIci0C8/v-deo.html). If you are interested please put yourself on the waiting list at ai-for-dev.com
The presentation was great, however instead of sending data off to aws and processing much if not all queries in house on my own data servers. Privacy is paramount.
Absolutely, prioritizing privacy by processing data in-house is a smart move. Leveraging open-source solutions and hosting LLMs on your own servers offers both control and security.
Very cool channel being so responsive to the comments. Going to check this out in more details in the coming days
Is there any way to run the transcription with diarization locally?
If you have a very strong GPU machine with high memory, you can clone the model to your local machine, follow their README for setting up and run it with "python subprocess" instead of replicate locally. Replicate basically a Cloud API, which "lends" you a GPU Compute Engine for those don't have enough budget to buy such one.
Yes, you can find the used models on huggingface.co including the instructions how to run them locally.
Great video, thanks for the tutorial! I just subscribed to your channel. Tolle Zeiten in denen wir leben!
🙌
Thank you for the impressive video, even better if there is an on-premise solution.
This vid is awesome! thanks :) I just subbed
This is great. Did you automate the final version of the meeting notes as well or cleaned it up yourself? If automated please show that as well.
This is done in part 2 of the course (see course preview here: ua-cam.com/video/_C-boIci0C8/v-deo.html). If you are interested please put yourself on the waiting list at ai-for-dev.com
What would you do if the transcript is past the token limit for the LLM?
If the transcript exceeds the token limit for the LLM, I would break it into smaller, manageable chunks and process each one sequentially.
The presentation was great, however instead of sending data off to aws and processing much if not all queries in house on my own data servers. Privacy is paramount.
I was thinking the same
Absolutely, prioritizing privacy by processing data in-house is a smart move. Leveraging open-source solutions and hosting LLMs on your own servers offers both control and security.