I purchased full access to your repo because I love and want to support the work you are doing. Some of the clearest and most articulate explanations about embedding, fine-tuning. Supervised vs unsupervised methods, data prep. Keep it up!
Great video! How are you chunking the videos, by paragraph, sentence, word char, etc? Are you using any overlap in the chunks? Have you tested you system with a smaller llama 2 model? What type of results would one get from maybe a llama 2 13B, or even a 7B that could possibly be ran from home?
Howdy! Here, I chunk into 500 or 750 token chunks. If you chunk too little, then the cropped sentence at the end has too much effect and you get hallucination. If you use too big chunks then you'll get too many questions (and llms aren't able to respond consistently with very long lists of questions, often). Check out my supervised fine-tuning video, that's done on 13B. With enough data, you can get to reasonable quality. 7B - unless you have a lot of data (or are fine-tuning for structured responses) is tough.
@@TrelisResearch Thanks for the reply. I watched the whole series after I had posted this. Very good series! :) What are your thoughts about using a 7B model just for the Q&A creation, and then fine tuning that on the larger 70B model? Is there any benefit for using such a large model on the Q&A creation step?
@@unshadowlabs yeah I think you need to use a big model for Q&A because you don't want hallucination in the Q&A set - data quality is crucial and 7B hallucinates too much
Hi thanks!! A question for a model in which I have more than 2,000 pdfs. Do you recommend improving the handling of vector databases? When do you recommend fine tunning and when do you recommend vector database
Start with a vector database unless, a) you need high latency and short prompts, or b) you want to do structured generation. Fine-tuning may give a small boost but embeddings will be best.
Is it actually possible to do it on a rtx 4090 machine locally without using any cloud api or cloud gpu provider and using multimodal pdf as your input data ?
Well if you have json available to start that’s going to be even easier to process and modify to meet your needs. Plaintext is hardest as there is no structure to go on.
On Runpod, How do I get/amend Llama 70B API by TrelisResearch Template to work with an exposed TCP? The terminal says connection is refused in the terminal and in VScode (preferred). Other templates work fine. Doesn't work: The SSH over exposed TCP: (Supports SCP & SFTP) Works: the Basic SSH Terminal: (No support for SCP & SFTP) works fine. The basic SSH terminal is not going to work with VScode to my knowledge. Perhaps there is a way to edit the templates for these containers so they can work with VS code? I'm really looking forward to digging into your tutorials :)
Hello @GrahamAndersonis, out of the box debian linux does not comes SSH installed. 1. In run pod image, you have to pass the public_key, as well as TCP port 22. 2. please use following commands in the basic command prompt - #### # Update package lists for upgrades and new package installations apt update; # Install OpenSSH server in a non-interactive mode to avoid prompts and questions during installation DEBIAN_FRONTEND=noninteractive apt-get install openssh-server -y; # Start the SSH service to enable remote connections service ssh start; #### 3. post this the run pod will have SSH available to connect. 4. use VScodes remote extension to connect to runpod as remote server. 5. this will have SCP and SFTP enabled
Hi Graham, yeah had this issue too and will post shortly with a workaround. Ultimately the image would need to be updated for a permanent fix (but I don't control that image).
I want to fine-tune on my code. I have multiple folders and files in each project on which i want to fine-tune. Can this private repo work in that? Basically i want to fine-tune on my coding projects.
Yes, this can work. If dealing with a file structure, you may want to decide what files to include and then flatten them into one single .txt file. It can also help to include a directory structure within that txt file as well so the llm knows what it's looking at.
Hi, I just paid for the access to the repo of this video, but I wasn't aware of the option to buy access to all projects in the repo, Is there any way to pay the difference and upgrade? how can I get in touch with you for that? love the work btw!
I purchased full access to your repo because I love and want to support the work you are doing. Some of the clearest and most articulate explanations about embedding, fine-tuning. Supervised vs unsupervised methods, data prep. Keep it up!
Appreciate that! Many thanks
Great video! How are you chunking the videos, by paragraph, sentence, word char, etc? Are you using any overlap in the chunks? Have you tested you system with a smaller llama 2 model? What type of results would one get from maybe a llama 2 13B, or even a 7B that could possibly be ran from home?
Howdy!
Here, I chunk into 500 or 750 token chunks. If you chunk too little, then the cropped sentence at the end has too much effect and you get hallucination. If you use too big chunks then you'll get too many questions (and llms aren't able to respond consistently with very long lists of questions, often).
Check out my supervised fine-tuning video, that's done on 13B. With enough data, you can get to reasonable quality. 7B - unless you have a lot of data (or are fine-tuning for structured responses) is tough.
@@TrelisResearch Thanks for the reply. I watched the whole series after I had posted this. Very good series! :) What are your thoughts about using a 7B model just for the Q&A creation, and then fine tuning that on the larger 70B model? Is there any benefit for using such a large model on the Q&A creation step?
@@unshadowlabs yeah I think you need to use a big model for Q&A because you don't want hallucination in the Q&A set - data quality is crucial and 7B hallucinates too much
is there a way to do the training on a local nv link paired rtx 4090 gpus from raw data(multimodal pdfs) for a llava 13b ?
Yes! But probably better to use Qwen VL 7B. It’s more powerful
@@TrelisResearch even for dealing with multimodal pdfs in high volume ?
@@TrelisResearch why do you think it will be more powerful ?
Hi thanks!! A question for a model in which I have more than 2,000 pdfs. Do you recommend improving the handling of vector databases? When do you recommend fine tunning and when do you recommend vector database
Start with a vector database unless, a) you need high latency and short prompts, or b) you want to do structured generation. Fine-tuning may give a small boost but embeddings will be best.
Is it actually possible to do it on a rtx 4090 machine locally without using any cloud api or cloud gpu provider and using multimodal pdf as your input data ?
Yes, Definitely!
you used plain text for the dataset, is it better than the json format? when choosing one or the other? thanks for the video!
Well if you have json available to start that’s going to be even easier to process and modify to meet your needs. Plaintext is hardest as there is no structure to go on.
On Runpod, How do I get/amend Llama 70B API by TrelisResearch Template to work with an exposed TCP?
The terminal says connection is refused in the terminal and in VScode (preferred).
Other templates work fine.
Doesn't work: The SSH over exposed TCP: (Supports SCP & SFTP)
Works: the Basic SSH Terminal: (No support for SCP & SFTP) works fine.
The basic SSH terminal is not going to work with VScode to my knowledge.
Perhaps there is a way to edit the templates for these containers so they can work with VS code?
I'm really looking forward to digging into your tutorials :)
Hello @GrahamAndersonis,
out of the box debian linux does not comes SSH installed.
1. In run pod image, you have to pass the public_key, as well as TCP port 22.
2. please use following commands in the basic command prompt -
####
# Update package lists for upgrades and new package installations
apt update;
# Install OpenSSH server in a non-interactive mode to avoid prompts and questions during installation
DEBIAN_FRONTEND=noninteractive apt-get install openssh-server -y;
# Start the SSH service to enable remote connections
service ssh start;
####
3. post this the run pod will have SSH available to connect.
4. use VScodes remote extension to connect to runpod as remote server.
5. this will have SCP and SFTP enabled
Hi Graham, yeah had this issue too and will post shortly with a workaround. Ultimately the image would need to be updated for a permanent fix (but I don't control that image).
@@TrelisResearch fantastic work
I want to fine-tune on my code. I have multiple folders and files in each project on which i want to fine-tune. Can this private repo work in that? Basically i want to fine-tune on my coding projects.
Yes, this can work. If dealing with a file structure, you may want to decide what files to include and then flatten them into one single .txt file. It can also help to include a directory structure within that txt file as well so the llm knows what it's looking at.
Hi, I just paid for the access to the repo of this video, but I wasn't aware of the option to buy access to all projects in the repo, Is there any way to pay the difference and upgrade? how can I get in touch with you for that? love the work btw!
Howdy, everyone gets emailed a receipt, so you can just respond to that email!
is "Context" a keyword which this specific model knows? how would it notice it after the blob of text
It should know Context like any other english word and also have seen training data of what that refers to.
Hi Ronan. Where is the code relevant to this video as of june 2024? In the Adv. FT repo, there is no trace of it AFAIK. Thanks.
Howdy, code is in the supervised-fine-tuning branch
@@TrelisResearch Thanks!
Thank you very much
Great 🤠
cheeeeez u give it to me man !
😂❤️