the conversion doesn't go through unless we convert to gguf. At least it was the case for me when I did the work. May be some recent commits to the library has eased the process and skipped the step?
yes. I feel models up to a size of 7B parameters quantized to 4 bits should fit into raspberry pi. Anything larger might be out of memory. Try the mistral 7B or llama2 7B pls. Did you try running on raspberry pi and face any issues? I am curious now :)
Hi, what is reason you first convert it to FP16.gguf not directly to 8 bit.
the conversion doesn't go through unless we convert to gguf. At least it was the case for me when I did the work. May be some recent commits to the library has eased the process and skipped the step?
Good one
Thank you 🙂
cannot we convert base model to gguf format & quantize with cpp. cannot we apply LORA after that?
yes that could also be one of the routes. But I am not sure if LoRA can be applied after. Did you try it out at all?
What about raspberry pi, can this be applied to them as well.
yes. I feel models up to a size of 7B parameters quantized to 4 bits should fit into raspberry pi. Anything larger might be out of memory. Try the mistral 7B or llama2 7B pls.
Did you try running on raspberry pi and face any issues? I am curious now :)
@@AIBites I haven't tried to run it as of now, still looking for what model would be best.
I'm not able to convert openelm model into gguf format! need help for that
sorry about the late reply but did you manage to convert now or is it still a problem?