@ 55:12 : Wouldn't it be more appropriate to utilize (or whatever the instruction format of the underlying LLM) instead of relying on a customized instruction format? You can use the same prompt but format should be followed depending on underlying LLM
This is great! Just the slide comparing base performance vs performance after fine-tuning makes this exercise worthwhile: proves that differences between foundation models are not *that* large, and that pure prompting is not sufficient to reach good performance (and once you do that, most differences in base models disappear ; though mistral models do seem to be significantly ahead!) Thanks for putting this together! If you're considering a similar comparison in the future, I'd be curious to see the effect of int4 quantization (with and without Quantization Aware Training) on prediction quality. Hard to find proper experiments testing this, mostly seeing evals with latency alone without a proper analysis of the quality cost (and how to reduce it, e.g. with QAT).
@ 55:12 : Wouldn't it be more appropriate to utilize (or whatever the instruction format of the underlying LLM) instead of relying on a customized instruction format? You can use the same prompt but format should be followed depending on underlying LLM
This is great! Just the slide comparing base performance vs performance after fine-tuning makes this exercise worthwhile: proves that differences between foundation models are not *that* large, and that pure prompting is not sufficient to reach good performance (and once you do that, most differences in base models disappear ; though mistral models do seem to be significantly ahead!)
Thanks for putting this together! If you're considering a similar comparison in the future, I'd be curious to see the effect of int4 quantization (with and without Quantization Aware Training) on prediction quality. Hard to find proper experiments testing this, mostly seeing evals with latency alone without a proper analysis of the quality cost (and how to reduce it, e.g. with QAT).
@5:08 - 😂😂😂