Simplifying On-Device AI for Developers with Siddhika Nevrekar - 697

Поділитися
Вставка
  • Опубліковано 9 вер 2024

КОМЕНТАРІ • 4

  • @johnkintree763
    @johnkintree763 21 день тому

    The smaller the model, the quicker the first token, the more tokens/sec, and the less electricity used. According to Andrew Ng, using an LLM in an agentic workflow can improve the performance more than increasing the size of the LLM. Also, retrieving a previous response that has been fact checked can be faster than generating a new hallucinated response.

  • @johnkintree763
    @johnkintree763 21 день тому

    Qualcomm AI hub should measure watt hours of electricity used in a typical hour of operation, in addition to the time to first token of output, and tokens per second of output for each optimized model.

  • @johnkintree763
    @johnkintree763 21 день тому

    The next thing I would like to see is a digital agent running on my smartphone that is part of an open source and decentralized global platform for collective human and digital intelligence.

  • @johnkintree763
    @johnkintree763 21 день тому

    Once an application works at all, it can typically be optimized to run faster, and to run on less powerful hardware such as smartphones. Developers of hardware, such as Qualcomm, should provide the means of optimizing software for their hardware.