Developing for Indic languages | Gemma and Navarasa

Поділитися
Вставка
  • Опубліковано 13 тра 2024
  • While many early large language models were predominantly trained on English language data, the field is rapidly evolving. Newer models are increasingly being trained on multilingual datasets, and there's a growing focus on developing models specifically for the world’s languages. However, challenges remain in ensuring equitable representation and performance across diverse languages, particularly those with less available data and computational resources.
    Gemma, Google's family of open models, is designed to address these challenges by enabling the development of projects in non-Germanic languages. Its tokenizer and large token vocabulary make it particularly well-suited for handling diverse languages. Watch how developers in India used Gemma to create Navarasa - a fine-tuned Gemma model for Indic languages.
    Watch the full keynote: ua-cam.com/users/liveXEzRZ35u...
    To watch this keynote with American Sign Language (ASL) interpretation, please click here: ua-cam.com/users/live6rP2rEWs...
    #GoogleIO #GoogleIO2024
    Subscribe to our Channel: / google
    Find us on X: / google
    Watch us on TikTok: / google
    Follow us on Instagram: / google
    Join us on Facebook: / google
  • Наука та технологія

КОМЕНТАРІ • 45

  • @qirimca
    @qirimca 25 днів тому +6

    Let's ensure support for the preservation of the endangered Crimean Tatar language. Our NGO is ready to help you

  • @JohnnyB.
    @JohnnyB. 25 днів тому +5

    This is a necessary step in this process of evolution. Being able to communicate but not forcing people to learn a specific language. Keep the cultural norms of their own society but still being able to communicate will be amazing :)

    • @Google
      @Google  25 днів тому

      Pumped for you try it

  • @Mahesh-ij7os
    @Mahesh-ij7os 24 дні тому +1

    Awesome, Superb, excellent!! Excited to explore this, Thanks Google!

    • @Google
      @Google  23 дні тому

      Ready to make magic happen together! ✨

  • @gamarsh1960
    @gamarsh1960 26 днів тому +15

    Another excellent step. Thank you !!

    • @Google
      @Google  26 днів тому +1

      We can't wait to see everything you create ✨

  • @santoshsantosh30
    @santoshsantosh30 25 днів тому +1

    Superb folks….many many congratulations

    • @Google
      @Google  25 днів тому

      Can't wait to hear what you think

  • @AmrinderSingh-rq8nw
    @AmrinderSingh-rq8nw 26 днів тому +3

    Congrats to Navarasa and excellent initiative by Google in showing LLM innovation from India.
    Following this space in detail, I must say that Indic LLMs are becoming a big deal now. And even surpass GPT - 4.
    Socket AI labs recently unveiled an LLM called Pragna 1B which has more efficient tokenizer than GPT 4 for Indic languages.
    GenVR Research unveiled AryaBhatta Gemma LLM which is a Gemma model trained on 6 million plus Indic cultural data (10x more SFT data than most Indic LLM) and is currently the leader on Indic LLM leaderboard and also on Microsoft Pratiksha leaderboard. And became the first Indic LLM to surpass GPT 4 on human evals in Microsoft Pratiksha study. Gemma finetune again.
    OpenBioLLM70B is the current leader on Medical LLM leaderboard and is finetuned on llama-3-70B. And is created in India.
    These three models (one made from scratch, one on Gemma and one on llama-3) show that Indians can surpass GPT - 4 despite our funding crunch.

    • @AmrinderSingh-rq8nw
      @AmrinderSingh-rq8nw 26 днів тому

      Microsoft Pariksha study *

    • @techxting
      @techxting 25 днів тому

      can't wait to tell sam that yes we indians can do it.

  • @pmishraofficial
    @pmishraofficial 26 днів тому +2

    Superb!

  • @sdd201
    @sdd201 25 днів тому

    Thats really great to see all languages and cultures followed by thousands, and even millions treated equally.
    thank you Google for this initiative, I am proud to be in such a world,
    thank you Google!!!!

  • @jagdishnigam5
    @jagdishnigam5 26 днів тому +9

    Rest in Peace OLA Krutrim.

  • @hkmstreams
    @hkmstreams 23 дні тому

    Really needed. Long due!

  • @Mr_Battlefield
    @Mr_Battlefield 25 днів тому +1

    Next is to ensure every language in Indonesia is resolved as well. There's so many different languages here too. Please include Indonesia in many different Google Projects that are typically only included in United States first.

  • @i.dragons
    @i.dragons 25 днів тому

    Amazing!

  • @wojownicza12
    @wojownicza12 17 днів тому

    nice, thanksss

  • @kingki1953
    @kingki1953 25 днів тому +1

    i wonder if i could use this to maintain Javanesse language

  • @maverick.gaurav
    @maverick.gaurav 26 днів тому +18

    Google is upping the game everyday. This will be really helpful to Indians if it actually delivers.

    • @Google
      @Google  26 днів тому +2

      Excited for you to try it

    • @warpdrive9229
      @warpdrive9229 25 днів тому

      Damn! Google replied to your comment XD

    • @htmlfortomorrow
      @htmlfortomorrow 6 годин тому

      Try what? Google is doomed check the comments,ill make sure Google isn't there at 2060,6 th generation computer,your Ai is stupid,i understand but sorry​@@Google

  • @Ari.xyzefg
    @Ari.xyzefg 24 дні тому

    This is awesome. Thank you google.
    Please include Bengali too 🥺

  • @trenfa4371
    @trenfa4371 25 днів тому

    Thank you 🙏 Google...

  • @legendgaming8877
    @legendgaming8877 25 днів тому

    Hope it works well

  • @suvamkeshari
    @suvamkeshari 25 днів тому

    This will definitely help understand Sanskrit and will help to learn the language as well.

    • @Google
      @Google  23 дні тому +1

      Thrilled you're ready to play around with it!

  • @balajilaveti
    @balajilaveti 25 днів тому

    Thankyou, Google
    .
    .
    #teampixel

  • @BeyondImaginationzz
    @BeyondImaginationzz 25 днів тому +1

    hope this encourages big companies to translate their content to indian languages 😅

  • @user-sz5vx8lz9f
    @user-sz5vx8lz9f 25 днів тому

    В аккордах Мироздания/
    Природа внемлет/
    Для продолжения/
    Быть/не взъерошив Землю!//

  • @Yuvraj.Agarwal
    @Yuvraj.Agarwal 26 днів тому +5

    Greate initiative!

  • @FONK6969
    @FONK6969 25 днів тому

    Can we do it for Nepalese too please?

    • @tui3264
      @tui3264 25 днів тому +1

      small market I think, how much data nepalese language generate ? I recently made one for sanskrit to understand old books using LLAMA 3 so try if you are developer, you are from land where panini (father of linguistics) researched

  • @h1mangshu
    @h1mangshu 25 днів тому

    Google dhanyawaad from entire India.

  • @chatoanil
    @chatoanil 24 дні тому

    Now my mom can also search in google using her native language Telugu.. Thanks TeluGoogu.. :)

  • @luizgeraldo3412
    @luizgeraldo3412 20 днів тому

    Solteiro moro no Brasil

  • @I_am_who_I_am_who_I_am
    @I_am_who_I_am_who_I_am 26 днів тому +6

    I sure hope you don't bring India's religious and class bigotry to the entire world.

    • @jayasimhanmasilamani9078
      @jayasimhanmasilamani9078 26 днів тому

      Insightful comment!

    • @Science-vt4vg
      @Science-vt4vg 25 днів тому

      Meanwhile Muslims demanding Sharia in UK, Germany and and creating chaos in Europe

    • @tvm73836
      @tvm73836 24 дні тому

      If you think bigotry is the exclusive domain of India and Indians it reveals both your bigotry and ignorance. Are you not even following what’s happening on college campuses in the US?

  • @ThatPJboy
    @ThatPJboy 26 днів тому

    This io was ass as always