If anyone from Google Cloud or Microsoft Azure are watching this 👀 The official SSML (Speech Synthesis Markup Language) spec actually does have an attribute called "duration" for the "prosody" tag to say exactly how long the synthesized speech should be, BUT no services seem to support it. (UPDATE: Azure now does support a duration tag!) 😔 (Amazon Polly actually does but only for non-neural voices) Please add support for this tag, because it would make it WAY simpler to accomplish all this! It would eliminate the need to stretch the audio or do multiple passes. ------------Some Additional Notes for everyone else: • Update: The app now supports DeepL! • Obviously it's not anywhere near perfect, so it is probably best suited for viewers who wouldn't be able to watch the video in English at all. To avoid UA-cam automatically switching to the dubbed track for people who have a very high likelihood of understanding English anyway, I might consider just doing it for languages with fewer bilingual speakers.
Even though translations are great, I hate the implementation of it, in the sense that there is no setting to enable or disable whether it automatically selects the language your app is set to. The same for video title translations.
@@lajawi. Exactly! UA-cam should either implement a switch to turn it off completely or ideally let us choose which languages we understand natively and only switches to a native language if the video is non-native.
You should use the API of DeepL, it is much faster and more accurate than the Google translator, in fact, I don't even speak English, all this text is completely translated with DeepL. (I say this because the Spanish audio, although understandable, is poorly translated).
As an Indian, I can confirm that the Hindi dub sounds ridiculously good. It even uses less popular words from Hindi which aren't very common in day-to-day conversations, but are an essential backbone of Hindi. Impressive stuff!
@@kindofanmol yes some words rarely used in daily Life,and lack of human emotions, only straits reading Hindi me bole toh- fill nahi aa raha hai, But ye kaam karega Main start kar raha affiliate me logo ke hindi videos ko English me dub karna Bahut paisa Hai
Having listened to the Italian version, I have to say, I am very impressed with the accuracy of the translation and the voice generation, up until you start talking about technical stuff, where terms like host, play, run, are translated out of context and make it more difficult to understand what you say. Still, I’m very impressed and can’t wait to see how this evolves!
Your work is simply amazing! A doubt about the function of audio tracks on UA-cam. Are you able to add audio tracks to old videos, or is this limited to future videos only, and can only be added at upload time?
Yes, you can easily add them to videos you already uploaded. As example: Mr. Beast, he used this option in some old videos, for example: Mr Beast "Golden pizza" video were created so much before after this function. I hope it helped you :D
@@Fiaspo Makes Sense, but alot of other youtubers do this Fiaspo, JuckSucksAtLife did this for one of his old old videos for spanish for texting, since he were thinking of removing the JackSucksAtSpanol channel, and theres some mini youtubers that did that, i remember people with like, 5k, 60k subs doing that, and since you are a Very big youtuber, i guess you could actually do It, and Ive been studying alot about this feature, and if you want i can contact you things like how to apply and things like that.
Add: Mr Beast were only the 192nd youtuber to have this added to their Channel, and this function is already quite old, and UA-cam probably have added more features to it.
In the Russian Yandex browser, there is a similar function on all UA-cam videos, but the list of languages is limited. There are only the most popular languages. The voice in Yandex is much better and more human, and the translation quality is good. And he also gives each person in the video their own voice. If two men speak, then they will have different voices, and women will have a female voice.
Let me tell you, I tried your video translated to Spanish and even though is not 100% perfect it is very very good. The translation and synchronization was almost perfect. Great job. 👍
Man what's up with people low standards? do we Spanish speaking people are so desperate to get attention from the people at the north? Lmao That dub was trash, the sync was terrible, no attempt to match lips, generic monotone voice not even close to Thio's You can get dramatically better results asking an amateur south american fandubber on fiverr to do the dub for pennies lmao Impressive work, there's no doubt about that, great job indeed. But all that effort for a less than mediocre result? the least we can do is be honest with Thio about the results so things can improve in the future
@@GianniLeonhart I could report your comment as racist or discriminatory but, them your comment would be deleted and other people won’t be able to see what kind of poor mentality you have. Sorry for you. 👎
@@SurmenianSoldier I said that the translation was not perfect but, it is the beginning of something that eventually is going to improve. Everyone just wants to criticize and complain about the efforts done by others but, the same people complaining and criticizing do nothing. Always is like that.
Actually yandex can translate videos. It has been working for a year. Now it can translate only into russian from english, german, french, spanish and italian. Also translation from chinese into russian will be available soon. It can't translate into other languages, but it translates much better. It detects voices of people, and gives each person his own voice (yandex has 6 male and female voices). It can translate any video with supported languages, it uses voice to text (It does not use youtube automatic subtitles). Also stream translation is on beta, but now we can use it only on limited amout of channels like NASA, english speaches.
@@danko5866 I think it's not Yandex Zen. Yandex Browser - chromium clone with yandex services. yandex zen is another service. And it's not part of yandex now.
Love this so much! I don't think all channels have the ability to upload videos with multiple audio tracks yet (only my big channel has an audio track thing next to where you add subtitles) but this will be really useful in the future. One feature I would love to see in this is the ability to set multiple speakers.
@Zaydan Alfariz idk I have the option on channel that has less than 10k subscribers but I did submit my passport for identifying verification for using advance features. Also you can't add additional audio tracks once your video has gained 100k views, it greys out the option after that.
Вау. Хоть и звучит русская дорожка не идеально, но это всё равно очень круто. Наверное, пока что я бы использовала английскую дорожку, так как в целом понимаю всё, что говорят, но с видео из других, не англоговорящих стран так сделать не выйдет и данное приложение будет очень полезно.
@@shikajf у Яндекса буквально так же :/ я смотрел несколько обучающих английских видео на русском и во всех них диктор либо быстро либо медленно говорил
@@rodion_runchev Я думаю у яндекса работает лучше именно для перевода на русский. Я прочитал в другом комменте, что он умеет давать разным людям разные голоса (всего есть 6 мужских и 6 женских). Но автор тоже сделал реально круто, плюс работает переводит на многие языки, так что более универсально.
6:17 Note, the feature to speed up or slow down audio is actually just a built in browser feature, not a youtube-specific thing. UA-cam just uses the browser's integrated video playback feature, which supports playback rate. It also doesn't sound all that great. But Audacity can do something similar if you disable "high-quality stretching". And yes, for some kinds of audio (notably speech, but not music) it does sound better.
На самом деле это хорошо, что появился такой инструмент. Хоть русский перевод и не такой идеальный как в проприетарном решение от Яндекса в их браузере, но все равно это очень не плохо
@@ЕвгенийДемьянов-х2ь понятное дело, что там используется полностью другой механизм перевода, но идея коммента была в том, чтобы сравнить итоговый вариант
En español hay veces en que acelera la voz y también hay veces donde la desacelera, pero la voz se escucha muy natural no parece uña Inteligencia artificial
Однако в яндексе голос просто перекрывает видео со всеми остальными звуками (ну вроде так было). А тут же можно иметь отдельный файл со всеми звуками, которые будут применены ко всем языкам.
Good stuff! Do you also see an impact on your metrics like retention and average view duration by dubbing over a video into multiple languages? I've found that by including high-quality captions, videos see higher average view duration and retention with manual captions vs. auto generated captions. I would hypothesize that the impact is even more profound when you dub over a video. There's a reason Mr. Beast is hiring a crew to dub over all his videos into multiple languages to improve the accessibility.
Here's the transalation: Until now, when watching videos that were automatically translated overseas, I was desperately chasing the subtitles, so I couldn't watch the video, but now that I have the audio, I can easily concentrate on watching the video! ! it's amazing! ! I want it as an official function of UA-cam!
Também acho incrível poder ler seu comentário traduzindo automaticamente pelo Google. A internet está ficando incrível. Conhecimento ultrapassando as fronteiras. 🥳👏🇧🇷
I love that you so heavily add comments in your code. I know a lot of experienced devs in the industry prefer not to add comments on code that is "self-evident", but what is self evident for them isn't necessary so for me.
This is very wonderful. I could hardly feel any discomfort when listening to the Japanese audio tracks. However, I think Deepl is more accurate than Google Translate when it comes to translating between Japanese and English. This text was also translated by Deepl!
Let me know if there is a difference in this. I'll write this in DeepL and next one above it is gonna be from Papago. I use that to translate Japanese to English . So do tell me. この違いがあれば、教えてください。 これと異なる点があれば教えてください。これはDeepLで書いて、その上の次のものはPapagoから書くつもりです。日本語から英語への翻訳に使っています。だから、私に教えてください。 This is Papago one. Let me know how accurate it is compare to the DeepL . I use this Papago to translate(right now it's with Honorifics) and Romaji to translate from Japanese to English. It should be more "accurate" since its a Korean company and Japanese and Korean have similarity. これはPapagooneです。 DeepLと比較してどれくらい正確なのか教えてください。私はこのPapagoを使って翻訳しています(今は敬語と一緒です)。ローマ字を使って日本語から英語に翻訳しています。 韓国の会社と日本語と韓国語は似ているので、もっと「正確」でなければなりません。 korewa poppagūn'desu dīparuto hikakushite dorekurai sēkakunanoka oshietekudasai watashiwa kono papagōo tsukatte hon'yakushiteimasu imawa kēgoto ishodesu rōmajio tsukatte nihon'gokara ēgoni hon'yakushiteimasu kan'kokuno kaishato nihon'goto kan'kokugowa niteirunode motto sēkaku de nakereba narimasen'
@@Enforcedcraft I found DeepL more accurate than Papago. DeepL has a fairly natural way of translating Japanese grammar and words, making the text very readable.
hmm, the sentence "I could hardly feel any discomfort when listening to the Japanese audio tracks" seems like a bit of a weird way to phrase that thought. Is that exactly what you meant, or was Deepl doing something weird?
@@vibaj16 there are a lot of similar things even in human translated text. You would probably have to sacrifice some accuracy in order to avoid this kind of thing.
I know this can be helpful for a lot of people, but I don't like that for me as a German it starts the video with German audio track and I manually have to switch to the English track, although English is one of the preferred languages of my Google account. The same problem actually happens with subtitles always showing even for languages you know, but you can turn off available subtitles always showing (of course they then also don't show for languages you don't know). I hope UA-cam will add a setting for not using alternate audio tracks by default (so far there isn't one), before this feature gets widely used. If you know the original language well enough, the original will always be better than a translation (even if the translation and voice acting is done by humans). (This is because some things like puns don't always work in the other language.) I think this is a problem that many sites from the US (and other mostly monolingual countries) have, that they don't really let you set multiple preferred languages (or in Googles case don't consistently use that information). (The worst case I have seen is on WattPad, where you have to change your language settings to find books/stories in a different language.) PS: The German audio track sounds pretty good, but it has a few weird quirks: At one point there was a significant jump in voice speed to a very slow speed that sounded unnatural (maybe you should cap the slowdown to some reasonable value and just pad with silence in that case). Also the AI didn't know how to translate 'you' and sometimes went with the honorific 'Sie' and sometimes the more personal 'du' (on UA-cam it should always be the personal, unless your demographic is over 50 and you want to treat them as strangers), sometimes it should have translated 'you' with 'man' when talking about general instructions (in sentences like 'So how do you create them?') and instead went with 'du'.
I'm a German too, and Google just doesn't care what your preferred languages are, it will always show any translations that are in your native language: titles, descriptions subtitles, and now audio tracks too. It's been like this forever and I doubt they're going to do anytging about it even tho it's super annoying :/
As a Spanish speaker I could say that it is a great tool, it is very interesting and capable to do great stuffs, but it has their flaws. Listening this video in Spanish I could say that it feels unnatural. I think that almost everything is properly translated, but the problem with that is on English, the speech has their own rhythm, but on Spanish because of the words and structures, there is some phrases that are really slowed down and the next one is speed up, and you have this some sort of roller coaster that can easily confuse and annoys you. I think, on companies that dubbs content, this issue is resolved by re translating the phrases to make it ended with the same length as the original speech, e.g. Spanish 10:17 "Entonces de esta manera, todo esta cubierto" English 10:17 "So this way, everything is covered" The speed on English is almost double as the Spanish translation, so on the Spanish audio it sounds really fast, but to make it sound at the same speed as the English audio, we could say and use instead: Spanish 10:17 "Entonces todo esta cubierto" This is just an example, I am pretty sure that there are better ways to translate these as a context. Also, the language that you use on your original audio should be Neutral, because when you started to talk about prices, everything was "converted" to the target language, e.g. "$20" does not sounds like "$20 dollars", instead sounds like "$20 pesos" which is a lot cheaper of what you think, and could confuse people; maybe next time say literally "20 dollars". Also, for example, there are "North American Expressions" that cannot be translated and made it understandable by other languages and cultures. As I said at the beginning, is a great tool, but needs some tweaks (and also consider taking advices from the subtitles community, which most of them translate subtitles and knows better how to do it).
Рік тому+18
Si, exacto, claramente esto se puede solucionar con una revisión del subtitulo traducido usado para generar el audio, pero ya involucra intervención humana.
i think the same mimic of lips dont sinc with sound. and some time the voice go fast or to slow. That is because the sound have to much the second on srt subtitles. Vamos que no me gusta. Prefiero subtitulos bien hechos que abarcar de todo y que esten a medias.
Рік тому+7
@@slevinshafel9395 Yes, a way for the program to read the original audio and try to match vowels and silences would be cool to see, but again, dub studios do this manipulating the script to match it better. Maybe we will have ai that does this at some point.
This will certainly be improved over time with A.I. However I think the options should be always available. I like watching media in it's original language, whatever is it, and with subtitles. So, I don't like that the app chooses automatically to play the dubbed version, even if I have my settings in general for not doing so. But as there is the option to change it, it's all Ok. I hope that option still be there in the future...
As an Arab, Dubbed version in Arabic is way better than what I expected. Still needs some edits here and there but overall good enough to understand. Great job as always ❤
I have created my self some scripts to aid me in editing videos, I have one that detects the intensity of the voice to make smaller or bigger text for overthevideo subtitles, so I already do make srt files for all my videos, this sounds like a great addition to add more ways to make content avariable for everyone.
I immediately checked the German dub (my native language), even before the involvement of AI was revealed. It was instantly apparent from the abysmal grammar. But fair enough, it's better than nothing I should mention: as a computational linguist o adore this project from a technical standpoint! Never think the effort isn't appreciated
I had the same feelings on the German dub. Terrible grammar in lots of places, but better than nothing. I'm not sure if I prefer it over just watching in English tbh.
Tested the Hindi one, would be very cool if youtube inplements that natively. Definitely would remove the language barrier significantly. Really appreciate your initiative.
As hindi speaker, it's very good features it was easy for me to understand content But, it's very pure Hindi translation you have to turn on english subtitles to understand them
The Hindi translation is absolutely spot on but it is a bit too pure when compared to the Hindi that is commonly spoken. It does throw around a healthy amount of English words but a few more here and there would make it perfect "Hinglish"
Speaking two languages i can say the dubbed version works fine, it's got the issues you'd expect from machine translation and speech synthesis but i can understand what's being said, so good job!
Are the subtitles machine translated? I was under the impression that they are actual translations, otherwise there isn't much point as you can auto-translate youtube subtitles anyway.
@@Leonhart_93 UA-cam auto translation is a one-size-fit-all solution, so it still make sense to run machine translation manually. That's said, in its current state the difference in quality probably isn't that good yet, especially that technical words are just as easy to fail.
Watching the german dub I have to say, I'm quite impressed with the result. I think someone hearing just the dub would be able to understand pretty much everything. However there are a lot of weird phrases. The vibe of the german dub is very different. I think most of the problems are because translating the trascript snippet by snippet, there is a lot of context lost and it shows in the translated text. Another thing is that the translator tends to use very formal speech, so it sounds more like a coporate presentation than you just explaining your program to a casual audience. Sometimes the AI voice is unusually slow/fast and it sound like a bit like those shady ads (think: "This self taught kid genius build a simple device to cut your energy consumption in half. Scientists hate him for it"). All in all still a huge leap in accessibility (even if I doubt most people know where/how to switch audio tracks).
Honestly its kinda crazy youtube waited so long to add it. UA-cam already automatically translates the title into your language. I remember all the way back in like early 2015 i kept clicking on Spanish Fnaf UA-camrs videos and being mad they were in Spanish, despite being no indication of it. People were also already making dub channels, such as El Smosh.
Watching the Japanese dub, you might want to allow some sentences to have a maximum slow speed (so short sentences don't get stretched over originally long sentences), or merge sentences that have a completely different duration than their original sentences to prevent rapid fire sentences being immediately followed by drunkenly slow sentences. For the second one, it is also needed when sentences are fragmented over multiple subtitles and the translator bugs out thinking they are competely separate sentences.
I feel like this is useful for monolingual audiences, but I'm bilingual and prefer to watch videos in their original languages without being automatically translated. I have my UA-cam set up in Japanese because it kept translating video titles from Japanese to English. It usually doesn't happen the other way around. I don't need the translations and it often ruins the original meaning of the title. When I clicked on this video it started playing in Japanese. No thank you. This video was originally recorded in English, I want to watch it in English. UA-cam really needs to work on making options to disable certain translations for multilingual people, or at least provide an option to not translate at all.
@@lucadoesthings Right, but UA-cam's default behavior is to translate video titles and descriptions to your interface's language with no regard for whether or not you can speak the original language of the video in question. There's no way to disable this behavior, which is fine for most people, but not for me, and I'd imagine most other multilinguals. Remember, UA-cam serves hundreds of millions of people, and one percent of 100 million is still a million. Even if only one percent of UA-cam's users are multilingual, that's still millions of users being done a disservice. It's be a shame to see the audio go the same route with no way to disable or change this behavior.
@@mrcryptozoic817 Murrica is not the entire world. There are 5 whole other continents filled with people who learn English (or some other language, German is common here too) as their secondary language.
its so weird remembering you as the guy who made troll tech tutorials like 7 years ago and seeing you do wonders for the platform now. not complaining tho, you're amazing
I'm Brazilian and I'm in love with this resource, I found out about it in a video on how to make a blog from scratch, it was an American video, I was scared by the guy speaking my language! I jumped and went to the UA-cam player and saw an additional feature of the UA-cam player! Dubbing!! This is so perfect, you really seem to speak Brazilian Portuguese NATIVELY! WTF
It was one of the best experiences I've ever had watching a video on UA-cam! I played around and changed the audio track a bunch of times, turning on and off the captions, it was amazing seeing how it works, nice work!
Такая функция уже давно есть в Яндекс браузере. Там если ты смотришь видео на английском (даже прямую трансляцию) оно будет переводится на русский голосом ассистента Алисы (аналог Сири в России)
When the video suddenly stopped for a brief moment to load because it hadn't buffered enough it reverted back to English, so maybe there's a bug that automatically changing the quality reverts the audio language back to the original?
Eu ao contrário do que vcs estão dizendo, achei fascinante isso, e ajuda muito. A voz está bem natural, logicamente que não será igual uma fala natural, masssss é só o começo, e para um começo está muito bom de verdade. Isso me deixa muito empolgado pois facilita ver qlqr áudio. E ao contrário que alguns estão dizendo, essa ferramenta possui função para facilitar o entendimento de um vídeo e não para substituir dubladores de filmes ou afins. O programa pensa em transmitir a informação do áudio em qualquer idioma e não passar emoções pela voz igual um dublador faz.
Even though translations are great, I hate the implementation of it, in the sense that there is no setting to enable or disable whether it automatically selects the language your app is set to. The same for video title translations.
I'm not gonna lie, I'm very surprised and delighted that you chose to do Indonesian audio. It's very rare to come across non-Indonesian content creators who even has Indonesian subtitles for their content, let alone dubbing. Many kudos to you 👍👍 Also, the translations is very good, just that it's very rigid. Indonesian uses the passive voice much more often than English, so some sentences seem rather superfluous. Still, it's very good for AI!👍
Oh my gosh, you literally made UA-cam better for other people who speak another language. I would like to add that maybe you could also do Lithuanian? For my fellow Lithuanian speakers. Thanks for your work we really appreciate it. Keep it up. (Every time you make a new video I'm very excited)
mabey but mabey not too i remember when i was like 6 or 7 and didn't know how to talk with english while playing crash so i wanted to know how to talk with it no matter what and this is probably why i become good at it but if everything will be just translated in the next some years there will be 0 point on trying to learn a new language because you don't need to
Listening to the Japanese version - I am surprised how accurate it is, although the way the speed changes constantly reminds me of how you change the rpm on a record player.
I'm an Arabic speaker, and honestly other than a few hitches, it works really well! I still needed to get used to the constant slowing down of the voice to match your speech, but I think that's a good compromise for how revolutionary this tech is for youtube videos!
It was really cool watching your video in Portuguese! It felt like low budget dubs from TV shows that do not care about lip sync like documentaries from Discovery Channel for example, with the added detail that the narrator speaks suddenly fast or slowly to keep up with the original video. Personally, I would prefer the awkward pause to the guy going Turtle Mode every once in a while, but what you did is really cool! You make an amazing software developer and I can't wait to share your future videos to my Brazilian friends. Cheers!
Yes, the software should run one pass, and look at how much each clip needs to be sped up and slowed down. Then it should try to even out the differences, so you don't have a fast clip between two slow ones.
The two pass system is really apparent in the german version, some parts are quite quick and some are insanely slow. Additionally there are some grammatical faults which aren't hard to spot, and it is in general very weird.
Error with prices 11:07. In english the symbol "$" means USD, but audio track in spanish says "pesos", maybe mexican "pesos". In this case, $ is "dólares americanos" (in spanish).
I wish more people made use of the secondary audio tracks, there are some neat things you could do. For example, there are a lot of ASMR videos that specify a specific pronoun for the viewer, or have background music that not everyone might like. If a creator wants to provide options currently they can only upload a second version of the video, which is likely to do bad things to their engagement ratings since the viewership would be split up. Having separate audio tracks could really help channels willing to put in that extra effort.
"that specify a specific pronoun of the viewer, or have background music that not everyone might like" uh some feature for that already exists right here on youtube, it is called 'exit the video and watch other one which u like' and for the pronoun thing, first does it matter that a random youtuber who is barely aware of your existence doesn't get your pronoun right? and second most youtuber refer their audience as YOU or some cringy name what do you wana fix on that? different people have different preference and nowadays different pronouns too, its hard to include all people in a ~10 min video.
@@trashpanda4 I don't care whether a UA-camr *does* it or not. My point is that the ones who do are *punished* for it, and that sucks. I'm not sure if you watch ASMR or not, but part of the process involves being relaxed and comfortable and some people are not comfortable being referred to by pronouns they don't like (or just don't like music with their sounds). Typically they'll just skip the vid, but sometimes there are creators who are awesome and literally make different versions with different pronouns (or with and without music, ect.) and then publish both even though it's going to do bad things to their overall metrics because very few people will watch both versions. I'm not suggesting we force creators to do this, I'm saying that there's already people who do, and this would be a system that would allow their hard work to not actively harm them.
@@MudakTheMultiplier UA-cam algorithm favours quantity over quality and if its a well established creator then the *getting punished* part is making the exact same video with slight variations. all things besides, idk what asmr people do over there but isn't there only one 2nd person pronoun which is 'you' (in English) unless its a group conversation simulator asmr how does pronouns enter the game?
@@trashpanda4 you greatly overestimate the viewership of most ASMRtists. I would say the bulk of the mid to large scale ASMRtists get 30-60k views on most videos, which is absolutely in range where getting one video that doesn't do as well will drop overall metrics. And several of the people I've seen do this actually only have one public facing video. And then private the other one and link to it in the description to combat this, which will result in some nontrivial fraction of your viewers clicking through to a separate video which will not contribute to algorithm weight. there are other gender defining terms (boyfriend/girlfriend, brother/sister, and so on) and it's not uncommon to refer to the listener in third person if only to cut down on saying "you" 50 times in one script. Unfortunately the alternatives are usually a bit clunky. Often a creator will pick the pronouns that match the bulk of their viewers, which then means that anyone who feels uncomfortable just loses a channel that they might love otherwise. That's why this feature is so cool because it's literally designed for accessibility, even if the original designers in 1990 would have never expected it to work like this.
For German: - A few grammar/translation mistakes - Audio speed changes are noticeable - It switches between the impolite "du" and polite "Sie" for addressing the audience - At 15:00 there is some weird overlapping of the dubs But: this is truly amazing work. The fact that these are the only things I've noticed given that this is all an artificial translation is mind-blowing. I'm so stunned right now 🤯
Actually there is also DownSub which allows you to download the titles of the different languages, you will have to upload an SRT file of a language to the video which in turn translates it DownSub to detected subtitles :)
1. Checked out the Russian automatic dub, it's not bad. There are some mistakes in translation and the voice's speed sometimes almost matches the speed of a machine gun. Deffinetly this needs to be a semiautomatic tool. 2. If you continue using this tool for your videos, why not turn your voice into AI generated one? 3. This tool reminds me of Yandex Browser's realtime video translator.
This sounds pretty good. There are some flaws (emphasis, pronunciation of some words in English, but these are the problems of Azure voice itself), but in 90% of cases everything is just fine. Of course, it does not reach dubbing, but we remember how it was in The Witcher 3 with shrinking and stretching audio tracks.
It's really impressive that all of this can now be automated. I've tried several languages I'm familiar with, there's often a significant difference in speed between each sentence as expected, but the only way around this would be to change the translated text to reduce these differences (perhaps approximating a proportion of the original number of syllables , though not necessarily). Future work. But that alone meets many dubbing needs.
German has a little problem with pronouns. In the second person singular we have "Du" and "Sie". "Sie" (not to be confused with "sie" without a capital s, the feminine third person singular) is the formal form, used for example in official letters. "You," on the other hand, is the more personal form, used in conversations with friends and family, for example. Google Translate uses "you" because it translates formally, but it felt a little strange to be called "you" in the video. "You" would be a bit better, but I don't think it's the best solution either, and I don't know why I feel that way. Translating other languages into German is apparently a bit difficult for Google Translate, or Google Translate is just overall bad, idk. Still, it was very interesting to see how far we've come with our technology when it comes to translating audio tracks, and that a script by just one person can do the whole thing relatively well.
LTT is doing something similar. They have made a demo for Spanish for an English video. I can understand Urdu so I tried the hindi language track in this video as it is closest to Urdu. What I found was at some places the voice was too fast. Then it went to normal. At some internvals the speed of the translated voice was too fast to understand properly.
That's coz the ai uses a lot of words to translate what Thio said. And it uses pure Hindi and not the modern Hindi language (which is an amalgamation of Hindi, Urdu and English) plus efficiency is something that can only be achieved with context awareness
Yeah I remember that, I have my youtube in Spanish because that's the country I'm on and youtube wont let me change it, got into LTT's video and thought it was just some joke until I was 6 minutes in thinking "alright when are they going to switch back into English?"
No me lo creo, está herramienta está increíble, ojalá pudiera ser más fácil el entrenamiento y transcripción... En un futuro seguro será algo como sube tu vídeo solo con el audio de voz sin efectos y verifica que la transcripción sea correcta y el programa se encarge de lo demás. Increíble trabajo! Es muy inspirador ver cómo usas diferentes herramientas ya disponibles para crear un servicio más complejo. Un saludo!
This feature has been in the Yandex browser for six months. But there is one caveat. It translates from different languages into Russian. And this works with streams as well.
@@xenondestiny Yeah, but it could be fixed with some manual editing, and instead of hiring an entire dubbing team, a creator can just look for multilingual editors that modify the audio file and it's timings. This is the greatest use of AI I've seen so far.
Hello from Russia, the Russian company Yandex has done what it says in this video, but better) There is a Yandex Browser where the video translation function is available, it translates it by pressing the button at the top of the video within a couple of minutes, selects the timbre of the voice being played suitable for the timbre of the person from the video (distinguishes female/male voice as well) And watched this video through it, and works much better
Es un excelente desarrollo! La voz y textos en español son casi perfectos, ¡felicidades! Sería estupendo si esta aplicación pudiera aplicarse cuando somos sólo usuarios y no creadores de contenidos, ¡sería un hitazo! Me apunto para la beta! Por lo pronto, suscrita y lo mando a mis creadores favoritos!
0:52 I’m basically working on the same thing but for sign language (English to ASL I’m not doing SLR) and that translation part is so hard, like I set up whisper (it also works with UA-cam videos, I actually set it up for UA-cam first and then modified it) in maybe 20 minutes, the playing signs for each word with a transcript took like 80 hours (which is probably a lie I worked on it in my free time way too much)but that translation is such a behemoth to tackle for me because surprise there aren’t that many tools that do that, so I have to make it myself kinda. There is someone who has a parallel corpus for Asl and he says it’s on his website but that English and ASL gloss side are just the same text file which I know isn’t right because he has a sample corpus with 80000 sentence pairs which is right but it wasn’t enough. I tried it and it worked for the test file but falls apart when used else where because of the lack of training data
I have been wanting just the multi-channel audio feature to be rolled out publicly for a while now. Language tracks are the obvious use, but imagine all the times the music in a video was too loud or just annoying. Imagine being able to mute just the music. Or with sports broadcasts, being able to turn off the crowd sounds or the commentators individually. One day, bandwidth might even be good enough for live content to have multiple channels that you can individually adjust. No more trying to tell a streamer that their music is too loud or the game is too quiet.
Translation to Arabic is Ok..as a matter of fact it's better than expected..but the disparity of timing between paragraphs is odd and can sometimes be confusing.. however , this step in itself is very promising..thank you for your effort
As a modest linguist, Thio, I found this utterly fascinating - and brilliant! In my late 70's, oh to be in my teens again studying languages, when in those far-off days so much of it was a book-driven drudgery! I have reached the conclusion that today's youngsters now have no excuses for failing an exam, given the astonishingly wide range of 'aids' available.
As an argentinian, this is very a very good and revolutionizing tools, specially since it's just begining, but it has some flaws and i want to point out a major one for me: For some reason, the program that translated the subtitles to other languajes took every set of subtitles as a different sentence, which if the other languajes has different rules to english it will probably screw things up. For example(screw ups in spanish only): -In english, adjectives go before nouns. While in spanish, adjectives go after nouns. This can cause that words are put out of order, and worse, if the adjective in english is also a noun. For example, the sentence "i like strawberry flavoured gum", which should be translated as "i like gum with flavour of strawberry", is translated as "i like the strawberry. Gum with flavour" -In english, objects are not gendered. In spanish however, they are(they shouldn't be and their gender is arbitrary, but that doesn't matter), if the pronoun and the noun are in different subtitle sets, the program may use a different pronoun than the one it should. -In english, verbs are usually conjugated by putting another word before it. If the word before the verb and the verb and in different sets, the verb is conjugated incorrectly and the sentence doesn't make sense.
adobe premiere has some tools for isolating audio down to the word level with an automatic transcript, I think. I wonder if it could be possible to have some auto translation plug-in over there? Feed it an audio track in english and it will spit out a 12 language audio track. this could become some kind of new standard in media.
This was really weird watching you in german (especially because of those small errors which are to be expected) Edit: I also really want a french translation
It sounds super easy, but it really took me long to realize that at the end, you have to pay a lot to be able to make this voice over with either google cloud or azure, and then if you have to pay, maybe it's better to pay a voiceover actor it will be cheaper, easyer, and more natural. Maybe it's me that i can't put this python to work, but at the end everyone is stuck at a paywall.
If anyone from Google Cloud or Microsoft Azure are watching this 👀 The official SSML (Speech Synthesis Markup Language) spec actually does have an attribute called "duration" for the "prosody" tag to say exactly how long the synthesized speech should be, BUT no services seem to support it. (UPDATE: Azure now does support a duration tag!) 😔 (Amazon Polly actually does but only for non-neural voices) Please add support for this tag, because it would make it WAY simpler to accomplish all this! It would eliminate the need to stretch the audio or do multiple passes.
------------Some Additional Notes for everyone else:
• Update: The app now supports DeepL!
• Obviously it's not anywhere near perfect, so it is probably best suited for viewers who wouldn't be able to watch the video in English at all. To avoid UA-cam automatically switching to the dubbed track for people who have a very high likelihood of understanding English anyway, I might consider just doing it for languages with fewer bilingual speakers.
Like this! Please make romanian
second
I assume it's because they're going to do the same as you and generate the dialogue twice.
Even though translations are great, I hate the implementation of it, in the sense that there is no setting to enable or disable whether it automatically selects the language your app is set to.
The same for video title translations.
@@lajawi. Exactly! UA-cam should either implement a switch to turn it off completely or ideally let us choose which languages we understand natively and only switches to a native language if the video is non-native.
as a Brazilian, watching you dubbed in Portuguese feels like I'm watching a documentary on a cable channel xD
Yep, I agree
me assustou pra poha
@@torpedo_ tbm, com essa dublagem estranha do nada kkkk
God, yeah, it feels so weird.
Agreed
You should use the API of DeepL, it is much faster and more accurate than the Google translator, in fact, I don't even speak English, all this text is completely translated with DeepL.
(I say this because the Spanish audio, although understandable, is poorly translated).
I'll have to look into it
I came here to suggest the same thing, DeepL is much more accurate than Google!
Sora the Troll even recommends DeepL when it comes to English Japanese translations.
Yup, as a German student i can say DeepL is much more accurate for french and English than Google Translate
Somewhat this is what it's lacking for me because some wording use for other words is not refined to native speaker
As an Indian, I can confirm that the Hindi dub sounds ridiculously good. It even uses less popular words from Hindi which aren't very common in day-to-day conversations, but are an essential backbone of Hindi. Impressive stuff!
the voice sounded very fast and very slow different times. it even overlapped in some times. its good but it still is very flawed.
Ya it uses some 'less popular words'
It's the only problem
I hope there's some Hinglish engine which uses words we are more familiar with!!
@@lucky_lol I've been using shuddh unadulterated Hindi for a while now and its such a flex tbh. So many people are losing the essence of the language.
@@kindofanmol yes some words rarely used in daily Life,and lack of human emotions, only straits reading
Hindi me bole toh- fill nahi aa raha hai,
But ye kaam karega
Main start kar raha affiliate me logo ke hindi videos ko English me dub karna
Bahut paisa Hai
@@Explorerladka2001 oh dubbing as in subtitles? kidhar?
Having listened to the Italian version, I have to say, I am very impressed with the accuracy of the translation and the voice generation, up until you start talking about technical stuff, where terms like host, play, run, are translated out of context and make it more difficult to understand what you say. Still, I’m very impressed and can’t wait to see how this evolves!
im italian too!
@@Dmbniancsa buonasera signori!
I’m also listening to the Italian, and the voice is pretty concise but at times I get hung up on it
Tachipirina 500 if you take 2 it becomes 1000
Le file però lol
He is literally *REVOLUTIONIZING* UA-cam
Its funny how he used to be a troll and disrespected in the beginning
@@ali99_82 and now he is so wholesome
@@ali99_82 I still remember that video where he said if you wrap your phone in tinfoil and charge it, it will charge faster
Russians have same fuction at least one for year. ^_^.
@@KingLarbear no
From being a parody channel to revolutionizing UA-cam, ThioJoe came a long way!
When he said the new settings option I proceeded to listen to the whole thing in German. I don't speak German.
@@jimschips254 I do speak German.
The german audio track sounds robotic to a native speaker.
@@jannikheidemann3805same thing with english, and every other language
i still sometimes have hard time believing what he says
@@poopiecon1489 lol
Your work is simply amazing! A doubt about the function of audio tracks on UA-cam. Are you able to add audio tracks to old videos, or is this limited to future videos only, and can only be added at upload time?
slv fiaspo
Yes, you can easily add them to videos you already uploaded. As example: Mr. Beast, he used this option in some old videos, for example: Mr Beast "Golden pizza" video were created so much before after this function. I hope it helped you :D
@@Loneslol Mr. Beast is a guy that UA-cam could easily make an exception for modifying old videos. So my doubt about this dubbing option still exists.
@@Fiaspo Makes Sense, but alot of other youtubers do this Fiaspo, JuckSucksAtLife did this for one of his old old videos for spanish for texting, since he were thinking of removing the JackSucksAtSpanol channel, and theres some mini youtubers that did that, i remember people with like, 5k, 60k subs doing that, and since you are a Very big youtuber, i guess you could actually do It, and Ive been studying alot about this feature, and if you want i can contact you things like how to apply and things like that.
Add: Mr Beast were only the 192nd youtuber to have this added to their Channel, and this function is already quite old, and UA-cam probably have added more features to it.
I'm japanese and so superied at how natural the translation is. I never saw anything like this!
Yandex Browser
Si, se escucha muy natural, parece una persona real que lo tradujera
The speed up and speed down is a bit odd at times, but it's actually surprisingly good!
最高
日本語で言っちゃうけどこれは作者に脱帽するレベル
字幕だけ日本語対応してる場合はラジオ感覚でざっくり理解しながら聞くっていうのが難しかったけど、音声も他言語に対応するようになってUA-camの利便性がめちゃくちゃ向上した 感謝感謝
In the Russian Yandex browser, there is a similar function on all UA-cam videos, but the list of languages is limited. There are only the most popular languages. The voice in Yandex is much better and more human, and the translation quality is good. And he also gives each person in the video their own voice. If two men speak, then they will have different voices, and women will have a female voice.
Эль Примо
Да нах он нужен
@@damedane5981 чтобы видео смотреть на иностранном языке.
@@costyapetrov1067 ну а какой смысл? язык лучше знать. единственное что на ютюбе появиться больше возможностей для заработка, яндекс тут не нужен
Рил?Я его не скачивал из-за того что он жрет много
Let me tell you, I tried your video translated to Spanish and even though is not 100% perfect it is very very good. The translation and synchronization was almost perfect. Great job. 👍
Yep from my understanding, Google Translate is really good
Man what's up with people low standards? do we Spanish speaking people are so desperate to get attention from the people at the north? Lmao
That dub was trash, the sync was terrible, no attempt to match lips, generic monotone voice not even close to Thio's
You can get dramatically better results asking an amateur south american fandubber on fiverr to do the dub for pennies lmao
Impressive work, there's no doubt about that, great job indeed. But all that effort for a less than mediocre result? the least we can do is be honest with Thio about the results so things can improve in the future
@@GianniLeonhart I could report your comment as racist or discriminatory but, them your comment would be deleted and other people won’t be able to see what kind of poor mentality you have. Sorry for you. 👎
@@rafaeltorovip lmao
even i am a spanish speaking person myself yet i think the translation is crap
@@SurmenianSoldier I said that the translation was not perfect but, it is the beginning of something that eventually is going to improve. Everyone just wants to criticize and complain about the efforts done by others but, the same people complaining and criticizing do nothing. Always is like that.
This is revolutionary! I thought Mr. beast hired an interpreter/translator to dub his voice in some of his video for Indonesian audience.
Actually yandex can translate videos. It has been working for a year. Now it can translate only into russian from english, german, french, spanish and italian. Also translation from chinese into russian will be available soon. It can't translate into other languages, but it translates much better. It detects voices of people, and gives each person his own voice (yandex has 6 male and female voices). It can translate any video with supported languages, it uses voice to text (It does not use youtube automatic subtitles). Also stream translation is on beta, but now we can use it only on limited amout of channels like NASA, english speaches.
huh interesting, ill have to try it out thank you :)
Oh wow this must be really cool for Russian speakers
Was just about to mention this, heard this is a very useful feature from some of my Russian friends (that only speak Russian)
Yandex has a youtube competitor?
Edit: I searched up, it's called "Yandex Zen"
@@danko5866 I think it's not Yandex Zen. Yandex Browser - chromium clone with yandex services. yandex zen is another service. And it's not part of yandex now.
Love this so much! I don't think all channels have the ability to upload videos with multiple audio tracks yet (only my big channel has an audio track thing next to where you add subtitles) but this will be really useful in the future. One feature I would love to see in this is the ability to set multiple speakers.
@Zaydan Alfariz idk I have the option on channel that has less than 10k subscribers but I did submit my passport for identifying verification for using advance features. Also you can't add additional audio tracks once your video has gained 100k views, it greys out the option after that.
Вау. Хоть и звучит русская дорожка не идеально, но это всё равно очень круто. Наверное, пока что я бы использовала английскую дорожку, так как в целом понимаю всё, что говорят, но с видео из других, не англоговорящих стран так сделать не выйдет и данное приложение будет очень полезно.
у Яндекса лучше
@@shikajf нет
@@shikajf у Яндекса буквально так же :/ я смотрел несколько обучающих английских видео на русском и во всех них диктор либо быстро либо медленно говорил
@@rodion_runchev Я думаю у яндекса работает лучше именно для перевода на русский. Я прочитал в другом комменте, что он умеет давать разным людям разные голоса (всего есть 6 мужских и 6 женских). Но автор тоже сделал реально круто, плюс работает переводит на многие языки, так что более универсально.
Я просто хочу функцию добавления аудиодорожек к оригинальному видео, чтобы можно было смотреть оригинал в фанатском переводе
6:17 Note, the feature to speed up or slow down audio is actually just a built in browser feature, not a youtube-specific thing. UA-cam just uses the browser's integrated video playback feature, which supports playback rate. It also doesn't sound all that great. But Audacity can do something similar if you disable "high-quality stretching". And yes, for some kinds of audio (notably speech, but not music) it does sound better.
yep. just about to write that
На самом деле это хорошо, что появился такой инструмент. Хоть русский перевод и не такой идеальный как в проприетарном решение от Яндекса в их браузере, но все равно это очень не плохо
Как раз таки решение от Яндекса, имеет абсолютно другой механизм перевода. Поэтому в данном случае, очень глупо их как либо сравнивать.
@@ЕвгенийДемьянов-х2ь понятное дело, что там используется полностью другой механизм перевода, но идея коммента была в том, чтобы сравнить итоговый вариант
En español hay veces en que acelera la voz y también hay veces donde la desacelera, pero la voz se escucha muy natural no parece uña Inteligencia artificial
Однако в яндексе голос просто перекрывает видео со всеми остальными звуками (ну вроде так было). А тут же можно иметь отдельный файл со всеми звуками, которые будут применены ко всем языкам.
@@nutsalhan8729 la polla esto la verda y tormundo hablandon su idioma
Good stuff! Do you also see an impact on your metrics like retention and average view duration by dubbing over a video into multiple languages? I've found that by including high-quality captions, videos see higher average view duration and retention with manual captions vs. auto generated captions. I would hypothesize that the impact is even more profound when you dub over a video. There's a reason Mr. Beast is hiring a crew to dub over all his videos into multiple languages to improve the accessibility.
Wow Kevin ❤️
Imagine merging this script with Magenta synthesis project where sounds will sound natural and musical instead of robotic.
Wow, One of my favorite creator commenting on a video of another one of my favorite creators.
@@just.nobody same. I didn't know he watches ThioJoe
Same
今までは海外の自動翻訳された動画を見るときは字幕を追うので必死で映像を見れませんでしたが、音声があることで楽に映像を集中して見ることができるようになりました!!
これはすごい!!
UA-camの公式機能として欲しいですね!
数年前に自動翻訳の字幕で海外の動画も見れるスゲーってなってたのにもう吹き替えが来るとは!
そうですね
Here's the transalation:
Until now, when watching videos that were automatically translated overseas, I was desperately chasing the subtitles, so I couldn't watch the video, but now that I have the audio, I can easily concentrate on watching the video! !
it's amazing! !
I want it as an official function of UA-cam!
Também acho incrível poder ler seu comentário traduzindo automaticamente pelo Google. A internet está ficando incrível. Conhecimento ultrapassando as fronteiras. 🥳👏🇧🇷
@@adonesjb eu tbm rsrs
I love that you so heavily add comments in your code. I know a lot of experienced devs in the industry prefer not to add comments on code that is "self-evident", but what is self evident for them isn't necessary so for me.
This is very wonderful.
I could hardly feel any discomfort when listening to the Japanese audio tracks. However, I think Deepl is more accurate than Google Translate when it comes to translating between Japanese and English. This text was also translated by Deepl!
Let me know if there is a difference in this. I'll write this in DeepL and next one above it is gonna be from Papago. I use that to translate Japanese to English . So do tell me.
この違いがあれば、教えてください。 これと異なる点があれば教えてください。これはDeepLで書いて、その上の次のものはPapagoから書くつもりです。日本語から英語への翻訳に使っています。だから、私に教えてください。
This is Papago one. Let me know how accurate it is compare to the DeepL . I use this Papago to translate(right now it's with Honorifics) and Romaji to translate from Japanese to English. It should be more "accurate" since its a Korean company and Japanese and Korean have similarity.
これはPapagooneです。 DeepLと比較してどれくらい正確なのか教えてください。私はこのPapagoを使って翻訳しています(今は敬語と一緒です)。ローマ字を使って日本語から英語に翻訳しています。 韓国の会社と日本語と韓国語は似ているので、もっと「正確」でなければなりません。
korewa poppagūn'desu dīparuto hikakushite dorekurai sēkakunanoka oshietekudasai watashiwa kono papagōo tsukatte hon'yakushiteimasu imawa kēgoto ishodesu rōmajio tsukatte nihon'gokara ēgoni hon'yakushiteimasu kan'kokuno kaishato nihon'goto kan'kokugowa niteirunode motto sēkaku de nakereba narimasen'
@@Enforcedcraft I found DeepL more accurate than Papago.
DeepL has a fairly natural way of translating Japanese grammar and words, making the text very readable.
@@s0haku I agree, DeepL is my go-to for Japanese and Papago is for Korean
hmm, the sentence "I could hardly feel any discomfort when listening to the Japanese audio tracks" seems like a bit of a weird way to phrase that thought. Is that exactly what you meant, or was Deepl doing something weird?
@@vibaj16 there are a lot of similar things even in human translated text. You would probably have to sacrifice some accuracy in order to avoid this kind of thing.
Now I'll wait for Linus tech tips to review this app.
Great work!
BRO IS MAKING UA-cam BETTER ALONE FOR FREE
"alone for free" - like literally any other open-source programme?
I know this can be helpful for a lot of people, but I don't like that for me as a German it starts the video with German audio track and I manually have to switch to the English track, although English is one of the preferred languages of my Google account. The same problem actually happens with subtitles always showing even for languages you know, but you can turn off available subtitles always showing (of course they then also don't show for languages you don't know). I hope UA-cam will add a setting for not using alternate audio tracks by default (so far there isn't one), before this feature gets widely used.
If you know the original language well enough, the original will always be better than a translation (even if the translation and voice acting is done by humans). (This is because some things like puns don't always work in the other language.)
I think this is a problem that many sites from the US (and other mostly monolingual countries) have, that they don't really let you set multiple preferred languages (or in Googles case don't consistently use that information).
(The worst case I have seen is on WattPad, where you have to change your language settings to find books/stories in a different language.)
PS: The German audio track sounds pretty good, but it has a few weird quirks:
At one point there was a significant jump in voice speed to a very slow speed that sounded unnatural (maybe you should cap the slowdown to some reasonable value and just pad with silence in that case).
Also the AI didn't know how to translate 'you' and sometimes went with the honorific 'Sie' and sometimes the more personal 'du' (on UA-cam it should always be the personal, unless your demographic is over 50 and you want to treat them as strangers), sometimes it should have translated 'you' with 'man' when talking about general instructions (in sentences like 'So how do you create them?') and instead went with 'du'.
I'm a German too, and Google just doesn't care what your preferred languages are, it will always show any translations that are in your native language: titles, descriptions subtitles, and now audio tracks too. It's been like this forever and I doubt they're going to do anytging about it even tho it's super annoying :/
This is quite amazing. Good job Joe. Appreciate all you do around UA-cam.
As a Spanish speaker I could say that it is a great tool, it is very interesting and capable to do great stuffs, but it has their flaws. Listening this video in Spanish I could say that it feels unnatural.
I think that almost everything is properly translated, but the problem with that is on English, the speech has their own rhythm, but on Spanish because of the words and structures, there is some phrases that are really slowed down and the next one is speed up, and you have this some sort of roller coaster that can easily confuse and annoys you.
I think, on companies that dubbs content, this issue is resolved by re translating the phrases to make it ended with the same length as the original speech, e.g.
Spanish
10:17 "Entonces de esta manera, todo esta cubierto"
English
10:17 "So this way, everything is covered"
The speed on English is almost double as the Spanish translation, so on the Spanish audio it sounds really fast, but to make it sound at the same speed as the English audio, we could say and use instead:
Spanish
10:17 "Entonces todo esta cubierto"
This is just an example, I am pretty sure that there are better ways to translate these as a context.
Also, the language that you use on your original audio should be Neutral, because when you started to talk about prices, everything was "converted" to the target language, e.g. "$20" does not sounds like "$20 dollars", instead sounds like "$20 pesos" which is a lot cheaper of what you think, and could confuse people; maybe next time say literally "20 dollars". Also, for example, there are "North American Expressions" that cannot be translated and made it understandable by other languages and cultures.
As I said at the beginning, is a great tool, but needs some tweaks (and also consider taking advices from the subtitles community, which most of them translate subtitles and knows better how to do it).
Si, exacto, claramente esto se puede solucionar con una revisión del subtitulo traducido usado para generar el audio, pero ya involucra intervención humana.
i think the same mimic of lips dont sinc with sound. and some time the voice go fast or to slow. That is because the sound have to much the second on srt subtitles.
Vamos que no me gusta. Prefiero subtitulos bien hechos que abarcar de todo y que esten a medias.
@@slevinshafel9395 Yes, a way for the program to read the original audio and try to match vowels and silences would be cool to see, but again, dub studios do this manipulating the script to match it better. Maybe we will have ai that does this at some point.
This will certainly be improved over time with A.I. However I think the options should be always available. I like watching media in it's original language, whatever is it, and with subtitles. So, I don't like that the app chooses automatically to play the dubbed version, even if I have my settings in general for not doing so. But as there is the option to change it, it's all Ok. I hope that option still be there in the future...
Aún así está bastante bien
As an Arab, Dubbed version in Arabic is way better than what I expected. Still needs some edits here and there but overall good enough to understand. Great job as always ❤
Yes
fr fr
Reminds me of spacetoon dubs ☺️
But also that it pronounced j in Arabic as g
I have created my self some scripts to aid me in editing videos, I have one that detects the intensity of the voice to make smaller or bigger text for overthevideo subtitles, so I already do make srt files for all my videos, this sounds like a great addition to add more ways to make content avariable for everyone.
Empresas: es caro y mucho tiempo e imposible hacerlo fácil para que parezca traducción oficial
Este sujeto : gratis y casi automatizado
aun le sigo buscando la explicacion de porque para una empresa es caro pero para alguien independiente es barato
@@coca7895 eso mismo... Muchas veces no es solo caro del todo, sino sacar mas ganancia
@@santiagocabascango6514 Exactamente. También es por la "mentalidad de empresa".
@@coca7895 Me imagino porque buscan perfeccionarlo... creo.
I immediately checked the German dub (my native language), even before the involvement of AI was revealed. It was instantly apparent from the abysmal grammar. But fair enough, it's better than nothing
I should mention: as a computational linguist o adore this project from a technical standpoint! Never think the effort isn't appreciated
I had the same feelings on the German dub. Terrible grammar in lots of places, but better than nothing. I'm not sure if I prefer it over just watching in English tbh.
@@DigitalJedi I was so damn confused when the bad german started... I switched back to English so fast.
Tested the Hindi one, would be very cool if youtube inplements that natively. Definitely would remove the language barrier
significantly. Really appreciate your initiative.
Yes, the translations are pretty accurate and will definitely help people connecting all over even more.
As hindi speaker, it's very good features it was easy for me to understand content
But, it's very pure Hindi translation you have to turn on english subtitles to understand them
The Hindi translation is absolutely spot on but it is a bit too pure when compared to the Hindi that is commonly spoken. It does throw around a healthy amount of English words but a few more here and there would make it perfect "Hinglish"
Considering India is such a massive market for UA-cam, I feel like this can be such a game changer
I'm Pakistani, and Hindi sounds similar to Urdu to me. Both languages have many words that are the same, but some can be different.
Jeez, it frightened me when the video started. At first I thought it was an ad, but then i realized it was the actual video playing. Cool function
Speaking two languages i can say the dubbed version works fine, it's got the issues you'd expect from machine translation and speech synthesis but i can understand what's being said, so good job!
Are the subtitles machine translated? I was under the impression that they are actual translations, otherwise there isn't much point as you can auto-translate youtube subtitles anyway.
@@Leonhart_93 UA-cam auto translation is a one-size-fit-all solution, so it still make sense to run machine translation manually. That's said, in its current state the difference in quality probably isn't that good yet, especially that technical words are just as easy to fail.
From making troll videos to actually programming things that improve UA-cam experience. YT should just hire you at this point.
Watching the german dub I have to say, I'm quite impressed with the result. I think someone hearing just the dub would be able to understand pretty much everything. However there are a lot of weird phrases. The vibe of the german dub is very different. I think most of the problems are because translating the trascript snippet by snippet, there is a lot of context lost and it shows in the translated text. Another thing is that the translator tends to use very formal speech, so it sounds more like a coporate presentation than you just explaining your program to a casual audience. Sometimes the AI voice is unusually slow/fast and it sound like a bit like those shady ads (think: "This self taught kid genius build a simple device to cut your energy consumption in half. Scientists hate him for it"). All in all still a huge leap in accessibility (even if I doubt most people know where/how to switch audio tracks).
Exactly, 100 % agree
Honestly its kinda crazy youtube waited so long to add it. UA-cam already automatically translates the title into your language. I remember all the way back in like early 2015 i kept clicking on Spanish Fnaf UA-camrs videos and being mad they were in Spanish, despite being no indication of it. People were also already making dub channels, such as El Smosh.
Watching the Japanese dub, you might want to allow some sentences to have a maximum slow speed (so short sentences don't get stretched over originally long sentences), or merge sentences that have a completely different duration than their original sentences to prevent rapid fire sentences being immediately followed by drunkenly slow sentences.
For the second one, it is also needed when sentences are fragmented over multiple subtitles and the translator bugs out thinking they are competely separate sentences.
Speaking Indonesian here, it has the same problem
I feel like this is useful for monolingual audiences, but I'm bilingual and prefer to watch videos in their original languages without being automatically translated. I have my UA-cam set up in Japanese because it kept translating video titles from Japanese to English. It usually doesn't happen the other way around. I don't need the translations and it often ruins the original meaning of the title. When I clicked on this video it started playing in Japanese. No thank you. This video was originally recorded in English, I want to watch it in English. UA-cam really needs to work on making options to disable certain translations for multilingual people, or at least provide an option to not translate at all.
Most people aren't multilinugual and this is what it's for.
@@lucadoesthings Right, but UA-cam's default behavior is to translate video titles and descriptions to your interface's language with no regard for whether or not you can speak the original language of the video in question. There's no way to disable this behavior, which is fine for most people, but not for me, and I'd imagine most other multilinguals. Remember, UA-cam serves hundreds of millions of people, and one percent of 100 million is still a million. Even if only one percent of UA-cam's users are multilingual, that's still millions of users being done a disservice. It's be a shame to see the audio go the same route with no way to disable or change this behavior.
@@lucadoesthings Most people _are_ multilingual. Very common that people speak two or three languages.
@@Liggliluff Prove it. I don't think so. And knowing 20 words in another language doesn't make one multilingual. At least in the USA.
@@mrcryptozoic817 Murrica is not the entire world. There are 5 whole other continents filled with people who learn English (or some other language, German is common here too) as their secondary language.
its so weird remembering you as the guy who made troll tech tutorials like 7 years ago and seeing you do wonders for the platform now. not complaining tho, you're amazing
I'm Brazilian and I'm in love with this resource, I found out about it in a video on how to make a blog from scratch, it was an American video, I was scared by the guy speaking my language! I jumped and went to the UA-cam player and saw an additional feature of the UA-cam player! Dubbing!! This is so perfect, you really seem to speak Brazilian Portuguese NATIVELY! WTF
It was one of the best experiences I've ever had watching a video on UA-cam! I played around and changed the audio track a bunch of times, turning on and off the captions, it was amazing seeing how it works, nice work!
I cannot express how cool i think this is.
I would never have imagined that dubs like this would be possible!
Такая функция уже давно есть в Яндекс браузере. Там если ты смотришь видео на английском (даже прямую трансляцию) оно будет переводится на русский голосом ассистента Алисы (аналог Сири в России)
Но иногда на английском могут слова некоторые зажеваться, из-за ветра или ещё каких-либо посторонних шумов, которые Яндекс уже не переведёт
+Яндекс доступен не всем и не везде. И браузер, и весь остальной сервис.
@Mister Chuvacok Алекса от Amazon
@@MrBrigadierArchived Ютуб не доступен в Китае(как минимум), а гуглосервисы ограниены.
@@Orakcool учитывая политический характер, то проще говорить что вообще доступно в Китае, чем то, что там запрещено)
日本人ですが、話し方は驚くほど自然ですが、字幕の表示されている長さに合わせて話し方がゆっくりになったり早くなったりするのが面白いと思いました。
Cool!
Btw, Yandex had already made fully dubbed multi-voice UA-cam video translations and now finishing work under the Streaming dubbing )
Would be interesting if LTT would be able to integrate this with their new ai voice Spanish workflow
finally i can watch thiojoe subbed in japanese the way god intended
More like thio jo jo 😎😎🥶
You mean "Dub" ; "sub" is subtitle
HOLY SHIT THIS IS AMAZING
I am watching in Japanese with English subtitles. My dose of anime for the day 😂😂😂
ThioJoe is the best anime
When the video suddenly stopped for a brief moment to load because it hadn't buffered enough it reverted back to English, so maybe there's a bug that automatically changing the quality reverts the audio language back to the original?
Eu ao contrário do que vcs estão dizendo, achei fascinante isso, e ajuda muito. A voz está bem natural, logicamente que não será igual uma fala natural, masssss é só o começo, e para um começo está muito bom de verdade. Isso me deixa muito empolgado pois facilita ver qlqr áudio. E ao contrário que alguns estão dizendo, essa ferramenta possui função para facilitar o entendimento de um vídeo e não para substituir dubladores de filmes ou afins. O programa pensa em transmitir a informação do áudio em qualquer idioma e não passar emoções pela voz igual um dublador faz.
Even though translations are great, I hate the implementation of it, in the sense that there is no setting to enable or disable whether it automatically selects the language your app is set to.
The same for video title translations.
i agree, the only way is to change the system language on android
Bruh that dub feature is SO AWESOME
I think this is soo cool, I think in the future aswell we could use this for audio description too to make this more accessible too, love this!!
I'm not gonna lie, I'm very surprised and delighted that you chose to do Indonesian audio. It's very rare to come across non-Indonesian content creators who even has Indonesian subtitles for their content, let alone dubbing. Many kudos to you 👍👍
Also, the translations is very good, just that it's very rigid. Indonesian uses the passive voice much more often than English, so some sentences seem rather superfluous. Still, it's very good for AI!👍
This is definitely one of my favorite channels on UA-cam 🤩
Oh my gosh, you literally made UA-cam better for other people who speak another language. I would like to add that maybe you could also do Lithuanian? For my fellow Lithuanian speakers. Thanks for your work we really appreciate it. Keep it up. (Every time you make a new video I'm very excited)
mabey but mabey not too i remember when i was like 6 or 7 and didn't know how to talk with english while playing crash so i wanted to know how to talk with it no matter what and this is probably why i become good at it
but if everything will be just translated in the next some years there will be 0 point on trying to learn a new language because you don't need to
@@alhusseinhd2095 true
Hey dude, how did you get access to the feature? Tried contacting YT via live chat but it went nowhere lol
If he doesn't make a video about it, i also want to know Lol
Oh Hello Weegeepie!
What u doin' in here Weegepie?? xDD love your videos
I want to do it too - damn UA-cam srupid chat !
the spanish version sounds good, few grammatical errors and verbal conjugation, especially the times, but the translation is good
Listening to the Japanese version - I am surprised how accurate it is, although the way the speed changes constantly reminds me of how you change the rpm on a record player.
CARAMBA QUE INCRÍVEL!!!!!
I'm an Arabic speaker, and honestly other than a few hitches, it works really well! I still needed to get used to the constant slowing down of the voice to match your speech, but I think that's a good compromise for how revolutionary this tech is for youtube videos!
It was really cool watching your video in Portuguese! It felt like low budget dubs from TV shows that do not care about lip sync like documentaries from Discovery Channel for example, with the added detail that the narrator speaks suddenly fast or slowly to keep up with the original video. Personally, I would prefer the awkward pause to the guy going Turtle Mode every once in a while, but what you did is really cool! You make an amazing software developer and I can't wait to share your future videos to my Brazilian friends. Cheers!
Yes, the software should run one pass, and look at how much each clip needs to be sped up and slowed down. Then it should try to even out the differences, so you don't have a fast clip between two slow ones.
You should have an option to manually edit the written translation before creating the voice, to catch any translation errors made by the program
The two pass system is really apparent in the german version, some parts are quite quick and some are insanely slow. Additionally there are some grammatical faults which aren't hard to spot, and it is in general very weird.
Error with prices 11:07. In english the symbol "$" means USD, but audio track in spanish says "pesos", maybe mexican "pesos". In this case, $ is "dólares americanos" (in spanish).
ThioJoe: "No app with these features? Fine I'll do it myself!"
Instead there actually is
I wish more people made use of the secondary audio tracks, there are some neat things you could do. For example, there are a lot of ASMR videos that specify a specific pronoun for the viewer, or have background music that not everyone might like. If a creator wants to provide options currently they can only upload a second version of the video, which is likely to do bad things to their engagement ratings since the viewership would be split up. Having separate audio tracks could really help channels willing to put in that extra effort.
"that specify a specific pronoun of the viewer, or have background music that not everyone might like"
uh some feature for that already exists right here on youtube, it is called 'exit the video and watch other one which u like' and for the pronoun thing, first does it matter that a random youtuber who is barely aware of your existence doesn't get your pronoun right? and second most youtuber refer their audience as YOU or some cringy name what do you wana fix on that?
different people have different preference and nowadays different pronouns too, its hard to include all people in a ~10 min video.
@@trashpanda4 I don't care whether a UA-camr *does* it or not. My point is that the ones who do are *punished* for it, and that sucks. I'm not sure if you watch ASMR or not, but part of the process involves being relaxed and comfortable and some people are not comfortable being referred to by pronouns they don't like (or just don't like music with their sounds). Typically they'll just skip the vid, but sometimes there are creators who are awesome and literally make different versions with different pronouns (or with and without music, ect.) and then publish both even though it's going to do bad things to their overall metrics because very few people will watch both versions. I'm not suggesting we force creators to do this, I'm saying that there's already people who do, and this would be a system that would allow their hard work to not actively harm them.
@@MudakTheMultiplier UA-cam algorithm favours quantity over quality and if its a well established creator then the *getting punished* part is making the exact same video with slight variations.
all things besides, idk what asmr people do over there but isn't there only one 2nd person pronoun which is 'you' (in English) unless its a group conversation simulator asmr how does pronouns enter the game?
@@trashpanda4 you greatly overestimate the viewership of most ASMRtists. I would say the bulk of the mid to large scale ASMRtists get 30-60k views on most videos, which is absolutely in range where getting one video that doesn't do as well will drop overall metrics. And several of the people I've seen do this actually only have one public facing video. And then private the other one and link to it in the description to combat this, which will result in some nontrivial fraction of your viewers clicking through to a separate video which will not contribute to algorithm weight.
there are other gender defining terms (boyfriend/girlfriend, brother/sister, and so on) and it's not uncommon to refer to the listener in third person if only to cut down on saying "you" 50 times in one script. Unfortunately the alternatives are usually a bit clunky. Often a creator will pick the pronouns that match the bulk of their viewers, which then means that anyone who feels uncomfortable just loses a channel that they might love otherwise. That's why this feature is so cool because it's literally designed for accessibility, even if the original designers in 1990 would have never expected it to work like this.
@@MudakTheMultiplier oh that makes sense, i personally don't not watch asmr content so i have no idea what happens there
3:56 whoaa, thank you for including indonesian dub
For German:
- A few grammar/translation mistakes
- Audio speed changes are noticeable
- It switches between the impolite "du" and polite "Sie" for addressing the audience
- At 15:00 there is some weird overlapping of the dubs
But: this is truly amazing work. The fact that these are the only things I've noticed given that this is all an artificial translation is mind-blowing. I'm so stunned right now 🤯
Actually there is also DownSub which allows you to download the titles of the different languages, you will have to upload an SRT file of a language to the video which in turn translates it DownSub to detected subtitles :)
1. Checked out the Russian automatic dub, it's not bad. There are some mistakes in translation and the voice's speed sometimes almost matches the speed of a machine gun. Deffinetly this needs to be a semiautomatic tool.
2. If you continue using this tool for your videos, why not turn your voice into AI generated one?
3. This tool reminds me of Yandex Browser's realtime video translator.
he talked in the video about how doing #2 is incredibly exprensive
This sounds pretty good. There are some flaws (emphasis, pronunciation of some words in English, but these are the problems of Azure voice itself), but in 90% of cases everything is just fine. Of course, it does not reach dubbing, but we remember how it was in The Witcher 3 with shrinking and stretching audio tracks.
I tried Arabic. In reality, spelling the word and letters and also punctuation are so weird
also the dude speaking is the same voice in the national geographic documentaries lol
It's really impressive that all of this can now be automated. I've tried several languages I'm familiar with, there's often a significant difference in speed between each sentence as expected, but the only way around this would be to change the translated text to reduce these differences (perhaps approximating a proportion of the original number of syllables , though not necessarily). Future work. But that alone meets many dubbing needs.
German has a little problem with pronouns.
In the second person singular we have "Du" and "Sie". "Sie" (not to be confused with "sie" without a capital s, the feminine third person singular) is the formal form, used for example in official letters. "You," on the other hand, is the more personal form, used in conversations with friends and family, for example.
Google Translate uses "you" because it translates formally, but it felt a little strange to be called "you" in the video. "You" would be a bit better, but I don't think it's the best solution either, and I don't know why I feel that way. Translating other languages into German is apparently a bit difficult for Google Translate, or Google Translate is just overall bad, idk.
Still, it was very interesting to see how far we've come with our technology when it comes to translating audio tracks, and that a script by just one person can do the whole thing relatively well.
LTT is doing something similar. They have made a demo for Spanish for an English video.
I can understand Urdu so I tried the hindi language track in this video as it is closest to Urdu. What I found was at some places the voice was too fast. Then it went to normal. At some internvals the speed of the translated voice was too fast to understand properly.
Indeed, it speeds up and slows down way too much
Yandex made it intergrated in its browser at least one year ago. Russians can watch translated youtube videos in 1 click.
That's coz the ai uses a lot of words to translate what Thio said.
And it uses pure Hindi and not the modern Hindi language (which is an amalgamation of Hindi, Urdu and English)
plus efficiency is something that can only be achieved with context awareness
yea and I felt like I was watching some Hindi dubbed documentary on Nat Geo 😂
Yeah I remember that, I have my youtube in Spanish because that's the country I'm on and youtube wont let me change it, got into LTT's video and thought it was just some joke until I was 6 minutes in thinking "alright when are they going to switch back into English?"
This is going to be a blessing and a curse for multilingual people. Neat
As an english speaker, I'm very surprised from the quality of the voice translation! It feels almost life-like!
No me lo creo, está herramienta está increíble, ojalá pudiera ser más fácil el entrenamiento y transcripción... En un futuro seguro será algo como sube tu vídeo solo con el audio de voz sin efectos y verifica que la transcripción sea correcta y el programa se encarge de lo demás.
Increíble trabajo! Es muy inspirador ver cómo usas diferentes herramientas ya disponibles para crear un servicio más complejo.
Un saludo!
ThioJoe working on adding features to UA-cam meanwhile UA-cam thinking of removing the next feature
This feature has been in the Yandex browser for six months. But there is one caveat. It translates from different languages into Russian. And this works with streams as well.
I know a Brazilian channel with that feature and it is called Manual do Mundo.
Holy crap, the Spanish dub is surprisingly good.
ikr wtf
Language stretching kinda just ruins it for me, I'd rather just use subtitles
@@xenondestiny Yeah, but it could be fixed with some manual editing, and instead of hiring an entire dubbing team, a creator can just look for multilingual editors that modify the audio file and it's timings. This is the greatest use of AI I've seen so far.
yea
Cambiar dolares a pesos esta mal, ya que 2900 dolares no son 2900 pesos jaja
Wow now that's actually revolutionary. Great work!
I found it very interesting, there are no audio gaps, all the time is well filled in making the video more direct
yes, that is so good
after this dude trolled me and so many other people into messing up their computer i dont trust him w anything anymore
@@mp5thegun what does this have to do with anything?
@@reedomu i’m jus basically sayin idk how people still watch him or maybe they don’t know the content he used to do
@@mp5thegun ???? First of all, What did he do.
Hello from Russia, the Russian company Yandex has done what it says in this video, but better)
There is a Yandex Browser where the video translation function is available, it translates it by pressing the button at the top of the video within a couple of minutes, selects the timbre of the voice being played suitable for the timbre of the person from the video (distinguishes female/male voice as well)
And watched this video through it, and works much better
Very impressive! Thanks for all the work you do for the community!
Es un excelente desarrollo! La voz y textos en español son casi perfectos, ¡felicidades! Sería estupendo si esta aplicación pudiera aplicarse cuando somos sólo usuarios y no creadores de contenidos, ¡sería un hitazo! Me apunto para la beta! Por lo pronto, suscrita y lo mando a mis creadores favoritos!
0:52 I’m basically working on the same thing but for sign language (English to ASL I’m not doing SLR) and that translation part is so hard, like I set up whisper (it also works with UA-cam videos, I actually set it up for UA-cam first and then modified it) in maybe 20 minutes, the playing signs for each word with a transcript took like 80 hours (which is probably a lie I worked on it in my free time way too much)but that translation is such a behemoth to tackle for me because surprise there aren’t that many tools that do that, so I have to make it myself kinda. There is someone who has a parallel corpus for Asl and he says it’s on his website but that English and ASL gloss side are just the same text file which I know isn’t right because he has a sample corpus with 80000 sentence pairs which is right but it wasn’t enough. I tried it and it worked for the test file but falls apart when used else where because of the lack of training data
I have been wanting just the multi-channel audio feature to be rolled out publicly for a while now. Language tracks are the obvious use, but imagine all the times the music in a video was too loud or just annoying. Imagine being able to mute just the music. Or with sports broadcasts, being able to turn off the crowd sounds or the commentators individually. One day, bandwidth might even be good enough for live content to have multiple channels that you can individually adjust. No more trying to tell a streamer that their music is too loud or the game is too quiet.
Translation to Arabic is Ok..as a matter of fact it's better than expected..but the disparity of timing between paragraphs is odd and can sometimes be confusing.. however , this step in itself is very promising..thank you for your effort
As a modest linguist, Thio, I found this utterly fascinating - and brilliant! In my late 70's, oh to be in my teens again studying languages, when in those far-off days so much of it was a book-driven drudgery! I have reached the conclusion that today's youngsters now have no excuses for failing an exam, given the astonishingly wide range of 'aids' available.
우와 미친 한국어가 들려요!! 와 세상에
심지어 대사도 어색하지 않아요...!
As an argentinian, this is very a very good and revolutionizing tools, specially since it's just begining, but it has some flaws and i want to point out a major one for me:
For some reason, the program that translated the subtitles to other languajes took every set of subtitles as a different sentence, which if the other languajes has different rules to english it will probably screw things up. For example(screw ups in spanish only):
-In english, adjectives go before nouns. While in spanish, adjectives go after nouns. This can cause that words are put out of order, and worse, if the adjective in english is also a noun. For example, the sentence "i like strawberry flavoured gum", which should be translated as "i like gum with flavour of strawberry", is translated as "i like the strawberry. Gum with flavour"
-In english, objects are not gendered. In spanish however, they are(they shouldn't be and their gender is arbitrary, but that doesn't matter), if the pronoun and the noun are in different subtitle sets, the program may use a different pronoun than the one it should.
-In english, verbs are usually conjugated by putting another word before it. If the word before the verb and the verb and in different sets, the verb is conjugated incorrectly and the sentence doesn't make sense.
adobe premiere has some tools for isolating audio down to the word level with an automatic transcript, I think. I wonder if it could be possible to have some auto translation plug-in over there? Feed it an audio track in english and it will spit out a 12 language audio track. this could become some kind of new standard in media.
This was really weird watching you in german (especially because of those small errors which are to be expected)
Edit: I also really want a french translation
same
Oui, why no French?!
german gang.
Ja man versteht schon so einiges, aber gerade die Grammatik stört mich schon sehr. Da würde ich das lieber auf english hören.
@@kkndzocker wenn mans versteht
Esto es la HOSTIA, necesitamos audio tracks en los videos y ya llevan como 2 años sin ser implementados
처음 영상을 켰을 때 유투버가 한국어로 말하는 줄 알았다.
전문용어가 많은 영상이지만 몇몇 부분을 제외하면 거의 완벽하다고 본다.
놀라운 기능을 알려줘서 고맙습니다.
Oh!!!! finally!!!! I hope this feature rolls up for more users.
How did you manage to access that option? I have no idea where to submit a request. I actually have a large channel, and I'm quite interested.
Did you figure that out, i just send a feedback message in youtube studio today?
Google please make this thing available on all videos.
Yandex Browesr already have made this 1 year ago. Sad thing - only for Russians.
It sounds super easy, but it really took me long to realize that at the end, you have to pay a lot to be able to make this voice over with either google cloud or azure, and then if you have to pay, maybe it's better to pay a voiceover actor it will be cheaper, easyer, and more natural. Maybe it's me that i can't put this python to work, but at the end everyone is stuck at a paywall.