YouTube Doesn't Render Arabic Properly
Вставка
- Опубліковано 10 бер 2023
- Special Thanks to Jafar for helping make this video!!
Written and Created by Me
Art by kvd102
Thanks to my patrons!!
Patreon: www.patreon.com/user?u=73482298
Translations:
Leeuwe van den Heuvel - Dutch
#arabic #youtube
I never realised how much this bothered me until watching this video (Hebrew has the same problem). Now I can be fully rather than subconsciously irritated every time it comes up - thank you.
It happens like, everywhere in the internet as well with many other languages but I just live with it at this point
I'd hate to see the nightmare of traditional mongolian (in the traditional columns of text going left to right)
I had to take an exam in computer science on my pc, that used both english and hebrew on the same questions..
Imagine having one more puzzle to solve before you start to do the actual exam. It was horrible.
Arabic has been the cause of a bunch of famous bugs
@@axelanderson2030 like what
As someone who has tried to do multi language support for some programs, I can say that this is a surprisingly difficult problem to solve. Early computers didn't really think about languages other than English and there's a lot of backwards compatability shenanigans. That said, this problem is already solved. We can see it working on one platform but not the other. If they hadn't just fired 12000 employees, they might have had someone around to fix it.
Having done a lot of work on text in CS myself, I agree with your first sentiment. However: This is not caused by UA-cam. Text rendering is done by the browser on PC, by the OS or a toolkit on mobile and even if someone did do it "manually", they would use HarfBuzz, DirectWrite, or Core Text, depending on the OS and system for text shaping, and ICU, Core Text, or Uniscribe for general text handling. Some languages also provide native Unicode support, but that's usually limited to semantic understanding of the text, not graphic understanding, such as splitting text into grapheme clusters or interpreting bidirectional text.
@@LetsPlayFolling fascinating, that explains a lot about the youtube studio app lol
@@LetsPlayFolling I agrée with you to a point - they’ll certainly be using a 3rd party system to render the text, but that doesn’t mean they don’t have control over it. Part of the problem is not grouping the username together and the comment together separately, so addressing that would surely improve the situation.
@@rainbowevil i mentioned that in other comments on this video as a possible remedy. However, thats still not really youtubes fault. It could have very well been a regression.
UA-cam Studio has many issues, one being no support for split screen, for some bizzare reason. Why? Apps have three settings:
- *unset:* can be run in split screen but Android gives a warning that it might not support it well.
- *allowed:* can be run in split screen without warning.
- *disallowed:* - can not be run in split screen.
So UA-cam Studio has intentionally disallowed split screen.
Most apps run perfectly well in split screen, because all split screen is, is that Android pretends that the phone screen is physically smaller and lower resolution (but same density). So the 9:16 screen is now 9:8. But since apps should already be built for all kinds of screens: 9:16, 3:4, 4:5, 1:2, being dynamic in size, it should already support 9:8.
There's a way to make Android disregard this setting and force all apps into split screen. I must find a way to do this, and see how perfectly fine the UA-cam Studio app runs.
The fun part is that incompatibility actually gets worse for Arabic on some programs.
Like if you try to write Arabic into RPG Maker then it'll write the Arabic characters backwards...not the order, but the actual characters will be mirrored!
I don't even know how to read Arabic and IMMEDIATELY noticed this issue and now it haunts me!
tiktok splits all the arabic letters. if you make a HUGE effort you might understand the text 😂
@@cool_bug_facts hahaha WHY
@@cool_bug_facts Maybe some of your ancestors were Turkish.
او يطلعو كأنهم حروف زي ك ذ ا مرة مستفز
@@lynx3845 Sorry, as I say I can't read Arabic.
Can you translate?
One of the worst experiences ever is having to deal with parentheses in a mixed English Arabic text. Sometimes there’s simply no way to get them right
Finally someone talks about this! No matter how I type I always have to remove and type again to the point it's become muscle memory.
@@xXJ4FARGAMERXx OH MYGOD IT'S JAFAR
@@xXJ4FARGAMERXx Or there’s the technique I use when including Arabic text inside a block of otherwise-Latin text:
Type *everything but the Arabic text* first, then carefully place your cursor at the point where the Arabic should be. Paste the Arabic from another program where it’s already been output correctly.
It should then be treated as a string of Latin text that simply happens to be written as Arabic text (and maintains the directional flow of both without clumsy splitting).
Bit of a hassle, but it’s never failed yet (at least, not by itself; its “failures” have all been attributable to a mistake somewhere else!).
@@-.bella.- real
@@-.bella.- yup, that's why he made the video.
Another problem:
If an Arabic comment is long, when you try to hit the "read more" button (on mobile), it doesn't let you. Instead, it opens up the text box to reply.
Whether you hit the button itself or where it ought to be (it changes place based on the language).
happens to me too!
Sometimes the only way to click read more is to use the translate button and hope it has a read more button after being translated lol
i thought its a general glitch for all languages didnt realize i get that only on arabic
@@Kap01 or use a laptop
قققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققققق
Its actually fairly common i've seen multiple games where if u type in arabic the whole sentence would be backwards like the word u start with would be the ending word it makes it very hard to read tbh
Omg. So you had to type Arabic left to right?
Like, here is an example if I had to type English from right to left, would it be like this?:
What you want to type: “This is an example sentence”
What’s actually being written: “Sentence example an is this”
That explains why Sony gives their games Arabic dubbing but no Arabic subtitles & menus UI.
@@hiiamelecktro4985 no it would be ecnetnes elpmaxe na si siht
@@modmaker7617 Arabic UIs are often completely flipped. The items that are on the left in English will be on the right. Like on youtube, the logo and navigation bar will be on the right. The video player is aligned to the right, and recommended videos are to the left.
You can switch your youtube language to Arabic to see it for yourself
This is the case in the game Im playing right now - Dota2
The bigger nightmare is while writing in Arabic the cursor goes left when you press right and the opposite is true, which makes typing a nightmare
I can't believe WORD hasn't this fixed
Pain
@@Pitbullnamedprincess69 🍞?
as a native arabic speaker i can tell you this is only the tip of the iceberg, most apps and websites just straight up panic when i type in arabic for example they type every letter by its own when its supposed to be connected , and sometimes show arabic from left to right and alot more bs, but thank you so much for shedding the light on this issue.
Ah yes, the three constants in life: death, taxes, and text rendering bugs.
Oh my God, this happens ALL THE TIME, and I've started being content with that like it's one of the most basic life facts. Like "the sky is blue", "5 comes after 4", and "most social media platforms can't render languages that go in opposite directions".
"most" wrong it's like every damn platform i've seen
As an Arab I agree this is a huge problem.
Sometimes on a platform/online game with a chat filter, it will not allow me to type letters
On some social medias it will make the letters backwards (left to right instead of right to left)
Language fact: in some US states, Arabs (and lots of others) are legally required to change their name to only use the 26 Latin characters when they move here. Just because the poor ancient government computers can't handle anything more.
@@josephwodarczyk977 That's not just the US. Most countries require a transliteration of your name if it's in a different script.
و انا كمان
اه
" this is a huge problem ", i am pretty sure we have way bigger problems
Thanks god someone adressed this problem. This is a real pain for someone trying to practice the most basic things in Arabic/Farsi/Hebrew/etc on the internet. Easily top 5 of things that frustrate me the most as an adult 😅
Or should I have written "الحمد لله someone adressed this problem"? 😂
@@marolibez ههههههههههه
@@marolibez هههههههههههه! Thats was good xd
Arabic laughing text looks like a fern or a particularly leafy vine
@@georgiykireev9678 like its flower leaves!
As a developer for English and Hebrew apps you have no idea how infuriating this is. Most apps get away with a basic left to right input system since that's pretty much all they need, but for languages like Hebrew and Arabic which require more support you sometimes need to design two separate layouts just for different languages. Although I am thankful at least that we don't also have to add top to bottom writing systems haha
*ominous aura*
*Mongolian script enters the battlefield*
Imagine having to deal with both Mongolian where vertical lines (or columns) go left to right and traditional Chinese or Japanese where columns are written right to left.
Android has it figured out. You define the layout in rows top-to-bottom, so app bar, app content, app tabs. Then inside each row, you define columns start-to-end. What counts as 'start' and 'end' depends on if the language is LTR or RTL. So with just one layout, you can support LTR and RTL.
You can also support Mongolian script this way as well. If you're holding your phone in portrait, it's commonly 9:16. For Mongolian, the layout is then using the RTL layout renderer in a 16:9 layout, and LTR text is rendered upside down, so Latin text still goes RTL. After it has rendered it all, it rotates the layout so the right edge is at top, so it fits the 9:16 screen. Mongolian text now rendered top-to-bottom left-to-right, and Latin script is also read top-to-bottom, just rotated.
theres a really cool library someone told me about... its a language library that changes all characters, every character is assigned either RtL, LtR or null, and when you type, it places the next character based on itself, so if you switched from Hebrew to English on the fly while typing it wouldn't f up, it would just continue typing like you would expect, space and some other misc characters are assigned null and just take the RtL / LtR value from the previous character... it was so useful and i am so sad i lost it while changing pc...
This problem has been happening for ages in basically everywhere on the internet, not only UA-cam
i still remember the hard times chatting in arabic with my friends in some game when i was a kid and not only was it rendering the text from left to right, but also separates each letter like this "ص د ي ق ي ي ا م ر ح ب ا" (i said "hello my friend" if you're wondering)
I swear this was extremely annoying and it's still happening in a lot of programs to this day, but most of arabic speakers got used to it anyway lol
This is actually a common problem.
I couldn't get זאב Sarah to work in THAT order on a phone as standalone in a line.
(This was in telegram.)
Btw. This makes me want to link the essay "text rendering HATES you"
@Dominotik Ivan Tulovskiy just write that on the search bar, it's the first result
If it's a text field that determine which order the text should be in based on the first directional character (so punctuation doesn't count). Then you can't have it appear as "RTL LTR". This is because:
- If you write LTR RTL, it will render it in LTR, so LTR is first to the left, and RTL is second to the right of it.
- If you write RTL LTR, it will render it in RTL, meaning that RTL is first to the right, and LTR is second to the left of it.
You have to use a Unicode LTR or RTL override to make it work.
This is a big issue in Hebrew as well. I'm a native Hebrew speaker and yet I set all of my apps to English because the formatting of text that has both Hebrew and English in it (which is common in western-centric apps with Hebrew UI) is literally unreadable.
Same
I'm learning Yiddish currently and it's such a pain
@@Plopi96ILuvPigeonsשלום, איזה הפתעה נעימה
@@divergentcloudsאוי וויי
גם אני דובר עברית זה ממש מעצבן, הכי הרבה ברובלוקס
Arabic rendering is very complex in that on top of being an LTR language the letters aren't separate, each one can be one of 3 different glyphs based on it's position in the word. Add to that 5 types of diacritics (99% of the time unused but sometimes mandatory. Go figure ) that must be "optionally" be positioned either on top or under one or several letters and it become nightmarish.
There’s always the technique I use when including Arabic text inside a block of otherwise-Latin text:
Type *everything but the Arabic text* first, then carefully place your cursor at the point where the Arabic should be. Paste the Arabic from another program where it’s already been output correctly.
It should then be treated as a string of Latin text that simply happens to be written as Arabic text (and maintains the directional flow of both without clumsy splitting).
Bit of a hassle, but it’s never failed yet (at least, not by itself; its “failures” have all been attributable to a mistake somewhere else!).
(Failing that, you could just string a run of Braille blank characters on the end of your text so the “predominant direction” becomes LtR.)
It shouldn't be any problem for you to just type out the Latin text until you want Arabic, then type Arabic text until you want Latin, then type Latin text.
In the end, all the computer sees is the order of the characters.
@@Liggliluff I wish I could, but the program I use unfortunately decides the overall direction of text based on the most-prominent one in the string. This necessitates padding the string so that the LTR direction of Latin text overrides the RTL of the Arabic.
@@Liggliluff UA-cam probably inserts the Right to Left Unicode character automatically whenever it sees arabic. This right to left character is invisible, but it switches the text to Right to Left. Similar characters exist for superscript and subscript. Stopping the automatic RtL character would be just as bad if not worse, because it would mean the user would either have to type it in themselves (Good luck finding it in this:en.wikipedia.org/wiki/List_of_Unicode_characters) or they just have left to right arabic siht sa dab sa tuoba eb dluow hcihw. Basically, text formatting is difficult and because Unicode is based of off ASCII which is american, it is first and foremost english based, which makes other languages difficult, even if they have the characters are representable.
@@DrWhoFanJ Ah, I understand, so it's a completely different issue. Most software only cares about the first directional character, or have a pre-determined writing order.
Yes but it'd just be so much better if they fixed it for real
Us Arabic speakers have just dealt with this for so long, in so many different devices, websites, programs, games. And more than just this bug, plenty more. We've just gotten used to it.
I'd have never thought I'd see the day where someone would directly go and ask this gets fixed for us. It feels weirdly empowering.
Thank you.
Unicode, CLDR, HTML and Android should have a fix for this. You should be able to mark a text as being "together". So even though it would see the first as an Arabic letter and render it right-to-left, it will still consider the "Jafar" part being part of the Arabic text and render that together.
But regardless, Android notifications and Android apps should really keep things left-to-right for left-to-right languages.
@@gregoryford2532 no, it will render normally. But it will be put next to the Arabic text
UA-cam has little to no influence over this, neither on PC nor on mobile. Same for Discord. On PC UA-cam is just a web-app, so it's the browsers rendering engines that implement this feature. Chrome, Vivaldi, Brave, Edge, and Chromium all have one rendering engine, and Firefox has another. Note that Discord, Spotify, etc. are all just a Chrome browser on your PC, so it's the same there. You can fiddle a bit with the way the rendering engine interprets text, but that's it. "Fixing" would need to be done by people outside of UA-cam.
On mobile there's a few options. Either you use the OS specific controls, which are provided by Apple or Android (for the most part), or you use a custom rendering engine provided by any number of providers. Although those often incorporate native OS controls, especially for text.
I can't be 100% sure, but I'm fairly certain that UA-cam on Android just uses the OS capabilities, given that both are owned by Alphabet. On iOS it seems that the controls are native too, at least the look and feel is much more akin to Swift UI controls than to something like flutter. So if my assumption is correct, this wouldn't be for UA-cam to fix either.
What they could do is provide a workaround where they segment the notification into multiple pieces of text, which would work better. Still not a bug caused by UA-cam though.
Surely UA-cam's developers (Alphabet) have some sway with the developers of Chrome's browser engine (also Alphabet).
Both Chromium and Firefox currently use skia so it's the same bad renderer :/
@@jhgvvetyjj6589 Oh, I wasn't aware FireFox switched to Skia, learn something new every day.
@@trickvro From my experience with large companies the amount of control should be minimal. You would need to escalate to some common denominator or have a friend in a position high enough to propose these changes.
I'm not actually sure if Chrome's Text rendering engine, Skia, implements CTL and Shaping by itself or uses HarfBuzz/ICU or the OS native solutions. In that case it'd be up to the OSS community or Microsoft/Apple to fix it.
@@LetsPlayFolling Firefox dropped cairo support after 52 or so
In the library in Göteborg they wrote the sign for the Arabic book section with disconnected letters an going left to right 😭 like
ةيبرعلا
Göteborg be using the Minecraft engine for their printer 💀
We have the same problem with Hebrew too, also being right to left
Ah, nice surprise seeing you here
@@CheLanguages no way, didn't know you watched this channel!
@@AvrahamYairStern I don't, this video just came up on my feed
@@CheLanguages I understand. I recommend his other videos
This issue is overall not just youtube, but EVERYTHING.
I speak Hebrew as my second native, and I was learning computer science at school. Hebrew is also written from right to left, but in computer science you work a lot with english, cuz programming languages are written using english words, and latin alphabet.
During the lockdowns we had to take the exams from home, and my OS is obv on english im not insane person to use OS inverted or whatever.
And I just couldnt read the questions, they were so messed up because of using english and hebrew at the same textbox in different places.. Imagine I had to solve a second puzzle during the exam to solve the exam..
It was just so bad. Dear people, stop using any other language than English when writing sites names etc.
Or find Bill gates, steal his company and make it smarter using AI or whatever.
Its pure pain and it begins a lot deeper than youtube.
Thx for pointing this out!
True story: back in 2005, I decided to go to college in Israel. My college (now known as Reichman University) had both English and Hebrew versions of their websites because the school was split into an English and Hebrew curriculum. In certain spots of the English website, you could find places where it said things like "the school reserves the left to use this information how it sees fit." Because, apparently, someone building the English website just did a "find and replace" on the code and switched all the rights to lefts when they needed to convert all the text alignments.
This is why modern CSS has "start" and "end", which correspond to left and right depending on the direction of the text.
@@SolomonUcko Oh they don't even know the difference between string and value
I had to use "port" and "starboard" for value names before
לול למה שתלמד ברייכמן אונ' תל אביב עדיף smh🤣
@@amOhad131 בשנה 2005, לא היה מסלול מתאים לי בתל אביב. תמיד אדיף את הבית הספר שקיים ממה שלא.
شكرا جزيلا لك اخوي thank you so much brotha
thats the reason why i have my channel name in english and my video titles in arabic and english even tho the videos are arabic, youtube really needs to fix this and i hope they do
great video klein keep it up brother
i never thought about that, but now that you’ve pointed it out, i can see the problems it has. great video keep up the work with interesting topics such as this one
I noticed that too! Thanks for figuring out why it happens
IIRC, YT doesn't control how the notifications on your phone are rendered, this is either Apple's or Google's fault. Maybe the YT team could still fix it by messing with the text, possibly by inserting invisible bidi override characters like LRO.
Yeah, but screenshot shown seems to be from notification tab in UA-cam mobile app.
it's from the yt mobile app
@@joshuahillerup4290 Bro look at the video it's from the UA-cam mobile app
@@kklein Even then, UA-cam doesn't control beyond what Jasmijn mentioned. Text rendering is handled by the underlying OS or toolkit in most cases and not implemented separately by each application.
@@LetsPlayFolling ok ok gotcha!
At least it's an upgrade on where they render Arabic like this / ه ب ي ر ع ل ا
Then you have to figure out left to right while looking right to left, and side to side.
It's like crossing the streets.
This problem is universal with every language that goes right to left (arabic hebrew and persian)
But also in games (i know this cuz I am arab but I am pretty sure it works the same with persian)
Text gets cutoff
Because in arabic and persian (idk if hebrew has the same thing) there are diffrent states to letters and letters are connected like for example this is baa "ب" and this is bee "بي" you noticed the "ب" got connected with "ي" the ee but in games that have not got a system to make sure arabic is rendered properly
The text is written the wrong way (left to right and seperated) which makes it very hard for speakers for those 2 languages to easily understand what the other player has said in the chat
This is (thankfully) not the case in Hebrew...
There are two modes for how a program renders hebrew: logical hebrew, which is how normal people write, and visual hebrew, in which the text is rendered correctly, but the actual flow of the text is reversed. You can tell when you highlight the text as you read, when you go to the next line, it marks the first part of the first line and the last part of the last line, instead of the last part of the first line and the first part of the last line...
@@adrianblake8876 I don't know what you're talking about. Hebrew is RTL, and as you type Hebrew in a RTL field, it writes RTL as normal. If you type Hebrew in a LTR field, it has the cursor static as it pushes the Hebrew text to the right.
When you highlight the Hebrew text, from the middle of the first line, to the middle of the second line. It will highlight from the middle to the end of the first line, and from the start to the middle on the second line. It does this in Hebrew, Arabic, Latin, Greek, ...
I think you're confusing start/end with left/right. In Latin script, this selection goes middle-right on line 1, then left-middle on line 2. In Hebrew script, this selection goes middle-left on line 1, then right-middle on line 2.
But in these situations Feras RE talks about, when it renders the text LTR regardless of the script, both Arabic and Hebrew will be backwards. But at least in Hebrew you can manually fix this by typing backwards; annoying, but it works. In Arabic there's no proper solution since the letters won't connect.
@@Liggliluff Read your last paragraph again, and imagine the text being rendered on screen. Now, with that in mind, read what I wrote again...
Kind of experiencing the Baader-Meinhof phenomenon right now. I've just looked up how bidirectional text is handled in Unicode yesterday.
Yeah same, just finished implementing that very algorithm for a toy GUI project, though this is probably just google noticing me searching up things for the past few days and recommending a video on the subject.
With 20 000 users who have watched this video, someone is likely having dealt with RTL in software.
It makes no sense why youtube doesn’t just fix this, it’s a multi billion dollar company
To provide three "senses":
1. The bug isn't caused by UA-cam
2. It's not the only thing that UA-cam has to work on. There's probably thousands of bugs on the ticket lists at UA-cam.
3. Text handling is hard, one of the hardest things in Computer Science. I'm not saying there isn't a fix/workaround, but assuming the triviality of a problem as an outside is never a good thing to do. Especially if you're untrained in the underlying speciality.
@@LetsPlayFolling thousands of bugs? they have more than enough money to hire the programmers/developpers/engineers to fix the issues
@@LetsPlayFolling
4. YT simply does not care anymore.
Also they aren't looking at it as outsider expecting triviality from computer science. They are looking at it from the pov that comments are one of the most basic, long time and og features of YT, naturally a non computer scientist would think that they are trivial
@@cerebrummaximus3762 But this bug isn't caused by UA-cam. They could provide a workaround, sure, but saying that youtube messed up here is simply wrong. Text rendering isn't done by apps but rather by pre-provided functionality, usually by the underlying OS or Browser.
@@ASocialistTransGirl To avoid drifting too far into speculation since we both know neither the budget left for UA-cam nor the manpower available in the sector (in CS it is notoriously hard to hire qualified people), the main point is that UA-cam doesn't cause this bug, so they cannot fix it. They could potentially provide a workaround, but putting the blame on UA-cam here for something that is most likely a bug in the underlying OS or toolkit is unfair.
I dislike UA-cam as much as the next person, but that doesn't mean we should blindly bandwagon onto any feasible criticism.
شكرًا على اهتمامك
Problems like this are less about the devs not caring, but computers just not being very good at rendering certain scripts. Unicode has made it a bit easier, but having to make cases for every possible combination of scripts and texts is really hard to get right. It's even harder making it for languages you don't speak so it's hard to know if you made a mistake.
Nice channel, subbed.
thanks!!
I noticed this on a few videos that had both Latin and Hebrew characters in the title, it struck me as odd because I swear that this wasn’t a problem a few years ago
The man has spoken!
really interesting and UA-cam needs to fix it
The reason it works with comments is because the username tag is independent from the comment's content which is stored in a . Basically just as you said, its divvied up and rendered in blocks. I think that is a browser thing.
Notifications treat the username as part of the hence the problem.
I imagine this would be relatively easy to fix frontend wise, the question is do these notifications have access to links that the usernames use?
It would also help make the notifications look better too because man are they ugly as it is right now.
So as opposed to
Jafar commented: I say if...
It should be
Jafar/... commented: I say if...
If they dont have access to the user uri, they could also just do
Jafar/... commented: I say if...
and that'd format it correctly.
Very easy fix, but theres probably some weird way they generate this thatd put a damper on things
There is a little concept in cost management called "marketing allowable" or "allowable", which in this context means "the amount of spending allowed per action without exceeding the 0+ revenue value". Simply, its about understanding the cost of each action and acting if taking that action is beneficial overall or by itself.
This connects to a weird issue when the marginal cost of each issue solved does not provide enough increase in revenue. If the cost of doing an action is more costly compared to doing nothing, then there is no point in taking that action in the first place. An example would be to say that "there is no point in running a business if your revenue stream is on the negative"
What I'm getting at is the cost per each action taken and the return that action provides. If an action does not provide at least the same amount of value it takes to go through it, then it is not a desirable action.
In this context the action is to change the system so it renders Arabic or another language properly. And the cost of doing that is potentially not as simple as we may think it is (just a note here, im not saying it is a good decision to not improve it, I'm just explaining how they probably look at this situation)
And since the return value of an action is close to zero, any amount of effort required in such an action will make it undesirable. Meaning that unless this gets personal, it's likely not going to be implemented any time soon.
Finally someone talked about that topic.
The problem is fundamentally that the Unicode Consortium failed to recognize a distinction between a piece of left-to-right text that starts with an embedded piece of right-to-left text, and a piece of right-to-left text which ends with a piece of embedded left-to-right text. I'd guess the mobile app is probably using the system's built-in text rendering engine, which has no way of sensibly making such a distinction when fed mixed-direction text.
Unicode declares special characters to be used in exactly those situations, so I think what we're really saying here is that UA-cam failed to put in the appropriate characters. Put them in, and it will fix it. Unicode is the solution here, not the problem.
@@trejkaz Properly rendering text would require the ability to nest LTR and RTL contexts, rather than merely mark some pieces of text as LTR and other pieces of text as RTL. Add in Unicode's implciit direction rules and there's no good way of ensuring that things will always be rendered correctly in all contexts.
@@flatfingertuning727 Unicode literally supports nesting contexts. That's why PDI is called PDI, **POP** DIRECTIONAL ISOLATE. There are a stack of contexts.
Really, to me, it just sounds like you haven't read the specification.
Some examples, where I use the standard names for the formatting code points as stand-ins so that it's easier to see the difference (these two examples would render in essentially the same way if true text were inserted):
1. Left-to-right text that starts with an embedded piece of right-to-left text: [LRE] [RLE] text here is embedded rtl [PDF] text here is ltr [PDF]
2. Right-to-left text which ends with a piece of embedded left-to-right text: [RLE] text here is rtl [LRE] text here is embedded ltr [PDF] [PDF]
@@trejkaz I haven't read the whole spec, but I've never heard of that construct. What's the best way to find an explanation of all everything having to do with LTR vs RTL text?
as an Arabic speaker I thank you for this video, trying to do anything in both languages for example sometimes i write something in Arabic and then say a term that doesn't exist in Arabic and yotube just freaks out and puts anywhere except where I actually put it , and this problem isn't only in youtube but I'm just hoping that if youtube fixes this more apps will follow .
Live chats show the username at the end it's so weird. Good video!
Wait, isn't it natural for RTL speakers to expect "headers" (like the username) to be on the right side of the "content" (like the message) too?
@@erikkonstas oh ok
@@MattTacc TBF I'm not one either, I've just seen all apps do the same thing in this regard, it would surprise me if everyone was so wrong.
Yeah, this is something I could never miss as a Hebrew speaker. It's even worse when you have things like numbers and periods, which have no inherent "direction" to them. That can sometimes cause the same thing you described, except with two right-to-left bits of text, while the overall direction is left-to-right. Seen it often on Instagram and I suspect it happens here too
I don't know if there even is a genuinely good way to go around this, but it would be great if people gave it a go.
Actually, digits are strongly LTR. [which is to say, "Latn" digits. "Arab" digits are RTL. Don't confuse this with what most of the world calls "Arabic" digits.]
But yes, the behaviour for all this is documented in UAX #9: Unicode Bidirectional Algorithm. It's very well-defined, but it's also quite complex, so what apps will generally do is, use a library which lays out the text for them. The trouble usually comes when someone thinks they're good enough to implement text rendering from scratch.
@@trejkaz I see, so there's ways to do it right but almost nobody uses them? Like, how come Instagram don't use one of those libraries if it solves the problem? Are they that dumb?
Also if "Latin" digits are LTR then that kinda explains why any Arabic or Hebrew sentence containing numbers immediately gets mangled
@@eliad6543 I think most people don't encounter this problem until maybe 10-20 years into their career, and the majority of people working on apps don't have the experience. Some of us got burned hard by it before so we're very aware of it.
@@trejkaz Yeah that's fair. I suppose if the team had a speaker of one of those languages it could've helped but obviously that's not that common
Could you mind doing a video on the Various scripts used in South Asia/Indian Subcontinent? They are really interesting and also very Underrated.
Examples: পূর্বী নগরী, देवनगारी, عربی, ଓଡ଼ିଆ, தமிழ், ગુજરાતી. తెలుగు, ಕನ್ನಡ, ꯀꯪꯂꯩ ꯃꯌꯦꯛ, ᱚᱞ ᱪᱤᱠᱤ, ތާނަ, ꠘꠣꠉꠞꠤ, etc
There is another bug on some apps where the arabic characters get cut into pieces (it supposed to be connected) and makes it’s literally impossible to read unless you have free time to actually understand what it’s saying
There's also a problem where comments with japanese/chinese characters always have the read more button on the comment, even if there is just a single line of text in the comment. ご覧ください
Actually I tried looking at this in multiple browsers and on mobile and it looks like it's only in firefox. I guess it's a firefox problem then. I swear it used to be on all platforms but maybe I'm just getting mandela effected
I upvoted this for identifying a problem, making a video about it and not going nuts with clickbait to get attention.
Also "read more" or "اقراء المزيد"
DOESN'T EVEN CLICK NO MATTER HOW FUCKING HARD YOU TRY
Thanks for addressing this. It causes many bugs on the mobile UA-cam app.
This proplem have also bothered me every time I tried to use Arabic in roblox it reverse your sentence so instead of "hello people "
It writes" people hello"
It's kind of annoying ngl
You have no idea the PAIN and SUFFERING you have to go through to type something in arabic in a search bar it is HELL
As an arabic letter i approve this happens to me everytime i am written on the internet
I really doubt you're an Arabic letter 🧐
@@baibac6065 r is upside down ر
You think this is bad? Try switching your youtube language to arabic and put on notifications for english channels, it is so messed up I almost cried the first time I saw it.
we stan jafar
I think I’ve seen this before and always wondered what was up. It explains why usernames might’ve been converted back to white text 🙃
i also encounter that on like facebook specially when i am trying to get some english words between arabic ones
When it comes to regional settings, sometimes the usage of numbers will remain the same. For example, someone in sweden chooses to display google (or anyother website) in english, even if swedish is available. Sometimes, it still renders numbers as if the language was set to swedish, so this means that 1.234,56 (for 1,234.56 'one thousand two hundred (and) thirty-four point five six') will be displayed even when set to english. Both , and . are acceptable when communication is done internationally such as nutrition information. This means for example, you will see 2,4g salt, even when English is being displayed as it is giving information in many other languages.
And what about vertical writing systems like Mongolian? 🤣
its been like this with right to left languages for years everywhere. when you try to have both english and hebrew in a word document you are going to smash your head into your desk many times before making the formatting coherent
Also bring back sorting by oldest to newest!!
Good rules to follow with this stuff:
- If you have a template of some sort where you're inserting a user variable, _assume that this variable may contain text which flows in a direction different to the surrounding text_.
- Properly wrap all such variables in _either_:
(a) a FSI-PDI pair, _or_
(b) an LRI-PDI (if the user's native language is LTR) / RLI-PDI pair (if the user's native language is RTL).
- Which of these is appropriate will often depend on what the variable contains.
- For instance, for usernames, you might decide that all users must see a given username in the same way, which means you _must_ use something like FSI. (Or, you could use LRI for all users, but that seems hostile towards RTL users.) But for titles, you might decide that it should always match the user's text direction.
- Once you decide which isolate you're going to use, standardardise on it and update your style guides so that everyone else in the organisation does the same thing.
- Go to other companies who are getting this wrong and hand them the same guidelines.
- Complain to people who implement templating systems and don't handle this stuff automatically, out of the box. (Which is most of them.)
شكرا!
additional issue: the arabic was rtl, and the ! mark was at the end of the sentence, on pc it's rendering it like this RTL! instead of !RTL
I have this healthcare/doctor’s website I used that can be translated into many languages. One is Arabic. It actually fixes this problem, and when it translates English-Arabic, it puts the English number/catalog “last” and Arabic name of the disease/disorder “first”
Example: DA90 Nonstructural developmental anomalies of small intestine (English version) and (the Arabic Version that Google might stupidly change): شذوذات نمائية غير بنيوية في الأمعاء الدقيقة DA90
Btw, Google DID try to change it when I typed it, so I retyped the English part at the end instead of the front lol. Hopefully one day UA-cam can learn how to accept it properly when I type it.
Your comment is in LTR so it will render it as LTR. It's intentional design. You can add a RTL embedding character to fix it:
DA90 شذوذات نمائية غير بنيوية في الأمعاء الدقيقة
This text now actually has the "DA90" at the front. But that website forces the whole page to be RTL, making "DA90" to appear at the front as well.
@@Liggliluff yes I agree, I know my app is LTR I just wanted to show people who haven’t done it before how it becomes problematic, but some websites even set LTR still register the languages vs UA-cam’s app just doesn’t at all.
The directional difference seriously screws me over as a student taking Arabic in school 😆 trying to flip between English & Arabic on note taking apps like Notability is even more awkward than YT ☠️ و ما بعرف ليش
If I start to write something on my phone with a script like Arabic or Hebrew and then switch to Roman letters (or any other left-to-right script) it considers the words as individual bits and places them right-to-left, so the English text (or whatever language) reads weird, as the words are written normally but are ordered backwards.
This existed since like forever on not only UA-cam but the whole Internet
I just live with it at this point but it's so annoying trying to write arabic words in english sentences since sometimes the brackets are reversed and some other stuff that bugs me
This happens to me a lot when I put Arabic and hebrew texts in my names on discord, and the fact I know some people that can't read latin too... It's really annoying
i have a really personal problem with this system when i type an arabic+english comment, and it's really really painful
جَزَاكَ اللهُ خَيْرًا for this. For far too long we have been in this struggle it now high time to.....wait why is the text....aghhhhh
a lot of english and arabic script on websites are difficulty as i often copy terms but when highlighting it take in account of the direction and words will be missed in between the Latin and Arabic script
اختبار/test
This is a test to the UA-cam problem, let me know if you are facing any problems
Hey this is totally unrelated to this video but I have a question for you , Mr. Klein.
Is there any research on the specific type of language that humans use when talking to themselves? Like whenever you talk to yourself when alone in the bathroom or in the car or even just in your head.
How is the grammar different? I, for example, notice myself using the pronoun “we” when talking to myself all the time.
So are there any works on this part of language? How would one even study this?
ur weird lol
we are legion we are venom
@@someguyontheinternet4277 How did you know that they are western, educated, industrialized, rich and democratic?
Cool. BTW, how is the Pirahã part two episode going?
This problem with the arabic language mess up the system is not in UA-cam alone.
This problem also happens in Roblox but it is much worse .
In Roblox when i type (Come here) in Arabic Roblox sends it from left to right instead of right to left to it comes out(here come) and because of this every one that reads Arabic in Roblox reads for left to right instead to right to left(because my example is English you will read from right to left instead of left to right.
As a language fan, I hate right-to-left texts on a computer as it messes up my left-to-right text. It's a computer text issue. I'm fine with whatever direction. Even up-to-down like East Asia languages.
As someone with 0 experience in anything related to this video, I can tell you UA-cam won't fix this
This has been a big problem for me when I do translation work and have to use an English word in the middle of a sentence so usually I try to romanise English words into hebrew
I will add to that: the read more button on arabic is so irritating, not working most of the times.
it's easier to click translate comment to english then read more then translate to original text again.
You are underrated...
I think this problem might be more to do with mobile phones themselves, since when I try sending messages to my grandma in English and Hebrew, the same thing happens. But when I send messages via WhatsApp on my laptop, they’re fine.
It's no the mobile phones, just the apps being improperly coded. There's no difference of a phone or a computer, it's hardware running software.
This is what Unicode codepoints U+2066 left-to-right isolate, U+2067 right-to-left isolate, and U+2068 first-strong isolate are for! (I think)
YOU FORGOT TO MENTION IN ARABIC COMMENTS YOU CANNOT PRESS THE “READ MORE” BUTTON IT DOESNT WORK PLEASE MAKE THEM FIX THIS OR SMTHN
This used to be a really really annoying bug in Instagram for around 8 years and got finally fixed just half a year ago. you couldn't ever read comments with the main rtl language mixed with some English words.
Commenting to boost the algorithm
you think that's bad? some programs do both of these
1.flip the arabic so it's left to right
2. NOT connect the letters together
an example for this is goat simulator 1 on mobile
And on some mobile apps the entire interface goes from right to left lmao
I would be happy if YT studio could just have the same video management functionality of the desktop browser 😭😭😭. After that not mangling comments is just icing.
They really do need to fix this, with any text written right to left, since I have the same issue with Urdu as well as just Arabic ;-;
It's infuriating
And don't get me started on video games' chat making the Arabic letters disconnected from each other and reversed forcing me to read the words letter by letting and from left to right...
It's nearly impossible to ping people with Arabic usernames on discord without clicking on the profile in specific because the entire thing renders backwards.
This can also be observed if you try to type in the same message as the ping, some words move before the @, and some move after the @, making everything a jumbled mess.
As an Arab, I can say that it's really frustrating when I use Arabic and English in the same comment. It makes it unreadable, especially when I use English numbers with Arabic text. UA-cam just pushes the numbers aside and the text in another, making it hard to understand the sentence that the number belongs to.
I'd like to thank you for bringing that up
I have the problem here, I have to spell the number in its own language when I actually want to write the number to save time and space. It happens on Facebook and Telegram more likely.
That's why I honestly prefer to write my comments in English or German (since I can also speak German)
That doesn't mean that I'm not proud of being an Arab though
As an Arab I actually didn’t realise this problem until I watched the video…
You can use the # symbol and insert the arabic/latin words between the #s
this is also a problem with Hebrew as it is also written right to left
test: שלום/hello
test: hello/שלום
in the first test I wrote in Hebrew first then English in the second I wrote with English first and Hebrew second
That appears correctly on English (US) Windows 10.
I'm just thankful for Canva