Wow, wish that tool existed when I was living in Thailand. I agree with you that once one advances in the Thai language the spaces aren't necessary. But, when you are learning a new language like Thai where you have to learn a new (and somewhat large) alphabet, anything to get you over the initial hump of being able to communicate is welcome .
Linguaphile Jay has done it again! 👏👏👏 This should be helpful for a lot of beginners. As he pointed out in the last video, Thai children are first taught to read with spaces between the words. And now that I think of it, the first book I used, The Fundamentals of the Thai Language, I learned to read by looking at individual words. Once I understood those, I could read short sentences. By the end of the book I was able to recognize where letters separated into words in complete sentences. It's a shame nobody has done an updated version of that book.
Looks like you've set up an endpoint that handles the delimitation process, but you can do it much more efficiently on the client side with Intl.Segmenter (and it works for other non-spaced languages too). It's a newer feature so it wouldn't work on outdated browsers, but it might be worth checking out as it would save you some bandwidth. You could check if the browser supports it and fall back to the endpoint in case it doesn't.
In Vietnamese I had/have the opposite issue. Each syllable is separated, and generally each one has a meaning of its own, BUT, many times IN THE SENTENCE it only makes sense as a word as a too syllable (sometimes more) word. Something that would GROUP it together in Vietnamese, maybe by a system of color code (two colors would be enough) would be great. For Japanese I also think color code would be better. But many people may prefer spaces. So maybe having both the option for color code and space would be best. That's really amazing and what Indeed to start learning Thai. too sad I'm still improving my Vietnamese ad just started Mandarin. I wouldn't have the time now.
Great addition Stuart! Super useful for the initial learning phase. Slightly off topic.... (I asked on another video) but I'm wondering if I could ask how to get the Google Sheet of the David Martin sentences you have in "10 sentences" videos... I couldn't find the resource anywhere. (fully ok if it's in one of the courses) I just cannot find it listed anywhere. Thanks again :)
as for the sheet, for security reasons, I can't fully share it as to you it you need to allow the script on your machine. This will be another tool I'll build out for here though to make it accessible to everyone
I have contemplated the process, mainly to do with Khmer language but how it would work with any language. Potentially smaller words that are within bigger words get caught when a potentially longer word is still in process. Words not in the look up list come out as separated letters until a new starting match point and matching word are found.
right. tested a bunch of parsers. there are much better and more flexible ones if I did this solely in python, but for this in node, I didn't want to have to build a wrapper library from scratch between python and node
ตามมาจาก TikTok ชอบอาจารย์มากเก่งภาษาสุดๆ สอนภาษาไทยขอมถูกใจมาก
Wow, wish that tool existed when I was living in Thailand. I agree with you that once one advances in the Thai language the spaces aren't necessary. But, when you are learning a new language like Thai where you have to learn a new (and somewhat large) alphabet, anything to get you over the initial hump of being able to communicate is welcome .
🙏 We have now an illuminated teacher. Thanks so much! 🙏
Thanks for the free tools!
เป็นกำลังใจให้ครับคุณราจเป็นคนมีที่คุณภาพ มากกว่ายูทูปเบอร์บางช่องที่มาวิจารณ์คุณเสียอีก สู้ๆครับ
Linguaphile Jay has done it again! 👏👏👏
This should be helpful for a lot of beginners.
As he pointed out in the last video, Thai children are first taught to read with spaces between the words.
And now that I think of it, the first book I used, The Fundamentals of the Thai Language, I learned to read by looking at individual words.
Once I understood those, I could read short sentences. By the end of the book I was able to recognize where letters separated into words in complete sentences.
It's a shame nobody has done an updated version of that book.
Looks like you've set up an endpoint that handles the delimitation process, but you can do it much more efficiently on the client side with Intl.Segmenter (and it works for other non-spaced languages too). It's a newer feature so it wouldn't work on outdated browsers, but it might be worth checking out as it would save you some bandwidth. You could check if the browser supports it and fall back to the endpoint in case it doesn't.
In Vietnamese I had/have the opposite issue. Each syllable is separated, and generally each one has a meaning of its own, BUT, many times IN THE SENTENCE it only makes sense as a word as a too syllable (sometimes more) word. Something that would GROUP it together in Vietnamese, maybe by a system of color code (two colors would be enough) would be great. For Japanese I also think color code would be better. But many people may prefer spaces. So maybe having both the option for color code and space would be best.
That's really amazing and what Indeed to start learning Thai. too sad I'm still improving my Vietnamese ad just started Mandarin. I wouldn't have the time now.
It's a really neat free tool, the Word Spacer. It would be great to have it with audio at some point.
click on the open in Google link and you get that straight from google
Great addition Stuart! Super useful for the initial learning phase. Slightly off topic.... (I asked on another video) but I'm wondering if I could ask how to get the Google Sheet of the David Martin sentences you have in "10 sentences" videos... I couldn't find the resource anywhere. (fully ok if it's in one of the courses) I just cannot find it listed anywhere. Thanks again :)
I have all the sentences there available in the mindkraft git repo...link in the description. Go to resources>Thai
as for the sheet, for security reasons, I can't fully share it as to you it you need to allow the script on your machine. This will be another tool I'll build out for here though to make it accessible to everyone
Which parser did you end up using with Node?.. I've attempted separating thai words in sentences previously, but also like you with variable results!!
after trying many...for this, I use TNThai because I coded it in svelte, so wanted a node wrapper. If just python, many more to choose from
@@StuartJayRajWhat python libraries do you suggest looking at?
I have contemplated the process, mainly to do with Khmer language but how it would work with any language. Potentially smaller words that are within bigger words get caught when a potentially longer word is still in process. Words not in the look up list come out as separated letters until a new starting match point and matching word are found.
right. tested a bunch of parsers. there are much better and more flexible ones if I did this solely in python, but for this in node, I didn't want to have to build a wrapper library from scratch between python and node
อักษรฆวยไทยสวยมาก
U r so clever, Jay. ❤❤❤
ผมอยากรู้ว่าเรื่องนี้จริงหรือไม่เกี่ยวกับจานกระเบื้องที่มีบันทึกภาษาโบราณ ซึ่งพระไทยนั้นระบุว่านี่คือตัวอักษรของไทยโบราณที่มีมีอายุมานานเกือบ 2000 ปี
youtube.com/@malamokitchen?si=g5VAG9uDtLrtUwNP
อยากให้มีชาวต่างชาติสักคนนึงไปพิสูจน์หน่อยครับ ในช่องนี้มักจะหยิบประเด็นนี้มา แต่บางวิดีโอนั้นแทบไม่มีความน่าเชื่อถือเลย มันเหมือนเล่าไปเรื่อยโดยไม่มีหลักฐานมาประกอบ
จานกระเบื้อง....มีอายุไม่เกิน2000ปี...จบ