Advertisements

Automatic Transliterations

36 posts / 0 new
lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
Pending moderation

Automatic transliterations have been implemented for some languages. You'll see "Romanization" links above song lyrics, near the song language.

At the moment, transliteration is available for the following languages:
Belarusian
Korean
Macedonian
Mongolian
Russian
Serbian
Ukrainian

Do these automatic transliterations make sense, are they accurate enough?

For some languages, there're few transliteration options, e.g. Russian -> Japanese, Russian -> Chinese.

Please suggest languages that could be transliterated. It may be different options, not only Romanization.

Editor
<a href="/en/translator/ahmetyal" class="userpopupinfo username" rel="user1256330">Kurdê Dîn</a>
Joined: 17.08.2015

Great addition, but some letters capitalized when transliterated (seems to be only "j"--> https://lyricstranslate.com/en/nokaut-%D0%BD%D0%BE%D0%B5%D0%BC%D0%B2%D1%...

Editor
<a href="/en/translator/nicholasovaloff" class="userpopupinfo username" rel="user1229450">Ondagordanto</a>
Joined: 19.12.2014

Great feature, thank you for implementing it! As Kurdê Dîn said, some letters appear weirdly: I saw Я appearing as YA instead of Ya. There may be others, but if I see them, I'll write another comment.

As for suggesting other languages, Bulgarian would definitely need to be added to the list. However, there is not only one system of transliterating it, so maybe adding two or possibly three ways would be best. Should I write a comment here or send you a PM explaining in detail my suggestions?

Editor in search of Anningan & Malina
<a href="/en/translator/sydney-lover" class="userpopupinfo username" rel="user1112972">DarkJoshua</a>
Joined: 10.05.2012

Considering there's not only one way of transliterating, it's hard to say what makes sense and what doesn't. That said, I noticed some inconsistencies for Russian: except for "Я" being "YA" instead of "Ya", there's also "ё" which sometimes appear as "yë" and sometimes as "ë". The best way to transliterate it would be "yo", in consistency with the way the rest of the iotised vowels are transliterated. Istances where "ё" is written "е" (which is common practice in Russian) have also been erroneously translated as "ye".

Greek should also be added.

Editor
<a href="/en/translator/joyce-su" class="userpopupinfo username" rel="user1375920">Joyce Su</a>
Joined: 17.03.2018

There is no standard for transliteration of foreign languages in using Chinese characters. It could be "sound" right but no meaning at all.

For my experience of learning foreign languages, IPA is my priority. And the Romanization is good enough for me to pronounce the foreign language. If I'm using Chinese characters to pronounce the foreign language (actually I tried to read them in Russian lyrics), it just makes me more confused. So I like to read the transliteration of Romanization.

Question from the user: if the automatic transliterations credit to the lyrics submitters, do they get the point too?

Editor
<a href="/en/translator/floppylou" class="userpopupinfo username" rel="user1336490">Floppylou</a>
Joined: 29.04.2017

Thank you for this new implement !

"Please suggest languages that could be transliterated" > Maybe Hebrew ? Greek ? Regular smile

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008

Bulgarian, Greek and Georgian transliterations have been added.

Russian->Japanese and Russian->Chines have been removed.

Kurdê Dîn a écrit :

some letters capitalized when transliterated (seems to be only "j"

fixed

Ondagordanto a écrit :

Я appearing as YA instead of Ya.

fixed

DarkJoshua a écrit :

"ё" which sometimes appear as "yë" and sometimes as "ë".

fixed

DarkJoshua a écrit :

Istances where "ё" is written "е" (which is common practice in Russian) have also been erroneously translated as "ye".

It seems impossible to automatically determine the letter must be е or ë.

Joyce Su a écrit :

Question from the user: if the automatic transliterations credit to the lyrics submitters, do they get the point too?

no

Floppylou a écrit :

Maybe Hebrew?

Is it possible to automatically transliterate Hebrew?

Thanks for your suggestions and corrections!

Editor
<a href="/en/translator/floppylou" class="userpopupinfo username" rel="user1336490">Floppylou</a>
Joined: 29.04.2017
lt a écrit :
Floppylou a écrit :

Maybe Hebrew?

Is it possible to automatically transliterate Hebrew?

Thanks for your suggestions and corrections!

If the vowel points are written (Niqqud), it's possible to transliterate automatically I guess. If they are not written (as lots of hebraic lyrics on LT), it's impossible to do it.

Moderator sapiens sapiens
<a href="/en/translator/knee427" class="userpopupinfo username" rel="user1110108">Alma Barroca</a>
Joined: 05.04.2012

I have to say that today, on my 7th anniversary here, reading this makes me happy Regular smile

It's amazing to see how much LT grew up and got new functions with time. It's a pleasure to be a part of this community!

On the way to our millionth translation!

Editor in search of Anningan & Malina
<a href="/en/translator/sydney-lover" class="userpopupinfo username" rel="user1112972">DarkJoshua</a>
Joined: 10.05.2012

Wow, never realised you had been here for so long. We've joined LT one month and five days apart.

Moderator sapiens sapiens
<a href="/en/translator/knee427" class="userpopupinfo username" rel="user1110108">Alma Barroca</a>
Joined: 05.04.2012

We're getting old, pal, but let's not talk about our ages here Tongue smile 😂

Moderator of Romance Languages
<a href="/en/translator/carnivorouslamb" class="userpopupinfo username" rel="user1109697">phantasmagoria</a>
Joined: 31.03.2012

I've been here one month longer than you Juan, thus older than both of you Wink smile

Question: Is Hebrew part of this list?

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008

Based on https://lyricstranslate.com/en/comment/555068#comment-555068 , we assume Hebrew is not an option.

Moderator
<a href="/en/translator/thomas222" class="userpopupinfo username" rel="user1310118">Thomas222</a>
Joined: 06.10.2016

Please don't add it, it is much better to transliterate Hebrew by hand, or by ear for that matter. Different singers can sing the same word in different ways. An automatic transliteration won't even notice.

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008

A few more languages added. Full list:

Amharic
Armenian
Belarusian
Bengali
Bulgarian
Chinese
Chinese (Cantonese)
Georgian
Greek
Hindi
Japanese
Kazakh
Korean
Macedonian
Mongolian
Russian
Serbian
Tamil
Telugu
Ukranian
Uzbek

Editor
<a href="/en/translator/joyce-su" class="userpopupinfo username" rel="user1375920">Joyce Su</a>
Joined: 17.03.2018

I just added the lyrics of Chinese (Cantonese) and it came out the Romanization automatically. This Romanization is pronounced by “Mandarin Chinese”, but not Cantonese transliteration. Both of languages use same Chinese characters, but their pronunciations are different.

https://lyricstranslate.com/en/wakin-chau-%E5%88%80%E5%89%91%E8%8B%A5%E6...

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
Joyce Su a écrit :

I just added the lyrics of Chinese (Cantonese) and it came out the Romanization automatically. This Romanization is pronounced by “Mandarin Chinese”, but not Cantonese transliteration. Both of languages use same Chinese characters, but their pronunciations are different.

Automatic romanization for Chinese (Cantonese) has been disabled, thank you.

Editor ♥
<a href="/en/translator/swedens0ur" class="userpopupinfo username" rel="user1334503">swedensour</a>
Joined: 09.04.2017

I just noticed that it has been implemented for Tamil, the problem with this is that it's using the same system that GT uses, which is inaccurate. There are 3 l sounds and 3 n sounds that are very confusing when transliterated in general, but the GT system uses diacritics and it doesn't work as this transliteration can't easily be read by those who speak Tamil. It also says that the word சீரணி is chiirani but it's not, it should be seerani, so it's a much harder system for learners. But the IPA is great, so keep that part, I guess. Tamil transliteration is a mess and no system is good, tbh.

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
swedensour a écrit :

I just noticed that it has been implemented for Tamil, the problem with this is that it's using the same system that GT uses, which is inaccurate. There are 3 l sounds and 3 n sounds that are very confusing when transliterated in general, but the GT system uses diacritics and it doesn't work as this transliteration can't easily be read by those who speak Tamil. It also says that the word சீரணி is chiirani but it's not, it should be seerani, so it's a much harder system for learners. But the IPA is great, so keep that part, I guess. Tamil transliteration is a mess and no system is good, tbh.

Thanks for the observation, this option for Tamil has been removed.

Editor ♥
<a href="/en/translator/swedens0ur" class="userpopupinfo username" rel="user1334503">swedensour</a>
Joined: 09.04.2017

Thank you. Regular smile

Editor
<a href="/en/translator/nicholasovaloff" class="userpopupinfo username" rel="user1229450">Ondagordanto</a>
Joined: 19.12.2014

After observing a while the romanisation systems for Bulgarian, I've noticed some issues. I'll separate them accordingly:

First system (closer to the official one):

  1. Хх currently appears as Kh/kh – it would be way better to appear simply as Hh because it has no k in it as sound.
  2. I noticed Цц romanised as Ts/ts, whereas тс – as t·s, with a separating dot in the middle to distinguish it from the other one. I understand why it's done like this, but the difference, in my opinion, is barely noticeable (unless bolded: t·s/t·s) and therefore misleading. Moreover, although similar, Цц is a single sound, whereas тс consists of two separate sounds, thus they're not the same, so I'd suggest Цц to be Cc, (like it's done in other Slavic languages using the Latin script, e.g. Serbian, Czech, Slovak), whereas тс to become ts without a separating dot.
  3. Ъъ is currently Ŭŭ, but I think it will be better to appear as Ǎǎ, because as a sound, ъ is closer to а than to у.
  4. Юю and Яя appear as Yu/yu and Ya/ya respectively: nothing wrong with that. Йй, however, appears as Ĭĭ, which isn't that wrong, but Yy would be way better because the letters ю and я represent й+у and й+а respectively, and in this way they will be consistently transliterated.

Second system (more phonetic one, closer to other Slavic languages using the Latin script):

  1. Ъъ currently appears as ", which is very, very unusual. I've never seen it transliterated like this anywhere, plus it doesn't give a clue how one should pronounce it, so given the more phonetic nature of this romanisation, the best way for it would be Ǎǎ, just like in the other system.
  2. Шш appears as Šš – nothing wrong with that. Щщ, however, appears as Ŝŝ, which I've never seen like this ever before, plus I doubt it anyone would notice the difference between the caron and the circumflex, which, being a small one, would be very misleading, as in Bulgarian, Щщ is ш+т, so the best way for it is to get transliterated as Št/št.
  3. Юю and Яя appearing as Ûû and Ââ respectively: it's not that wrong, but it's pretty unusual, to say the least. Given that Йй is Jj in this system, Юю and Яя should be rather Ju/ju and Ja/ja respectively.

And lastly, I noticed both systems lack a transliterated ѝ (и with a grave accent) – it should get transliterated as ì in both systems (You may add, just in case, the uppercase Ѝ (Ì) as well, but it's never actually used).

EDIT: I also just saw ь transliterated as an apostrophe, which is also strange for Bulgarian. Since it represents the same sound as й does, ь should be transliterated as y in the first system and j in the second one.

EDIT 2: A few more vowels, just in case, would be better to have their transliterations too: А̀а̀ (Àà), О̀о̀ (Òò), У̀у̀ (Ùù), Ѐѐ (Èè), Ю̀ю̀ (1: /; 2: /), Я̀я̀ (1: /; 2: /).

Super Member
<a href="/en/translator/jadis" class="userpopupinfo username" rel="user1387945">Jadis</a>
Joined: 01.07.2018
Ondagordanto a écrit :
  1. Хх currently appears as Kh/kh – it would be way better to appear simply as Hh because it has no k in it as sound.

In French, we currently use "Kh" to represent the Russian (or Bulgarian) "Хх" : for instance Nikita Khrouchtchev . This is nearly impossible to pronounce for French people, so they usually say "Kroutchev" (Kruchev), although there should be no "K" in it (and anyway, it should rather be something like Xrushchof). The "scientific" representation uses a "X", just like in Russian. I doubt that a Russian would understand that "Kruchev' means "Хрущёв"...
 

Editor
<a href="/en/translator/nicholasovaloff" class="userpopupinfo username" rel="user1229450">Ondagordanto</a>
Joined: 19.12.2014
Jadis][quote=Ondagordanto a écrit :

In French, we currently use "Kh" to represent the Russian (or Bulgarian) "Хх" : for instance Nikita Khrouchtchev . This is nearly impossible to pronounce for French people, so they usually say "Kroutchev" (Kruchev), although there should be no "K" in it (and anyway, it should rather be something like Xrushchof). The "scientific" representation uses a "X", just like in Russian. I doubt that a Russian would understand that "Kruchev' means "Хрущёв"...
 

I can see your point and I'm aware that the same would apply for other Romance languages, like Italian or Portuguese for instance. At the same time, however, not many languages lack the H sound and its allophones; in fact, Romance languages are well-known for lacking it (with the exception of Spanish [Jj] and Romanian [Hh]), but compared to many other languages from different groups, they appear to be the minority and shouldn't therefore have much of an impact on transliterating languages that feature this sound. All Slavic languages share it, English, German, Arabic, Finnish, Hungarian, Mandarin, Japanese, etc, do so too.

French (and probably other languages) may transliterate lacking sounds in different ways, but that doesn't mean Bulgarian specifically does it the same way. Kh is kind of outdated and it would be strange for a native speaker to type it like this. It's also seen from a very Romance-based point of view, which doesn't have much to do with representing a Slavic language's phonetic system, as I already said. Plus, a transliteration should also look as natural as possible to a native speaker – despite the fact it's usually meant to serve as a guide for non-natives.

Apart from that, if the first system is meant to be closer to the official one for Bulgarian (the current one is from 2006), then Хх should be Hh (Hristo Botev, Harmanli). An ideal romanisation system is difficult to achieve, but I'm rather prone to think that whoever is interested in learning how to pronounce this or that letter/sound in a certain language, can always look it up on Google or Wikipedia where IPA guides with audio examples are to be found.

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008

[@Ondagordanto] Thanks for your suggestions, we tried to implement them. Could you now check the result?
Not sure about "EDIT 2" - are these Bulgarian vowels?

Editor
<a href="/en/translator/nicholasovaloff" class="userpopupinfo username" rel="user1229450">Ondagordanto</a>
Joined: 19.12.2014

Thank you very much! I just checked them and everything's perfect, with just a few small things you seem to have missed, they're all in the first system:

  • Capital Х appears as Kh instead of H
  • ѝ doesn't appear as ì, but just ѝ
  • Йй still appears as Ĭĭ instead of Yy
  • тс appears as t·s instead of ts

Meanwhile, I was wondering if the letter Жж (in the first system again) should get transliterated as Jj rather than Zh/zh.

The combination zh can be understood as зх, which is a common combination of letters in frequently used words like разходка (razhodka), разхубавил (razhubavil), разхайтен (razhayten). If put in the opposite direction, zh can appear ambiguous and therefore words like the ones I gave could be understood as ражодка, ражубавил and ражайтен instead, making no sense at all. Another example would be the word изживявам, which currently will appear as izzhivyavam, not making it clear for people who can't read the Cyrillic script whether the zzh should be pronounced as зж, ззх or even жж.

If you decide so, you can leave it the way it is currently, but in my opinion, Jj would make this ambiguity disappear; plus, the majority of Bulgarians do use Jj for Жж when typing in Latin script, so it's not uncommon practice at all.

lt a écrit :

Not sure about "EDIT 2" - are these Bulgarian vowels?

Yes, these are Bulgarian vowels with grave accents (for comparison, Russian uses acutes, e.g. и́). They're used in rare occasions (with the exception of ѝ, being a word itself) to mark where the stress falls in words you can't tell for sure how they should be pronounced, for distinguishing words whose written forms are the same, but their pronunciation differs (e.g. напра̀ви (He/she did) – направѝ (Do!), or when a word is not pronounced the standard way (common practice in poetry and lyrics). That's why it would be best for these vowels to be transliterated as well, just in case.

(Sorry for my extreme perfectionism and detailed explanations, I'm just trying to make things clear and as less ambiguous as possible.)

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
Ondagordanto a écrit :

just a few small things you seem to have missed, they're all in the first system

Everything is done except zh. Thank you!

Editor
<a href="/en/translator/nicholasovaloff" class="userpopupinfo username" rel="user1229450">Ondagordanto</a>
Joined: 19.12.2014

Excellent, just checked it and everything's flawless. Thank you very much again.

Editor
<a href="/en/translator/floppylou" class="userpopupinfo username" rel="user1336490">Floppylou</a>
Joined: 29.04.2017

Hi [@lt], I saw that Amharic was available as a translation for this new option. Maybe you can add Tigrinya (Eritrea's official language), that uses the same alphabet (the Geʽez script) ? Regular smile

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
Floppylou a écrit :

Hi [@lt], I saw that Amharic was available as a translation for this new option. Maybe you can add Tigrinya (Eritrea's official language), that uses the same alphabet (the Geʽez script) ? Regular smile

If the transliteration rules are the same, we will add Tigrinya.

Super Member
<a href="/en/translator/tonyl" class="userpopupinfo username" rel="user1204550">tonyl</a>
Joined: 07.04.2014

I noticed that now there is an automatic transliteration.

And I want to say that I'm sure that there's at least one mistake in every Japanese song.
Here are just three I looked at now: (and in every Japanese song that I looked at since, especially older ones)

https://lyricstranslate.com/en/%E9%85%92%E5%AD%A3%E3%81%AE%E6%AD%8C-shuk...
1. "shu" → "zake"
2. は can be "ha" but also "wa"
3. "tokkuri" → "tokuri" (both are valid, the second one is chosen in the song to fit the 7 5)
4. "Shō imo no ni koro ga shi" → "koimo no nikkorogashi"
5. "ashita no katari kusa" → "asu no katarigusa"

https://lyricstranslate.com/en/%E5%8F%A4%E5%9F%8E-kojou-old-castle.html
1. "aogeba bishi" → "aogeba wabishi"
2. "Yadan" → "yadama"
3. "ōko" → "mukashi"
4. "sora iku" → "sora yuku"

https://lyricstranslate.com/en/%E3%81%BB%E3%82%8D%E9%85%94%E3%81%84-tips...
(and even in more modern songs:)
1. "idakitai" → "dakitai"
2. "Itakatta rōshin kizutsuite" → "itakattarou kokoro kizutsuite"

I think that these examples are nice, since they feature both simple mistakes that maybe a slightly better automatic transliterator could fix,
but also occurrences where the choice of how to read the kanji is purely up to who wrote the song,
(Like asu - ashita, both mean 'tomorrow' and are written the same)
and in some cases even who sings it.
(there are examples of songs covered by different singers where they use a completely different pronunciation of some characters.)

Super Member
<a href="/en/translator/tonyl" class="userpopupinfo username" rel="user1204550">tonyl</a>
Joined: 07.04.2014

Another example:

https://lyricstranslate.com/en/%E3%81%95%E3%81%8F%E3%82%89%E8%B2%9D%E3%8...

Sari ikeru kimi ni sasagen [Sariyukeru] (could be both 'i' and 'yu', no way to tell but to listen)
Kono kai wa kyonen no hamabe ni [kyonen --> kozo] (no way to tell, out-dated usage)
Ware ichi nin hiroi shi kai yo [ichi nin --> hitori] (this one is usually hitori)

Honobono to usu beni shimu ru wa [shimu ru --> somuru] (classical Japanese)
Waga yuru samishi chishio yo [yuru --> moyuru] (classical Japanese)
Wa roba ro to kayou kaori wa [wa roba --> harobaro]
Kimi koi uru mune no sa zanami [koi uru --> kouru] (classical Japanese) [sazanami - one word, could be fixed]

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
tonyl wrote:

no way to tell but to listen

In such cases, automatic transliteration cannot help.

Some things are fixed. Unfortunately, we are not able to fix the rest now.

Do you think we should get rid of automatic transliteration for Japan, or is it still better with it than without it?

fixed

tonyl wrote:

[wa roba --> harobaro]

fixed

Editor in search of Anningan & Malina
<a href="/en/translator/sydney-lover" class="userpopupinfo username" rel="user1112972">DarkJoshua</a>
Joined: 10.05.2012

I think it would be helpful, as long as it's possible, to add names to the automatic translations. Calling them "Romanisation 1" and "Romanisation 2" doesn't really help. I'd suggest to put the name of transliteration system in question. It gets more technical, but who's specifically looking for transliterations would find it useful.

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008
DarkJoshua wrote:

I'd suggest to put the name of transliteration system in question.

Could you provide some examples, please?

In most cases, we use romanization (conversion to the Roman (Latin) script).

Fonipa is used for several languages, and Cyrillic is used for one language (Uzbek).

Editor in search of Anningan & Malina
<a href="/en/translator/sydney-lover" class="userpopupinfo username" rel="user1112972">DarkJoshua</a>
Joined: 10.05.2012

Different transliteration systems have different names. As you can see from other comments, transliteration is mainly based on choices: should a letter be transliterated following their pronunciation? Or would it be better to choose one unambiguous letter for each letter or the other language's alphabet? Some choices are also made for historical reasons.
For istance, here you can see a table of different systems of transliteration for Russian, the way they differ and their names.
This is an example for Greek.
As I said, a downside is that this would get too technical, but at least everything would be categorised and well ordered.

lt
Administrator
<a href="/en/translator/lt" class="userpopupinfo username" rel="user1">lt</a>
Joined: 27.05.2008

It really will be too technical for most users. In addition, after editing the transliteration tables, Russian and Bulgarian versions cannot claim to be standard clean.

In general, transliterations of most languages use the BGN system.

Add new comment