Chinese Dictionaries for OmegaT

Thanks for sharing this Weedy! Our discussions along with James have been really constructive for me. Thx guys! Learning what school doesn’t seem up-to-date on teaching. OmT for freelance translator! Cheers

People, Places, and Food

by Weedy Tan on January 25, 2014

When I first started using OmegaT, I couldn’t figure out how to find and install a suitable Chinese dictionary. I didn’t care so much as I can use online Chinese <> English dictionaries while still learning OmegaT. However, after reading some old posts and discussions in the OmegaT Yahoo support group, I decided to research on this and find a way to install the Chinese dictionaries.

There are, in fact, many resources when it comes to available Chinese <> English dictionaries in both Traditional and Simplified Chinese. However, as a novice OmegaT user, I couldn’t understand the differences amongst those numerous dictionaries. Based on the OmegaT manual, I need to find a packed or zipped file with *.tar.bz2 file extension name. When unzipped, it should have 3 files with file extension names as follows:

    1. *
    2. *.idx
    3. *.ifo

And all the above should have…

View original post 353 more words


Sentence Segmentation Rules for Ancient Chinese Buddhist Text

Nicely written post on how to setup Chinese language segmentation rules for OmegaT translation software. It’s open source and runs quick and nimble.

People, Places, and Food


by Weedy Tan on January 14, 2014

After I came out with an OmegaT sentence segmentation rules for typical Chinese text, there was a request from someone in the Yahoo support group for addition of certain non-standard punctuation mark segmentation rules suitable for ancient Chinese Buddhist text.

Since this request is suitable only for this type of ancient Chinese Buddhist text (and possibly some ancient “Classical Chinese” text as well) and not for the present government (both the Chinese and Taiwanese governments came out with their own sets of punctuation marks though they are very much the same in practical usage. Also, traditional Chinese is in use in Taiwan while simplified Chinese is the one used in China) mandated punctuation marks, I suggested that these should not be included in the typical Chinese segmentation rules.

Instead, I volunteered to make a different set of segmentation rules for this purpose. Whether…

View original post 297 more words

Basic ideas on the Chinese language and writing forms



This post is inspired by the questions asked by a fellow translator on Twitter.  Thanks Eva Hussain aka @Eva_Polaron for inquiring about my language pairs.  I’m trained to translate French to English, but have an interest in translating Chinese Mandarin to English since I’ve spent quite a few years working and traveling in China.  

What are the different languages and writing forms in China?  It’s a simple question to a Chinese person who has grown up there or someone like myself who tries to put effort into learning local languages when traveling abroad.  The diversity of local languages and dialects is found in most parts of the world. This is a fact in China too, where crossing a river to another city or village can give one the impression that one is in another language realm.   I got that feeling the first time I went to Wenzhou after having spent a few years in the Northern area.  Again it all depends on the places you visit.  My experience is mostly in the North, but I did visit Guangdong or the Canton region and was pleasantly surprised by the variety of languages in the parts I visited.  Most spoke Cantonese, but I visited an area where the Kejia dialect / language was also common.  Again, I use the term dialect and language a bit loosely here.  Wiki has map that explains this.  Check it out here.

So what are the differences between Traditional and Simplified characters?  First off the name in Chinese is 繁体 fanti (said fan tea) 简体 jianti (pronounced as gee’N tea).  I’m currently learning Simplified, but always try to catch a glimpse of the Traditional character when looking up words / characters.  Generally speaking, one can say that a common worker in China can read and understand about 3000-5000 characters.  University educated person perhaps has a larger mastery and can read and understand somewhere up to 10,000 characters.  When using social networking and whatnot, most try to use Mandarin, but at times, when discussing things with people who come from the same location, people can get creative with Chinese and adapt it to their dialect or language. Just think of the expression “How y’all doing” or something similar and it could perhaps give you an idea of how creative people can get.  I first noticed this when visiting Shanghai.  I’d picked up an English / Shanghainese phrase book and glimpsed at local expressions and words.  For example the word “teacher” is generally understood as 老师, but in Shanghai it can be written as 老死 since the pronunciation there is a bit lighter on the “R” and “H” sounds.  Again I only have a superficial knowledge of Shanghainese since I only met a couple of people there who had the patience to try teaching me this local language / dialect.  It’s a pity, but Shanghainese is slowly fading away, since most kids are educated in Mandarin and are actively learning English.  

What are the differences between Traditional and Simplified writing forms?  First off, the writing of Traditional Chinese is seen as more complicated since it uses more brush strokes to produce the character.  Simplified was introduced sometime around the 1950s in order to promote literacy.  I have a list that shows that there were 2473 characters that were simplified from their original traditional form.  I don’t claim to understand them all, since some are somewhat obscure to me. Most online dictionaries will provide you with both forms.  Some of the 2473 characters somehow don’t show up, therefore I’m not quite sure on the validity of my list.  

If you are interested in learning more about characters and getting a feel for how to write them, I highly suggest you check out the following website.  I’ve taken the liberty of using her photo on my post.  The person who created it really has a sense that education should be entertaining.  Edutainment so to speak.