Friday, December 23, 2011

Context Service

The three pillars: character, context, and syllable are the base of chinese language. I managed to finish the syllables in cantonese. Then I finished importing the CEDICT. Now I modified the original context structure to the one matching the CEDICT and I extracted the CEDICT for all single chinese characters which amounts to just over 10,000 distinguished characters. There are still some characters left out from the cantonese syllables.
Here is the url for the context service:
http://pcwong.org/cgi-bin/cantoneseContext.cgi
Only SINGLE chinese character are to be looked up.
There are couples of servces available:

  1. look for character(word) from cantonese syllables/sound
  2. look for character(word) context meaning
  3. search for record for either Chinese character or English word
Of course, other than those, there are ADD, MODIFY but required permitted user in order to do so and hopefully turn off those spammers.

Thursday, December 15, 2011

CEDICT

CEDICT stands for Chinese English Dictionary. The original website is at:
ftp://ftp.monash.edu.au/pub/nihongo/cedict.html
The most current version is under:
http://www.mdbg.net/chindict/chindict.php?page=cedict
Some information of the project continuation is at:
http://www.cs.cmu.edu/~eepeter/cedict_readme.txt
There are a lot mirrors and website implementing the dictionary using the CEDICT data.
I always mispelled the CEDICT as CEDIT, some how think about the EDITOR..
I got the implementation now at:
http://pcwong.org/cgi-bin/wordlookDB.pl
The data has been databased and combined the earlier set. Hopefully in the future, I will add some more current ones to better the set. Enjoy the convenience and the sharing of the data!