Kana Character Conversion in Javascript (Part 1)

When dealing with enterprise customers in Japan building web applications, a lot of the times we deal with legacy systems that have specific character requirements

  1. Legacy systems that require half-width kana, but most browsers by default enter full-width kana. 
  2. Legacy systems that do not support 第二水準 SJIS characters, however most browsers support these kanji characters. 
The best way to address these character issues is typically upon first entry into the browser. Restricting and converting characters in the browser typically results in less work for integration systems downstream.

Part 1 of this article is going to focus on the first issue of half-width kana characters.

Half-width Kana Character Conversion Code Sample:
In order to setup our example we will first need to add the requisite libraries. To keep things simple we will only use JQuery.
Next we will need to create the html for kana entry:
The following code will be used for replacement of full-width kana characters with half-width kana characters:
Finally, we will add the conversion characters. Note that any replacement characters can be added or removed from this list.
Clicking on the "convert Full to Half" button, all characters which were entered full-width kana are converted to half-width kana.

The following code can also be added into KnockoutJS or into AngularJS so that the character conversion happens without having to click a button.  However, that will be the topic of the next blog post.

A working example can be seen on jsFiddle.
http://jsfiddle.net/mkgo69gk/3/

About the Author
Kent Horng worked in the Silicon Valley for 6 years at various startup companies, prior to moving to Japan and later joining Enrapt.  He has spent the last 8 years working on various web/cloud/tablet applications and is currently obsessed with iOS, AngularJS, and HTML5.

2 responses to “Kana Character Conversion in Javascript (Part 1)

  1. Since it's not good practice to leave japanese characters inside of the javascript. I am also providing the same sample with the Unicode character conversions.

    The changed code is below:

    var HKANA = new Array(
    "\uFF76\uFF9E", "\uFF77\uFF9E", "\uFF78\uFF9E", "\uFF79\uFF9E", "\uFF7A\uFF9E", "\uFF7B\uFF9E", "\uFF7C\uFF9E", "\uFF7D\uFF9E", "\uFF7E\uFF9E", "\uFF7F\uFF9E",
    "\uFF80\uFF9E", "\uFF81\uFF9E", "\uFF82\uFF9E", "\uFF83\uFF9E", "\uFF84\uFF9E", "\uFF8A\uFF9E", "\uFF8B\uFF9E", "\uFF8C\uFF9E", "\uFF8D\uFF9E", "\uFF8E\uFF9E", "\uFF73\uFF9E", //濁音
    "\uFF8A\uFF9F", "\uFF8B\uFF9F", "\uFF8C\uFF9F", "\uFF8D\uFF9F", "\uFF8E\uFF9F", //半濁音
    "\uFF67", "\uFF68", "\uFF69", "\uFF6A", "\uFF6A", "\uFF6C", "\uFF6D", "\uFF6E", "\uFF6F", "\uFF70", "", // 小文字
    "\uFF71", "\uFF72", "\uFF73", "\uFF74", "\uFF75", "\uFF76", "\uFF77", "\uFF78", "\uFF79", "\uFF7A", // 50音
    "\uFF7B", "\uFF7C", "\uFF7D", "\uFF7E", "\uFF7F", "\uFF80", "\uFF81", "\uFF82", "\uFF83", "\uFF84", "\uFF85", "\uFF86", "\uFF87", "\uFF88", "\uFF89", "\uFF8A", "\uFF8B", "\uFF8C", "\uFF8D", "\uFF8E", "\uFF8F", "\uFF90", "\uFF91", "\uFF92", "\uFF93", "\uFF94", "\uFF95", "\uFF96", "\uFF97", "\uFF98", "\uFF99", "\uFF9A", "\uFF9B", "\uFF9C", "", "\uFF66", "", "\uFF9D" // 50音 end
    );

    var WKANA = new Array(
    "\u30AC", "\u30AE", "\u30B0", "\u30B2", "\u30B4", "\u30B6", "\u30B8", "\u30BA", "\u30BC", "\u30BE", "\u30C0", "\u30C2", "\u30C5", "\u30C7", "\u30C9", "\u30D0", "\u30D3", "\u30D6", "\u30D9", "\u30DC", "\u30F4", //濁音
    "\u30D1", "\u30D4", "\u30D7", "\u30DA", "\u30DD", //半濁音
    "\u30A1", "\u30A3", "\u30A5", "\u30A7", "\u30A9", "\u30E3", "\u30E5", "\u30E7", "\u30C3", "\u30FC", "\u30EE", // 小文字
    "\u30A2", "\u30A4", "\u30A6", "\u30A8", "\u30AA", "\u30AB", "\u30AD", "\u30AF", "\u30B1", "\u30B3", // 50音 start
    "\u30B5", "\u30B7", "\u30B9", "\u30BB", "\u30BD", "\u30BF", "\u30C1", "\u30C4", "\u30C6", "\u30C8", "\u30CA", "\u30CB", "\u30CC", "\u30CD", "\u30CE", "\u30CF", "\u30D2", "\u30D5", "\u30D8", "\u30DB", "\u30DE", "\u30DF", "\u30E0", "u30E1", "\u30E2", "\u30E4", "\u30E6", "\u30E8", "\u30E9", "\u30EA", "\u30EB", "\u30EC", "\u30ED", "\u30EF", "\u30F0", "\u30F2", "\u30F1", "\u30F3" // 50音 end
    );

    I have also created a new jsFiddle with the changes.
    http://jsfiddle.net/mkgo69gk/5/

  2. Dear Mr. Kent Horng,

    Thanks for your code.
    I would like to apply your code to convert half-width Japanese to full-width Japanese. However, it seems not to be correct as expectation.

    Could you please give me a clue to do that.

Leave a Reply

Popular Posts