ICU > Demo >

IDNA Demo

*Results of Operation*
Mode	Text	Code Points
Input		(empty)
ToASCII(input)
ToUnicode(ToASCII(input))
ToUnicode(input)
ToASCII(ToUnicode(input))

About this demo

This CGI program demostrates the IDNA implementation. The RFC defines 2 operations: ToASCII and ToUnicode. Domain labels containing non-ASCII code points are required to be processed by ToASCII operation before passing it to resolver libraries. Domain names that are obtained from resolver libraries are required to be processed by ToUnicode operation before displaying the domain name to the user. IDNA requires that implementations process input strings with Nameprep, which is a profile of Stringprep , and then with Punycode.

In the above demo, different combinations of ToASCII and ToUnicode are applied to the input. It also provides a simple illustration of how a GUI can visually indicate boundaries between different scripts, to help avoid spoofing. The code is rough, and only meant for illustration. One could certainly refine this to call out more characters that are visually confusable. For example, many CJK Radicals are identical in appearance to CJK Ideographs. Mixtures of simplified and traditional characters can also be visually highlighted, to help signal possible user errors.

Examples
You can either paste in Unicode text into the above box, or you can use Unicode escapes. For example, you can either use "ä" or "\u00E4", or could use the decomposition "a\u0308". You can also copy some interesting Unicode text samples from the following pages:

International Components for Unicode

IDNA Demo

About this demo