HOW TO USE UNICODE NCRs IN A WEB PAGE (HTML File)

NCR stands for Numeric Character Reference. All of the characters in Unicode can be included as "text" in any HTML file which can be read by all modern web browsers including Opera, Netscape, and Internet Explorer.

Microsoft® offers many free Unicode based fonts available covering many different scripts, including Chinese.

Once you've assured that your favorite browser in installed in its latest version on your system, you can visit: http://www.microsoft.com/windows/ie/downloads/recommended/ime/default.asp
... This is Microsoft's IME page and offers language packs including fonts and input methods for various writing systems.

There are several fine Unicode based commercial fonts available on the World Wide Web.

Mr. Ronald Ogawa has a very nice font currently available as beta-test freeware which includes Latin, Cyrillic, Greek and the UCAS (Unified Canadian Aboriginal Syllabics). The UCAS are used for writing languages such as Cree, Naskapi, Ojibwe, and Inuktitut. The font is called "Ballymun RO" and is available at: http://nexus.brocku.ca/rogawa/ucas

How To Add Special Characters (to HTML):



The web browser substitutes special characters from fonts whenever it finds this sequence: the ampersand symbol, the number sign, 99999, the semi-colon (Where 99999 can be any number up to 65535)

So, A will produce the capital letter "A" because the number 65 is the decimal code point assigned in Unicode for "A".

Of course, most of us would simply type the letter "A".

If the trademark symbol "™" is needed, however, it doesn't appear on most keyboards.

™ will produce the trademark symbol.

Долина Кукол will produce "Dolina Kukol" in the Cyrillic script. This is the book title "Valley of the Dolls" in Russian: Долина Кукол.

(Many people know that Долина Кукол was written by Жаклин Сьюзан.)

Although Unicode is comprehensive, it isn't yet complete. The Unicode Consortium welcomes input from users of the various World's scripts. It is possible to represent any of the letter-plus-diacritic combinations found in Vietnamese with a single Unicode NCR (at least, as far as I can tell...) But, some languages have combinations which aren't included in Unicode as "precomposed" characters. One example is the Guarani language which uses the letter g combined with the tilde. When a precomposed form isn't directly encoded in Unicode it is necessary to use one of the characters found in the combining diacritic range. So, to get the Latin letter 'g with tilde', use the letter g followed by the NCR for the tilde as a combining diacritic. Thus, " g̃ " should produce the symbol "g̃"

Many of the newer e-mail programs can be set to handle HTML, so multilingual e-mail is possible.

Several word processors allow "global search and replace" which means that the word processor could substitute the NCR-macro any time it finds a certain letter or combination of letters. For example, if the HTML sheet needs a lot of the trademark symbols, the author could use any keyboard symbol which isn't needed in the document and then "Find and Replace" every appearance of that symbol with the desired NCR macro.

(I use the ` and I could replace all the ` signs with ™)

The more sophisticated word processors allow for a series of "global search and replace" operations to be "programmed".

So, someone wishing to set type in the Cherokee script would be able to type phonetically using the Latin script and then use the pre-programmed series to convert the Latin script file into Cherokee Unicode NCRs:

for example:
FIND AND REPLACE ALL te WITH Ꮦ
FIND AND REPLACE ALL di WITH Ꮧ
FIND AND REPLACE ALL ti WITH Ꮨ
FIND AND REPLACE ALL do WITH Ꮩ
FIND AND REPLACE ALL du WITH Ꮪ
et cetera.

Folks used to have to make gifs or bmps (picture files) of any special symbol or script and then insert the gifs into the document. Picture files take up a lot of room and sometimes take forever to load. Using the font(s) that are already installed on your web page reader's computer saves time and storage space.

Sometimes, it is necessary to make a picture file of an unusual script. The HTML author may wish to display a specific type face of a script (as one example) because the author believes the reader's computer lacks the proper font(s).

I do this by creating my HTML document, calling it up on my web browser (off-line, because it is on my hard drive), using the "Screen Capture" feature in my registered copy of IrfanView32, saving the "capture" as a Windows BMP (bitmap) file, modifying the bitmap in Windows Paint (if any modification is necessary, like trimming off the explorer bar), opening the modified bitmap in IrfanView32, then finally saving the bitmap as a gif. (Gifs are much smaller than bmps, and thus take up less space and load much faster.)

My home page


Valid HTML 4.01!