TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Unicode/ISO 10646 From:Laura Lemay <lemay -at- DEATH -dot- KALEIDA -dot- COM> Date:Fri, 25 Feb 1994 10:05:15 +0800
>I just got a flyer about a Unicode conference coming up in Tokyo. The
>flyer says, "Technical writers may also benefit," from attending. So
>far, I know that Unicode is a proposed 16-bit character encoding
>standard for information interchange. It would make communication/
>data exchange between countries, particularly the Asian and
>Pacific-Rim countries, much easier.
>Since this is the first that I have heard about it, I thought I would
>ask my fellow writer type folks, what do you know about Unicode or
I've done some minor work exploring Unicode as part of the product my
company is producing -- here's what I know:
Western computer systems use the 7-bit ASCII character set to
represent characters. ASCII, with only 256 characters, is fine for
the english language -- ASCII provides character slots for the
entire alphabet plus a whole range of special characters.
The problem results when the ASCII standard is applied to other languages.
Many languages have far more than 256 available characters. What
seems to have happened to solve this problem is that the 256-position
character set is retained, but each slot is mapped to an appropriate
character in the local language.
It works, sort of. If you've ever read asian or nordic ASCII messages on
the net, what you see if you're using a standard western ascii terminal
is a lot of garbage. Remapping the ASCII character set between languages
is fine when you're communicating homogenously in that one language. But
When a body of ascii text in one mapping shows up in someone else's mapping,
wierd characters result. All this makes it very difficult for digital
communication between languages.
Unicode was proposed to solve the problems with the limited size of the
ASCII standard. Unicode provides a 16-bit character set, which means
that instead of having only 256 slots for available characters, Unicode
provides over 65,000 slots, enough to encode most of the characters of
many the worlds' languages, as well as commonly recongized mathematical
symbols, punctuation, phonetics, and dingbats. The current Unicode
standard includes 28,000 available characters, in a book 700 pages
long (and just try finding the code sequence for a bullet in there! :)
Unfortunately, Unicode is something that everyone seems to agree is
important, but no one in implementing. I don't know of any major operating
system manufacturer that supports Unicode (if anyone does know of one,
please set me right).
As to your comment about the conference -- although as a part-time nerd I
think Unicode is way cool, as a full-time tech writer I'm not positive that
I'd get anything out of a conference on Unicode. Unicode does not
provide any insight in translating issues or even into typography, fonts or
character design -- its just a character mapping standard.