Good Old ASCII?

der.hans plug-discuss@lists.plug.phoenix.az.us
Fri, 6 Jun 2003 01:59:43 -0700 (MST)


Am 01. Jun, 2003 schw=E4tzte Thanasis Kinias so:

> > UTF8 isn't the fix, but it's a great step in the right direction.
>
> I'm curious whay you say that.  Other than the fact that many things
> (like Gnumeric, for example) barf on UTF8 text still, what would `the
> fix' do that UTF8 doesn't?

I can't find the main article I read. It was pretty interesting if you like
the history of asian character sets :).

The problem, according to the article, is that the 65k characters of UNICOD=
E
( got bumped to 128k at some point ) isn't enough to house all the necessar=
y
asain characters. UNICODE needs to work for current, future and *past*
character sets. It also needs to work for non-real char sets such as
Tolkien's languages, Klingon, etc.

According to the article westerners often glump together characters that
look the same to us, but really are different.

Something I can use as a comparison is the german S-tzet, ß in html,
and the greek Beta.

Both 'look' the same, but they are very much different characters. I'm not
certain on the origins of the Beta, but S-tzet is an "s-z" ( ess-zee ).
Imagine a tall letter 's' ( like what was often used in Ami script at the
time of the Revolutionary War ) with a letter 'z' ( similar to Ami cursive
lower case z ) glued on the back.

The S-tzet ends up looking very much like a Beta and in it's modern form is
actually drawn very similar to the Beta. They are still different
characters, though, so it's like substituting 1 for l or 0 for O. Maybe
it's the i18nal form of 133t speak, but for the rest of us it just won't do=
=2E

The short of it is that UNICODE isn't big enough to handle asian char sets,
much less all the char sets we have. It's progress, but not a destination.

ciao,

der.hans
--=20
#  https://www.LuftHans.com/    http://www.TOLISGroup.com/
#  When you are tired of choosing the lesser of two evils,
#  Vote Cthulhu for President!