Am 01. Jun, 2003 schw=E4tzte Thanasis Kinias so:
> > UTF8 isn't the fix, but it's a great step in the right direction.
>
> I'm curious whay you say that. Other than the fact that many things
> (like Gnumeric, for example) barf on UTF8 text still, what would `the
> fix' do that UTF8 doesn't?
I can't find the main article I read. It was pretty interesting if you like
the history of asian character sets :).
The problem, according to the article, is that the 65k characters of UNICOD=
E
( got bumped to 128k at some point ) isn't enough to house all the necessar=
y
asain characters. UNICODE needs to work for current, future and *past*
character sets. It also needs to work for non-real char sets such as
Tolkien's languages, Klingon, etc.
According to the article westerners often glump together characters that
look the same to us, but really are different.
Something I can use as a comparison is the german S-tzet, ß in html,
and the greek Beta.
Both 'look' the same, but they are very much different characters. I'm not
certain on the origins of the Beta, but S-tzet is an "s-z" ( ess-zee ).
Imagine a tall letter 's' ( like what was often used in Ami script at the
time of the Revolutionary War ) with a letter 'z' ( similar to Ami cursive
lower case z ) glued on the back.
The S-tzet ends up looking very much like a Beta and in it's modern form is
actually drawn very similar to the Beta. They are still different
characters, though, so it's like substituting 1 for l or 0 for O. Maybe
it's the i18nal form of 133t speak, but for the rest of us it just won't do=
=2E
The short of it is that UNICODE isn't big enough to handle asian char sets,
much less all the char sets we have. It's progress, but not a destination.
ciao,
der.hans
--=20
# https://www.LuftHans.com/ http://www.TOLISGroup.com/
# When you are tired of choosing the lesser of two evils,
# Vote Cthulhu for President!