On Mon, 30 Jun 2003, Matt Alexander wrote:
> Matt Alexander said:
> > Does anyone know how I would take an HTML unicode character and convert=
it
> > to the actual unicode character in a text file using Perl? For example=
,
> > let's say I have López. I'd like the ó to be converted to th=
e
> > character with the o and the accent over it and saved to a plain text
> > file.
#!/usr/bin/perl
while ($line=3D<>) {
$line =3D~ s/(&#)([0-9]+)(;)/ chr($2) /eg;
print $line;
}
(I use the opposite for that to encode for webpages.)
> I figured it out. Browsers use decimal for unicode. So I would convert
> 243 in decimal to F3 in hex and then I can print the character:
>
> perl -e 'print "\x{F3}\n"
$ echo 'López' | ~/scripts/iso-to-ascii
L=F3pez
Jeremy C. Reed
http://bsd.reedmedia.net/