Summary:
UTF 8 and Unicode "knowlegde" has created a wealth of DoS and "ahem" other in-security (ah...) "tricks".

UTF8 Cert Security

Overview:
Here's a paper all about what and why: 

Utf8-survival


1) Mounts

For Samba to retain encoding, you might verify that you have mounted the share using the nls=utf8 flag.

http://wiki.vidalinux.com/index.php/...ows_Partitions


2) Linux Conversion Packages

Gnome

cxplorer
http://cxplorer.sourceforge.jp/

pcmanfm
http://pcmanfm.sourceforge.net/

KDE users solutions:

changeFilenameCode
http://kubuntu.free.fr/servicemenu/

ToUTF-8
http://www.kde-apps.org/index.php?xcontentmode=287

Big list of KDE service menus:
http://www.kde-apps.org/index.php?xcontentmode=287

use the convmv command to convert the invalid encoded file names:

Code:
sudo apt-get install convmv
cd /<path to>
convmv -f utf-8 -t windows-1252 -r --notest *.*

Switches:
-f utf-8 Initial encoding
-t windows-1252 Final encoding
-r Convert subfolders
--notest Commit the conversion, this is not a test!

See more about Unicode text conversion from the command line:
http://www.linux.com/howtos/Unicode-HOWTO-3.shtml

3) Windows Unicode formats and convmv

Re: Special Characters / Shares / Ubuntu 5.10 Breezy / Gnome2 / UTF-8

There aretwo conversions necessary after transferring files from a Windows shareto a UTF-8 encoded ext3 file system. You need to convert your filenames from CP850 to UTF-8, and then convert any archives of ISO-8859-1 encoded files to convert to UTF-8 as required. These are two different operations.

I should also note there is a difference between Windows-1252, CP1232, CP850, and ISO-8859-1. According to general documentation, the Windows files machines have Windows-1252 or CP1252 encoded file names, but they might actually be CP850 encoded, which is an MSDOS encoding.

Note that Windows-1252(i.e. CP1252) is essentially the same as ISO-8859-1 for the purpose ofconverting file names, although they handle a few special charactersdifferently.

http://j3e.de/linux/convmv/man/
http://ppewww.ph.gla.ac.uk/~flavell/iso8859/iso8859-pointers.html
http://en.wikipedia.org/wiki/Windows-1252

The other way around: Extracting from an Archive verses Windows as as example:

* There is a third case, where you have to convert thefile name of an archive, and also convert the file names of thearchived files.

If you are converting files that have been transferred from a Windows share to a UTF-8 ext3 file system, then

Code:
cd /<path to>
convmv -f cp850 -t utf-8 -r --notest *.*
If you are converting files that have been extracted from an archive, then

Code:
cd /<path to>
convmv -f iso-8859-1 -t utf-8 -r --notest *.*

4) Known Issues

There is a known issue and patch with utf8/unicode under Rational CQ 7.0: http://www-01.ibm.com/support/docview.wss?rs=0&uid=swg24017915


--- excerpt ---
K47070 (RATLC01023607) - Users were unable to open some exportedRequisiteWeb views as CSV files in Microsoft Excel, depending on thedefault character encoding for the RequisiteWeb server locale. Thefollowing encoding is used by default for the indicated locales. Allare supported by Microsoft Excel except UTF-8:

Windows-1252 - English, French, Italian, German, Swedish, Dutch
Shift_JIS - Japanese
GB2312-80 - Simplified Chinese
Windows-1255 - Hebrew (see the known problem for this APAR below)
EUC-KR - Korean
UTF-8 - all others

If the RequisiteWeb server locale defaults to UTF-8, RequisiteWebadministrators must specify the character encoding for the exported CSVfiles so that users can view them them in Excel. Set the followingparameter in the config.txt file on the RequisiteWeb server:

CsvExportCharacterEncoding=

The default location of the configuration file is: C:\ProgramFiles\Rational\Common\rwp\EmbeddedExpress\profiles\profile2\installedApps\DefaultNode\ReqWeb.ear\ReqWeb.war\WEB-INF\classes\config.txt

When specifying this value, use one of the supported character codesthat are shown above with their various language locales; for example:CsvExportCharacterEncoding=Windows-1252
---end excerpt ---

There is a known issue with perl to contend with, although you don't seem effected by it.

obnosis.com | wiki.obnosis.com| (503)754-4452
PLUG HACKFESTS 2nd Saturday Each Month@Noon - 3PM


Date: Fri, 27 Feb 2009 05:10:36 -0700
Subject: Re: language differences between linux and windows
From: jmcphe@gmail.com
To: plug-discuss@lists.plug.phoenix.az.us

:)  Yeah, you could convert it to ascii and then ftp it to the windows box in ascii mode

On Fri, Feb 27, 2009 at 12:07 AM, Joshua Zeidner <jjzeidner@gmail.com> wrote:
 also, if you're using FTP the ascii/binary setting may make a
difference, so I would try that route if all else fails.

 -jmz

On Thu, Feb 26, 2009 at 10:07 PM, Jerry Davis <jdawgaz@cox.net> wrote:
> I am using Rational ClearQuest, and am using the CQ Import process to import
> from a csv file which I produced on a linux box.
>
> The linux backend database is Oracle, and is a utf8 database.
> The windows box I am using to do the import (there is no import binary on
> linux), thinks it is WINDOWS-1252
>
> My csv file has some spanish characters. I can see them in the csv file i
> produce. However, when I copy from linux (any means like scp, or a samba mount
> point), the file characters actually change.
>
> I have tried installing iconv (using cygwin), and tried:
> $ iconv -s -c -f UTF-8 -t WINDOWS-1252 file > newfile
>
> It does not look the same as when it was on linux.
>
> Does anyone know what else I could do, to copy a utf8 file to a windows-1252
> file?
>
>
>
> --
> Happy Trails!
> Jerry (K7AZJ)
> Hobbit Name: Pimpernel Loamsdown
> Registered Linux User: 275424
> This email's random fortune:
> Small things make base men proud.
>                -- William Shakespeare, "Henry VI"
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss



--
James McPhee
jmcphe@gmail.com


Access your email online and on the go with Windows Live Hotmail. Sign up today.