language differences between linux and windows

Lisa Kachold lisakachold at obnosis.com
Fri Feb 27 08:42:01 MST 2009


Summary:
UTF 8 and Unicode "knowlegde" has created a wealth of DoS and "ahem" other in-security (ah...) "tricks".

UTF8 Cert Security

Overview:
Here's a paper all about what and why:  

Utf8-survival


1) Mounts

For Samba to retain encoding, you might verify that you have mounted the share using the nls=utf8 flag.

http://wiki.vidalinux.com/index.php/...ows_Partitions


2) Linux Conversion Packages

Gnome

cxplorer
http://cxplorer.sourceforge.jp/

pcmanfm
http://pcmanfm.sourceforge.net/

KDE users solutions:

changeFilenameCode
http://kubuntu.free.fr/servicemenu/

ToUTF-8
http://www.kde-apps.org/index.php?xcontentmode=287

Big list of KDE service menus:
http://www.kde-apps.org/index.php?xcontentmode=287

use the convmv command to convert the invalid encoded file names:

	Code:	sudo apt-get install convmv
cd /<path to>
convmv -f utf-8 -t windows-1252 -r --notest *.*
Switches:
-f utf-8 Initial encoding
-t windows-1252 Final encoding
-r Convert subfolders
--notest Commit the conversion, this is not a test!

See more about Unicode text conversion from the command line:
http://www.linux.com/howtos/Unicode-HOWTO-3.shtml

3) Windows Unicode formats and convmv

								Re: Special Characters / Shares / Ubuntu 5.10 Breezy / Gnome2 / UTF-8															 There aretwo conversions necessary after transferring files from a Windows shareto a UTF-8 encoded ext3 file system. You need to convert your filenames from CP850 to UTF-8, and then convert any archives of ISO-8859-1 encoded files to convert to UTF-8 as required. These are two different operations.

I should also note there is a difference between Windows-1252, CP1232, CP850, and ISO-8859-1. According to general documentation, the Windows files machines have Windows-1252 or CP1252 encoded file names, but they might actually be CP850 encoded, which is an MSDOS encoding. 

Note that Windows-1252(i.e. CP1252) is essentially the same as ISO-8859-1 for the purpose ofconverting file names, although they handle a few special charactersdifferently.

http://j3e.de/linux/convmv/man/
http://ppewww.ph.gla.ac.uk/~flavell/iso8859/iso8859-pointers.html
http://en.wikipedia.org/wiki/Windows-1252

The other way around: Extracting from an Archive verses Windows as as example:

* There is a third case, where you have to convert thefile name of an archive, and also convert the file names of thearchived files.

If you are converting files that have been transferred from a Windows share to a UTF-8 ext3 file system, then

	Code:	cd /<path to>
convmv -f cp850 -t utf-8 -r --notest *.*If you are converting files that have been extracted from an archive, then

	Code:	cd /<path to>
convmv -f iso-8859-1 -t utf-8 -r --notest *.*
4) Known Issues

There is a known issue and patch with utf8/unicode under Rational CQ 7.0: http://www-01.ibm.com/support/docview.wss?rs=0&uid=swg24017915


--- excerpt ---
K47070 (RATLC01023607) - Users were unable to open some exportedRequisiteWeb views as CSV files in Microsoft Excel, depending on thedefault character encoding for the RequisiteWeb server locale. Thefollowing encoding is used by default for the indicated locales. Allare supported by Microsoft Excel except UTF-8:
 
   Windows-1252 	- English, French, Italian, German, Swedish, Dutch
   Shift_JIS 	- Japanese
   GB2312-80 	- Simplified Chinese
   Windows-1255 	- Hebrew (see the known problem for this APAR below)
   EUC-KR 	- Korean
   UTF-8 		- all others
 
If the RequisiteWeb server locale defaults to UTF-8, RequisiteWebadministrators must specify the character encoding for the exported CSVfiles so that users can view them them in Excel. Set the followingparameter in the config.txt file on the RequisiteWeb server:
 
   CsvExportCharacterEncoding=
 
The default location of the configuration file is: C:\ProgramFiles\Rational\Common\rwp\EmbeddedExpress\profiles\profile2\installedApps\DefaultNode\ReqWeb.ear\ReqWeb.war\WEB-INF\classes\config.txt
 
When specifying this value, use one of the supported character codesthat are shown above with their various language locales; for example:CsvExportCharacterEncoding=Windows-1252
---end excerpt ---

There is a known issue with perl to contend with, although you don't seem effected by it.

obnosis.com | wiki.obnosis.com| (503)754-4452
PLUG HACKFESTS 2nd Saturday Each Month at Noon - 3PM

Date: Fri, 27 Feb 2009 05:10:36 -0700
Subject: Re: language differences between linux and windows
From: jmcphe at gmail.com
To: plug-discuss at lists.plug.phoenix.az.us

:)  Yeah, you could convert it to ascii and then ftp it to the windows box in ascii mode

On Fri, Feb 27, 2009 at 12:07 AM, Joshua Zeidner <jjzeidner at gmail.com> wrote:
 also, if you're using FTP the ascii/binary setting may make a
difference, so I would try that route if all else fails.

  -jmz

On Thu, Feb 26, 2009 at 10:07 PM, Jerry Davis <jdawgaz at cox.net> wrote:
> I am using Rational ClearQuest, and am using the CQ Import process to import
> from a csv file which I produced on a linux box.
>
> The linux backend database is Oracle, and is a utf8 database.
> The windows box I am using to do the import (there is no import binary on
> linux), thinks it is WINDOWS-1252
>
> My csv file has some spanish characters. I can see them in the csv file i
> produce. However, when I copy from linux (any means like scp, or a samba mount
> point), the file characters actually change.
>
> I have tried installing iconv (using cygwin), and tried:
> $ iconv -s -c -f UTF-8 -t WINDOWS-1252 file > newfile
>
> It does not look the same as when it was on linux.
>
> Does anyone know what else I could do, to copy a utf8 file to a windows-1252
> file?
>
>
>
> --
> Happy Trails!
> Jerry (K7AZJ)
> Hobbit Name: Pimpernel Loamsdown
> Registered Linux User: 275424
> This email's random fortune:
> Small things make base men proud.
>                -- William Shakespeare, "Henry VI"
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change your mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss


-- 
James McPhee
jmcphe at gmail.com

_________________________________________________________________
Access your email online and on the go with Windows Live Hotmail.
http://windowslive.com/online/hotmail?ocid=TXT_TAGLM_WL_HM_AE_Access_022009
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.PLUG.phoenix.az.us/pipermail/plug-discuss/attachments/20090227/87334337/attachment.htm 


More information about the PLUG-discuss mailing list