Filesystem Optimization
Michael Butash
michael at butash.net
Tue Feb 11 10:36:06 MST 2020
I've been doing something similar for years, where I use unison instead of
rsync, but I delta changes between a laptop, desktop, and two filers (they
sync between themselves), and never seen huge performance issues like
that. Mine is some 100gb of files from 30mb visio/ppt files to (10's/100's
of) thousands of .txt files of network configs from myself and customers I
work for, and never had any issues on a local lan with this.
I use all SSD's on my workstation/laptops, and my filer is ext4 on synology
with 4x hgst 6tb spinners. A rescan of the local and remote share with
unison tends to take around 10min on my laptop via cifs, much faster, maybe
3-4min on my desktop via nfs. I did this not long ago, not precise timing,
but what I remember of doing it. This is also with ext4, lvm, and luks
encryption, my laptop with a m.2 toshiba disk and my desktop a pair of
960pro samsung m.2's in mdraid 1. I've not run into slow-down or
fragmentation issues of any sort in linux really, and both my laptop and
desktop are now some 3+ year old now.
I know between my synologies a full resync between the two of some 15tb of
data takes roughly a week, and both have 4x 1gbe links in port channels
worth of bandwidth through my arista switches...
Furthermore, because of encryption I don't run trim or any real ssd wear
leveling features, rather I found Samsung built-in wear leveling has been
the best and unneeded to interact with. My toshiba m.2 in my xps15 has
been solid so far too - normally I kill ssd's in 3-12mo without fail. I've
expected to see performance issues over the years with them, but never
since going ssd, other than they work, or they die quickly.
Have you run iotop or anything else to look at disk usage during that if
really an i/o issue? You should see an abundance of waits on the disk if
so, as well as what is accessing your disks at that moment. Maybe try
Unison instead of rsync as a test over nfs or cifs, not sure if something
weird with rsync too. Worst case, buy a cheap couple TB spinner usb drive
to offload to, zero the drive, reformat (cool kids seem to like xfs or zfs
these days), and try again on clean disks.
-mb
On Thu, Feb 6, 2020 at 7:43 PM Nathan (PLUGAZ) <plugaz at codezilla.xyz> wrote:
>
> When I ran the initial test that I stopped after 75 minutes, I should have
> noted, I pushed the files to a new directory, so there was no comparison
> performed. Nevertheless, the comparison part is rather quick. The files are
> transferring at kilobits per second for some reason. I watched my netdata
> output while running the transfer, along with other data transfers, and it
> shows an iowait of well over 25% min during any network transfer.
>
>
> I backed everything up and wiped the drives. I created a software raid 1
> and then luksEncrypted the raid device, formatted and mounted. I skipped
> the LVM this time around. I want to test with a little less overhead and
> see how it goes. If this performs better I might leave it. Otherwise, if it
> still gets slow, I'm going to replace this beloved old A6 with a shiny new
> Ryzen 3.
>
>
>
> On 2020-02-06 19:52, Bob Elzer wrote:
>
> Well if you create a new filesystem and do an rsync then there is nothing
> to compare so the copy should go fast.
>
> If you have 46k files and they need to be compared before overwriting then
> that may take a little longer.
>
> Try copying a 2gb file across your nfs and see how long that takes. I once
> had a config error that caused my network copies to run slower than they
> should.
>
> Also run you rsync a second time to a full tmpfs and check the timing I
> suspect it will take longer. Not sure how many of your files change, but
> you might have to let some change to get a better reading.
>
>
>
>
>
> On Thu, Feb 6, 2020, 12:35 PM Nathan (PLUGAZ) <plugaz at codezilla.xyz>
> wrote:
>
>
>
> I realize ext4 does not easily fragment, but when you have a large
> volume with lots of files of differing size, how can you optimize it?
>
> I have a 2TB mirrored array that has hundreds of thousands of less than
> 12KB files and hundreds of files that are more than 1MB and of course
> lots of movies and such which can be 1 to 4GB. Over the years it has
> gotten really slow.
>
> I have a shell script that basically runs rsync against my home
> directory and pushes it to a specific folder on my file server (part of
> this 2TB array).
>
> Typically the script runs in the wee hours when I'm asleep. But the
> other day I decided to run it just to watch it and see what happens. It
> was horrendously slow!
> I tried timing it. I ran time { rsync -av /home/myuser/.cache/
> remote:/backup/dir/.cache/; } and after 75 minutes I cancelled it. There
> are 46k files in that folder and it is roughly 2GB... 75 minutes it
> wasn't finished. Now this is running over an NFS link just FYI.
>
> So I created a 4GB tmpfs and mounted it where I needed and ran my time
> backup again and it took 2 minutes and 6 seconds. Obviously my network
> is not the issue.
>
> So today I'm trying to find places to store 2TB of data so I can
> rearrange things, but I'm wondering...
>
> Is there a program that watches and optimizes placement of files on a
> hard drive? I know these exist for windows, but linux?
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> https://lists.phxlinux.org/mailman/listinfo/plug-discuss
>
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.phxlinux.org
> To subscribe, unsubscribe, or to change your mail settings:
> https://lists.phxlinux.org/mailman/listinfo/plug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.phxlinux.org/pipermail/plug-discuss/attachments/20200211/8165c4b4/attachment.html>
More information about the PLUG-discuss
mailing list