Incremental sync speed for large files NAS (ARM)

bedosk · July 20, 2015, 1:27pm

Hm,OK. What about more user options…perhaps too many…just trying to brainstorm. Option 1 - fearless, i.e. I/O error doesn’t concern me/is rare/… If it happens, discard the file and do full sync. That would mean a lot of GBs over network.

Option 2 - do a local copy of the file and perform in place updates on temp file. In case of I/O error only temp file is corrupted. I think copy a file is fairly quick, 20 GB in ~10mins@50MB/s. Probably still a lot faster than the current state, right? (At least for me 6GB file takes ~ 2hrs) How much CPU thirsty the journal file process might be? Faster/less demanding than the current process?

calmh · July 20, 2015, 1:53pm

This is exactly what we do today. Are you saying Syncthing’s file copy is much slower than if you just run cp directly on the box (followed by openssl dgst -sha on the file, as we do that too on the fly)?

bedosk · July 20, 2015, 2:34pm

Well I just run this:

-rw-r--r--    1 syncthin users    1962558464 Jul 13 12:05 IOM_Projects.pst

Jamaica> time cp IOM_Projects.pst test && time openssl dgst -sha test
real    0m 49.39s
user    0m 0.01s
sys     0m 25.95s
SHA(test)= 900efd28b61e9f90c9e90787604f7da279ed40db
real    0m 50.22s
user    0m 35.63s
sys     0m 13.15s
Jamaica>

2GB file, very quick. Normally to sync my pst files takes hours. Is the above command correct?

AudriusButkevicius · July 20, 2015, 6:08pm

This might be due to PST format. I am not familiar with it, but if it appends in the beginning or middle, syncthing will have to redownload the whole thing, as the block boundaries have moved around.

calmh · July 20, 2015, 6:46pm

That’s odd. It could be what Audrius says, but I hope not because that would be a nail in the coffin for my theory that rolling checksums are unnecessary in the modern world…

Could you let syncthing start syncing a change to that file, and run vmstat 1 and iostat -dx 1 for about a minute (preferably the same minute, i.e. in parallel)? Also check what syncthing is reporting for transfer stats (i.e. is it downloading stuff)?

(If iostat isn’t available on that NAS it’s not a disaster, vmstat is more important.)

bedosk · July 21, 2015, 5:44am

Ok, I’ve done the test. This post is going to be rather large with a few screenshots, I apologize. Please trim this post if necessary. So this is exactly what I’ve done. First thing I upgraded syncthing to v0.11.16, (was on 0.11.13 before) both NAS and PC. Next, I unshared all folder but 20_Outlook. Why? Because I wasn’t syncing for a week due to adding ~500GB of photos and giving NAS time to index. But I kept working on my PC. I sync all of my virtual machines ~250GB and I didn’t want them to start syncing for this test. So I decided to sync only one folder. After running the test I think this might have contributed a lot to the overall result. For start, the initial index exchange was quick, only ~4MB exchanged, normally it is a lot more. Now, log files. Synology NAS doesn’t have vmstat so I used dstat, I hope that it will do. At the beginning of the sync (17mins after restart) the GUI shows 4 items, ~10.6GB not synced, at this time pop-up window shows 4 files but no progress (no colors shown).

Pop-Up:

GUI:

At this time dstat and iostat files were taken. dstat file: [http://6b64f54dfd098fb3.paste.se/][1] iostat file: [http://fcd05774d464f10a.paste.se/][2]

I have to say that this time the progress was reasonably quick, 4GB file synced in 30min. You can see in this screenshot that only 4.64MB were transferred so I’d say pst files are good for incremental sync, i.e. no append in the beginning / middle. You can also see that RAM usage goes down at some point after restart from 279MB to 107MB.

I took another stat files while syncing the last file:

dstat file 02: [http://0248fdc05fa4a256.paste.se/][3] iostat file 02: [http://b9aab00abd99ee89.paste.se/][4]

Then, at some point around 47min I lost the pop-up info. I’m not sure why and if this is of any relevance, but this happens to me quite frequently.

The GUI was then frozen for at least 10mins. How do I know? The uptime was frozen at 47mins. So I closed the GUI and opened again. But by the time all folders were scanned (so that folders wouldn’t show “Unknown” status) the sync has finished.

All in all synced in 10GB synced in 1hr 20min. At this point RAM went again up to 279MB. Now as I write it is down to 60MB, CPU@0.1%. Still the question remains, why it takes 30min to copy 4GB locally when doing the same command (or is it the same?) on the box takes only 4min? (I just ran the above cp/openssl command on the 4GB HIMA_Projects.pst) I also assume when I now enable all folders and start syncing 1105GB, the situation will be a bit different. One note though. I use NAS only as a mirror so all scan times on NAS are set to 43200s to take it easy on NAS. I don’t have a lot of files changed during one day. Outlook pst changes daily, virtual machines not really and I add/modify only a handfull of small files a day. So what I’m saying is that except of outlook the system is nearly always in sync. I will try to do another test once all my folders are sync and then I will run the above with all shares enabled to see if there is a difference but I assume there will be, because that was my experience so far…and now I even added 500GB of photos:) [1]: http://6b64f54dfd098fb3.paste.se/ [2]: http://fcd05774d464f10a.paste.se/ [3]: http://0248fdc05fa4a256.paste.se/ [4]: http://b9aab00abd99ee89.paste.se/

calmh · July 21, 2015, 6:26am

So, here’s some speculation. In the first run, I’m not fully sure what’s going on really. Something is doing only reads, and using a lot of CPU. Idle CPU is zero per dstat, iostat shows a lot of reads and that the md devices are at 100% utilization (i.e. doing as much IO as they can). The physical disks aren’t that loaded (lower %util), so I’m guessing the limiting factor there is again CPU in the RAID layer…

Trace number two looks more like a copy. Again, you’re at the limit CPU and IO wise - CPU is doing work half the time, waiting for disks the other half, and the md arrays are maxed out performance wise.

Unfortunately this doesn’t tell us much, I was hoping there would be something more clear cut (swapping, being just CPU limited, …).

If you go to “Actions” -> “Settings” in Syncthing, hit “Preview” next to “Anonymous Usage Reporting” and dig out the part that says "sha256Perf": 122.36, what do you have there? That’s how much data we can hash per second, in MiB. I’m guessing there may be something suboptimal about how we read a block, hash it, then read the next that interacts badly with this box (and maybe others of the same class…).

bedosk · July 21, 2015, 6:41am

{ “folderMaxFiles”: 81681, “folderMaxMiB”: 541750, “longVersion”: “syncthing v0.11.16 (go1.4.2 linux-arm default) unknown-user@syncthing-builder 2015-07-19 11:34:11 UTC”, “memorySize”: 499, “memoryUsageMiB”: 560, “numDevices”: 2, “numFolders”: 19, “platform”: “linux-arm”, “sha256Perf”: 1.8, “totFiles”: 190257, “totMiB”: 1131536, “uniqueID”: “”, “version”: “v0.11.16” }

I’m not sure if relevant (if the above changes), but at the moment I’m running full sync on all foders, hopefully will be finished overnight.

calmh · July 21, 2015, 6:44am

Ouch!

"sha256Perf": 1.8

That one point eight megabytes per second of hash performance. That’s almost 40 minutes of just hashing for a 4 GB file. Even assuming that your box was loaded with other stuff when the test was run just now, that’s still very, very slow and this is for sure the root cause…

Well, the inplace updates stuff would still help as we would presumably not bother hashing the stuff we’re not changing.

bedosk · July 21, 2015, 6:48am

Well yes I understand…this is slow. But my question is why do we hash something that has been already hashed? With my engineer’s eye I’d expect something like this: Say a file is hashed and has 1000 blocks, so 1000 hashes. Now that file is changed. For instance block 1000 is changed and an extra 10 block are added. Why to rehash those previous 999 blocks? They are the same right? I know they are the same because the 4GB file has been recreated locally with just a minimal data transfer over LAN.

bedosk · July 21, 2015, 6:49am

hmm, ok. So I see we are hashing it now…why? Isn’t this just waste of time and CPU power? I’m also guessing that openssl dgst -sha is not exactly the same as that sha256? Because the above command took just 2mins on that 4GB file.

AudriusButkevicius · July 21, 2015, 6:54am

How do you know if they are the same without rehashing the whole file? All you know is that mtime changed, you don’t know which part of the file changed. You don’t have a copy of the file to check for content changes, hence you are left with comparing hashes.

calmh · July 21, 2015, 6:56am

What happens, technically, when we sync a file is something like the following:

Grab the list of block hashes we need for the new version of the file. For each of the blocks:
- Does it match the hash of a block we already have on disk? That can be in the old version of the file we’re syncing, or another file altogether.
  - If there is a match for a block we already have, read that block. Verify that the hash is in fact correct. Write the block.
  - If there is no match, request the block from the network. Verify that the hash is in fact correct. Write the block.

The verify steps are the ones that are expensive for you, particularly in the first case. But here, try this build and see what happens: http://build.syncthing.net/job/syncthing-pr/750/

There’s a new advanced option, per folder, skipHashOnCopy. It’s probably not a good idea to run with permanently, but it could confirm the guess.

AudriusButkevicius · July 21, 2015, 6:57am

OpenSSL has platform optimized assembly using cpu specific instruction sets on different models. Go on the other hand has a generic implementation which works on most platforms, though is slow on some of them.

calmh · July 21, 2015, 6:57am

@AudriusButkevicius in this case I think the NAS only gets changes from the outside, so it doesn’t do this rehashing locally.

calmh · July 21, 2015, 7:02am

On AMD64 I think it’s fairly fast, at least written in assembler with that intention. On ARM, as this is, it’s probably not very optimized at all, no…

AudriusButkevicius · July 21, 2015, 7:10am

Yeah, and I don’t think we regash existing content, only the one which arrives over the wire?

calmh · July 21, 2015, 7:13am

No, we do. :) I seem to remember there being some cases where the block finder wasn’t fully up to date, hence why we need to verify that we actually got what we expected (apart from local modifications we haven’t discovered yet)…

Edit: Prior to the block finder I think we just pieced together what we could from the original and network blocks, then hashed the entire file at the end to verify success. So this is the same, just done during the copying…

We could reuse the source file descriptor in that loop as well, just by checking if this block is coming from the same file as the previous one and in that case not closing & opening, but that didn’t actually make a difference in a benchmark I just run so I’m not sure it’s worth it.

calmh · July 21, 2015, 7:21am

Barring bugs and irregularities in our index, I think the best answer to this is that you may have changed the file we are about to reuse, one millisecond before we grab data from it. Then we would do the wrong thing, if we didn’t double check the data we read.

AudriusButkevicius · July 21, 2015, 7:27am

We could also check mtime after reading the chunk, and if it’s the same as the old file, not verify the buffer. Though that probably means blockmap needs to store mtimes.