Slow sync sending files from Android

@mraneri Did caseSensitiveFS — Syncthing documentation make a performance difference with your Periodic transfer bursts, then no traffic. problem?

If you were confused on why I suggested enabling that option, this performance issue on Android is why. I was wondering if it was the same problem.

Yes - this is the same case.

I did some tests and even with this option enabled, Syncthing is killing I/O on my Android phone, but at least there’s enough left for it to do sync.

So another guess - how many threads are doing directory scanning? If more than one, maybe that’s the issue, polluting and trashing the cache or hammering SQLite?

As for the versions, actually, GH issues 1787 and 2739 say it was implemented in 1.9.0. And the “fixer” was already discussed here /t/periodic-transfer-bursts-then-no-traffic/25184 [only 2 links allowed for new users].

I didn’t have a large transfer to do until about a week later. I did set case sensitive FS on the Linux side but both endpoints happened to be rebooted for other reasons. The subsequent transfer was faster and seemed not to suffer from the start/stop nature of the original problem. I can’t say for sure it was the setting change vs the reboot though.

1 Like

Folder stuck at scanning issue relates to case insensitivity for a very long time.

Directory listing on my Android is slow - I’ve always seen that using adb, as busybox ls seems to always fetch the attributes:

lisa:/storage/emulated/0/DCIM/Camera $ strace -efile ls
openat(AT_FDCWD, “.”, O_RDONLY)         = 3
newfstatat(4, “Raw”, {st_mode=S_IFDIR|S_ISGID|0770, st_size=3452, …}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(4, “thumbnails”, {st_mode=S_IFDIR|S_ISGID|0770, st_size=3452, …}, AT_SYMLINK_NOFOLLOW) = 0

– these newfstatat() calls are not necessary for short (filename-only) listing (not to mention using file descriptors…), single getdents64() is enough. On linux there’s the new statx() (well, compared to stat(), 8 years old now, but kernel 4.11 is still new for some embedded hardware) call for better performance.

At first glance, reading file attributes shouldn’t be required for case-checking, only pure directory read. And this should be fast even without application-level cache…

So I wonder - isn’t that ReadDir() mentioned by @calmh invoking stating all the files? Anyone familiar with codebase can share some insights?

No, it opens the directory as a file descriptor and calls Readdirnames, which is the quick “I don’t care about file attributes or sorting” variant.

Can you provide a build with android:debuggable set?

Alternatively, with simply increased caseCacheTimeout.

I see it’s been like that since the beginning, however there was 10 second value introduced as well; it’s not in lib/fs/folding.go currently, maybe this was somehow merged/reused on the way, and that decrease caused the effect we see. Also 1 second threshold might explain per-device/directory difference in behavior.

I’ve verified my DCIM/Camera listing timing using termux (to overcome busybox issue) - it takes 1.4 s and is entirely taken by openat() syscall followed by getdents64() series, no newfstatat(). Yes, that’s weird that busybox with all that excessive calls finishes in just 1 second, this might be related to permissions checks somehow - adb shell is internal app, while I have to give storage access for termux (and I can’t run embedded busybox from within it, no access to /bin). However, assuming similar enough codebase, termux-installed busybox ls takes 1.6 seconds (used mostly by newfstatat()) which confirms that getdents64() is faster, yet not fast enough, and that the syscalls are clearly slower for installed app.

Therefore even the “just read the contents” as presented by @calmh cache would already be invalid in the next loop.

I got 7930 files there, so it’s going to take 11102 seconds (3 hours) just to scan this directory, assuming it’s not being retried by some recursion. And I got more large directories on my phone…

Having 10 seconds cache would save at most 10/1.4=7 times that (in practice: probably less, there are other thing being done in each loop), finishing somehow above 1555 secods, which is half an hour. Possibly manageable.

So I’d like to have this cache valid for 20 seconds - if you cannot make such value upstreamed (RAM considerations?), this should either be UI-configurable or possibly increased just for Android build (Syncthing-fork local change).

Debug builds are mentioned there. Feel free to make a PR with the changes you’d like to try and measure, it will then be built by CI. Or just fork the repo to get your own builds ready. :slightly_smiling_face:

If possible: put something like Os.GetEnv instead of the current value in Syncthing’s go code and then use the wrapper’s troubleshooting menu to set the env.

1 Like

Very interested in the results of any experimentation!
In principle I expect no effect of cache duration on scanning though: The case check only happens on “input paths” not for paths coming from walking the filesystem. So for a scan it will always check whatever directory/path is being scanned (we even ignore anything cached at that point, as a scan is meant to pick up changes), but then when walking the tree none of that is subject to the case checks. Basically I expect all of this only to matter on the pull/sync side of things, not scanning.

3 Likes

Just adding I think this is my issue too, not sure if there is anything I can provide besides it’s a Samsung 24 ultra (using the f-droid syncthing fork). It will do massive amounts of transferring for a few seconds and then dies for many minutes, and does it all over again. It actually reads very much like this thread: DCIM/Camera won't sync - #9 by cwilmot except it’s not exclusive to DCIM (not sure if that person tried other stuff, I did- in general it’ll go firehose mode and then not). Logs don’t appear to be reporting any errors. Folders are stuck syncing, with no transfer going on, the remote devices section of the webui says it’s up to date.

Just thought to add, I tried the playstore app (yesterday), unlike the f-droid app it never stops but it goes monstrously slow (syncing photos at like 100-400 kb/s and then takes a short break between each photo, but it does work on each photo consistently), meanwhile the f-droid app will do nothing for hours and hours, and then all a sudden toss 100s of mb/s until it stops out of no where and wont start again. I wouldn’t be surprised if the proton VPN doesn’t help with some of the speeds here, but weird that the apps behave differently lol. (The folders the two apps are working on are different, but it is the same phone that is using both apps at the same time just to see how each behaves).

1 Like

Alright, I come back hopefully one last time- I believe my issue is resolved and here are the things I believe mattered:

f-droid app (while the base of the google play app, so credit is due) appears to just stop*, so I switched to google play variant of the fork. Things stopped completely stopping after that, where the DCIM folder was really slow still but OTHER folders had a reasonable speed given everything is behind proton VPNs (phone and pc).

Yet… the DCIM folder was insaaaneeelly slow. So, I thought maybe I can just manually pull the files out through USB.

LOL. No, the PC and phone had a joint heart attack and refused to look at the folder, frequently claiming it was empty. Looking into it, apparently 24,000 photos creates a lot of problems… Which I never noticed because the gallery and like android apps I normally used didn’t care about this quantity.

So.. how to take the photos off without waiting 15 years? I used Android Debug Bridge with a pull request from the phone to the PC. I did have to tinker with the commands as ADB initially was also leery of the folder size. However, in the end it worked and it worked way faster due to the raw stream of ADB (it’s one of the fastest ways to get photos off). The command initially failed when I tried to move the whole DCIM folder but when I picked the camera folder only (where the 24k was stored lol), it moved them all in an hour (which is pretty good for that many photos).

After that I used the in app android file manager to purge thousands of photos (just make sure your photos on your PC are not going to get purged by this, that’d be a pikachu face moment). Now syncthing works on photos as expected too :wink:

*Now that I don’t have a massive DCIM folder, I am not sure if the F-Droid version would still have issues (as maybe that was the source of why it was hanging up), but given that I already switched all the folders to google play.. I’ll probably not switch again.

The above is supposedly fixed with fix(fs): store `getExpireAdd` mutex in `caseCache` (fixes #9836) by marbens-arch · Pull Request #10430 · syncthing/syncthing · GitHub

Just released with v2.0.11
I’ll give it a try after syncthing-fork has updated as well!

1 Like

Now Syncthing-Fork has been updated to the release containing the fix. cc @here @a24 @thepartisan @davidvpe @AleksiDj73 @1AiiA1 @pikac @26kick @Banko

4 Likes

It’s more complicated than that. The slowness is caused by Android filesystem slowness, which can be amplified by hundreds/thousands of concurrent directory listing calls, which is caused by a buggy mutex (which is a Syncthing bug).

At least if we are talking about 1.28.0 (which is the first version that has the problem according to @zzz-io’s, @BForestRunner’s and my testing) relative to 1.27.12.

There’s also other factors on the Syncthing side that could amplify the slowness (e.g. people have said that pulling gets slower in 1.10.0 compared to 1.9.0). I don’t think it’s entirely on the Android side.

I think attributing android slowness fully to that mutex issue is a bit too much - it’s been slow before that regression was introduced. And I have gotten significant speedups by increasing syncthing sync IO parallelism (the irony) with that regression in place. I am not saying that I don’t expect/hope for speedup besides lower memory usage with the mutex fix, just that there’s a lot of other unrelated reasons for filesystem access on android to still be slow.

2 Likes

I agree. That’s why the second-to-last paragraph is there, but I’m not exactly great at wording things, so I’ve tried to improve it.