I have configured a new Linux device in my Syncthing env and shared some newly added directories immediately from Windows target. There was a huge (approx ~10, 900GB) queue of folders waiting for scan, but at the end everything seems to be completed, all green - synced within my LAN.
But the problem is that it is not the case. I checked the checksums of my directories and one directory was surprisingly off, but the reason was: missing files. So what happen is that half of my precious family pictures and videos did not synced at all.
Source has ~300GB of files, but the target show sync completed status and the size of the target directory was only ~110GB. Restarting those nodes didn’t change anything. After 2 more restarts it finally shown on my source that it is trying to sync some files but it stuck and the newely created target was still showing that it is 100% synced.
Currently I just removed completely the directory form the source Syncthing node and re added it again (also re added it in the target directory including with the data wipe - I want to see what happens).
Anyway there is some kind of issue and missing constrain check about what and how exactly is synced.
BTW. I have no exclusions, just defaults with Alphabetic download order. But in the middle of the process my target node went offline (I have stopped the Syncthing and I have rearranged some of those directories targets by moving them around and editing config.xml paths but after that syncing resumed as usual so I don’t think that this has any to do with this issue.
Should I issue a bug for this occasion or something? I don’t even know where to start. I have no good knowledge about how the Syncthing works yet - mostly how it tracks the sync status of single files / directories.
Probably I did a mistake by not immediately starting to investigate the issue. Is there a chance that I can look over the logs and see if there is something interesting there? Hmmm… let me try…
Device XXXXXX folder “Photo” (aaaa-bbbbb) has a new index ID (0xwhatever…)
Maybe this is actually my fault and I damaged the targets config.xml file somehow or did unwanted edit for that target directory when I was changing the targets paths?
But still, the directories way off out of the sync, so anyway there should be an indication for that.
Sorry, I’m not entirely clear on which device has the ~300GB of files and which one has only ~110GB because this sounds like Windows is sending files to Linux because of the “immediately from Windows target”…
I have configured a new Linux device in my Syncthing env and shared some newly added directories immediately from Windows target.
… but then this makes it sound like the reverse:
Source has ~300GB of files, but the target show sync completed status and the size of the target directory was only ~110GB.
(Because Syncthing is bidirectional sync, really all devices are both source and target. So it’ll just be easier to follow if we just refer to the devices as “Linux” and “Windows”.)
Additional details about your setup would be very helpful:
How is Syncthing installed and running on the Linux and Windows devices? (e.g. standalone executable, Docker,…)
On Windows, is it the official Syncthing package or one of the community bundles? (e.g. SyncTrazor)
If there are no obvious errors, screenshots of Syncthing’s web GUI with the folder panel expanded to show the sync status details on both the Linux and Windows devices.
Thanks for the response, so the Windows was uploading files to Linux basically. Linux ended up with 110GB of data synced and stuck, Windows also got stuck.
Linux install is from official apt repository (https://apt.syncthing.net/).
Windows is running for months using the SyncTrayzor and it never had any issues with syncing before with other devices.
Both are at version 1.27.4.
Currently all instances works fine. And I’m super surprised how little recurses they are using - excellent piece of software.
Only one suspicious thing which I found in logs is that directory index recreation which I posted already during the synchronization for that directory id which failed. And maybe I was not paying enough attention to UI itself because I didn’t found any issues when I was looking at. I don’t remember the devices statuses now unfortunately.
As I wrote, after the recreation of that syncing directory pair the process went smooth and without any issues.
I forgot to mention that I pressed also the recheck buttons on both instances and waited but it did nothing actually.
Currently I was syncing another device with approx the same 900 GB of data and yet again I have an issue.
So the sync completed, checksums match OK, but the new device now says that my remote (one from which it was downloading the files) is out of sync at 26% and got stuck with 365GB of data out of sync with different folders than I was describing before! And I checked the checksum of the sync and all is good, but the Syncthing says that one of the devices is out of sync!
I have to mention that this time I was joining this new device with two others.
What is also funny, the first and second device shows that is up to date with everything.
CHECKSUM:
A == B == C (all ok)
SYNCING STATUS:
THIS DEVICE => REMOTE DEVICE STATUS
Honestly, I don’t think the bug report is going to help you much, at least as long as the issue can’t be easily reproduced. Can you post some screenshots from the Syncthing Web GUI from both sides? Please make sure that all folder and device information is unfolded and visible.
I provided one with log entries that I think have something to do with in the bug report. I’m testing the same sync routine for the 3rd time but on a bit slower machine currently just for my curiosity.
And because the error was not handled properly - the logical state of the “device” is impossible - this is already a bug itself. There is no possibility to get another device with the data with proper checksums if it do not exist. Please see log attached there.
Yeah, the question isn’t whether it’s a bug or not, but rather whether the developers can reproduce the problem in order to fix it. Log files can be helpful, but you should at least provide full debug logs with db and model enabled. They also need to include the actual corruption when it was happening, not the state after that. They also should be as short as possible (e.g. all unrelated folders and devices paused beforehand, etc.), as otherwise analysing super long log files is a major pain.
The best and most likely to be fixed scenario is to provide a short list of steps that can allow anyone to follow and reproduce the issue.
I’m not a GO programmer meh. But I guess I can try to be for some time… It looks like I have to dig through the documentation and try to debug this solution.
Can you quickly guide me how I can jump start? I never did anything with the GO, I’m a .NET DEV though - this world is super easy in terms of debugging…
But first I will take a look onto logs.
I need to find some time to work on this. Stay tuned.