I updated a clients monthly server backup a day or so ago. The sending end is all scanned and ready to send…
Sending end is also set to a specific ip:port
I restarted St this morning as it hadn’t done anything for a few days but virtually nothing is coming in
Other folders related to this sending end have been downloading so I know there isn’t an issue with the two ends, but specifically with this sync folder. The logs reference this sync, up until I restarted St this morning then no further log entries for this particular folder. Other folders from the sending end is included in the log. But there is no errors or panics. It can’t be permissions as other files have already been written to the same folder.
File is on a 14Tb SATA drive
So is there anything that can be done to get the file to start syncing? There’s no syncthing related disk activity against this file? Infact the date and time stamps are
and on another folder thats happily syncing I get tonights SQL backups
Could it be related to Distributed deadlock on request#6583?
Maybe, you can pause and unpause the folder, see if that gets it moving.
You could also follow the steps explained in the other recent threads debugging this sort of issue on how to get the stacktraces.
You could also run the stidxchk tool to check the database for consistency problems.
After pausing both ends made no change I turned on model logging and paused all the other folders and all but the affected remote devices, but it remains syncing. I even tried to shutdown, and that was also ignored twice. But got all this racing up on the screen
this is probably irrelevant.
I looked at the SIGQUIT signal, but assume that’s a linux command? i’m pure windows.
Anything else I can do to debug? I don’t know what stidxchk is and nothing is in the documentation for reference. Sorry.
It looks like it’s requesting stuff, so it’s probably moving.
The explanation for Windows is in the first post here:
I use synctrazor and whilst that gui was showing all paused, when I opened an external browser the folders and devices hadn’t changed (no pauses), so that explains why i’m still getting activity. I was also getting the momentary gui pauses that’s been reported elsewhere.
I tried http://127.0.0.1:8384/debug/pprof/goroutine?debug=2 but I get 404 page not found. debugging is enabled, but I recall I have to do something to get this to work but I can’t recall or find what I need to do. I still have STHEAPPROFILE=1 enabled and making files if that’s of any help, but is probably not the trace you need.
I seem to recall I had to change the variable to STPROFILER=1 then got panic-20200430-002945.reported.log but have restarted synctrazor and paused everything but the one folder. It’s gone to preparing to sync.
http://127.0.0.1:8384/debug/pprof/goroutine?debug=2 still isn’t working and I need to call it a night, so will report back in the morning with any progress on the folder.
Read the docs, the value is not 1, and it starts a second http server on a different port.
My apologies for the last post. It was very late (close to 1am I think) and thus very tired and not thinking straight. Thanks to tomasz86, I have now got it all appearing on the screen. However after adding the variable and restarting the PC the syncthing is now syncing so if any folders stall I can send a trace
I don’t know if this will be of any help…
trace.txt (783.4 KB)
What i’m finding is that if all but the one affected folder are paused and St restarted, then the one folder will run and download pretty fast. I then think it’s all fine and resume all the others after a few hours has passed. But at some point later the affected folder will stop working. It’s as if St is overloaded with requests and everything grinds to a crawl (cpu threads = 93). Download speeds on any folder drops to low Mb/s with the affected one at 0 other than an occasional 6 bps.
Also, trying to pause the folders no longer seems to work. They say they are paused but a few minutes later the gui refreshes and there’s been no change.
The trace does contain lots of syncing folders so hopefully there might be something that looks unexpected.
I’m running 39 folders / 14 remote devices, concurrency -1
There is nothing stuck in the trace.
If the trace was captured when everything was ok, then I gues that makes sense.
Try capturing it when you think you are experiencing a problem.
Thanks for having a look. I’ve managed to pause all but one folder so will see if that resumes syncing or just sits there. I sent it because I thought the query folder hadn’t updated in hours.
We limit how many requests are sent out over the network across all folders, so you probably have contention between folders.
It could be that it looks like smaller folders make progress but not the bigger ones as 4mb is 0.5% of progress vs 0.005% of progress.
I tend to look at this
to judge if the job is stalled. At the moment it is still Preparing to Sync, so the above is just for example.
This probably isn’t helping
( myfolder id) isn’t making sync progress - retrying in 12h20m35.4700432s
is there a a way to cap the retry time to not exceed, say 1 hour.
back up to the usual speed when all the other folders are paused…
Well you should fix whatever errors there are before spending time trying to understand why it does not sync.
I have. I’ve gone back to 1.3.4 to see how that compares. I’m trying to constructively explain what i’m seeing and the issues i’m getting and I feel that i’m being disregarded. But looking through other threads, say from the last month others are getting similar problems but they worded their issue in a different way.
I don’t wish the tone of this reply to sound bitter, just exasperated.
This means there are errors while syncing, which is logged and displayed in the web UI together with the actual error message. This error message is what Audrius refers to, i.e. what needs fixing. If it isn’t evident what to do from the error message, please post it here. If there’s new things to sync, Syncthing will do that regardless of the delay mentioned, this delay is just for a retry for the existing items that failed if there is no other changes.
There wasn’t any errors, everything was running as expected, all the other folders were up to date, syncing or scanning.
I just need to try on 1.3.4 just to ensure that it is a 1.5.0 rc2 issue and not a ‘number of folders / IO / something else’ as I know that 1.3.4 worked well for me in the past.