I am running distribution system from a sendonly master to 11 sendreceive devices. The sendonly master is the only device which creates and alters about 300 000 files during a 6-10h timeframe. At the end of this timeframe almost 60 000 files are deleted. I check if the final scan on the master is finished and check the Folder Completion on the remote devices afterwards. Ususally only deletions are missing afterwards as I would expect.
Sometimes the process gets stuck on a few deletions. Last night none of my 11 receivers reached a Completion of 100%, because the master (FolderCompletion events) as well as the slaves (FolderSummary events) reported that each slave is missing 33 deletions (needDeletes = 33). Between the first time that only 33 needDeletes were reported and the time that the folder was unshared was about 1h were no other changes were synced (or had to be synced). This means that the progress was stuck on these files.
I then used the script I wrote to check the state of my leftover syncthing temp files to compare the folder on the master with each of the slaves.
I found out that there are exactly 17 files (the same for each slave) to many on each of the slaves compared to the master. Those files are created at different times and deleted in the same timeframe as the rest of the 60 000 successfully propagated deletions. For very few of these files on only a few slaves I found temporary files belonging to these files.
I hoped these errors would be fixed with 0.14.50, but it appears that it might be a different issue.
It is very hard for me to reproduce this reliably, because the process takes a lot of time an resources and I cannot enable debug logging, because one syncthing.log can easily grow up to 10+GB with the amount of changes.
I hope I can give you some clues if this is related to issue #5149 or if this could be another cause.
Maybe someone notices a similar problem on a smaller setup and is able to reproduce it.