Syncthing nightmare sync 330GB deleted down to 220GB

Can somebody help me with the following nightmare sync scenario. If I can’t understand what went wrong, I’ll have to stop using syncthing as data loss is too dangerous.

I’ve installed on 2 machines both running Manjaro Linux.

Did a test setup with folder A on both machines. Been using for a few weeks and everything seemed reliable. Setup simple versioning up to 1 level only.

For my real use, I copied about 330GB of data from usb hard drive onto both machines 1 & 2. I then setup folder B on machine 1 and let it create all the hash data etc. When completed, I setup folder B on machine 2 and let it create hash data and sync with machine 1.

This took a long time, but in the end everything was fine, with no data moved as it was duplicate data. Both had simple versioning to 1 layer.

Nightmare preface.

On machine 2, I started restructuring the folder layout by moving physical folders inside syncthing folder B for a more organised layout.

I thought machine 1 would simply move everything relatively quickly, but it was taking hours on end, both re-hashing and also showing that it was ‘downloading’ files (even though it should just have moved files instead).

Because it was taking so long to sync machine 1, I also deleted a few folders on machine 1 equating to roughly 1GB.

At some point machine 1 disk became full, as files weren’t getting moved. They were getting deleted into the versioning .stversions folder. I had to delete the files in .stversions to create more space.

At same time new files were getting downloaded from machine 2 into machine 1 to sync.

NIGHTMARE RESULTS.

Here’s the crazy result at the end of full sync.

1. Global Sync has gone from about 330GB to 220GB!!!

I don’t know why its all been deleted?

2. Local Sync on machine 2 has been slowly deleted from 330GB to 220GB to match the global sync. Same with machine 1, although I think physical check shows its more.

**3. Folder structure on machine 1 partially matched the newly created structure on machine 2, but but about 50% old structure still remains. **


4. There are load and loads of syncthing.*.tmp files everywhere in machine 1. Haven’t checked machine 2.

5. Many files on machine 1 now also show as 0 bytes.

Luckily I have a 3rd backup on usb disk, so will have to restore both computers to their full 330GB data.

This has really scared me from using syncthing in future, as I don’t know why its behaved so strangely and:

1. Deleted data

2. downloaded instead of moved large amounts of data.

3. Made many files 0 byte.

4. Created loads of syncthing.*.tmp files.

Please advise if this is a known issue as it seems unworkable for large data storage.

I can’t see this as normal, so need to figure out what went wrong and how to make sure it doesn’t happen again. Otherwise I need to look for a more reliable solution.

Please advise.

It can’t always reliably detect moves, because it acts on changes as fast as it sees them, which potentially means it sees deletes before additions. Renames are not handled in any special way, and are seen as delete + create somewhere else, which means all of the data you move would be considered as deleted, therefore versioned.

If your disk was full at some point, it would explain why you have a lot of 0 byte files. Temporary files get removed after 24 hours by default.

If you want any kind of debugging on this, you’d have to provide logs from both sides.

Its a real pity syncthing can’t deal with ‘move’ properly.

I thought this had been resolved, but suppose not.

How does anyone rely on syncthing at all for large amounts of data, if it can misinterpret ‘move’ to ‘delete/create’?

Why did it delete without create though, as size has reduced by 100GB.

Perhaps has something to do with the disk filling up and not error protection for such an event?

  1. Do you know an alternate solution that can handle ‘move’ properly?

  2. Where do I find the log files to upload?

Thanks

Well data.syncthing.net says that there are 30k+ people relying on syncthing and one of them has 100TB of data.

Syncthing logs to stdout by default and the logs should be there.

There are protection against filling up the disk, and they did work, which essentially said, we can’t download more data hence stop with creates, but there are deletes that we need to process, hence go ahead with the deletes as this makes space.

The files that were renamed should not have been removed on the source especially if the destination didn’t have space to download them, unless you had send only folder or something along the lines which then enforced the state on the remote devices.

Ideally, we’d need a reproducer for this case to understand the issue better.

I know this is a silly question, but where do I find that in a file to upload to you?

I’ve searched stdout and can’t see anything of relevance to syncthing data.

I’m sure its working well for others, and it has been for me too on my 1 month test with small data.

However, this experience has really shaken my trust until I can understand how to avoid it in the future.

Also,

Jakob BorgcalmhSyncthing MaintainerJun 2015

A move/rename should be handled as a rename on other devices as well. Technically what happens is that the update is sent as a pair of entries; one describing the new file, one saying that the other file has been deleted. We notice that the block checksums are identical and optimize it by “reusing” the file we were going to delete.

I thought the above meant that moving files in Machine 1, should result in moving in Machine 2 aswell, rather than delete, download?

Usually there is no file. It logs to the console it gets opened in. Anyways, I suspect even if you find logs, it won’t be immediately useful. What we really need is a reproducer.

Moves renames work only for a small set of files, not for renaming large directory trees.

If disk space fills up, the creates will stop.

Can you stop the deletes aswell, as that seems to have created the biggest problem.

Also can you protect by stopping:

  1. Files of 0 length being created, when there is clearly no space left.
  2. Tonnes of syncthing.*.tmp files being created, as this is also killing the space.

I understand there are 100TB++ of data being synced, but can’t understand how to be confident if the protection against full disk hasn’t worked properly like above.

thanks

What you suggest makes is not always the right thing to do, because deleting files is a way to recover from the full disk issue, and be able to continue sync operation. In your case this would have resulted in redownloads, which is not ideal, yet the fact that you had versioning meant deletes freed no space at all.

The temp files get cleaned up after 24hrs automatically, so it’s a non issue.

There should not have been 0 byte actual files (only temp files).

Did you files actually get deleted or just zeroed (number of files stayed the same)?

Anyways, we need a reproducer to look into this, talking about you lacking confidence in syncthing and you being surprised that something didn’t work as you expected yields nothing that actually helps understand/resolve this.

I guess people who sync 100TB have backups and don’t run with full disks.

can’t check quantity of 0 bytes now, as deleted whole folder and restoring now from usb backup.

wasn’t trying to offence, so apologies for that.

thanks

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.