Since 0.14, lots of 'sync-conflict' files appearing

Been running syncthing for a long time now. Everything has worked well.

However, since upgrading to 0.14 I’m finding lots of ‘sync-confict’ files are appearing while I’m working on code in IntelliJ. E.g. I’ll be half way though editing a function and a new ‘.sync-conflict-20160723-151306.java’ will appear in the editor.

Logging into the web interface on both machines I don’t see any errors, no banners about conflicts or anything else.

Two active clients, both on 0.14 One is OSX, my 'source’ Other is FreeBSD, my ‘backup’. It’s not running any other apps, just a machine that has syncthing installed so that I get a second copy of everything elsewhere.

Reviewed the syncthing logs for the OSX (source) machine, nothing in either stdout or stderr. Logs in BSD show lots of ‘pull: peers who had this file went away, or the file has changed while syncing. will retry later’ at about 3 minutes before the conflict files timestamp.

Any suggestions on next steps to figure out what the cause of the conflicts is?

2 Likes

It’s potentially because you haven’t allowed all peers to finish scanning and settle syncing before starting to modify files.

Ok, but I can’t control (and it’s not very visible) if/when syncthing disconnects if both machines have been up and running for 6+ hours, or there is a network glitch, or whatever.

It’s not clear why it’s a conflict though. If files are only being modified on one machine then there should be no conflicts caused regardless of what actually happens.

In 0.14, the database was reset, hence if two devices advertise a different file from the very beginning, it’s a conflict by definition.

This is nothing todo with disconnecting, this is todo with allowing them to reconcile the global state at least once before starting to modify files.

Aha. Well, good (bad?) news is that I upgraded all my machines yesterday at 9am, they all were processed and get into ‘Up To Date’ pretty quickly (e.g. an hour or so later).

Here is the state info for the particular folder I’m noticing it with right now: 22587 items, ~11.3 GiB

So I’d expect the database reset to not be the cause.

BTW - thanks for continuing to think about potential causes, it’s appreciated.

Additionally, files that I’m working on get altered externally (I assume, by syncthing) and have nulls (hex zero) appended to the end of them. Sometimes a single one, sometimes many.

Happened probably 8-9 times today, with the various files that I’m working on.

Will disable syncthing for a day and see if that makes a difference. If it doesn’t then will close this as I’ll have something else to go figure out locally.

That would be exceptionally odd, and to my knowledge never previously reported. I hope it’s something entitely different.

I agree, though the 0.14 upgrade is the only change I made in the last few weeks to the BSD side of things since I only do software updates (pkg upgrade etc) monthly.

I noticed a lot of messages like this are being logged around the time the problem occurs:

      Jul 25 09:17:15 syncthing: [UZP5V] 09:17:15 INFO: Puller (folder "testfolder", file "syncthingcheck/myfile"): pull: peers who had this file went away, or the file has changed while syncing. will retry later

The files are changing, but that’s because they’re being worked on and the app (IntelliJ) is writing to the file pretty much on every keypress.

So threw together something that does horrid things to files - basically create a file, rename it, append something, rename back. Does this a lot, pausing infrequently, and checks constantly.

Fails 100% of the time in my environment after 10-15 minutes.

$ ls
myfile                    myfile.sync-conflict-20160725-092755
myfile.sync-conflict-20160725-092420    syncthingcheck.sh
myfile.sync-conflict-20160725-092535

Enjoy. Happy to run debug builds or whatever to figure out what the actual cause is.

#!/usr/bin/env sh
n=1
rm -f myfile myfile.tmp
touch myfile
echo "Start"

while [ $n -gt 0 ]; do
    mv -n myfile myfile.tmp
    if [ -f myfile ]; then
        echo "ERROR! myfile still exists"
            ls -la myfile
            exit 1
    fi

    touch myfile
    date >> myfile.tmp
    echo $n >> myfile.tmp

    n=$(( n + 1 ))
    if [ -s myfile ]; then
        echo "ERROR: myfile has size!"
        ls -la myfile
        exit 1
    fi

    rm myfile
    mv -n myfile.tmp myfile
    if [ -s myfile.tmp ]; then
        echo "ERROR: myfile.tmp exists"
        ls -la myfile.tmp
        exit 1
    fi

    if [ $RANDOM -gt 32764 ]; then
        echo "Zzzzz...."
        sleep 61
        echo "Resume..."
    fi

    conflicts="$(ls | grep sync-conflict-)"
    if [ ! -z "${conflicts}" ]; then
        echo CONFLICT!
        echo "${conflicts}"
        exit 1
    fi
done

Conflicts are only caused if two different sides are modifying files.

I have a macos + freebsd setup at home, I’ll give your script a spin later to see what’s going on. Certainly it’ll struggle to sync a file that changes constantly, but it shouldn’t cause a conflict.

My BSD instance is a dedicated VM with the single Syncthing app on it. It doesn’t run any other apps and I don’t have ‘users’ who log into it.

The only unusual thing about this VM is that it mounts it’s syncthing storage via NFS to a ZFS pool sitting on another VM running on the same machine. That’s been working fine, forever, though.

I’m trying to eliminate as many things as possible.

Next step is to totally skip that machine and see if the same problem seems to occur OSX->Windows.

So the full setup is

        OSX
(HFS+ local filesystem)
     Syncthing
         |
         |
         v
     Syncthing
      FreeBSD
    (NFS mount)
         |
         |
         v
     NFS Server
      FreeBSD
       (ZFS)

?

With lots of modifications happening on the OSX side only?


Edit: Yep, reproducible with that script, so I’ll do some detective work and see what’s going on.

Your diagram is correct. And yes, modifcations only on OSX, not the BSD side.

(Sort of) glad you can reproduce, at least I can assume I’ve not got a funky hardware issue.

I also eliminated the BSD / NFS server as I can also reproduce syncing to a folder on Windows (Surface Pro - totally idle machine).

Again, happy to help debug however I can.

Well, your test case is frankly bizarre as I have no real expectation of Syncthing coping gracefully with a file oscillating between existing and not existing many times per second. The only thing that makes this work even slightly (apart from the occasional long sleep) is that the content doesn’t change much so it doesn’t always have to notice that the file has been removed and recreated five times between two requests…

That said, there shouldn’t be any conflict files so that’s what I’m looking into. :slight_smile:

I’ll see if I can do a simpler example :slight_smile: But, regardless, somehow a change is being detected at the destination side even though one never happens.

Do you think it would matter if the system times on both machines are about 1 second adrift?

(edit, I’ve got a real world app causing it - IntelliJ’s development tool, but no idea if that does a similar thing to office and writes new file to tmp → rename)

In short, something happens and we forget that we just updated the file when updating it again, thus seeing a size mismatch which becomes a conflict.

I can’t reproduce this at all locally on my Mac, but quite easily locally on a FreeBSD box. Not sure about the relevance of that.

Another aside, I’ve just sat and monitored IntelliJ using another hack (basically, while [ true ]; do ls >> x -lart; done), and see this behaviour:

-rw-r--r--   1 me  staff     9748 25 Jul 17:37 Main.java

next loop (i.e immediately)

-rw-r--r--   1 me  staff     9748 25 Jul 17:37 Main.java
-rw-r--r--   1 me  staff     9765 25 Jul 17:38 Main.java___jb_tmp___

next loop (i.e immediately)

-rw-r--r--   1 me  staff     9765 25 Jul 17:38 Main.java

So it’s not updating the file in-place but is doing a write/delete/rename. And I can observe it’s doing it several times a minute as well.

e.g.

$ grep __ x
-rw-r--r--   1 me  staff     9712 25 Jul 17:51 Main.java___jb_tmp___
-rw-r--r--   1 me  staff     9696 25 Jul 17:51 Main.java___jb_old___
-rw-r--r--   1 me  staff     9712 25 Jul 17:51 Main.java___jb_old___

Yep and that’s fine and reasonable behavior that I wouldn’t expect to cause any issues at all. It’s what we do with temp files as well.

Last one from me unless there is something else you need.

#!/usr/bin/env sh
n=1
rm -f myfile2
touch myfile2
echo "Start"

while [ $n -gt 0 ]; do
    date >> myfile2
    echo $n >> myfile2
    sleep 0.$RANDOM

    if [ $RANDOM -gt 32764 ]; then
        echo "Zzzzz...."
        sleep 61
        echo "Resume..."
    fi

    conflicts="$(ls | grep sync-conflict-)"
    if [ ! -z "${conflicts}" ]; then
         echo CONFLICT
         echo "${conflicts}"
         exit 1
    fi
    n=$(( n + 1 ))
done

This version doesn’t move files around, and has more pausing as if new characters are being typed (and constantly saved), with the more occasional ‘thinking’ time.

This also shows conflicts when syncing between OSX and Windows, but takes longer than the first script to cause a conflict.

[Edit, yes, will grabbing a test version]

@Uniquenospacesshort

Try this build on for size:

https://build.syncthing.net/job/syncthing-pr/2610/artifact/

On the sending side, or both.