Folders out of sync, even after rebuilding indexes

Yet looking at this:

[R7DNA] 2016/10/30 08:51:07.957963 sync.go:111: DEBUG: RWMutex took 1m31.8756011s to lock. Locked at model\model.go:1329. RUnlockers while locking: model\model.go:1080, model\model.go:2006, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:2006, model\rwfolder.go:381, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\rwfolder.go:210, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:1080, model\model.go:2006, model\model.go:2006, model\model.go:2195, model\rwfolder.go:210, model\model.go:2006, model\rwfolder.go:210, model\model.go:2006, model\model.go:2006, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591, model\model.go:591

It looks like it’s mostly contention?

Actually the code that is supposed to capture the issue is not good enough, as it seems some rlocker() is stuck somewhere.

Here’s a Node B panic log from the latest build

syncthing.log (17.1 MB)

Can you try:

https://build.syncthing.net/job/syncthing/2335/

with STTRACE=sync STDEADLOCKTIMEOUT=300 thanks.

I need a Windows build so I’ll wait for that to complete tomorrow

https://build.syncthing.net/job/syncthing-windows/285/ I’ve kicked it off.

Here’s the log from the latest build. This is with Node B running and Node A shut down. It stopped shortly after scanning was complete

syncthing_NodeB_solo.log (5.3 MB)

Can you try latest build from: https://build.syncthing.net/

And also set STDEADLOCK=250 env var together with the others.

I now know exactly where it happens and which commit introduced it, yet I can’t explain why it would happen as both mutexes have locks in read mode and are trying to acquire each others lock in read mode too.

Here you go. It stopped almost immediately so I did another log but the same thing happened: syncthing_NodeB_solo2.log (15.0 KB) syncthing_NodeB_solo3.log (16.0 KB)

Is there anything else you need or are these logs enough?

No I think we now know why this happens.

Please let me know if you need me to test any fixes :slight_smile:

A fix was committed a while a go, and should be in the latest version. If you are still getting deadlock panics, provide new logs.

Here’s another panic log from the latest debug build (#307)

I’m not sure if it’s the exact same conditions - I managed to get the two nodes synced up (with several panics / restarts). I’ve now introduced Node C and it seems to be flaky again (the log is from Node B)

syncthing_14.12.log (15.6 KB)

Is this supposed to be fixed now? If so then i’m still getting the error in 0.14.14.

Panics are supposed to be fixed. If you have panics, provide logs with the debug options asked.

OK, here’s another panic log then. This time generated by 0.14.14 release.

syncthing.log (20.1 KB)

Sorry, can you get output without STDEADLOCK env var set, that will fall back to using our lock detector which I find a bit more useful. Thanks.

Actually, try the build from here:

https://build.syncthing.net/job/syncthing-pr/3120/

Another issue cropped up.

Sure.

Is that the equivalent of this windows build? https://build.syncthing.net/job/syncthing-pr-windows/1795/