A little help perhaps? ...I think i almost have it

So I’m basically trying to add a feature that simply inserts into the debug log weather we have won a file conflict or lost and (if lost) the computer that won was < device ID >.

I feel like I almost have it but cant seem to figure out for the life of me what’s not working right. I feel like I just don’t quite know whats happening with the structure of the []fileVersion Maybe someone can help me with the last piece.

My code is here (I hope you can see it): https://github.com/syncthing/syncthing/compare/master...nrm21:conflict_log?diff=unified&expand=1&name=conflict_log

Tests show that when the folder rescan happens on a remote client everything works correctly 100% of the time. But when the rescan is done first from the local computer (doing the logging) everything is switched and 100% wrong all the time. It shows we won the conflict when we lost and vice versa. It seems to exhibit these behaviors consistently so if I can just find a way to check if the scan is coming from local behave one way and if not the other.

Anyone have any ideas? I’m hoping it’s an easy fix.

At a glance it looks reasonable to me, I’m not sure why the scan order would affect things. However I’d suggest not doing that logging there. In the general case you can’t be sure that the device that sent the update is responsible for the conflict (in the sense that it was they who changed the file), and you can’t be sure to even know or hear about conflicts that you win - they may just be resolved somewhere else.

I’d suggest hooking into where the actual syncing happens, and just log the losses somewhere around here:

Well yeah that’s true but really the feature was going to say what computer we conflicted with in the immediate transfer (and let the user trace it back, client by client if really needed).

But now that you said that I got an idea for improvement. I was thinking it would not be possible to know the ultimate source of the conflict unless some remnant is recorded in memory by the original computer that can be passed on until all computers have it and know who the original offender was (especially in a swarm where some are going on and offline frequently). But hey, we don’t even need to do that… we can just stamp it on to the filename right? It’s getting changed anyway. But hang on… it’s the loser file that gets it’s name changed. We need to know who the winner was right?

So what… we stamp the winner client onto the loser filename? Wouldn’t that just confuse people? Hmm, I gotta think about this.

And of course the loser file will eventually be propagated to everyone else with filename.sync-conflict.ext… so why not this instead:

filename.sync-conflict-AAAAAA.ext (the A’s being the start of the offending devID)

If you view the loser as being the offender, all you need to do is name the file / logs after the local device ID when it does the rename. That’s probably the simplest.

If a device creates a conflict file then it knows the ID of the device it’s pulling from at the point that the conflict occurred, surely?

So the conflict file can be named after one of the devices on which it last existed (before being turned into a conflict file), if that makes sense…?

Yeah and that’s what I did.

Actually, no. :wink: We just have a list of blocks, and these get distributed as requests on any peers that may have them. It doesn’t need to be a block from the same version or even the same file, it just needs to be the same contents…

So after looking at this for a day or 2 I have concluded this is not going to be easy.

The client can of course calculate the latest modification time of a file to determine if it wins or loses the conflict but unless a “LastModifiedBy” <string (or maybe []byte)> variable is added to the protocol.FileInfo struct, I don’t see how the loser client (who is responsible for renaming his own file to something else) can know WHO the winner was.

IF a “LastModifiedBy” variable were to be added to the protocol.FileInfo struct then I would guess you could have the winner stamp his device ID in this method/function:

message.go:74: func (f FileInfo) WinsConflict(other FileInfo)

But then the other problem would be… can we guarantee that the info on who last modified it would even be available and have been passed across the network yet from winner to loser? Also it would require another protocol change too wouldn’t it? :frowning:

I think your analysis is spot on. We’ve been thinking along these lines before, but a simple changed-by attribute doesn’t add enough value to justify it, in my opinion. The “last change” may be just a simple update to the modification time, while the actual change to cause the conflict was by someone else entirely slightly earlier.

That said, the version vector holds some relevant info. The keys are the first 64 bits of the device ID, and those entries with higher values than in your existing version are at least partly guilty for the conflict.

@AudriusButkevicius’s comment from the PR:

As for which name there should be, I think when we move the file for conflict we have to take the first 5 characters of the device ID who has last modified the file which we are now moving for conflict.

The way you’d get that is check the version which is concurrent and loosing, and then match the first 63 bits of the version ID with the device ID’s available in the model.

I haven’t dug into that bit of the code, but is he mistaken somehow?

The version struct contains key value pairs, the key is the first 63 bits of the device ID, the value is the version is the version number whih can be identified who has produced the conflicting modification.

Well I still don’t understand the version vector enough to know how this works, I guess I’ll keep looking in that area.

Part of why I posted this was to hope you guys would point me closer in the right direction. :slight_smile:

But yeah, if I can figure that part out… maybe I can crack this thing. Maybe even make a “func GetDeviceIDFromVersionVector(Vector) []byte {}” for future use as well if any later code needed it.

Yeah… For this purpose you could probably just add a base32 formatting of those first 63 bits, as that will be a unique prefix of the device ID.

Ok I’ve just about got this feature but this code is causing the occasional characters to come out wrong/scrambled and I cant figure out why:

bytes := make([]byte, 8)
binary.BigEndian.PutUint64(bytes, newestClientID)
remoteID := base32.StdEncoding.EncodeToString(bytes)
return remoteID[:7]

For example for an ID of “XKX75BA” it will return XKX754BA or sometimes XKX753BA (notice a char inserted after the 5th char, sometimes it’s a diff spot). I’m guessing it’s a minor thing I’m not seeing.

Also when I write the byte array to debug line it LOOKS right when I convert the decimal numbers to hex that is. And when I convert the hex to b32 using online websites, it checks out ok. Dunno what it is.

I see no reason that would happen… The short ID is what it is and base32 should be deterministic enough.

Ok after some more testing it looks like the problem is with the time.format command changing the string not the function as that seems to be returning the right data.

Ah. Yeah, the time.Format thing can bite unexpectedly. :expressionless:

Ok so I relied to your replies in the PR request. What do you want me to change (if anything) for this PR to go through?