Unexpected (?) proto unmarshal behaviour

I am tracking down an issue on the overflow PR, and encountered a behaviour that I did not expect. When I do v.Unmarshal(d) I’d expect v to be completely “overwritten”, but that isn’t the case:

package main

import (
	"fmt"

	"github.com/syncthing/syncthing/lib/protocol"
)

var myID protocol.DeviceID

func init() {
	myID, _ = protocol.DeviceIDFromString("ZNWFSWE-RWRV2BD-45BLMCV-LTDE2UR-4LJDW6J-R5BPWEB-TXD27XJ-IZF5RA4")
}

func main() {
	v := protocol.Vector{}
	v = v.Update(myID.Short())
	fmt.Println(v)

	d, err := v.Marshal()
	if err != nil {
		panic(err)
	}

	err = v.Unmarshal(d)
	if err != nil {
		panic(err)
	}
	fmt.Println(v)

	nv := protocol.Vector{}
	err = nv.Unmarshal(d)
	if err != nil {
		panic(err)
	}
	fmt.Println(nv)
}

Results in

{[{ZNWFSWE 1}]}
{[{ZNWFSWE 1} {ZNWFSWE 1}]}
{[{ZNWFSWE 1}]}

When I would have expected always just one version item.

Where is the mistake in my expectation?

Unmarshal() just loads the wire format data, leaving existing attributes that are not overwritten as they were, and also appends to slice attributes. You want Reset() before unmarshal.

Documentation for this isn’t obvious, it’s in the code generation plugin that makes the unmarshal methods, at the far bottom; https://godoc.org/github.com/gogo/protobuf/plugin/unmarshal

Remember when using this code to call proto.Unmarshal. This will call m.Reset and invoke the generated Unmarshal method for you. If you call m.Unmarshal without m.Reset you could be merging protocol buffers.

It’s trying to say you should use

proto.Unmarshal(bs, &v)

(similar to json/xml unmarshal) to achieve

v.Reset()
v.Unmarshal(bs)

but :man_shrugging: I don’t think we do that anywhere really, preferring to cut out the middle man and call the unmarshal method on the type directly instead.

Thanks a lot, that was it!
Maybe I should have, but I definitely didn’t expect this - and it still seems somewhat counterintuitive and misleading to me.

I’ll have a look, but I think we always use Umarshal on a newly created, empty variable - so not a problem in existing code. However I can’t do that with the disk overflow stuff, as I am dealing with interfaces there and thus can’t declare variables - that’s why it’s relevant.

Right, so the Message interface might be the right one so you can do Reset()+Unmarshal(). We do that in the with... loops in the db as well, I think.

I checked and while we don’t use Reset we only ever unmarshal newly created variables, so existing code is fine regarding this (unsurprisingly, I’d have expected havoc otherwise).

1 Like