[Poll] The sending of crash data

We will be including crash reporting soon. The way it will work is that when a crash (“panic” usually, in Go parlance) is detected the Syncthing version and backtrace is reported to some server, where it’s stored and aggregated. The data that will be sent looks like this:

07:48:24 INFO: syncthing v1.1.4 "Erbium Earthworm" (go1.12.5 darwin-amd64) teamcity@build.sycnthing 2019-05-21 20:36:38 UTC
Panic at 2019-05-22T07:48:25+02:00
panic: interface conversion: *pfilter.FilteredConn is not net.Conn: missing method Read

goroutine 106 [running]:
github.com/syncthing/syncthing/lib/connections.(*quicListener).Serve(0xc000158000)
        github.com/syncthing/syncthing/lib/connections/quic_listen.go:74 +0x41b
github.com/thejerf/suture.(*Supervisor).runService.func1(0xc0001c6690, 0xc000000000, 0x54b4728, 0xc000158000)
        github.com/thejerf/suture@v3.0.2+incompatible/supervisor.go:600 +0x47
created by github.com/thejerf/suture.(*Supervisor).runService
        github.com/thejerf/suture@v3.0.2+incompatible/supervisor.go:588 +0x5b
... more of same gibberish

That is, it does not include any log data, user or other metadata, nor the ID of the sending machine. Excluding that data makes it somewhat less valuable for us, but I think we’ll be OK.

All in all this means sending us less data than global discovery does, which is a feature that is enabled by default. Given that, how do you reason around having this enabled by default or not? Not having it enabled by default means we will lose lots of panic reports, so will be less likely to fix those bugs.

  • Enabled by default is fine; the very privacy conscious can disable it like they must global discovery
  • No, this must be explicitly opt-in, because …

0 voters

Feel free to explain your thinking below. :slight_smile:

3 Likes

I think it is best to ask user (like usage reporting) during next update

Why?

I have no objection to its being enabled by default. It’s the sort of thing which can lead to misunderstand with people who are, shall we say, abundantly cautious, so it and its rationale should probably be clearly documented, though.

I think we will have a one off notification like we had with filesystem watching that has to be dismissed.

I would ask the user after the first crash, just before sending the first report. At that point they might be a bit annoyed, and offering to help diagnose the issue will probably go down well.

It also means you can preview exactly what will be sent.

Even entities like Mozilla and Microsoft will ask before sending crash data, I believe.

4 Likes

Firefox sends everything by default and opens a tab on first start (in the background, so you have to notice it yourself) explaining their privacy focus;

Chrome on the other hand actually asks on first start, but abstracts it all away to a single checkbox, which covers the full spectrum from crash reporting to keylogging.

Edge Canary didn’t say anything and just sends it by default. But it’s the Canary build, real thing might act different - we also enable this stuff in our candidate builds.

But! That doesn’t mean we can’t do better of course. I think we already do better in that the stuff we do send (after asking) is much less invasive than what browsers send, both by the nature of the program and by decision. The crash reports are specifically designed to include zero personal data.

I would like to not have to ask about it, because it makes it more likely we’ll actually get the report, plus situations where it’s not possible to ask the user (headless, or crash on startup). But if we do ask, then we could add in a lot more data on the other hand… At minimum I covet those log entries we have in the on disk crash report…

Privacy is a right of every person. That others don’t respect it does not mean that it is okay. Please, show a pop-up and let users decide. Thank you very much.

There is literally no privacy concern here that is not completely overshadowed by normal usage of the app.

This actually includes less privacy concerning data than the discovery connections that are needed for correct operation. And requiring the user to opt-in to discovery would make this is much less useful product. Making the user explicitly OK error reporting would be deceptive.

I would vote enabling by default, but have a clear statement in the documentation. You already have something like that: https://docs.syncthing.net/users/security.html#information-leakage

It might be good to have a section describing how to configure syncthing to share data because machines on the same local network (or with hardcoded addresses) where you wouldn’t expect anything to leak to the outside world. I assume that is possible, but I didn’t see a definitive list.

3 Likes

Thanks for the thoughts on the documentation; we should do that. I think we will default to on for the reasons described, and in the migration (existing users) do something smarter. We should show the popup Audrius mentions, and default to following for example global discovery and anonymous usage reporting. If either of those are enabled, crash reporting will probably also be fine. If global discovery is disabled we can probably assume the user won’t like to be opted in to crash reports for now.

1 Like

Enabled by default with the user informed upon first run. I don’t see any issue with non-sensitive data and if the user is informed at the start (or upgrade) there’s no ‘hidden agenda’.

Google, Facebook, Amazon and numerous other companies track user data and present ads as an example. I am not saying you should follow them but there is no point going overboard on “privacy” either.

The developers are doing a great public service by providing this software for free so in return, it ought to be completely acceptable to get some data to make the product better (I don’t even think any permission is necessary for the kind of data that was shown by Jakob above; it should be silent collected. Just put a statement to this effect in the “Agreement” during install like other companies do.).

These companies make profit on public data, and that is totally unforgivable.

“Privacy” like any other socially relevant word, needs a lot of analytical discussion to understand how much of it is justifiable and where to draw the line. For example, one can argue that collection of anonymous data for public benefit is the collective right of society (and there are numerous laws that use this in several countries.)

1 Like

Yes, and privacy is looked upon very differently in different countries all around the world. That’s why it should be left to every user to decide. I understand your intentions and I am sure that you will get enough information if you respect the decision of every user.

It’s already somewhat agreed that if global discovery is disabled, we’d disable crash reports, past that point (of leaking IPs which we will not collect anyways), I don’t see anything privacy undermining.

Let’s cut all the “privacy is important in the modern society” stuff, and focus on the proposed implementation at hand.

If you have objections to that, please state how you believe your privacy is undermined (pointing at parts of the log that was provided or scenarios or whatever), so we can understand something we don’t currently understand.

If you just have a general feeling towards privacy, this is not the thread to argue about it.

On another hand global discovery is enabled out of the box, so perhaps this should be enabled too, as there is no personally identifiable information in the payload apart from the IP address.

I would default it on, without warning, because of how little information you’re actually collecting. If the amount of information reported changes, please reauthorize the sending of the data (via the must-be-dismissed-to-work dialog). I would put a checkbox in the webui with a nice, short explanation of what’s being reported and an example which can be shown for interested users.

I also agree that it’s a sane thing to disable reporting for clients who already have global discovery off. If nothing else, so syncthing does not violate the principle of least surprise with those users.

1. PURPOSE OF PRIVACY: Public outcry over privacy has generally been due to:

Govt tracking – to discriminate based on religion, race, etc., to take political advantage from citizen data, etc.

Private company tracking – to bombard people with ads to increase revenue and sales, to gain advantage over competitors, increase prices, etc.

In all cases, there was some damaging effect felt by the citizens so privacy was justified.

Here there is absolutely NO such undesirable effect. Just wanting privacy without a purpose has no meaning.

2. INDIVIDUAL vs COLLECTIVE: On one side is a collective benefit of making the software more robust for all. And to support the developers who are creating a viable competition against profit-hungry corporations that want money for such software.

On the other side is individual “privacy” (I don’t even know what is private in such data) demanded rhetorically by some. Which side should we take?

And IMO, it is not the users who should have a greater say in decision-making but the developers, particularly those who are putting in substantial time for all of us (although it’s awfully nice of them to ask us).

Thank you for this great software. (Sorry for the long note; I won’t say anything further on this.)

Very simply phrased: I want to know about any data going out anywhere for any purpose, and I want to allow it, and unless I do I would like no data getting out anywhere for any purpose.

That’s the theory of control over my activities.

And, indeed, I would enable this reporting, after reading why it is sent, where it is sent and what does it contain. (For example including specific path elements could cause a privacy problem, as well as any data about files, including hashes, and lots of other possible problems I’ve seen with similar features.)

It’s not that I don’t trust every developer on Planet Earth, but… I don’t. This is rather a policy than a personal mistrust.

I see no problem to connect it to global discovery as long as it is obvious that it means “global discovery and crash reports with the following content”.

I’ve tried that in 1.2.0 rc.1 and find it very good how it is solved now. One client updated and had crash rep enabled and the other syncthing told me “comfortingly” it has disabled crash rep because my other settings looked like. So thumbs up :-).

2 Likes