A Syncthing indicator

sil · April 21, 2014, 9:26am

I’ve now got syncthing all rigged up how I want it (and thank you @calmh for all the help!). As promised, I’d like to put together a simple status menu for Syncthing to live in my Ubuntu panel so I can see whether it’s doing things without having to open up the web GUI in a tab. This should be fine using the API (which I admit isn’t documented, but it’s not hard to see how it works by looking at the GUI’s js code). However, in thinking about this, I’ve hit an issue.

Syncthing’s GUI does stuff by polling. The steady state of syncthing is that everything is synced. This means that almost all the time, when you ask for the current state of syncthing, it’ll be “everything is 100% synced”. If I, for example, drop a small new file into one of my repositories at time 0, I likely won’t notice anything in my indicator because that file will be detected by the syncthing polling interval at some point later (at time 60, say), but my indicator will poll syncthing’s API for interesting changes at time 37 and then at time 97, and at both of those times we’ll be 100% synced (we assume the file is small). So I can’t actually tell whether the file I’ve just added has been synced or not by polling the API; I don’t know whether we’re at 100% synced because it’s been noticed and synced, or because it hasn’t been noticed yet.

The obvious way to solve this would be to keep a record of the last few files that were synced; then the indicator can show that information. As far as I can tell, I can’t get this information out of the API. Is there a way to do that, or is there a better way to tell the difference between “100% synced because we haven’t noticed the new stuff” and “100% synced because we have noticed it and have synced it”? Another alternative might be an API endpoint listing a stream of “events”, or similar…?

calmh · April 21, 2014, 12:08pm

Indeed. So, if you didn’t already find it, the best “documentation” of the current available requests is in https://github.com/calmh/syncthing/blob/master/cmd/syncthing/gui.go#L36-L50. The current GUI calculates the percentage on the repo (indicating how in sync this node is) by using the /rest/model/<somereponame> call and comparing globalBytes to insyncBytes. For data on other nodes the /rest/connections call returns the in sync percentage directly.

As you say, both are likely to be at 100% at any given time if only small changes happen occasionally. I could see it being useful to have the event log you mention and I think that wouldn’t be very tricky to implement. Off hand, I think we could log events for:

Detected change in local file x
Started / completed syncing file y from cluster
Connected / disconnected to node n
Node n requested some blocks from file x
In sync percentage for repo r changed to p
In sync percentage for node n change to p

Obviously, polling (as opposed to streaming) sucks in the general case, increasing load unnecessarily etc. In this specific case we’re only likely to have at most a handful of users connected to the admin interface at any given time so simplicity wins over efficiency…

Still, it would be cool to expose this live over a streaming websocket, for example. I might look into that.

sil · April 21, 2014, 12:49pm

All that seems good to me, and that would give a third-party app enough knowledge to be able to know whether something has changed since the last time it looked. It would be more convenient to do that if the log were in timestamp order and each event had a timestamp, because then when I hit the log I can remember the timestamp of the most recent item and then just ignore everything before that. (Of course, this could be an increasing number rather than a timestamp if that’s more convenient.) Agreed completely that polling sucks for efficiency but we’re not serving a million users so it’s much better; the issue, as noted, is that when things start and them complete entirely between two polling intervals, you’ve got no idea that they happened at all

Streaming websocket would be nice, indeed, although reading a websocket if you’re not actually a web browser can be a bit fiddly; this is where “the API is basically for the web GUI” and “the API is for any client, where the web GUI is just one example” come slightly into conflict. A long-poll JSON-style thing or a server-sent-event thing might be better in this case?

calmh · April 22, 2014, 7:56am

Yep. I have a prototype of this cooking. The interface is doing a GET of /rest/events/<lastSeenId> where <lastSeenId> is the ID of the last event you’ve already seen, or zero. You get back an array of events that has happened since that ID (up to a point; currently it keeps the last 100 events so if you’ve missed more than that there will be a gap). If there are no new events it blocks for up to 300 seconds waiting for an event, then returns a timeout event with ID -1 if nothing interesting happened. Every event has id, type, timestamp and params fields. The results of a typical polling loop might look something like this:

jb@jborg-mbp:~ $ curl -s -u st:test http://localhost:8080/rest/events/0 | jsonpp
[
  {
    "id": 1,
    "timestamp": "2014-04-22T09:03:11.274499963+02:00",
    "type": "NODE_CONNECTED",
    "params": {
      "node": "LGFPDIT7SKNNJVJZA4FC7QNCRKCE753K72BW5QD2FOZ7FRFEP57Q"
    }
  },
  {
    "id": 2,
    "timestamp": "2014-04-22T09:03:11.335947892+02:00",
    "type": "NODE_CONNECTED",
    "params": {
      "node": "ME6QVQK2B4BFYWIANFJCSN76Q2GMH3NZISD6LAYME6CSDSCPE47Q"
    }
  },
  {
    "id": 3,
    "timestamp": "2014-04-22T09:03:13.128464+02:00",
    "type": "NODE_INDEX",
    "params": {
      "hasBytes": 7229008723,
      "node": "ME6QVQK2B4BFYWIANFJCSN76Q2GMH3NZISD6LAYME6CSDSCPE47Q",
      "repo": "lightroom",
      "totalBytes": 7229008723
    }
  }
]
jb@jborg-mbp:~ $ curl -s -u st:test http://localhost:8080/rest/events/3 | jsonpp
## blocks for a while ... ##
[
  {
    "id": 4,
    "timestamp": "2014-04-22T09:04:33.220233006+02:00",
    "type": "NODE_INDEX",
    "params": {
      "hasBytes": 7229008723,
      "node": "LGFPDIT7SKNNJVJZA4FC7QNCRKCE753K72BW5QD2FOZ7FRFEP57Q",
      "repo": "lightroom",
      "totalBytes": 7229008723
    }
  }
]
jb@jborg-mbp:~ $ curl -s -u st:test http://localhost:8080/rest/events/4 | jsonpp
## blocks for five minutes ... ##
[
  {
    "id": -1,
    "timestamp": "2014-04-22T09:09:37.526111988+02:00",
    "type": "TIMEOUT",
    "params": null
  }
]

That seem OK? I’ll write something up on the event types and their expected params as I get along.

calmh · April 22, 2014, 7:35pm

https://github.com/calmh/syncthing/wiki/Event-Interface (work in progress)

sil · April 23, 2014, 1:02am

PULL_START: Generated when syncthing begins synchronizing a file to a newer version.

If I’m hitting the /rest/events/ endpoint on machine THIS, which shares a repo with machine OTHER, and I edit a file on OTHER, syncthing will then pull those changes over to THIS, and presumably fire a PULL_START event. If I edit a file on THIS and syncthing starts sending it over to OTHER, will that also fire PULL_START? Or is there a “PUSH_START” in planning? I may be looking at that page when it’s only half finished

calmh · April 23, 2014, 6:25am

Yeah, it’s half finished. The “pull_*” stuff is all about getting files from OTHER to THIS.

For the other direction, I’ll expose “the local repo changed: the file f is new/changed/deleted” and maybe “the remote node OTHER requested blocks for file f”. The “maybe” on the latter is just because that is slightly more involved in that we need to rate limit it (i.e. keep state so we don’t generate an event for every block requested).

sil · April 23, 2014, 7:41am

Makes sense. One of the big end goals here, the questions that I think a user will want to ask and have answered, is: I have just edited file f on my computer. Has that file f been successfully synced to another node?

I don’t know whether it’s actually possible to answer that question from syncthing’s PoV; syncthing on THIS may only know that OTHER has requested blocks for file f, and not know whether OTHER now has an up-to-date copy of file f. However, I think that that’s what people actually care about; if it’s not possible to answer that question, maybe a brief protocol augmentation might make it so?

calmh · April 24, 2014, 7:06am

Well, it could be answered, as in the local syncthing node knows. What actually happens protocol wise is the following:

nodeA discovers a change to the file F, rescans it and gets a new set of block hashes + metadata
nodeA sends an “index update” to it’s peer, nodeB (and any other peers; we’ll do the exercise for one only) containing the new file data and a bumbed version number. At this point nodeA knows that nodeB doesn’t have the updated file, because it hasn’t seen an index update for that file with the new version number and hashes etc. In the current GUI, nodeB’s “in sync percentage” will be dropped from 100% to something lower since it does not have all the bytes that make up the latest version of all files.
nodeB receives the index update and notices the local file is out of date. It is scheduled for “pulling”.
nodeB figures out which blocks have changed compared to the old file, requests those from it’s peers and puzzles together the new version of the file.
Assuming no problems happen during #4, the new version of the file is OK and nodeB sends an index update to it’s peers announcing that it has the file.
nodeA sees the index update from above and knows that nodeB has successfully updated the file. nodeB is shown at 100% again.

From there it’s “just” a question of what events to expose. Technically we could generate events for each file that is changed for each connected peer.

sil · April 24, 2014, 10:04am

Aha. Does nodeB send an index update per file updated, or does it send one at the end of the whole process? An event for “file f up-to-date on node b” would exactly cover the use case from above, I think.

calmh · April 24, 2014, 10:09am

The index update is sent every five seconds, if there are changes to send. The update then includes all files changed since last update. So there is a delay but not very long.

sil · April 24, 2014, 3:36pm

That’d be perfect, then: have a node fire an event for “files updated on node B: f1, f2, f3, f4, f5”?

sil · April 29, 2014, 7:40pm

Just to keep people in the loop, https://github.com/stuartlangridge/syncthing-ubuntu-indicator. Since the event interface isn’t actually in syncthing yet, I threw in a tiny test server which pretends to be the syncthing events API, and you can generate fake pull events and so on. So, a work in progress, and it’ll need extra once it gets hooked up to the real syncthing, but progress is made.

calmh · April 30, 2014, 11:20am

Cool! I haven’t had time to finish the event interface, turned out to be a bunch of code in between where the events need to be fired and where they are to be reported… But I’ll get there.

icaruseffect · August 5, 2014, 10:00am

Hi, i have taken your code and adopted it to the new event interface (see https://github.com/icaruseffect/syncthing-ubuntu-indicator). Pull request on Github is out.

So i have a question: are you still working on the Project (?), so there is no work done double. Thank you for the good code basis anyway. It was much better than what I’ve been coding, before i found yours.

icaruseffect · August 5, 2014, 10:24am

@calmh Until now i’ve never fetched an “ItemStarted” type from the event interface. Could it be this one is missing?

calmh · August 5, 2014, 10:57am

ItemStarted should be emitted when a file begins syncing, looks like it should work at least.

icaruseffect · August 5, 2014, 11:00am

Hmm, then I’ve missed something. Thank you for checking

calmh · August 5, 2014, 11:03am

Ah, no. The event is emitted, but the /rest/events endpoint doesn’t actually expose ItemStarted and ItemCompleted. The reason was that the amount of events can be overwhelming and it’s not used by the current GUI. I was intending to add a separate endpoint for that for when it was actually needed.

Edit: I exposed them on the same event interface now, it’s not that expensive (I think). Will be in next release.

icaruseffect · August 5, 2014, 1:19pm

Perfect So on the next release ItemStarted & ItemCompleted are exposed?