Delay in requested events

schneemensch · August 20, 2018, 4:13pm

I am working on a script which detects when the master in my setup is finished with scanning and logging completion on the other devices afterwards. In order to do this I frequently check the master status and when it is “idle” I request the lastSeenId to start polling FolderCompletion events.

The problem is, that my lastSeenId results to a value of about 150, but the scanning process actually results in about 77000 events. Because of this the handling of the events is really slow and it takes my script at least 5 minutes to receive the first FolderCompletion event even though they are generated by far earlier.

I tried to add a short waiting period between the “idle” state and the polling of the lastSeenID, but that did not help. The lastSeenID does not correspond to anyting special, just an ordinary LocalChangeDetected between many others.

While the delay of 5 minutes on my test setup is manageable, I have delays of over an hour on my production setup.

calmh · August 20, 2018, 4:16pm

Sorry, it’s totally not clear to me what delay you are talking about. The events endpoint either returns immediately (there is an event with an ID > than what you asked for) or waits for the next event to happen.

schneemensch · August 20, 2018, 4:20pm

The problem is that the reported ID is not corresponding to the latest event ID. I checked it by comparing timestamps and by logic of order: I only start checking the ID after the master went idle, but the reported ID is far before that state. (about 77000 ids before)

calmh · August 20, 2018, 4:42pm

What reported ID? Show some examples or code, I think.

schneemensch · August 20, 2018, 4:56pm

I think I found a fix for the problem in my code. I seems like the IDs in the audit.log do not correspond to the IDs I receive with /rest/events?since=0&limit=1

This could be because the LocalChangeDetected events are found in /rest/events/disk. Because of this discrepancy between the audit file and the seen events, my latest seen ID was to high and it only worked when the events surpassed my wrong value.

I will look further into this and report if I have further questions

calmh · August 20, 2018, 5:17pm

Yeah, so, the ID is local to the subscription channel, and the audit log has its own. The “global ID” is unique and shared, but that’s not what you ask for in the since=… parameter.

schneemensch · August 20, 2018, 5:27pm

This is really confusing and not mentioned in the documentation as far as I can see.

Now I do at least know what the problem is, even though I do not have a solution to fix it yet.

AudriusButkevicius · August 20, 2018, 5:32pm

You can also request to create a subscription to a single type of event, yet not sure if that is documented either.

calmh · August 20, 2018, 5:42pm

I think both of these things are at least mentioned at https://docs.syncthing.net/dev/events.html and https://docs.syncthing.net/rest/events-get.html#events-get

And yes, as Audrius says a subscription for the event type that you actually need is probably the best thing.

schneemensch · August 21, 2018, 7:33am

I am already doing that and that was causing the problem. Before the polling I did an initial request with /rest/events?since=0&limit=1 to receive my starting ID for polling. But there I did not specify the subscription channel and therefore the starting ID was higher than it should be for a specific subscription channel.

I got confused additionally because the audit.log logs /rest/events/ aswell as /rest/events/disk.

I have one more question: If I am polling for two types of events in one http request, are the IDs sequential or are there individual IDs for each event type?

AudriusButkevicius · August 21, 2018, 7:41am

There are two ids, global id and per subscription id. The latter is monotonically increasing for all events within the subscription, the former is monotonically increasing for all events, some you don’t subscribe to.

schneemensch · August 21, 2018, 7:43am

Ok, thanks. Now I know how it is supposed to work and will try to implement it. I will ask again if it still not works how I expect it.

I do not think the documentation describes clearly that the ID is subscription based and not machine or event based.

calmh · August 21, 2018, 7:49am

File a clarification.

This distinction wasn’t there prior to the existence of multiple event subscriptions, so it wasn’t written into the original docs.

schneemensch · August 21, 2018, 8:17am

I will try to update it this evening. I cannot do it from work.

schneemensch · August 21, 2018, 2:46pm

I have one further problem with the IDs of events:

The global ID of my master which is sendonly is lower than the globalIDs of the other devices. It seems like only IDs from the master host itself are counted and no other event IDs from the cluster. Is this part of the sendonly option and is there any way to turn this off?

calmh · August 21, 2018, 3:15pm

Events (and event IDs) are always local to the process that generates them. There is no sending of events between devices.

schneemensch · August 21, 2018, 3:24pm

ok thanks. I see it now. It destroys the basic principle of our script, but I will see how I can work with it.

calmh · August 21, 2018, 3:27pm

What’s your script do? As a point of comparison, Arigi subscribes to events from all configured devices and writes that to, for example, an elasticsearch database that can be queried and stuff. But that requires actually getting events from all devices if you want the full picture.

schneemensch · August 21, 2018, 3:34pm

The script reports the completion progress on all devices in the cluster. First only the master collects the Information, but once a remote device is at 100% completion it also fetches completion events in order to reach devices which do not have a direct connection to the master.

It receives the FolderCompletion events from all devices and currently uses the events with the highest globalID as the current status. This has to be changed to a comparison by time, I guess. But this leaves vulnerabilities due to differences in localtime on the machines.

Another option might be to request FolderSummary events on all devices and calculate the completion out of the neededBytes or something, but I do not know how reliable the data is.

schneemensch · August 21, 2018, 3:36pm

Normally the difference between the devices should be negligible, but once the master changes we reset the trusted devices and get an issue with the missmatching globalIDs.