Keeping versions at only one endpoint

greggman · June 12, 2019, 9:29am

I’m trying to keep 3 machines in sync using a 4th machine as the base. Think of it a clientA, clientB, clientC, and backup. My plan is to configure clientA,B,C to each sync with the backup. Only the backup needs to keep versions.

If I understand correctly, to make that work the backup machine would need to initiate context with the clients. It would see which files are newer and copy them to itself and keep versions.

Am I correct?

I think I’d prefer not to keep the 3 clients open to contact all the time. I’m guessing to do that I’d have to setup something on each client that started their syncthing service, then pinged the backup machine and said “hey backup machine! sync with me now I’m open”, and them some how know when the backup machine was done.

Am I on the right track? Am I over thinking things and there is a simpler solution?

imsodin · June 12, 2019, 9:46am

Just run it all of the time. Reason: Why not and it’s not trivial to automate detecting when the backup client is in sync.

Also if you are not concerned about syncing between clients A, B and C, using a backup software directly or a one-time-sync solution (e.g. rsync) would probably be simpler. Regardless, please use a dedicated backup program on your backup client to keep backups, not just Syncthing’s versioning.

greggman · June 12, 2019, 10:00am

I’m already backing up (thanks for the advice)

I’m interested in keeping A, B, and C in sync but I don’t have room for versions on A, B, C therefore I need backup to keep the versions. I was hoping to sync each of A, B, and, C to backup. That will indirectly keep A, B, and, C in sync and with versions enabled on backup I’ll also have versions.

Just run it all of the time. Reason: Why not and it’s not trivial to automate detecting when the backup client is in sync.

Why not: Because it’s bad security to run with ports listening all the time, specially if they are not in use 99.99% of the time. It also takes memory and CPU to sit there with some service running I’m not actually using except once a day at 4am.

imsodin · June 12, 2019, 11:09am

I probably still don’t exactly get what you want to do: What if B changes something at 9am, wouldn’t you want that to sync to A and C before 4am the next day?

I am definitely not an authority on security, but I wouldn’t trust a service that has access to all my data to be online for lets say 10min/day, if I don’t trust it to be online 24/7. And just for a reference: Idle Syncthing uses next to no CPU and some memory, depending on your setup (~140MB with ~10 Folders, 3 active devices and ~500GiB total data here).

AudriusButkevicius · June 12, 2019, 11:38am

This is silly, just run rsync on a cron. Why do you need syncthing for this?

imsodin · June 12, 2019, 12:44pm

Keeping 4 devices in sync with cron and rsync - rather you than me…

greggman · June 12, 2019, 1:29pm

Syncing multiple machines (A,B,C) to some other machine (D) and with versions kept in only in D is a pretty common feature of commercial sync services like Google Drive or Dropbox or One Drive so I didn’t think I was asking how to do something out of the ordinary. Each machine only talks to D and indirectly that keeps all the machines in sync with versions stored at D

So, good to know Syncthing doesn’t do this. Thanks for the clarification.

acolomb · June 12, 2019, 1:50pm

Actually it does, if I understand your needs correctly. Regardless of whether the backup machine is online or whatever, Syncthing will only keep versions of files on those devices where it is configured to.

So enable Versioning on the backup device. Disable it for the same folders on devices A, B, C. Done.

Every time the backup device pulls an updated file from any of the other three, it will keep a backup of its previous local state as configured in the Versioning settings.

acolomb · June 12, 2019, 1:55pm

Your only real problem could be if your backup device is not online 24/7, then it will be rather hard to keep A, B, C in sync. Remember that your references (Dropbox etc.) do have always-online central servers. What they cannot do is sync between A, B, C when D (backup) is not online.

Syncthing can, so it’s actually more powerful and you should consider linking the devices directly if backup should really be offline most of the time. Just then nobody will be there to notice new file versions. So keeping backup online and connected to the others as much as possible is really the better solution.

imsodin · June 12, 2019, 2:22pm

As already stated by others: It very much can do that and is suited for doing that. The only thing that makes your described use-case special was that you only want to run Syncthing at a specified time. That means your devices, that you want to have in sync, will not be in sync for up to 24h. There’s no reason why not to connect these devices as well. If you don’t want to run the server at all times, you can start it whenever you need, you just need to be sure that at least one of the other devices is reachable at the same time.

greggman · June 12, 2019, 2:40pm

If I understand correctly, it’s that clients need to be open 24/7. With the other services I mentioned only D needs to be open for connections. The clients, A, B, C, are never open. They contact D. Where as with Syncthing D need to contact A, B, and C to do what I want.

acolomb · June 12, 2019, 3:24pm

No, your devices A, B, C do not need to listen permanently on ports open from the outside Internet. They could be behind a firewall and Syncthing has several mechanisms built in to deal with e.g. NAT or other situations where no inbound connection is possible. As long as D has its Syncthing port open, synchronization can happen when another device comes online. If D is also behind a NAT, the connection could still work through the relay network with probably reduced performance. Device D would need to have global discovery enabled, though.

AudriusButkevicius · June 12, 2019, 3:37pm

Sorry, why can’t you run an rsync from each of these machines into the backup machine?

If A B and C need to be in sync always, sure, use syncthing, but if you need A, B and C backing up to D at some magical point in time, and don’t want open ports, use rsync.

calmh · June 12, 2019, 5:10pm

You can configure syncthing on the clients A/B/C to have no listening port at all and just connect to the server D. It is not necessary to be able to open connections in both directions.

ellnic · June 18, 2019, 11:47am

Also, have I misunderstood or does there appear to be some confusion over open ports here? The data port is not the same as the GUI port. You could leave the data ports open all the time. Without an attacker having your node ID AND you clicking accept to the request, they aren’t going to be able to access the cluster. Leave 22000 open. No harm doing that.

greggman · June 20, 2019, 2:56am

Without an attacker having your node ID AND you clicking accept to the request, they aren’t going to be able to access the cluster. Leave 22000 open. No harm doing that.

Said every network software ever until the hacks are found

Open ports are always risky

Even if you think your code is bug free any library you’re using could also have a bug. Compilers change, libraries change, run time environments change, all of them could introduce new vulnerabilities.

I’d prefer if my laptops didn’t have any open ports. My home PC behind the router can be open only to things on my home network.

AudriusButkevicius · June 20, 2019, 6:11am

Sure, but network software needs open ports to work. If you don’t want ports, don’t run software.

ellnic · June 20, 2019, 6:31am

You are going to make life very hard for yourself then.

LE0N · June 23, 2019, 10:45pm

That is interesting - is it possible to adjust the interval at which the client (which does not have a listening port) tries to connect to the peer?

AudriusButkevicius · June 23, 2019, 11:20pm

It’s somewhere in advanced config.