Sync via SFTP

Harvie · November 5, 2021, 4:17pm

I know this probably will be considered an offence to syncthing’s strict p2p ideology But i will tell you about my story anyway.

I am using syncthing to sync my phone, laptops and PCs. It works suprisingly well. That automagic NAT traversal absolutely changed the way i think about my devices and internet. I like this p2p nature, but i think sometimes i would need some central node running on server to cover downtimes of my consumer devices. Eg.: i shutdown my PC, take my laptop and go out. they were not online at the same time, so they had no chance to sync. This can be easily solved using some server running syncthing 24/7.

Unfortunately servers and their operation are not for free, so it would make sense to share such server with my friends to share the cost. But syncthing is not really designed for such setup. Yes, it can be done. But you need rather complex setup with multiple instances. In my opinion it would be very smart to run each instance sealed in some kind of secure container to make sure individual users will not be able to access each others data. But that just adds to the complexity of running of such setup.

I am lucky person, because thanks to my job i have access to relatively large number of Linux servers with free storage, which i already use over SSHFS. But i don’t like the idea of running syncthing on them. Less lucky persons might as well find that commercial cloud storage services which allow you to run custom software are more expensive and require more setup than simple storage services which just provide you with SFTP access and nothing more. Great thing about SFTP is that you can achieve reasonably secure isolation between SFTP users without having to constantly run separate service/container for each of them. You just once configure OpenSSH with SFTP chroot and you’re done.

It would really make sense to me to be able to add “dumb SFTP nodes” to my syncthing mesh.

Don’t get me wrong, i am well aware that such approach might get quite challenging. After all we might see that from experience made by authors of https://www.syncany.org/ and https://csync.org/ (both of these aimed to provide syncthing-like experience with all logic running exclusively on client-side, while server is dumb arbitrary storage without any sync logic; like SFTP)

On the other hand i think that syncthing might be able to leverage it’s existing synchronization logic in order to enable users with this kind of operation. Use additional SFTP storage to make my syncthing mesh even more available, robust and resilient.

AudriusButkevicius · November 5, 2021, 4:57pm

I don’t really follow what you are trying todo.

If you have servers with storage, just run syncthing. If you want to run multiple instances, just run multiple docker contaiers with some directory mounted into the container, giving you the beloved chroot.

You still have to run syncthing in two places to get it to sync, so you still have to run them on your linux machines to utilise the storage that they have.

Are you suggesting you will run it twice on your machine, and then in one of the instances pointing at the sshfs mounted storage writing data to your linux servers? Sounds mad, but knock yourself out.

Syncthing has a filesystem abstraction, so you could go an add your own filesystem that does things cia sshfs, but that doesn’t fix the fact that you’ll still have to run it twice.

To me, it sounds that you are looking for a backup solution, and not a sync solution, for which, rsync/scp on a cron to your chrooted sftp server sounds like it would be good enough.

With syncthing your angry coworker deleting the files in your sftp share would cause a deletion of your files on all devices.

Syncthing will never be a “run this application, it copies files to a remote sftp share as they change” application, because that has very little todo with syncthing.

Harvie · November 5, 2021, 5:13pm

I don’t want syncthing to be backup solution. My way of reasoning is this:

I use syncthing because i like my infrastructure to be independent of cloud providers
Sometimes i would like to get little help from cloud without commiting to it. Without having to setup software on that cloud. So i can easily swap to different SFTP provider without having to migrate the syncthing setup on server side.

Catfriend1 · November 5, 2021, 5:24pm

What about Duplicati? It pretty well does SFTP.

Harvie · November 5, 2021, 5:27pm

Duplicati is backup solution. I am looking for continuous synchronization throught SFTP share, not backup. (eg. multiple devices syncing with each other using SFTP as central hub). I already use “duplicity” for backups.

AudriusButkevicius · November 5, 2021, 5:40pm

I don’t buy the whole syncthing is hard to setup, sftp is not, argument. Its a single statically linked binary, i.e, single command. Sftp is adding users, folders, config etc etc. Equally hard or even harder.

Just mount the sftp thing as local storage and off you go.

Or as I said, write some code to implement a custom filesystem interface in syncthing.

As I suggested you will still have to run two instances.

mdell-seradex · November 10, 2021, 1:42am

It sounds odd to me that he would need to run two instances. By that logic, I would think that he would need to mount an sftp drive on each client and run two instances at each client. This would allow every client to keep the files in sync such that if the only available “node” is the sftp server, then a client still has an available “node” to sync with. This based on my understanding of what he wants. This sounds like a very complex setup, if it would work.

If Syncthing can be customized like you say, then it sounds like it would be better for him to essentially add the node functionality, if he has the ability to code this, hopefully via some plug-in, that way he would only need to distribute the plug-in and not a whole custom version of Syncthing, but I have no clue if Syncthing supports plug-ins in that manner.

If I understand correctly, Syncthing is not actually passive. It has to have a sender and a receiver (server and client). It also has to have something that keeps track of the files and creates a kind of change log, or tracks the current status. While it would be a poor imitation, and subject to failure due to manual manipulation, this “status log” could perhaps be stored in a flat file on the sftp node, and really only maintained by the nodes that send updates to it. Possibly, the sftp node could be setup as a fall back node used only when the primary server is down. The primary server would update the sftp node as if it were a normal Syncthing client, albeit connecting to it over sftp and reading the flat file as if it was a record of the current status of that node.

It seems to me that this is perhaps the kind of functionality that he is looking for. Perhaps he was thinking that the Syncthing server would not read a flat file to determine the server’s status and instead scan the sftp server, but I think that would add significant extra time to the synchronization process.

AudriusButkevicius · November 10, 2021, 4:24pm

Syncthing doesn’t do

[Storage location A] <-> Syncthing <-> [Storage location B]

It does

[Storage location A] <-> Syncthing <-> Another syncthing <-> [Storage location B]

So for you to sync your local files to some sftp site you have to run two syncthings.

Syncthing doesn’t support talking to itself and pretending its not itself.

Harvie · November 10, 2021, 6:18pm

What i’ve originaly meant was bit different. I’ve meant that two (or more) instances of syncthing would exchange data through SFTP server to which all of them have access. That way SFTP server would help syncthing to get in sync if there is not other instance online.

Eg: Notebook running syncthing, PC running syncthing, cheap SFTP folder in cloud (not running syncthing).

Any of these three would be able to sync with each other. Eg. i do work on PC, it syncs to SFTP, i then shutdown PC, open laptop, it syncs from SFTP, i do more work, it syncs back to SFTP. Then i power up PC again and all of three devices get in sync with each other. Both through p2p and SFTP.

I fully understand that this is extremely different from what syncthing does right now. But it is not impossible. Syncany project does this: https://www.syncany.org/ but it is not maintained any more and it does not have the excelent p2p sync that syncthing can do.

AudriusButkevicius · November 10, 2021, 10:11pm

It sounds to me like:

local storage <-> syncthing <-> sftp storage <-> syncthing <-> local storage

Which syncthing doesn’t and has no plans to do, as syncthing only talks to syncthing for the purpose of syncing.

Harvie · November 10, 2021, 10:25pm

I expected this. But still wanted to share my utopic dreams

zendnez · November 12, 2021, 3:36pm

Another way to describe what @Harvie is asking for is:

        [-------- syncthing-2 -----]

[--------- syncthing-1 -----] storage-A ← > storage-B <-> storage-C

It’s Syncthing syncing A<->B and then another instance syncing from B<->C. And, of course, the implication is that all of this can happen concurrently.

Personally, I don’t get why you don’t just run compute in the cloud. That seems to be more of an aesthetic preference than anything else. Given my experience with Syncthing, there’s a real chance you’d be absolutely delighted by the flexibility and capabilities that running it in the cloud opens up.

vgivanovic · November 12, 2021, 7:05pm

I just got a notification of Amazon’s Lightsail: “Amazon Lightsail - Powerful virtual servers built for reliability & performance”.

" Lightsail gets you started quickly with preconfigured Linux and Windows application stacks and an intuitive management console."

From $3.50/mo to $160/mo.

Disclaimers:

I have not tried Lightsail.
I’m not associated with Amazon in any way.
You may have (many) perfectly valid reasons not to use Amazon AWS or Lightsail.
There are other options.

mdell-seradex · December 1, 2021, 6:53pm

Yes, basically, it is my understanding that we were looking at the “sftp storage” as if it would be a 3rd party syncthing agent which required a specialized module to talk to. It sounds like there would be no method to create a specialized module like this, even if a custom one. I can understand that, but as @Harvie said, one can dream.

I understand his desire because he can get inexpensive (free?) sftp access, but cannot deploy custom software to those systems.

Anyway, it is what it is. I appreciate the feedback you have provided for this.