Is syncthing + syncthing-inotify enterprise ready?

nambrosch · July 22, 2015, 6:20am

is syncthing + syncthing-inotify ready for enterprise-class linux clusters? i’m talking about thousands of files on high availability clusters across multiple datacenters… starting around 20 servers and potentially scaling up to a few hundred (depending how the project goes).

i’m also looking at software like lsyncd over gluster/hadoop/drbd because 1) the more nodes you add the slower they get and 2) many of the other solutions only support two nodes before it gets messy.

it seems like the main focus of the project is for home users syncing their laptop with their desktop(s) or sharing files with a few colleagues or friends. note this is not what i’m looking to do… it would be for in-house file syncing between clusters and would not function as a cloud service. a centralized management console will probably need to be written as well (i can probably do that). i’m also concerned about stability and qa/testing of new builds.

i was wondering if syncthing-inotify will eventually be a part of syncthing itself so you can choose polling or inotify.

sorry for all the questions but i’m interested in this project for a new environment i’m setting up that could grow much larger.

thanks in advance!

AudriusButkevicius · July 22, 2015, 7:32am

https://data.syncthing.net/ speaks for itself. Though there definately are some stability issues which contribute to the churn we are currently seeing.

Plus ram/disk usage for each new device is linear, for 300 machines requiring to store 300 copies of the index.

calmh · July 22, 2015, 9:37am

I can’t speak for inotify unfortunately, but otherwise I’d suggest yes (but of course you need to test for your use cases). When talking hundreds of servers you’ll want to apply some design - don’t go for a full mesh of connections, but some sort of tiered setup.

“Thousands” of files, if that is in opposition to “hundreds of thousands” or “millions” is a good thing as that’s the scale we’re used to operate on. If they’re each a terabyte, you may want to look into other options though.

Yeah. You should be able to roll out configs and certificates with ansible, puppet, or whatever you preferable tool is.

Consider it all beta until it hits 1.0. We test what we can, obviously.

Well… We just started graphing user churn, and I’m not certain how the negative axis graph is going to look in a few weeks. We have quite a lot of users that are not visible for quite a while but then do return to the network. Also, of course, a larger user base equals larger number of people leaving per unit of time.

eyeson · July 22, 2015, 12:10pm

don't go for a full mesh of connections, but some sort of tiered setup.

Could you please explain why? I was under the impression the larger the mesh the better. Does that fall down at a certain point then?

calmh · July 22, 2015, 12:13pm

There’s a certain amount of overhead since each device needs to exchange index information with and keep track of the contents of each connected device. Hence there’s bound to be a sweet spot somewhere, and intuitively I suspect it’s below hundreds. If nothing else, configs with that number of devices aren’t well tested so expect to be a pioneer in finding possible scalability issues there.

uok · July 22, 2015, 12:15pm

see Scaling to hundreds of users

nambrosch · July 22, 2015, 2:03pm

thanks for the response. active community + irc = really huge pro.

that’s one thing i’ll need to test, performance degradation as quantity of nodes increases. if i use a per-datacenter shared file system on our SAN i could probably delegate one (or more for HA) host per file system to keep each filesystem in sync.

i use puppet but we’re a full rhel/centos shop, looks like the only puppet module out there is for debian however that’s easy enough to fix. there are a number of concerns brought up on the project’s github repo, do you know if an “official” init script for debian and rhel/centos/fedora is in the works?

https://forge.puppetlabs.com/whefter/syncthing/readme

fair enough. do you know if a timeline on a RC has been established or are things still in a state of flux right now?

calmh · July 22, 2015, 2:07pm

I suggest not using an init script and instead using a service manager like runit, systemd, SMF, etc. There are examples of this in https://github.com/syncthing/syncthing/tree/master/etc that should be easily adaptable to your environment.

There’s no timeline. There are some things that must be fixed before 1.0, when all those are done it may be time for a 1.0.

calmh · July 22, 2015, 2:18pm

Wow, that repo… I have no idea what most of that stuff is accomplishing, but it sure is complex and enterprisey. As above, I suggest going with one of the five line startup scripts that sets a couple of variables and runs syncthing, rather than the five hundred line monstrosity there.

Although the stuff to change config on the fly etc is sort of neat I’d probably have gone with pushing out ready made certs and configs directly instead.

In fact, if I were enterprisey enough I’d probably hack Syncthing to auto accept devices with certificates signed by a local CA and set some boxes as introducers, thus getting rid of most config wrangling when adding devices.

nambrosch · July 22, 2015, 3:52pm

sorry by init script i meant anything that ties into a service manager. last night i looked at the systemd config but it seemed broken on centos7, i’ll check it again later today, i probably missed something. we also deploy monit but only use it as an actual service manager in a few unique situations.

cool, not sure i could justify implementing until 1.0 unless any known bugs aren’t relevant to our installation.

we’d want to deploy via puppet but i think we could write a simpler module for our purposes. the actual puppet manifests are pretty standard, it looks way more complex than it actually is, however it needs updating to support other operating systems.

lfam · July 23, 2015, 2:26am

Can you give any more information about how it failed?

xarx · March 18, 2020, 11:27am

My answer to the original question is - definitely not, Syncthing is not ready for enterprise.

It might perform well, if it is working. But pretty often - look into the forum - it gets out-of-sync. And there are evidently no good means how to resolve such situations. Syncthing might work in your business well for years, and in a moment it can break, and you are unable to repair it. In my opinion, this is inacceptable for enterprise usage.

calmh · March 18, 2020, 11:50am

Good job on finding a five year old thread to vent in.

nambrosch · March 18, 2020, 12:23pm

fast forward five years, i’ve been using it in production on linux hosts with both sendreceive and receiveonly folders with no major issues.

xarx · March 21, 2020, 5:33am

I’ve been using Syncthing for more than two years, with no issues at all. And if I didn’t touch it, it could run without any problems for another several years. But recently I changed something in the configuration, and things started to go wrong. In an attempt to fix it, it went even more wrong.

So I’m saying that Syncthing is a good program, I like it and use it for my personal purposes. But as there are missing documented recovery procedures and tools, I wouldn’t dare to use it for any enterprise/business-critical purposes.

@calmh About venting. If this thread was burried in the history, there would indeed be no point in reviving it. But it is not burried, it shows up at the top when searching for enterprise topics.