A few questions about send only and receive only folders

mraneri · July 17, 2023, 6:42pm

Okay so I read the docs and understand the send only and receive only settings but have a few more questions.

Let’s say I have a cluster of 6 devices…. Let’s say those devices are all on their own private networks and thousands of miles apart and spread across three continents. Connectivity isn’t a given at any point in time.

Consider a “master” server in the cluster as a “send only” type device. All folders are send only. This part really isn’t in question.

However for the other 5 devices, let’s call them clients (I know syncthing doesn’t make a distinction but for clarity, clients seems appropriate), I’m considering should their folders be setup as “send receive” or “receive only”? If the files never change then presumably it doesn’t really matter. But let’s say occasionally one of the clients inadvertently modifies a file.

If they’re set as “receive only”: then the client with the modified file doesn’t advertise that file as modified to the cluster and the global state doesn’t change. Right? In this case, the “master” doesn’t know one of the clients is out of sync. The other clients also don’t know about any change and they still show “in sync” and match the master. The difference can only be seen on the one client that has the modified file. And I can “revert local changes” on that client. All correct?

If the client devices are all set to “send/receive” then the modified file is pushed to all the other clients in the cluster… and the master device with the send only folder becomes out of sync and I get the option on the server to “override changes” and push the master state back to all clients. Is this correct?

One last question. If the client folders are set as “receive only” is it still possible that they are sources and will push their files to an out of date client? Remember my earlier comment about devices on multiple continents? If the master device with the send only folder updates its files and needs to push them to all the other locations, can that master in the USA push to the client in Europe and then that client in Europe push to the client in Asia? Even if the client in Asia never has connectivity with the server in the USA? Or do receive only folders not “send”? Also confirm even with no connectivity directly to the server, the “global state” will propagate from client to client as long as there is at least one link with the server somewhere in the cluster of clients.

Am I making any sense??? (It’s very clear in my head! Haha.)

calmh · July 17, 2023, 7:00pm

Your assumptions and guesses are correct.

mraneri · July 17, 2023, 7:15pm

Wow I’m shocked. Thanks for the confirmation. I guess my preference is to see on the “master” that the client has an “unauthorized” change…. But I think in my case seeing the change visible on the master syncthing UI isn’t worth having that unauthorized change pushed to other clients. So “send only” on the master and “receive only” on the clients seems to make sense with the only negative side effect that we need to check each individual syncthing instance on the clients to see if someone’s modified their files.

Of course there are permissions we can set on those machines to control changes directly in the file system but as always it’s a complicated discussion when we don’t fully control all systems.

Thanks for the info. I do think this tool is going to work well for us.

AudriusButkevicius · July 17, 2023, 7:39pm

I think all devices that are connected to the device is receive only and has made modifications, will see it as out of sync. (However, you can’t distinguish between “it’s still downloading data” vs “it has local modifications” I believe)

Whether that is your hub device or not, that is for you to decide.

mraneri · July 17, 2023, 7:54pm

Thanks for all the inputs. I think generally I understand how it works. Will try a few tests on local machines before we make a final decision on which way to go.

Appreciate the responses.

acolomb · July 17, 2023, 8:44pm

Just a thought, as you seem to be trying to build a somewhat centralized solution (control several clients from one server) with a totally decentralized tool. Maybe you should just use Syncthing for transfer and keep a kind of “data pool” folder synchronized. But don’t give the clients direct access to that folder, instead have some local component copy the relevant files to where they will be accessed. Overriding unauthorized changes then becomes the responsibility of this local software. Of course this requires more storage space.

Hope this may give you some fresh ideas. It’s easy to see all problems as nails when you’ve found a neat hammer named Syncthing…

mraneri · July 17, 2023, 9:06pm

Actually this is basically the current solution. The problem is the connections are sometimes very unreliable. Our longest link is about 9000 miles and many many hops. Very slow transfer rates, and many broken connections. Because when we push new code we are pushing as many as 1000 files, (mostly very small ones), if there’s any failure the whole thing fails. We also find sometimes links between the USA and a site in Vietnam, for instance struggle because of a misbehaving ISP somewhere along the way, but there is no problem from Thailand to Vietnam. So the idea is we push from the USA to everywhere, and let all the different nodes help get the full dataset where it belongs. Also, many times each new release is very similar to the old release with only 5-10% of file contents changing. Currently the transfer of so many files over very long latency connections, that drop somewhat regularly takes a REAL long time. Hours… so the goal is to use the hash-matching copy capability which is integral to sync thing to substantially reduce the amount of data we have to push over a 9000 mile link.

And of course if one site is down for a few hours our current process just fails. We get error report and have to push again… syncthing would connect at some point in the future, and if the misbehaving site didn’t already get all the parts from another client, it would start recovery automatically and bring that site up to date.

Anyway I hope it makes sense…

acolomb · July 17, 2023, 9:41pm

Well my suggestion was to use Syncthing for data distribution, with all the benefits you describe. Just separate that shared folder from the folder that the client actually has direct access to. Make it mostly impossible for them to modify (or see) the distributed files under Syncthing’s control. Have something like rsync running locally on each client to apply changes from the synced folder to the one used live. This is where your overrides of local modifications are applied then.

mraneri · July 17, 2023, 10:01pm

That’s not a bad idea. I think our initial strategy is simply to lock down the syncthing destination folder and not have an extra copy. If we find valid reasons why they have to modify those files for their specific location then we may have to come up with something like what you describe. The goal now though is to keep it as simple as possible with as few points of failure as possible. What happens if rsync fails? Then we don’t know if the code that’s running on that machine is actually up to date. And then it’s 6 rsync instances on 6 different machines we have to check logs for to verify success… The ideal situation is I can check the syncthing UI on the master and see immediately if all the remotes are up to date. (If they’re connected ant that moment.) And maybe even create notifications if all remotes don’t get the full update within, say, one hour. (REST API I guess but I haven’t gotten that far yet…)

(It may be a feature request at some point, and maybe one I can take a crack at now that I have my dev environment setup. But it would be great to see on that “master” machine, not only the “last seen” time on all of the client machines, but also the “last up to date” time indicating the last known time when that client was successfully synced…. Right now we see last seen, but that doesn’t imply it’s up to date.)

Wank · July 17, 2023, 11:06pm

Have you thought about having each RO server be capable of syncing with each other? This way not all of that data has to make it from that one server to all of the rest. The servers that are remote might have some level of a more reliable connection with each other that might not exist with the send only server. This way not every remote server needs to be populated 100 per cent from the one send only server. Country B might get 50% from one server and the rest from each of the others. It should make the data populate more quickly globally.

mraneri · July 17, 2023, 11:52pm

That was part of my original question in the first post and I believe based on the answers, that’s already how it works.

Wank · July 18, 2023, 12:30am

Yes, sorry I re-read your original.

Do you have an option to be able to view each/any of the foreign GUIs? They will each need to locally “Accept” the sync requests for the folders you are going to share. Also, good for checking on local errors. There’s always the possibility that one of those servers are going to toss out an error every now and then.

Given how distributed your system is, and how connections are, it might be difficult to set up. One way that works for me, and I think others is ssh tunneling. You’d need IP information that may be difficult to get. The syntax is simple.

I’m just throwing out the idea in case it’s a possibility.

mraneri · July 18, 2023, 12:45am

Yeah we will have the ability to check each server. we have team members in all of those locations (except one). The beauty would have been to be able to see in a single dashboard that everything is current. But the reality is it will be the responsibility of the teams in each location to NOT modify the source.

Anyway, this is going to be WAAAY better than what we have now, which I’m embarrassed to say is FTP.

We will also have to see how the firewalls are going to work. We know already that the UDP holepunching isn’t working for two of our devices. We may have to setup a firewall port forward for one or two machines. We’ll see once we get the software installed on all the systems.

Really appreciate the ideas and discussion from everyone.

Wank · July 18, 2023, 1:53am

Well,

So long as you need to make local Firewall rules to ensure that Syncthing can fully work, see if you can have a dedicated inbound port that forwards all traffic to port 22 on the local server (With sshd running). You can use that to ssh tunnel in so you can bring up the local GUI on your local computer. Perhaps you can’t get that to work 100% everywhere, only it will be better than nothing. That’s the closest you’ll come to a single, central, GUI admin tool.

You’d be able to run something like this:

ssh -L 6999:127:0.0.1:8384 user@ip_of_firewall -p 2222(or whatver they give you to use, so long as it forwards to port 22(ssh port#) on the Syncthing server.

Then, you open a new browser tab and point it to 127.0.0.1:6999 and hopefully you’ll see the remote GUI.

I have a port forwarded on my home router in this manner so I can get to my server at home.

gadget · July 18, 2023, 2:57am

I’d considered setting up a dashboard, but most of the time things are running fine, so what I really wanted was to know whenever there as an error.

A simple cross-platform solution that’s been working great for me is combining STC with grep and a notification tool.

I use msmtp to send an email depending on the output from STC, but other push notification tools such as Healthchecks, Pushover, Pushbullet, etc. can also be used instead (the Swiss Army Knife of push clients is the open-source Apprise).

A scheduled task checks for errors every 15 minutes (I didn’t want to be bombarded with notifications for intermittent network hiccups).

mraneri · July 18, 2023, 3:06am

This is good info. Will check these out.

Wank · July 18, 2023, 4:32am

Just downloaded and tested STC software. Seems to work perfectly. Not 100% of the info you can do/get from the gui. Only, it provides very useful information that might alert you to a problem.

My problem is going to be getting it to work with msmtp. Seems complicated. But, will give it a go. STC is most useful if it can send email.

Thanks.

gadget · July 18, 2023, 7:06am

The status column shows several states including:

syncing
error
offline
LocAdds

It’s actually much easier than it initially seems.

The complicated part of msmtp is getting the command syntax right. Familiarity with formatting email header fields is also helpful.

For Linux, a slightly modified version of my shell script that emails the output from stc when any device is in an error or offline status.

#!/bin/sh

STC=/usr/local/bin/stc
EMAIL_SMTP=smtp.company.com
EMAIL_FROM=donotreply@company.com
EMAIL_TO=syncthing@company.com
EML=/dev/shm/syncthing.eml

$STC | grep -i -e error -e offline > /dev/null

if [ $? -eq 0 ]
then
	echo "To: ${EMAIL_TO}" > $EML
	echo -e "Subject: Syncthing error on ${HOSTNAME}\n" >> $EML
	$STC >> $EML
	cat $EML | msmtp --host=${EMAIL_SMTP} --from=${EMAIL_FROM} $EMAIL_TO
	rm -f $EML
fi

With some minor adjustments, the script above should also work with the Homebrew package of msmtp for macOS and Cygwin package for Windows.

If using apprise instead of msmtp, the following example emails the output from stc:

stc | apprise -vvvv --dry-run --title='Syncthing Status' 'mailto://smtp.company.com?from=donotreply@company.com&to=syncthing@company.com'

(“-vvvv --dry-run” cranks up the verbosity and suppresses actually sending to help with debugging. Leave off for production use.)

At work I have a SMTP server that can be used without user credentials via the LAN, but msmtp and Apprise also support authentication.

Wank · July 18, 2023, 8:12am

Thanks for the examples. Shell scripting has always been hard for to learn. been using Linux for over 20 years but am not a programmer.

What do you use for a S M T P server that doesn’t require authentication? Doesn’t your SMTP ultimately need to connect to another smtp server that will be able to deliver the message or is your credentialless smtp server able to deliver messages directly? I was thinking of loading mailman but I’ve never tried before. I’m guessing that trying to get it to authenticate to a Google smtp server for example might be difficult especially since my accounts are set up for two factor authentication.

Unlike the gentleman who started this thread I do not have servers located around the world that I need to monitor. Everything is at home. This is going to be practice for me in case I do need to use it someday. It’s primarily going to help me check the status of my four servers with one simple command and confirm that everything is okay.

I always like a good challenge.

gadget · July 18, 2023, 5:36pm

I’ve also been using Linux for about the same time. Started because I needed a C compiler for a project (too poor at the time to buy a commercial compiler after spending a small mint on a custom home PC ).

Shell scripting in some ways can be harder to learn than traditional programming languages because with C, Perl, Python, Go, etc. – exceptions being the occasional exec() and system() calls – you’re coding inside a curated environment. But with shell scripts, it’s a mashup of internal and external system commands, which to a more casual user doesn’t always have a clear separation, making it more difficult to fully understand (not all GNU/Linux commands are available on every Un*x platform).

At work, I’ve got multiple Postfix servers. One server only accepts connections for outbound email via a private LAN. It’s used for automated alerts from a monitoring system, backup software, etc., so it’s configured to allow unauthenticated access (a hardware firewall provides an additional layer of security).

In general, a SMTP server can hand off email to another SMTP server for delivery without authentication. It’s a bit like a post office transporting snail mail to another post office for the last mile delivery. A postal carrier can then drop off the snail mail into a recipient’s mailbox. It all happens without the post office needing specific permissions.

Mailman is useful for email notifications, but still requires access to a MTA such as Postfix, Sendmail, etc.

For Gmail, if you’re still able to create app passwords, that’s one option (although Google has been slowly deprecating its use). Another Gmail option is oAuth (supported by msmtp and Apprise).

oAuth tokens work with 2FA-enabled accounts and usually have an expiration date along with possibly other security measures. If you don’t have a way to automate token renewals, it might be inconvenient.

Because of excessive email spam, most home ISPs block port 25 and 587, so running your own SMTP relay at home might not be practical.

So another option is to use a 3rd-party push notification service such as Healthchecks.

At my parents’ home, a Linux server is plugged into an UPS unit that has a USB interface for monitoring the electrical service and battery backup. Once a day, the server runs a cron job like this:

curl --silent --max-time 10 --retry 5 --output /dev/null https://hc-ping.com/${PING_KEY}/ups

It acts like a “dead man’s switch”, so that if Healthchecks doesn’t receive a HTTP GET request from my server at least once per day, Healthchecks emails me.

I can also download a JSON blob for a quick check as needed (e.g., after a big storm) like this:

curl -s -H "X-Api-Key: ${API_KEY}" https://healthchecks.io/api/v1/checks/00000000-0000-0000-0000-000000000000 | json_pp