i have six linux hosts running a syncthing agent along with syncthing-inotify in order to keep various directories in sync. i’ve had no issues for months but recently i started syncing a directory containing between 500-2000 small files which are used as a cache for a wordpress caching plugin.
shortly after bringing these directories into sync for the first time, i noticed that when wordpress deletes many files at once (removes expired cache objects), which frequently occurs on multiple servers within a few seconds of one another, i will encounter various scenarios where syncthing starts to misbehave, the most common being as follows:
files removed on some hosts but not others which makes the directories not able to be deleted because they aren’t empty. here’s an example (host and paths substituted):
Feb 1 23:42:33 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:34 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:34 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:34 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:34 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:35 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:35 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:35 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:35 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:36 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:36 myhost.com syncthing[9181]: [4U7MM] INFO: Puller (folder "cache", dir "to/cache"): delete: remove /path/to/cache: directory not empty
Feb 1 23:42:36 myhost.com syncthing[9181]: [4U7MM] INFO: Folder "cache" isn't making progress. Pausing puller for 1m0s.
this will repeat over and over, sometimes it will stop on it’s own because of another cache purge, other times i need to remove the files manually. either way, this does not seem ideal. at first i blamed inotify but the behavior persisted even when using the built-in polling interval.
i’ve also tried the (not documented) fsync option which seems to help with files being written but does nothing for deletes… which i believe are the main issue here.
is there a best practice to configure a directory for rapid deletes or large quantities of changes on multiple hosts in a short period of time? i really want to sync caches on these hosts but i’m afraid that syncthing may not be capable.
This is probably due to how it stores the caches, which is not obvious from the above. But the error you see would be caused by something like:
Two devices have a directory structure which is in sync:
path/
subdir/
fileA
fileB
Now device A deletes fileA, fileB and the now empty subdir. Device B creates a fileC in subdir.
Device A:
path/
(subdir was deleted)
Device B, applying the changes from Device A as best it can:
path/
subdir/ (Device A says this should be deleted?!)
fileC
This is a bug, of sorts, since it should probably realize that there is a sync conflict on subdir here. But instead it diligently tries to apply the delete, which is not possible because there are files in it.
At some point fileC gets synced back to Device A, and subdir resurrected.
But really - why are you even syncing these cache directories between your wordpress instances? It doesn’t seem like something I’d expect to work.
[quote=“calmh, post:2, topic:9140”]
This is a bug, of sorts, since it should probably realize that there is a sync conflict on subdir here. But instead it diligently tries to apply the delete, which is not possible because there are files in it.
At some point fileC gets synced back to Device A, and subdir resurrected.[/quote]
this is exactly the behavior - under most circumstances everyone is doing their job, but when you have a large amount of operations all happening at once this condition seems to occur.
the way caching plugins work is when a page is hit the plugin checks the local filesystem to see if there is a static version of the page. if there is then it’s served (quick, no database or cpu usage). if there is not then it’s generated and saved to disk. after [configurable] minutes/hours these files are removed.
let’s say i have a post that’s hit on one of my six servers and generates the html on-disk. when there’s another request from a different client they could go to another server where the same thing will occur, causing database queries and cpu usage, which is what caching helps us avoid. when i sync the cached files across all instances then we only have to generate the file on one server instead of all six. it works very well so far when it’s not stuck in an endless loop of delays
you said it’s potentially a bug, do you think it’s something that could be looked at? we deploy syncthing for some of our largest sites but nothing with the frequency of changes as this cache directory. i’d be very happy to contribute in one way or another!
I have not done any testing but I would be very surprised if the database query and HTML generation was more resource intensive than Syncthing.
Syncthing does multiple database queries, cryptographic hashing and transmission encryption in order to synchronise the files.
Bug or no bug, the only ill effect here is a few log entries and retries. That could be cleaned up a little, but the effect will always be roughly the same when one side is deleting a directory and the other side creating new files in it.
To be honest, you should’t be using syncthing to do this. Your cached content should go on a shared memcached or redis server, so I suggest you start there.
syncing a file once vs generating a file six times is a reduction of load on both db and httpd servers (case was proven).
yeah that’s what i’ve been afraid of, i’m not sure if this will be a solution for us then. i suppose i was hoping that it’s less of a technical roadblock and more me missing something.
the main devs for this project have used wp-rocket for their previous high-traffic sites with very good results and are pushing it here. i originally asked about keeping it in memory but that was dismissed soo i’m going to disable syncing caches on disk for now and see what else i can find - any recommendations would be helpful.
You’re on the wrong forum for wordpress support, but it definately feels you want memcached here, or the same NFS mount on all boxes, but definately not syncthing.