Filesystem Watcher Error - No such file or directory

vance · September 7, 2020, 4:05pm

FreeNAS Plugin Install (Running in a BSD jail) Version v1.8.0, FreeBSD (64 bit)

Synced OK, but now shows an error in the GUI:

Filesystem Watcher Error: For the following folders an error occurred while starting to watch for changes. It will be retried every minute, so the errors might go away soon. If they persist, try to fix the underlying issue and ask for help if you can’t. Support

Documents error while traversing /mnt/documents/30-39_Hacking/32_Hacking_Projects/32.14_X230/roms/hack-x230: no such file or directory

I have checked that the folder is still on disk, it has the correct permissions for the syncthing user:

ls -la /mnt/documents/30-39_Hacking/32_Hacking_Projects/32.14_X230/roms/hack-x230
total 12880
drwxr-xr-x  2 syncthing  nogroup        9 Sep  7 12:40 .
drwxr-xr-x  4 syncthing  nogroup        4 Sep  7 12:19 ..

There are also files in the directory (Not shown)

I thought it may have been something similar to inotify limits on linux, but no mention in your documentation about similar issues with BSD, so I guess not?

Any ideas?

calmh · September 7, 2020, 4:21pm

The corresponding limit on BSDs is file descriptors, but I don’t know if that results in “no such file or directory”.

vance · September 7, 2020, 5:27pm

sysctl -a | grep kern.maxfiles
kern.maxfiles: 2095415
kern.maxfilesperproc: 1885869

The number of documents I am currently syncing is 72589, so well below these limits, so I suspect this is not the issue.

calmh · September 7, 2020, 5:28pm

Doesn’t look like it, no.

owenfi · September 23, 2020, 7:10am

I’m having the same issue with the same setup (FreeNAS 11.2, will update soon to see if that resolves it)

The warning goes away when I disable the “Watch For Changes” mode as you’d expect (but I’d rather have this enabled as I imagine it is far less resource intensive for a 1TB folder with 56k files).

Initial permissions: root@syncthing:/mnt # ls -al /mnt total 20 drwxr-xr-x 3 root wheel 3 Aug 8 00:35 . drwxr-xr-x 19 root wheel 23 Aug 8 00:27 … drwxr-xr-x 3 syncthing 1004 3 Aug 8 10:40 SyncThingData

I couldn’t find a group 1004 so I set group to syncthing, first for the top level, then recursively, no change in error (when disabling and re-enabling Watch)

sysctl -a | grep kern.maxfiles is 12M kern.maxfiles: 12580523 kern.maxfilesperproc: 11322468

Any other ideas where I can look for cause of the issue?

AudriusButkevicius · September 23, 2020, 8:02am

Sorry but I think running in the jail might have something todo with this.

I don’t think any of is do that, so there is no advice we can give.

Best I can suggest is to download the binaries from github, run them outside of the jail and see if that helps.

vance · September 30, 2020, 3:07pm

I have re-checked permissions and there is no difference between the folder which is mentioned in the error and any of its parents.

I wondered if it could be the file path length, the path ‘/mnt/documents/30-39_Hacking/32_Hacking_Projects/32.14_X230/roms’ is 64 bytes long, which would overflow a 64 byte buffer with a trailing NULL, but there are many longer file paths. I maybe guessing incorrectly here, but I can’t really think of anything else.

I don’t have a spare machine to try running the BSD executable outside a jail.

Anybody have any other suggestions that I can try to narrow down this problem?

AudriusButkevicius · September 30, 2020, 4:45pm

Did you try what I suggested, running outside of the jail with our binaries?

vance · October 14, 2020, 3:49pm

I don’t have another machine I can test running outside a Jail, so I am continuing to try and identify something that is causing this:

I tested opening 1 million files concurrently using these python scripts from inside the jail and succeeded without problem. So I really do think we can discount any file handle limits.

The error says “No such file or directory”, but doing an ls on the path works and shows the contents. I did wonder whether this could be case sensitivity, but I can see no differences between the reported path and the actual path. There is a discussion about case sensitivity within ZFS on freenas. My pool and dataset hosting syncthing are both set to “Sensitive”. I have not done anything to try and use the “in development” case insensitivity in syncthing.

Any more ideas other than “try it outside the jail”?

AudriusButkevicius · October 14, 2020, 3:56pm

You could run it via strace (or whatever equivalent in your os is) and see what syscall returns the nonexists error,but I guess you have to know what error code to look for etc.

owenfi · October 21, 2020, 7:49am

Vance, have you noticed this causing any issues beyond the warning? Mine seems to sync and stay in sync fine.

I’m in the same boat where on this machine I’d rather not run it outside the jail. Maybe someday I’ll setup a testbench and see if I can repro and mitigate it; or possibly look into strace/source and see if I can debug that way.

vance · October 21, 2020, 8:18am

The problem is that I can’t be sure. I have over 75,000 files and I can’t check whether each of them are up to date. If syncthing is giving me a warning then I don’t want to rely on it.

I don’t tend to edit files directly on this filesystem, so it is possible that things will remain in sync as all changes happen remotely and syncthing will therefore be aware of the incoming changes and update the local filesystem, but again this is fragile if I suddently decide to share the filesystem through CIFS/SMB or NFS.

As the error is removed if you turn off “File Watching” and switch to regular scanning you could run it in that mode, but that would be a bit of a resource hog. I am also considering running it on a linux VM on the freenas box, but again this seems overkill if I could just get it working natively in the jail.

imsodin · October 21, 2020, 2:26pm

While it’s not a solution/explanation, if what you want is “just” to ensure sync still works just disable filesystem watching. That gets rid of the error and works. It means that you’ll do more filesystem scanning, as you need to scan everything periodically (more often) than with watching, but it works.

owenfi · October 21, 2020, 5:30pm

Okay, I think, but not quite sure yet, that I’ve made some progress on this problem.

While adding a new share that will ultimately be with a different machine (not yet finished), I noticed that chmod wasn’t working and found out that’s because I have Windows/SMB ACLs enabled on my network shares/mount points/pools/something.

I had trouble figuring out setfacl, but with help from here:

Found these commands that basically set full access in current+subdirs:

[root@freenas01] /mnt/dlvol1/dldataset1# find . -type d -exec setfacl -m everyone@:full_set:fd:allow {} \;
[root@freenas01] /mnt/dlvol1/dldataset1# find . -type f -exec setfacl -m everyone@:full_set::allow {} \;

After running those (took a while, naturally) I found that the FS Watch Error is pointing at a much deeper directory (one in a path with a space, and with some files that didn’t get fixed by the setfacl). Seems like the above fails on symlinks, at least that’s my first guess. Not caring too much about that file I removed it and the error seemed to move along to a different directory. I cleaned up a couple more (wait a couple minutes and refresh) and then get an error about too many levels of symbolic link.

So I’m fairly confident with a bit more research into permissions (obviously 777 isn’t going to work for everyone…) and cleaning up some of the cruft in my backups that have these symlinks I can probably get the error to go away.

It seems there is a deeper root cause as this started as a fresh sync to the FreeNAS/TrueNAS/BSD server (in the jail), from a Mac host, so it seems like upon creation it shouldn’t have written files/dirs in a way where it didn’t have permission to modify (read?) them. In my other share point they were existing directories I had rsync’d a while back and I’m not really surprised there are permissions issues as I bring them into the fold.

Will update again later once I’ve dug a bit more.

I’m not really sure what rwxpDdaARWcCos is, but if anyone knows the key bits, that might help avoid some guess-and checking and spraying everything with full permissions to start with.

owenfi · October 21, 2020, 5:39pm

2 more *.app files removed from an old downloads directory, and I’m clean! No more File System Watcher error or anything else showing on my GUI dashboard. (Other than the failed items in the new sync point.)

Granted, my permissions are wide open, so I need to investigate that, but at least from here it should be possible to narrow down.

vance · October 22, 2020, 8:09am

@owenfi - Thanks for sharing your investigation. Using the getfacl command in the problem directory I was able to get a no such file or directory error for a symlink that was pointing at a non-existing file. Removing the symlink moved the error on to a different location, so further cleaning of my filesystem is required.

I think this must be a stricter handling of permissions on file systems with ACLs applied. On Freenas, even non-CIFS shares still have ACLs applied, but they are fairly open and equivalent to 755 unix permissions (or maybe even are updated to reflect the unix permsissions if you use the GUI to change them), but they are still there and therefore give the error for a missing symlink whereas filesystems without the ACL don’t give the same error. I think this explains the reason why my linux machines don’t complain whilst the freenas BSD box does.

imsodin · October 22, 2020, 8:49am

Ah yes, symlinks pointing at nothing causing problems for the watcher is actually a known bug: https://github.com/syncthing/syncthing/issues/5360

What is more concerning though is that you get no such... errors when in fact you just don’t have access. Is that really the case? That would be terrible behaviour by the system, which can cause data loss, because it means syncthing considers those files as deleted. While in fact they are there, just not accessible. In any sane system the error would be something like “permission error”, in which case Syncthing just shows that error to the user, but definitely does not mark the files as deleted.

vance · October 22, 2020, 11:24am

@imsodin, I can’t speak for @owenfi, but my errors are all caused by dangling symlinks, rather than permission errors, which I hope would give a different error message? It may be related to file permissions, but only by the fact that the filesystem uses ACLs on top of the unix permissions and those libraries may provide different error messages, which is why I don’t see this error on my linux machines.

The error I get is:

error while traversing <The path to directory containing the link>: no such file or directory

It would be of more use if the error identified the actual file (link) that caused the problem, rather than the parent directory.

It would be better still if the watcher maintained a list of problem files, but continued to watch the rest. Users could then investigate the problem files, whilst the remainder of them were synced normally.

imsodin · October 22, 2020, 11:42am

In this particular case the watcher should just not care that the symlink is dangling honestly. If anyone can give me ssh access to a system showing the problem, I’d be willing to investigate.

Having error handling in the watcher other than failing completely is a whole different topic, requiring significant changes to a 3rd party library (likely a fork and adopting the puppy). And that’s not on my roadmap.

system · November 21, 2020, 11:42am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.