Infinite filesystem recursion detected

Hi,

I’m getting issues with the 1.7.0 under Windows - binary is the official binary build.

I get many messages similar to:

2020/07/08 16:54:06.737843 walkfs.go:93: WARNING: Infinite filesystem recursion detected on path ‘it\openvpn\win10’, not walking further down

Some points:

  1. The file system resides on a virtualbox share under Windows: the share is on the Linux host; the directory/folder is shared with a Windows guest; Windows is a guest system under Linux.

  2. The directory/folders in question are shared with multiple machine running multiple operating systems in multiple locations.

  3. The Linux directory on the Linux host has no symlinks in it.

  4. The same directory shared with a Linux virtual box guest is scanned fine.

  5. All was (or seemed) fine under previous versions of the official syncthing binary.

Here’s some log outputs with:

STTRACE=walkfs,fs,scanner

[3Y3FL] 2020/07/08 16:54:06.736757 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\openvpn\.DS_Store {0xc001d3d730} <nil>
[3Y3FL] 2020/07/08 16:54:06.736757 walkfs.go:71: DEBUG: walk: path=it\openvpn\.DS_Store
[3Y3FL] 2020/07/08 16:54:06.736757 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\openvpn\win10 {0xc001d3d810} <nil>
[3Y3FL] 2020/07/08 16:54:06.736757 walkfs.go:71: DEBUG: walk: path=it\openvpn\win10
[3Y3FL] 2020/07/08 16:54:06.736757 walkfs.go:36: DEBUG: ancestorDirList: Contains 'win10'
[3Y3FL] 2020/07/08 16:54:06.737843 walkfs.go:93: WARNING: Infinite filesystem recursion detected on path 'it\openvpn\win10', not walking further down

And another:

[3Y3FL] 2020/07/08 18:39:52.125176 walkfs.go:30: DEBUG: ancestorDirList: Pop 'mac'
[3Y3FL] 2020/07/08 18:39:52.126171 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\vnc\win10 {0xc002c061c0} <nil>
[3Y3FL] 2020/07/08 18:39:52.126171 walkfs.go:71: DEBUG: walk: path=it\vnc\win10
[3Y3FL] 2020/07/08 18:39:52.126171 walkfs.go:36: DEBUG: ancestorDirList: Contains 'win10'
[3Y3FL] 2020/07/08 18:39:52.126171 walkfs.go:23: DEBUG: ancestorDirList: Push 'win10'
[3Y3FL] 2020/07/08 18:39:52.127218 logfs.go:55: DEBUG: walkfs.go:97 basic X:\dept-share DirNames it\vnc\win10 [vncviewer-1.10.80.exe sha1s.txt nightly.html nightly_files tigervnc64-1.10.80.exe vncviewer64-1.10.80.exe sha1s.txt.asc] <nil>
[3Y3FL] 2020/07/08 18:39:52.127218 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\vnc\win10\vncviewer-1.10.80.exe {0xc002c06690} <nil>
[3Y3FL] 2020/07/08 18:39:52.127218 walkfs.go:71: DEBUG: walk: path=it\vnc\win10\vncviewer-1.10.80.exe
[3Y3FL] 2020/07/08 18:39:52.127218 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\vnc\win10\sha1s.txt {0xc002c06770} <nil>
[3Y3FL] 2020/07/08 18:39:52.127749 walkfs.go:71: DEBUG: walk: path=it\vnc\win10\sha1s.txt
[3Y3FL] 2020/07/08 18:39:52.127749 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\vnc\win10\nightly.html {0xc002c06850} <nil>
[3Y3FL] 2020/07/08 18:39:52.127749 walkfs.go:71: DEBUG: walk: path=it\vnc\win10\nightly.html
[3Y3FL] 2020/07/08 18:39:52.139362 logfs.go:61: DEBUG: walkfs.go:104 basic X:\dept-share Lstat it\vnc\win10\nightly_files {0xc002c06930} <nil>
[3Y3FL] 2020/07/08 18:39:52.139362 walkfs.go:71: DEBUG: walk: path=it\vnc\win10\nightly_files
[3Y3FL] 2020/07/08 18:39:52.139362 walkfs.go:36: DEBUG: ancestorDirList: Contains 'nightly_files'
[3Y3FL] 2020/07/08 18:39:52.139929 walkfs.go:93: WARNING: Infinite filesystem recursion detected on path 'it\vnc\win10\nightly_files', not walking further down

Any ideas? Apologies if this is an obvious user error!

Thanks again for a fab program!

All the best,

===Rich

It seems whatever mechanism you use to expose those things to windows, it seems to reuse inodes/unique file identifiers on windows, so syncthing thinks it’s going into a infinite recursion and stops delving deeper into the directory tree.

I suspect this is specific to your virtualbox share stuff, not a general problem, and sadly there is no flag to disable this.

Hi Audrius,

Is there an easy way to dump the relevant data to test this? This didn’t seem a problem in earlier incarnations of st.

The commit that introduced the error message was:

ee445e35a

which was pretty big splodge of code.

This change in basicfs.go:

-       return os.SameFile(f1.FileInfo, f2.FileInfo)
+       return os.SameFile(f1.osFileInfo(), f2.osFileInfo())

makes me wonder if the new tests are sufficent. My go-fu isn’t that great, sadly, and the changes in lstat_windows.go intimidate me a tad!

Thanks again,

===Rich

This was added in 1.7.0, so yes, you did not have the issue before, because before we did not delve into reparse points before, now we do, and because we do, we need to protect ourselves from cycles, hence these checks were introduced.

I don’t think there is any issue with the code, I think the issue is with the fake filesystem virtual box is exposing.

If you run:

fsutil file queryfileid <filename>

on win10 and nightly_files, I suspect they will return the same file id, implying it’s the same directory just in a different location (which it’s clearly not) which trips the recursion check.

Yup, you are bang on the money! Ran this under cygwin:

find . -type d -exec c:/windows/system32/fsutil file queryfileid {} \; |sort |uniq -c

24 File ID is 0xffffe78e24786a20
 2 File ID is 0xffffe78e24ad89e0
 1 File ID is 0xffffe78e2663d290
 2 File ID is 0xffffe78e2686e5a0
 1 File ID is 0xffffe78e268f9310
 1 File ID is 0xffffe78e26c17010
 1 File ID is 0xffffe78e27036010
 1 File ID is 0xffffe78e282e8010
 7 File ID is 0xffffe78e28d49010
18 File ID is 0xffffe78e2b0985a0
 9 File ID is 0xffffe78e2b6e8010
 2 File ID is 0xffffe78e2b780720
 2 File ID is 0xffffe78e2b783010
 2 File ID is 0xffffe78e2bf54010

Many duplicates! Not only that, but the results were different with each invocation. The ids also differed with files on each invocation, too (-type f). Neither were true for files/dirs on native ntfs.

Don’t suppose there’s another way to do it? Or, perhaps, a switch to turn this code off including completely ignoring Windows junctions? It blows virtualbox shared file systems completely.

Just seen:

https://forum.syncthing.net/t/option-to-follow-directory-junctions-symbolic-links/14750

Do not mean any disrespect to the amazing work @xarx put into this.

Thanks for your help with, and your time spent in giving me a handle on this.

===Rich

Hi,

Just an observation I’ve made across years or using virtualbox under Windows. On Windows 7 and higher \vboxsvr\shared has some glitches in file access, it may be slower than attached virtual hard disks, didn’t play well with archiving and editing zip archives from Windows explorer and running some installers directly off the share. E.g. installers reported “the msi/exe is corrupted” and when I copied that same thing over to the local vm’s desktop and ran it it was fine. Since then, I recall the feature as useful but unstable to myself. Just fyi. I don’t think Syncthing can do the fixing that needs to be done in the vbox ifs filesystem driver. Always wondered why oracle didn’t fix this.

What maybe could help you if an alternative is ok: Mount the contents to the VM via ISCSI or NFS. But yes, the shared folders are easiest.

2 Likes

No, there is no option to switch this off.

Hello all, Using Windows Server 2016 Essentials with Stablebit DrivePool (over 20TB). Drives are NFTS formatted. Issue came up after 1.7 Upgrade. Until 1.6 ST worked without problem (yes it kept 20TB+ in sync)

Is this recursion check/watch really necessary ? Would be nice to have at least an option to disable it.

47q68s

(Sorry no time for a more substantial contribution :slight_smile: ).

1 Like

Thank you @imsodin for your contribution Time to downgrade …

" Just to avoid any confusion: The functionality added by @xarx is not following symlinks, but treating directory junctions like directories and detecting path loops."

So, when will also the not symlink newly arrived “feature” adverse effect issues be treated?

I wish this feature will not kill the ST soul. And also I thought that usage pattern will matter, as well the known limitations.

A good day,

You should explain your setup and problem. It seems different than the OP’s.

Hi There,

Thanks for your comments (and for your brilliant android port). Yes, mounting with samba or some such thing would probably work. But the convenience has it for me and am using vb for simplicity and a (tiny) bit of privacy/security. So, whilst my go-fu is not at all profound, I can rise to a comment or two, to kill the return nil, change the logging, and let st die out of memory if it goes infinite. I promise not whinge if I lose my files!

It’s brilliant to have the source in git, your whole dev process exposed, modular and clean design, and, above all, the devs on hand to tell you wtf’s going on! Thank you on all those counts.

I’d defo vote for a switch to turn off infinity checking, but in the mean time, I guess I’ll hack the code which would be way more preferable (for me, at least) to spinning up samba, changing firewalls, and other hacks.

I’ll try to report this issue upstream to vb, as well.

===Rich

Hi @calmh,

You should explain your setup and problem. It seems different than the OP’s

Thank you for stepping in and asking

I have two different geographically storage setup’s, that are kept in Sync with the help of ST. Both use the hard drive virtual pooling software Stablebit Drivepool. Both have more than 8 perfect shape NTFS HDD’s under the virtual filesystem.

You can read a quick info about the Drivepool software in the links below (5 min. read)

–Features: Drivepool Features

–Limitations: Drivepool-known-issues-and-limitations

I’m using this setup for about a year, and it worked like a charm. I think that with, the v1.7, it’s the first time when I did seen a Notice/Warning. I have received this Notice “infinite-filesystem-recursion-detected”, for some folder paths. I deleted one path (folder), involved in the notice, and after a re-scan the notice didn’t reappear. There was no shortcut’s or any *link to different path, in the mentioned folder.

All drives volumes are reported clean by the “fsutil repair state”

This is the fsutil reported Drivepool volume output:

fsutil.drivepool.txt (488 Bytes)

I would like also to ask if this “Warning” will stop scan for the infinite recursion detected path, or it will stop the scan for the entire Syncthing root Folder Path?

Thank you,

I assume the software you are referring to implements a virtual filesystem, in which case it depends on how it implements it, and whether it does the right thing with file ids.

It prints a warning and stops descending, but does not abort, but if you have the same issue like OP, I suspect it might result in deletions, which thinking about it now is probably not ideal.

Hi Audrius,

The Drivepool fileystem software is advertised like this (from developer Features section):

Advanced File System

StableBit DrivePool features CoveFS, an optimized file system specifically designed for disk pooling.
It has a virtually unlimited pool size (many Petabytes).
It's compatible with existing applications*, and is designed to function like NTFS.
It's a 100% kernel mode implementation. No user mode service dependencies or any such hacks are involved. It works like a real file system.
Advanced features:
    Alternate stream support and extended attributes.
    Full NTFS security.
    Full Windows disk caching support. Read-ahead and lazy writer caching supported along with memory mapped files.
    Full oplock support. Oplocks improve network share performance by allowing a network client to cache files on their end.
    File change notifications, for applications that watch directories for changed files.
    Sparse files.
Completely parallelized:
    Reads and writes to duplicated files happen in parallel.
    An optimized fast directory listing algorithm queries all the disks at the same time and combines the results in memory to return the list of files and directories, in real-time, as they come in from all the disks.
Zero dependencies on any external metadata:
    Plug in any pooled disk to any system running StableBit DrivePool and it is instantly visible on the pool.
    No special RAID-like format, no "tombstones" and no SQL-lite databases are involved. Everything is just plain old files.
Always shows the actual free space on the pool**. No need to reserve imaginary free space that you can't use.

“*” Some disk imaging applications may not be compatible.

“**” Some space may not be usable for file duplication, depending on drive layout

Data and file management is done, from the user side experience, all-over through the the pool side (e.g B:<yourfiles&folder> structure path). The Drivepool driver logic-engine will take care of everything, like data balancing, folder duplication etc. (works like art, flawlessly).

Indeed, fsutil report the Pool (virtual filesystem) as NTFS, so I expect to work also the same from a Windows OS point-of-view:

C:\Windows\system32>fsutil fsinfo volumeinfo B:
Volume Serial Number : 0xddb5743c
Volume Name : DrivePool
Max Component Length : 255
File System Name : NTFS
Is ReadWrite
Supports Case-sensitive filenames
Preserves Case of filenames
Supports Unicode in filenames
Preserves & Enforces ACL's
Supports file-based Compression
Supports Sparse files
Supports Reparse Points
Supports Object Identifiers
Supports Named Streams
Supports Extended Attributes
Supports Open By FileID 

Thank you for explaining the behavior when recursion Warning Notice occur. My detected paths are reported strangely without the root folder:

17:32:40 WARNING: Infinite filesystem recursion detected on path ‘PHONES DATA\FULL N73 card copy\Resco\Viewer\Images\200810’, not walking further down

The Fullpath should be: “b:\LIBRARY\PHONES DATA\FULL N73 card copy\Resco\Viewer\Images\200810”. Don’t know if matters.

fsutil file query:

fsutil file queryfileid “b:\LIBRARY\PHONES DATA\FULL N73 card copy\Resco\Viewer\Images\200810” File ID is 0x00000000000000000000000000005200

All other files under “LIBRARY” are scanned and without problems and synced:

As precautionary measure, I made the setup one way-only (Send->Receive) with File Versioning on the target system, but just wanted to change to “Send&Receive” for all the root directories, on both side, yesterday :roll_eyes:. But the recursion “Warnings” came up, and I have rolled-back to ST 1.6.1.

The error only crops up if two folders report the same file id, so if you run the query against each directory, then group by value, and if you end up with 2 files/directories with the same id, then the filesystem doesn’t respect basic principles that applications are relying on.

The marketing material or the technical specs are moot at that point.

I’ve started with the help of CYGWIN and the above @rahrah “find” command, to look for the same file-id, and the command does indeed return a file system loop on the similar ST paths:

find: File system loop detected; ‘./LIBRARY/PHONES DATA/FULL N73 card copy/Resco/Viewer/Images/200810’ is part of the same file system loop as ‘./LIBRARY/PHONES DATA/FULL N73 card copy’.

Now, how I got this happened, I’m not sure. This is a really old (cold) directory.

As far as the DrivePool filesystem reparse points, junction points / mount points implementation, a lot has been developed in 2013, and support the followings:

The architecture has these positive aspects to it:

It supports file / directory symbolic links, directory junction points, mount points, and 3rd party reparse points on the pool.
 
It is a 100% native kernel implementation, with no dependence on the user mode service.
 
It follows the 0 local metadata approach of storing everything needed on the pool itself and does not rely on something like the registry. This means that your reparse points will work when moving pools between machines (provided that you didn't link to something off of the pool that no longer exists on the new machine).

Some of my previous attempts had these limitations:

Requires the user mode service to run additional periodic maintenance tasks on the pool.
 
No support for directory reparse points, only file ones.
 
Adding a drive to the pool would require a somewhat lengthy reparse point pass.

The new architecture that I came up with has none of these limitations. All it requires is NTFS and Windows.

Implementation challenges: Stablebit-drivepool-reparse-points

Here’s the full explanation on Release 2.1.0.432 BETA https://blog.covecube.com/2013/11/stablebit-drivepool-2-1-0-432-beta-reparse-points/ . I’m on StableBit DrivePool 2.2.3.1019, so the challenges has been resolved and shouldn’t be a relevant Drivepool actual issue.

Since the reported loops are not so many (4 directories paths actually) I’m looking to modify (move/re-copy) the involved data, and leave it to the filesystem taks to update the file-id.

I will update the post with the relevant info.

Hello,

Just checked under both Linux and Windows, and os.Getwd() returns a full path, including, under windows, the drive.

Wouldn’t changing your push/pop stack to having full paths, rather than these file/inode IDs be a better check for cycles before you walked into a dir? Am I missing a huge elephant here?

My vote would now be for a folder level switch for infinity checking.

Sorry to belabour this!

Thanks again,

===Rich

The recursion is not represented in the path.

Yup, I spoke to quickly, sorry. All the unix code I looked at like realpath(3) expects a knowledge of when a file in a path is a symlink. Sorry.

===R