will hard links ever be supported?

travisfw · June 10, 2025, 7:02pm

There is one significant use case that I cannot yet use Syncthing for: keeping an off-site backup of my on-site backup. My backup is a simple filesystem sync made with rsync, and between snapshots, there are a lot of hard links. Using Syncthing as-is would multiply the size of the backup repository by … I don’t know … a lot. Of course, I could “just” do another rsync offsite, but … Any hope hard links could be a feature in a future release?

If it isn’t obvious: I won’t move off of rsync to be compatible with volumes that don’t support hard links.

I realize that adding a single volume that does not support hard links to the peers would break everything. I also realize I’m wishing, but I would like Syncthing to check if the local volume supports hard links and, if not, simply refuse to synchronize with a shared folder that has hard links enabled (so yeah, there would hypothetically be two kinds of folders now, with and without hard links). I’d even settle for running a hardlink-supporting-volume-only fork of syncthing. Maybe I’ll try vibe coding that someday.

calmh · June 10, 2025, 7:38pm

Very unlikely given all the potential complications and the niche nature of it. Maybe a copy on write filsystem and enabling the corresponding options in Syncthing might be a valid workaround.

psiberfunk · October 10, 2025, 5:57pm

I’m slightly confused by this , because in theory you ought to be able to map a hard linked file to an already existing / hashed file path , right ? Is there something preventing a hardlinks lookup table to avoid re-hashing a known hardlinked file?

At least on Unix-like filesystems, each file has an inode number and a link count, yes ? If two paths point to the same inode (i.e., hard links), you can store the hash once keyed by inode, and re-use it for all hard links. That would let Syncthing avoid re-hashing unchanged hard-linked files, saving CPU and I/O. You can optimize db storage by only ever storing inodes that have more than one reference so you don’t store extra inode data in the db.

I’m sure i’m missing implementation details.. but i’m not sure why it wouldn’t be something able to be handled in the same way rsync solves the problem ?

P.S. I also got led here after realizing Synctthing is failing me for more or less this same use case as the OP pointed out .

Nummer378 · October 10, 2025, 11:09pm

This is far more complex than it might sound at first. Some optimizations are certainly possible, but not everything and it gets complicated really fast.

For starters, there’s the obvious cross-platform problem: Is this solution Linux-only, or should/can it also supported on other platforms? If we leave that aside, we quickly realize that userspace has very little inode information in Linux.

The inode information reported by e.g. stat in Linux is passed through as reported by the filesystem. This has several implications:

Firstly, inode numbers are only guaranteed to be unique per-filesystem: So when talking about comparing inode equality, you always scope yourself to the same filesystem: Figuring out if two paths are on the same filesystem is in itself not entirely trivial, and it is something that can change at any moment due to path mounting. You always open yourself up to TOCTOU races that cannot be fixed from userland.
Next, there is no requirement for filesystems to report unchanging/consistent inode numbers: A filesystem may recycle inode numbers as it pleases. For example, if a file (and its associated inode) has been deleted, the filesystem is free to re-use that inode number for a new file. So just because two files are on the same filesystem with the same inode number at different points in time that doesn’t mean that they’re the same file now: This is only valid if you’ve re-checked that the inode numbers are still the same at this point in time, which again, is subject to TOCTOU races.
Likewise, a filesystem may choose to be stateless, and report different inodes every time for the same file (some network filesystems do this). The Linux kernel itself is protected against inode confusion by using internal inode ids (assigned by the kernel) for which the filsystem is required to provide consistency as long as the kernel has caches for that file. However, these “internal inode ids” are not exposed to userland, and are in-memory only: They change on every reboot.

You can use inode ids for short-term sanity checks: If you stat two files within a very short timeframe, and both report the same metadata (inode id, file size, timestamp) you can assume that they’re the same file with a reasonable probability (still possible TOCTOU, but small). This is what’s typically done. What you cannot do reliably is use the inode id for any kind of “this is still the same file” long-term: That doesn’t fly from userland, too many things can change without the software knowing about it.

What you may be able to do is scan a file, remember its inode metadata, and then if you see the exact same inode metadata shortly thereafter there’s a certain probability that you’re looking at the same file (though again, never guaranteed). The longer the time between the comparison, the higher the uncertainty though.

ttf · October 11, 2025, 7:31pm

I’m not involved here, but very thankful for this explanation.

I knew about the first paragraph (file-system uniqueness), but not about the other two.

You can learn an awful lot here.

BTW, I come from the mainframe arena (USS), where each mounted filesystem has a unique device number. Is that the same in the Linux world?

Nummer378 · October 11, 2025, 8:43pm

Basically yes, but also “it’s complicated”. Each mountpoint (more specifically, the mountpoint’s mount) is associated with one device, but it’s not necessarily 1:1 - there can be multiple mountpoints for the same mount (e.g. think bind mounts - the same file system can be available under different paths or namespaces. This is also how container filesystems work).

ttf · October 12, 2025, 4:38am

Thanks!

psiberfunk · October 12, 2025, 6:30pm

I’ll admit I did not consider this level of complexity at all. I am curious therefore.. how does rsync solve this problem, any why is it inapplicable to how syncthing works?

Rsync nominally seems to handle hardlinks with “–hard-links” , but perhaps I’m being naive here… I’d love to be educated about the differences.

Nummer378 · October 12, 2025, 7:07pm

Rsync does the thing I described above: When it sees a file that could be a hardlink, it stats both filenames (the original or “first” link) as quickly as possible (not atomic though) and checks if device and inode information is the same. If so, it assumes that it’s a hardlink. This isn’t free, as it requires re-querying the original path every time a “candidate” is found. It is technically possible to race this and confuse rsync - this is likely acceptable behaviour for them*.

I never said that this approach isn’t applicable to syncthing, I just said that it isn’t easy to get it right. If one can accept the drawbacks of the above approach, you can definitely do this.

*They do have a bunch of “abort if file changed” logic in the code, so if rsync sees an inode timestamp changing during the operation it will invalidate (parts of) the transfer and abort or retry. This likely catches the common case where an inode num is re-allocated with a different timestamp at an unfortunate time during the transfer. This isn’t hardlink specific though, this is a common problem when transferring files (which syncthing also has).

ttf · October 13, 2025, 4:30am

I always thought that every file is represented as a hard link - there is just a “first” one and then there are additional ones. Aren’t directory entries just filename and inode #, the file attributes being drawn from the inode entry. All directory entries being equivalent: How does Syncthing decide wether a filename is a “first” or an “additional” hard link?

calmh · October 13, 2025, 5:22am

Apart from everything else, rsync does its thing one time in one direction for two devices. This is somewhat simpler than doing it continuously in both directions for n devices.

Yes. The inode has a link counter so you can know when a file has more than one name / hard link.