Very wrong folder size in BOTH global and local state

Hi, I have just setup Syncthing on a new computer and added a new folder. However, the local state (and global state, but no syncing has been done so far) show almost 400GiB as the size of this folder while the real folder is about 20GiB. I have checked the folder size multiple times, there are (as far as I can tell) no weird symlink things going on) and as a funny bonus I know the shown folder size cannot be correct because my hard drive is not big enough for it.

What is going on ? And importantly how can I fix this ? Cheers !

Update: It seems looking at the file size through a graphical file manager (nautilus) also gives a wrong file size, utilities like ‘du’ show the correct size. So I’m guessing this is a bug, but not native to Syncthing and rather the different methods used to calculate total folder/file size ?

Any suggestions ?

Screenshots will be helpful here, including both Syncthing, the file manager, and the command line utility.

Also, what kind of files are we dealing with here specifically? Are these some kind of compressed, deduplicated or sparse files?

Untitled Untitled

And ‘du -h my_folder’ outputs 9.2G.

I have narrowed down the sub-folders which seem to be problematic, and they contain netkit VMs which show the same folder size problem when viewing in the exact same way (terminal vs Syncthing/nautilus file size) So I’m not quite sure what type of file it really is, the VM filesystem folders seem to be the most I can narrow this down.

Yeah, virtual disk files can be tricky.

Actually, now that I’m thinking about this, I’ve got the exact same thing happening on my system. I sync VirtualBox VMs with their size displayed as “25 GB” in Syncthing, while in reality they take ~15 GB on the disk.

Well that’s it then, glad to know my system is not just crazy ! I unfortunately don’t know how Syncthing (or nautilus as in my example) calculate file sizes, but it doesn’t seem like the sort of issue that would be too hard to resolves ? Considering ‘du’ apparently gets the file sizes more accurate (although you can’t just rely on using a GNU-Core tool for a multiplatform program like Syncthing)

EDIT: I can’t be sure but I suspect this also make the Syncthing of these files slower ? Or am I just being paranoid now haha

“Size of a file [on disk]” is highly relative and depends on how you look at it.

For example, do you include the overhead of a file or not? When on an extX filesystem, do you count bytes that belong to content blocks but are actually unused? Do you count the bytes needed for the inodes? The bytes for the inode bitmaps, directory entries or group descriptor tables?

IIRC, du -h tries to estimate the size of the file as it exists on disk. It also has to make a bunch of assumptions and will never satisfy every perspective. Syncthing on the other hand uses the filesize as reported by the OS (I think).

The filesize as reported by the OS usually is just the number “how many bytes are in this file”. This can differ hugely for the actual “physical size on disk”, if features like sparse blocks are used - which I believe is probably what’s happening for the virtual disk images. The files are really as huge as reported by Syncthing - the OS has allocated this number of bytes for the file, and this is the number of bytes that can be read from the file.

The actual physical size on disk is lower, probably because a bunch of bytes part of the file are not actually stored on disk - they’re probably compressed, or left out entirely (sparse blocks).

For synchronization, the docs state this:

What things are synced?

Sparse file sparseness (will become sparse, when supported by the OS & filesystem)
2 Likes

Just for the record, Windows Explorer, for example, reports both sizes:

image

2 Likes

Correct, but even here we can still argue that it’s “wrong”, because Windows (assuming an NTFS filesystem) does not include MFT-overhead including most non-data attributes, which is part of the size on disk - just not of the content. It does however consider the bytes required for partially filled clusters (blocks).

(According to this source)

What I’m trying to say is: filesize is always relative and there’s not really a universal truth that covers everyone’s perspective.

[In addition to that Microsoft has always had trouble with SI vs IEC units, which is highly annoying when working with filesizes as displayed in the Explorer]

4 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.