A quick idea that just came to mind. Sometimes there are shared folders with overlapping sets of connected devices. For example one shared with my family and one with my in-laws, where me and my wife are in both groups. When I share pictures from an event via both shared folders, my wife will also get them twice on her devices.
So when pulling a file from a remote, Syncthing currently notices if the blocks are all the same as for a file in the other shared folder. Then it will copy the existing data to a temporary file and when done, rename it to the final path. How about taking a shortcut here and simply hard-linking the final path to the existing file (if on the same filesystem, otherwise fall back to copying). This would save half of the required space on disk.
If a remote then changes only one of the two identical files, Syncthing assembles a new temporary file and renames that to the final path, removing the previous hard-link. Kind of a copy on write behavior.
Of course this should be an opt-in setting per folder. And with a clear warning that it will create local files that are possibly linked to others and in-place local modifications will affect all linked duplicates. The obvious race conditions (file not completed in one folder, thus the second won’t pick up the existing duplicate) could be mitigated by applying the hard-link logic also during regular scans, to deduplicate after the fact.
Of course, the existing workaround is to run a duplicate file scanner with hard-linking capability over all folders regularly. I haven’t tried though what Syncthing will do when a file is replaced with a hard-link to another identical file. And it might be appealing to re-use the scanning and block hashing of Syncthing to save cycles and I/O instead of doing it independently.
Would that make sense or do you see any obvious problems?