UTF8 encoding conflict with another file

First of all, thanks to the development team for this great piece of software. It has replaced ownCloud for syncing data between my different computers, and works overall much better.

The one problem I haven’t been able to solve yet is that files with non-ASCII file names don’t get synced. That seems to be a widespread issue, but my case looks different from the other ones I see discussed in this forum or in various GitHub issues. The error messages I get look like this:

INFO: File “datei-mit-äöüß-im-namen.txt” path has UTF8 encoding conflict with another file; ignoring.

That particular file is one I freshly created for this test, using Emacs under MacOS. It is the only file in its folder, so I don’t see which other file it could be in conflict with.

The folder containing this file was created with the default options, in particular with autoNormalize=true, so this shouldn’t be a normalization issue.

There is one particularity in my setup: my Syncthing-managed folders are kept on a ZFS filesystem in order to make them accessible from Linux as well. I am not aware of any UTF8-related issues with ZFS, but I am not much of an expert on that topic either.

Any ideas what is going wrong here?

Thanks in advance, Konrad.

My guess is that ZFS is doing this:

  1. If a file name isn’t normalized, ZFS won’t normalize it for you, and will report its name as non-normalized
  2. However if there is a non-normalized file name, and you ask ZFS whether the normalized version of that file name exists, it will answer “yes”. That is, ZFS does normalization when finding a file with a given name.

Looking at the code, this will probably trigger the behaviour you’re seeing, but I don’t know ZFS well enough to say whether this is actually how it behaves…

Thanks for exploring!

I checked the ZFS documentation and the specific configuration of my ZFS pool. ZFS never normalizes filenames before storing them. Normalization happens “as part of any comparison process” - the documentation isn’t more explicit as to what a “comparison process” is exactly.

My pool is configured to allow only valid UTF8 filenames, and to use a normalization algorithm that ZFS calls “formD” for comparisons.

Is there anything I can configure in Syncthing to disable the check that fails? It looks as if ZFS already guarantees the absence of UTF8-related name conflicts on my system.

Looking at the code, I think the only thing you can do is to normalize the filename yourself, using something like convmv --notest -f utf-8 -t utf-8 --nfc filename (untested).

I suspect this will be picked up and fixed fairly quickly…

There may also be some fun around the fact that you’re using normalization form D, and Syncthing’s assuming you’re using normalization form C (as that’s the standard for non-mac systems). I don’t know what the consequences of this would be, whether you’d need --nfd instead of --nfc above, etc.

Mind giving this build a spin, see if it fixes the issue? https://build.syncthing.net/viewLog.html?buildId=15118&buildTypeId=Syncthing_BuildLinuxCross&tab=artifacts

I’m half-expecting it to complain with Error normalizing UTF8 encoding of file "...": ....

I don’t see a macosx build there.

EDIT: I found the Mac builds elsewhere on the server!

Hadn’t realised you were on mac. Try here: https://build.syncthing.net/viewLog.html?buildId=15120&buildTypeId=Syncthing_BuildMac&tab=artifacts

No more error message. On every scan of the folder, I see the message

Normalized UTF8 encoding of file name “datei-mit-äöüß-im-namen.txt”.

The file gets transferred correctly to a Linux system (ext4 filesystem), including the filename. In other words, everything looks as if my problem is solved, although I will do some more experiments to be sure.

Thanks!

You see that on every single scan? Hmm, that’s not right.

Yes, about once per minute, which is the scan frequency for the folder.

Mind trying this one? https://build.syncthing.net/viewLog.html?buildId=15176&buildTypeId=Syncthing_BuildMac&tab=artifacts

Do make sure you’ve backed up any files with unnormalized names, just in case…

It’s back to “UTF8 encoding conflict with another file”.

Damn, sorry. One last time? :slight_smile: https://build.syncthing.net/viewLog.html?buildId=15186&tab=artifacts&buildTypeId=Syncthing_BuildMac

No more error message or warning. All files get correctly copied to a Linux machine. Files with UTF8 filenames generated on the Linux machine get copied to the Mac correctly as well. In short: no more problem :sunny:

BTW, I don’t mind testing builds at all. I can’t remember any other software with such easy deployment. Download, unpack, run.

I’ll give this some more extensive testing over the weekend!

2 Likes

Thanks! I don’t have a ZFS setup here, so I’m somewhat reliant on your help here.

Let’s see if I’ve done anything stupid…

1 Like

I have the same issue on linux ext 4 file system with characters like ä in filenames. Would it help to test your build, too? I am on v0.14.43 now.

Try v0.14.44-rc.3: that’s got this fix in.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.