Unicode errors while syncing between Mac OS X and Linux

I am using Syncthing for quite a while between mac os x and linux.

Now I recognized unicode errors and duplicate files. I had this problems some years ago and now again. The filenames seems to be identical but they have different encoding. Special characters show up differently when listed in a shell eg.

I reported it as a bug which was closed immediately.

Maybe someone can try to reproduce the error?

I am not sure if this behaviour is covered by this post:

It only says that files with combination letters are not sync’ed. In my case file are duplicated one with the right encoding the other with the combination letters.

Syncthing only handles files named using UTF-8, and in the normalization form expected on your operating system. It’s possible to get duplicates with different normalization forms:

  • You copy a file from Windows to Mac (or vice versa, or Linux to/from Mac). The normalization form isn’t changed, so it’s now “wrong” on the target system.
  • Syncthing also syncs the same file. It converts normalization.

You will have to remove the offending one.

Ah ok. This is an annoying limitation as I get many files from different sources, Windows, Mac, Linux.

It is annoying indeed that the standards differ between systems. Changing the encoding in the transfer is the sane thing to do. Your issue stems from not doing this. This results in files that look the same but are not, and weird stuff happens.

People moving files between systems have historically been stoically accepting of special characters getting mangled. Nobody in Sweden is that surprised when a file räksmörgås ends up being called r‰ksmˆrgÂs. That’s fine. But it doesn’t work when we need to sync changes bidirectionally between systems with different encodings. Thus we need to take more care.

I wonder if there is a script that deletes these double files with the wrong encoding. We could use the syncthing log as input as the files are listed there.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.