RESOLVED: Dropping index entry for … contains invalid path separator


#1

Continuing the discussion from Dropping index entry for … contains invalid path separator:

This is still happening in the latest version 0.14.47, but I figured it out …

First thing I noticed: This is coming from my Android phone …

And finally I checked my “Phone Pictures” folder:

And there it is!

So there are actually files with stupid fully qualified windows-style paths as their file names …

I will delete them now.

But maybe you can find a solution how to handle such madness gracefully.

Thanks for the great work you are doing!

Greetings Fino;


#2

Update: Even though i deleted the bad files and added the whole Skype-Folder to ignore: “(?d)Skype” i am still getting the error message.

Maybe now it is up to you, to remove deleted files from the DB even if they have invalid filenames?!

Here again the error message:

2018-05-26 14:44:47: Dropping index entry for Skype/C:\Users\Jan\AppData\Local\Packages\Microsoft.SkypeApp_kzf8qxf38zg5c\LocalState\1fda1808-31e5-4b2a-95ce-9ee2a63c2c34_20180411_170042865.jpg, contains invalid path separator
2018-05-26 14:44:47: Dropping index entry for Skype/C:\Users\User\AppData\Local\Packages\Microsoft.SkypeApp_kzf8qxf38zg5c\LocalState\94db313a-c5fd-4aac-8ddf-5b02cfb3c76f_20180406_113030260.jpg, contains invalid path separator
2018-05-26 14:44:47: Dropping index entry for Skype/C:\Users\amuel\AppData\Local\Packages\Microsoft.SkypeApp_kzf8qxf38zg5c\LocalState\3695e692-3464-4482-92be-1a74cdbe9440_20180409_161725895.jpg, contains invalid path separator

(Audrius Butkevicius) #3

They will not be deleted, things are preserved in the database indefinitely. You can reset the database on the other side in hopes to get this fixed, yet make a backup first.


#4

I don’t want to reset the database.

I have several devices and a lot of files. Resetting the database means everything must be rescanned. This is a lot of work for me and my devices.

Why are these deleted files kept in the database?

Would this mean the database will grow indefinitely with files being added and deleted again?

My assumption was that deleted files will not persist in the database.

What if i delete the folder and re-add it?


(Kluppy) #5

It will be added again with a new index counter. This is the same as when it is changed or deleted… All of these things are seen as a change to the state and so are stored until another update to the file or directory.

If you meant the same files, no. The latest state is remembered, not every state it has ever been in.

But if you mean new never before seen files added then deleted then yes. Every new file that is deleted is remembered as being deleted.

This is one of the problems with a truely distributed system. When a file was deleted has to be remembered in case a long dead node comes online again and has to syncronise with the cluster.


(Simon) #6

Yes that works. It drops data about this folder from the db and actually probably is what we should recommend to try before resetting the entire db if the problem is localized to a folder.


#7

I also added the sub-directory containing the bogus files to the ignore-list on all clients with the “(?d)” option.

This seems to have no effect though.

Wouldn’t it make sense to delete entries from the DB in this case?


(Audrius Butkevicius) #8

No, because we’ve advertised them before to others.


(Simon) #9

In general you can’t drop anything from the db. Simple “explanation” is that dropping an entry would have to be a cluster coordinated action (otherwise there is no difference between a genuinely missing item and a “dropped” one), but there is no such mechanism. However currently (meaning on rethinking this I might well find a catch) I don’t see a reason to drop a file that only exists locally and is either deleted or invalid. I even don’t find an argument against dropping file that are deleted/invalid on all devices we know. Nobody will ever request an invalid/deleted file. Then again, this “feels” wrong, as if the catch is just waiting at the back of my head :stuck_out_tongue:

In general this is yet another reason why we need a case insensitive mode and extend it to be a “canonical path” mode - i.e. not only normalizing case, but also restricting other path features to the lowest denominator common. Yeii, fun


(Audrius Butkevicius) #10

You still can’t drop it, as if it becomes undeleted, or unignored, and there is some offline peer with that file, loosing version history would mean conflicts.


(Simon) #11

The offline device’s item also was invalid/deleted, otherwise you woulnd’t have dropped it. So if it still is that way, you’re newly created valid item will win the conflict. If the offline device created something valid there too, it is a genuine conflict.


(Audrius Butkevicius) #12

How do you know, what if this magical device is three hops away and you can’t even see it?


(Simon) #13

Ah no you’re right anyway, even without the magical three hops away device. Not because it becomes a conflict (I think that’d be the right thing? Doesn’t matter though xD ), but because it might not become a conflict. If any other device (magical or not) doesn’t drop the item with say version {yourid: 10, theirid: 1} and then they create the item at the same time as you do, they end up with version {yourid: 10, theirid: 2} while you dropped the item, so have version {yourid: 1, theirid: 2}, which will lose without a conflict to the other.

Morale of the story: Don’t tamper with that **** if you don’t need to.

And to make it clear that I don’t mean to dismiss @Finomosec’s issue: Dropping warnings all the time due to a delete, ignored and unluckily invalid filename definitely isn’t cool. Maybe it would be better to just silently drop that? Sure, the user might never notice then, but that wouldn’t be different from e.g. symlinks.


(Audrius Butkevicius) #14

It wouldn’t.

We have file at version 1 A B C all have it C goes offline. A deletes file, version 2 B follows and deletes, both delete from versions. A creates file, with different content, version 1, C comes online, says hey, I have version 1, all good.

It’s more complicated than that, but you get the drill.


(Simon) #15

Wow, I should have just put posting that on standby and should have rethought it tomorrow or something - so much wrong with it. Thanks for finding the catch(s) :wink:


#16

Wow this is more complicated than i thought. Thanks for trying to solve it eventhough.

I have two ideas about the “common denominator” for filenames …

First: Maybe there is a way to escape the path with the invalid symbols in it. But my guess would be windows can’t do this.

Second: You could add a mode like “only sync compatible filenames”, either by adding an option to manually enable/disable (like the sync permissions flag) or by autodetecting the connected devices (which might be a problem if a new windows device is added later) or by simply enabling this hardcoded always (which again might make problems if you have only linux devices and want to sync weird filenames).

Anyway what the option does is: only allow filenames that are valid on all types of devices and maybe creating a warning on the SOURCE-device when an invalid filename is detected saying “this file can not be synced”.

I hope this helps.

Greetings Fred;


#17

Oh and one more idea, which should be uncritical:

You could at least suppress the “dropping invalid file from index” warning if the file is being ignored.


(Simon) #18

Conceptually the problem is already well outlaid, the problem isn’t there anymore, it’s in the actual implementation (and details/problems that will then come up).

As to suppressing the warning if ignored: That warning happens at the protocol level, were there is no access to ignore rules. I still think demoting this to debug level is fine and consistent.


(Audrius Butkevicius) #19

I am not sure I understand why this happens repeatedly tho. Also, if we drop things how does delta indexes work?


(Simon) #20

Good question, it should only happen on initial index exchange. I don’t think it should affect the delta index mechanism on the receiver side. If there is any new change with a higher sequence number, that will just become the highest number we have - i.e. the invalid file should just be skipped. It remains a problem on every device (re)connect if the invalid file has the highest sequence number, because then the receiving side will never increment its max sequence number though.