Too many Database Files, more than 30GB folder total

Hey guys,

I’m having an issue similar to this one:

where the send/receive boxes starts accumulating index .ldb files until syncthing starts crashing because of no available disk space (the largest so far was around 30GB). My setup consists of the following:

The server have 11 .ldb with a total size of ~500MB. I have 4 working clients on this cluster, where the folder size is about the same with the same sizes. But the others have reached up to 15000 files, extrapolating 30GB on the “index-v0.14.0.db” folder. They are sharing one folder only, where the global state is: Global State 179.846 files 9.451 folders ~694 GiB

Since I have to deploy syncthing on multiple clients, I copy the config file only, to each machine, then start syncthing on them. I left only the server as device on the config file, so it can introduce the other machines, and the shared folder so I don’t have to go to each box accepting the folder. I’m using v1.10.0-rc.3, Windows (64 bit) on all machines.

Here is the log file of one of the problematic clients (the one with 30gb >15000 files) syncthing.log (57.9 KB)

I have tried deleting the whole db folder on one of the problematic clients, tried recopying the prefilled config file, but it just started creating infinite files again.

What might be happening is that a lot of index data is added (there’s immediately lots of new devices) and the leveldb never compacts, just keeps writing new info while write activity is ongoing.

Some possibly related/interesting info: Do the new devices already have the data, or do they start with empty folders? Do you add multiple devices at the same time to the cluster?

This makes sense. Yes, the shared folder already have files on them, and yes, I added them to the cluster with ~1 minute difference between each device. I added 7 devices this time around.

Three of the four working correctly were previously added ones (probably not with this 1min time difference), and the other one kinda of switched places with one of the 7 I added this time, so one of the new is working (actually verifying files), and one of the old started accumulating files.

Would it be the case where I have to wait for the existing files to be verified before adding another device? Or even shutting down the server (or pausing the folder, idk if it would be enough) so devices are prevented from being introduced to each other?

edit: (sorry for all the edits, not my native language and missed some details)

Do you have a device that still has the large database and hasn’t panicked/crashed yet? If yes then please pause all devices. If the write activity was preventing compaction, then after pausing devices compaction should happen and thus db size go down. If that has no effect, please stop Syncthing and run it again with the environment variable STGCINDIRECTEVERY=1m set. This will trigger a gc run 1minute after starting, which might also improve things. Don’t forget to restart Syncthing without that environment variable set, that should only be used for debugging. Given how large your database is, in both cases it might take a while to take any effect. And please let me know how it goes :wink:

1 Like

Also set the database tuning to large, at first startup. Gets you better throughput during the time when all the info gets received from other devices.

2 Likes

Actually every device with ‘infinite’ .ldb files crashed. I deleted the whole “index-v0.14.0.db” folder on one of them and it keeps throwing me, even with STGCINDIRECTEVERY=1m:

[start] 19:46:04 WARNING: Error opening database: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\001102.ldb: Insufficient disk space. (is another instance of Syncthing running?)

I have also tried resetting everything, it stopped creating .ldb files, but I get the same error as above, probably after reintroducing a crashed device.

I noticed that on all machines, before the above error and consequent 4 restarts and crash, the following warning:

[ZJJ2V] 09:56:09 WARNING: Fatal error: qhvhs-lvtj6 Update(7777777-777777N-7777777-777777N-7777777-777777N-7777777-77777Q4, [394]): sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\001540.ldb: Insufficient disk space. panic: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\001540.ldb: Insufficient disk space.

Lastly I deleted all the crashed devices from the server, and reseted again the machine, adding it alone to the cluster. The moment I accepted it on the server, I got the Error opening database as above.

This is curious. Just to be sure: Do you get that error on the same device just after removing index-v0.14.0.db? What is the actual disk space available?

I tried it again on another machine just to be sure and yes, as soon as I hit the accept button on the server, it starts getting the message and crashes. I had previously deleted all crashed devices from the server, that’s why I had to accept it again.

Free space is ~28GB of 150GB total.

another curious thing, is that GC throws errors every run, before the crash:

[VYQYH] 08:15:28 WARNING: Database indirection GC failed: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\000008.ldb: Espaço insuficiente no disco.

Here is the log file (Timestamp 08:16:01 is where I accept the device on the server) syncthing (1).log (79.0 KB)

edit: “Espaço insuficiente no disco” means “Insufficient disk space”, host OS language is in portuguese

Ok, so these ~28GB get filled up in roughly 2 hour - impressive.

Did you try pausing all devices after the db grew, but before it runs out of disk space? E.g. when it is roughly 10GB. Does the db size go down a while after pausing?

Also did you set the database tuning config to large?

actually I’m not getting this 28GB filled anymore, because I’m adding a single client to the cluster, but it still throws the Insufficient disk space even when I have almost all the 28GB free.

It will probably fill up when I add all the devices together, I’ll try it later on. Actually, in the current state, as soon as I accept the device on the server, it will panic and stop, so I don’t see it filling up anymore.

Also did you set the database tuning config to large ?

Yes, I have.

At first I thought it was a 4Gb FAT32 limitation, then read though the thread again. Now I wonder if your running this on a server which has disk quotas enabled?

I don’t quite understand what the steps are you take. What I thought you did was that on the device where you have the error about disk space, you delete the index-v0.14.0 directory then start syncthing again. That’s consistent with the log where the panic happens 2 hours after starting syncthing. Now you are saying it panics immediately. Could you please make a list of steps of things you do and what happens.

No, both the server disk and other devices have quotas disabled

  1. Delete panicked device from server.
  2. Delete ‘index-v0.14.0’ folder from the device.
  3. Start syncthing on the device (at this point GC will throw the 'Database indirection GC failed. Insufficient disk space', while I still have more than 28GB available, besides this error it is verifying the local files as expected)
  4. Accept the device on the server (here I get the 'Error opening database. Insufficient disk space' while I still have more than 28GB available)

The 28GB filling up happened only on the first time I added the 7 devices together to the cluster.

Can you provide logs for these steps please.

[VYQYH] 08:11:28 WARNING: Database indirection GC failed: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\000008.ldb: Espaço insuficiente no disco. Line 57

[VYQYH] 08:12:28 WARNING: Database indirection GC failed: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\000008.ldb: Espaço insuficiente no disco. Line 68

and goes on

from here, when I accept the device on the server, the first line is:

[VYQYH] 08:16:01 INFO: Established secure connection to RT63NQA-Y54GSV6-BMS6VQA-2UTHMBS-LZKELBV-6ORM5PK-ILVODYU-BIMUPAV at 192.168.1.151:22000-192.168.1.10:22000/tcp-server/TLS1.3-TLS_AES_128_GCM_SHA256 Line 100

and then:

[VYQYH] 08:16:21 WARNING: Fatal error: qhvhs-lvtj6 Update(JXSCZJM-YTNWPJA-DL24SBW-77WZUZE-LQ5QBVN-WXNSPDM-D2F47WO-GR4PKQH, [1000]): sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\000008.ldb: Espaço insuficiente no disco. Line 131

panic: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\000008.ldb: Espaço insuficiente no disco. Line 132

It’s the same log as the last one I sent, I just deleted the previous unrelated runs.

Again, “Espaço insuficiente no disco” means “Insufficient disk space”

syncthing (1) (1).log (23.1 KB)

I am copying directly from the log, not removing anything (just saying because it’s weird):

[start] 08:10:26 INFO: Automatic upgrade is always enabled for candidate releases.
[start] 08:10:26 WARNING: Error opening database: sync C:\Users\Skill Lan House\AppData\Local\Syncthing\index-v0.14.0.db\MANIFEST-000000: Espaço insuficiente no disco. (is another instance of Syncthing running?)
[start] 08:10:27 INFO: syncthing v1.10.0-rc.3 "Fermium Flea" (go1.15.2 windows-amd64) teamcity@build.syncthing.net 2020-09-15 17:38:23 UTC
[start] 08:10:27 INFO: Automatic upgrade is always enabled for candidate releases.
[VYQYH] 08:10:28 INFO: My ID: VYQYHHG-X7J7BD2-GW5WCIW-V5KLFQ2-QEZPIQ2-3MPGZ3Z-XMCZUKK-WXARAQF
[VYQYH] 08:10:29 INFO: Single thread SHA256 performance is 490 MB/s using minio/sha256-simd (428 MB/s using crypto/sha256).
[VYQYH] 08:10:29 INFO: Hashing performance is 409.83 MB/s
[VYQYH] 08:10:29 INFO: Migrating database to schema version 1...
[VYQYH] 08:10:29 INFO: Migrating database to schema version 2...
...

Somehow on first opening it gets a insufficient space error, then directly after restarting it doesn’t any more. A minute later the insufficient disk space error pops up again.

Looks like there’s something wrong with your filesystem. It sometimes reports it is full. No idea why, but that’s a system level error, not one from Syncthing. Before continuing to look into why the db grows so much, the filesystem must “become sanely behaved again”.

1 Like

I may know what is going on, I just haven’t connected the dots until you mentioned it now. These clients uses disk freezing software (Time Freeze), even with the %localappdata%\Syncthing folder on it’s exceptions, it might have frozen the free space information on the filesystem.

Thanks for the insight so far, and sorry for the drift off the main problem, I’ll investigate further and come back after.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.