Syncthing 2.0 resource usage

dmih · September 9, 2025, 6:26pm

I see, with this approach, not a lot can be done. Denormalizing names in one of the ways (separate table or with parent_id on the main table) is the only meaningful improvement then, for “lots of files” and “lots of peers” case. A pity but priorities up to the owners, for sure.

(well locality fixes will also help anyway, but hard to justify them in advance)

EDIT: being also-coding-TCO of cloud hosting provider, I am developing apps that have same degree of responsibility; for example all of our data is in backup with deduplicating app of mine, which is there just because Borg-like apps are in approach “we will do that with DB self-consistent” etc, and my being self-made C-based indexes and hashes, which are times more efficient at storage and performance, which I consider important (Knuth-books-like athoc data design).

I understand exactly the concerns, also being worried for a years that maybe I just wrote “write-only archiver” or like that, which will ruin my business some day. However, things like Hoare-driven “prove of correctness”, and a modern substitute for that like unit testing and fussing, can make these things happen and go live.

I understand however the situation when you are moving opposite direction, and I respect the rationalies on that. My appearance in the thread is just a look from the other side then, but maybe in a void, given different view of priorities and possibilities. I appreciate the attantion of any of you.

calmh · September 9, 2025, 7:04pm

I don’t think splitting names into path components is very likely to gain much, but the name itself and the version are both long strings, repeated for each row in the table. Moving those to a separate table each and having a name idx and version would likely help a fair amount with the size.

It might make things slower though, because queries will be more complex and need to do joins for things that currently don’t.

bt90 · September 9, 2025, 7:14pm

On the other hand, we’d benefit from smaller index sizes, which might compensate this to some extent.

dmih · September 9, 2025, 7:23pm

here are the records, binary:

my estimation is around 50% save here, this very case

UPD - I’d say, more than 50%, for sure

UPD2 - just check any of yours, all mine are such that will definitely gain 50% here

UPD3 - and imagine them long chars unicode…

dmih · September 9, 2025, 7:30pm

on 4-byte indexes / hashes:

these are index pages, it will be like 95% smaller

ADD: in my case, implementing all I said will do it like 3GB data → 1GB data, 3GB index → 0.3GB index.

calmh · September 9, 2025, 7:44pm

Yeah I don’t even now how to make an effective query for using a filename split into a tree of levels. I’d focus elsewhere since there is probably low hanging fruit in just normalising on full file names and versions.

dmih · September 9, 2025, 7:53pm

in a single query seems best here, then consider this approach - add a level field, then,

/dir1/dir2/dir3/file1

yelds

WHERE (type=dir AND name=‘dir1’ AND level=1) OR (type=dir AND name=’dir2’ AND level=2) OR (type=dir AND name=’dir3’ AND level=3) OR (type=file AND name=’file1’ AND level=4) ORDER BY type

this will be 4 index walks but unavoidable, if we are storing a tree (I think we should)

Then if you have less than 4 results, then null.

If you have more than 4 results, just start looping to check if things match; this will be a short loop of options anyway - you will have 4 interations on result set; will be some obvious bad corner situations for sure, but rare;

Maybe just 4 queries are better, but in my solutions I always win with single query no matter how strange it is.

I agree looks cumbersome. But just from my experience - it is fine.

ADD: this all depend a lot what exactly is happening inside most of the time, I just don’t know. Have not seen the code yet.

ADD2: but instead imagine that you do have directory walk in DB now! WHERE parent_id= and that’s all. You can redesign around these batches - just query all files on a level and do a readdir() for real OS dir, will greatly improve the scanning. This will literally be like 0.000% of queries before during scan of large dirs.

ADD3: I’ll put it this way: sometimes you just know the dir in context, or at least, “lets cache last 1000 dirs in a memory hash”. You can even avoid these strange queries to the tree-like storage, at least, most of the time, just guessing parent_id from some other sources.

UPD: this may require additional locking, though

dmih · September 9, 2025, 8:20pm

if you are doing a database change anyway, consider this approach -

id PRIMARY KEY INT

name4 INT(4) + non-unique index

name BLOB - NO index here

I understand you will not be able to database-side constraint of unique of the name, which you seems against, but it will be just so much more efficient, like, times more efficient… we can discuss safety measures about that, if you tell what are your risks estimations here, exactly

UPD: you have ACID anyway, what’s the deal

calmh · September 9, 2025, 8:27pm

I’m not smart enough or well enough versed in SQLite to understand your sketches. Could you show concretely what the schema would be instead of

github.com/syncthing/syncthing

internal/db/sqlite/sql/schema/folder/20-files.sql

8233279a6


      
          CREATE TABLE IF NOT EXISTS files (
              device_idx INTEGER NOT NULL, -- actual device ID or LocalDeviceID
              sequence INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT, -- our local database sequence, for each and every entry
              remote_sequence INTEGER, -- remote device's sequence number, null for local or synthetic entries
              name TEXT NOT NULL COLLATE BINARY,
              type INTEGER NOT NULL, -- protocol.FileInfoType
              modified INTEGER NOT NULL, -- Unix nanos
              size INTEGER NOT NULL,
              version TEXT NOT NULL COLLATE BINARY,
              deleted INTEGER NOT NULL, -- boolean
              local_flags INTEGER NOT NULL,
              blocklist_hash BLOB, -- null when there are no blocks
              FOREIGN KEY(device_idx) REFERENCES devices(idx) ON DELETE CASCADE
          ) STRICT

and what a simple query that just looks up by name like

github.com/syncthing/syncthing

internal/db/sqlite/folderdb_local.go

8233279a6


      
          	err := s.stmt(`
          		SELECT fi.fiprotobuf, bl.blprotobuf FROM fileinfos fi
          		INNER JOIN files f on fi.sequence = f.sequence
          		LEFT JOIN blocklists bl ON bl.blocklist_hash = f.blocklist_hash
          		INNER JOIN devices d ON f.device_idx = d.idx
          		WHERE d.device_id = ? AND f.name = ?
          	`).Get(&ind, device.String(), file)

would become?

dmih · September 9, 2025, 8:31pm

I will write in details, first, to progress in small steps, in respect to 100% in-DB consistency, within your current approach. So without name4 etc. Will do now.

dmih · September 9, 2025, 9:16pm

Lets imagine we moved names to separate table as you wish to do, this way, so anyway, we replace

name TEXT NOT NULL COLLATE BINARY,

with

name_id INTEGER NOT NULL

then, we have names table:

CREATE TABLE IF NOT EXISTS names (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
parent_id INTEGER NOT NULL,
name TEXT NOT NULL COLLATE BINARY) STRICT

CREATE UNIQUE INDEX IF NOT EXISTS names_name ON names (name)

The question is how to query for the table? Most easy to read way is the following:(I don’t write Go so pseudocode)

function get_names_id_by_path (full_name)
{
  path_tokens = full_name.split ('/');
  query = "SELECT id FROM names AS leaf ";
  args = array();
  string reference_table_level = "leaf";
  for (t = path_tokens.count - 2; t >= 0; t--)
  {
    query += format (" INNER JOIN names AS l{0} ON {1}.parent_id = l{0}.id AND l{0}.name = '?'", t, reference_table_level);
    reference_table_level = format ("l{0}", t);
    args.push (path_tokens[t]);
  }
  query += " WHERE leaf.name=?";
  args.push (path_tokens[path_tokens.count - 1]);	
  id = SQL (query, args);
}

Initially I proposed different more complex way, but this should also be fine, and much easier to follow. Maybe I write something wrong, was not testing, this what should happen in the query, for dir1/dir2/dir3/file1:

SELECT id
FROM names AS leaf
INNER JOIN names AS l2 ON leaf.parent_id = l2.id AND l2.name = 'dir3'
INNER JOIN names AS l1 ON l2.parent_id = l1.id AND l1.name = 'dir2'
INNER JOIN names AS l0 ON l1.parent_id = l0.id AND l0.name = 'dir1'
WHERE leaf.name = 'file1';

You can integrate this in an initial JOIN query for the final data, just a lot more pseudocode, let me know if I need to clarify more. This is most naive implementation but should work OK,

then on names4, (will add a bit later now)

dmih · September 9, 2025, 9:22pm

consider this:

CREATE TABLE IF NOT EXISTS names (
id INTEGER PRIMARY KEY NOT NULL,
parent_id INTEGER NOT NULL,
name4 INTEGER NOT NULL,
name TEXT NOT NULL COLLATE BINARY) STRICT

CREATE INDEX IF NOT EXISTS names_name4 ON names (name4)

this will create two smaller indexes - for id and for name4, but not for name itself. This will save on index space considerably.

name4 is always populated such as name4 = first-4-bytes-of-md5 (name)

and always queried like this:

WHERE (name4=first-4-bytes-of-md5 (“file1”) AND name=”file1”)

instead of

WHERE (name=”file1”)

This will just save on duplicating full file names in index. Thus, with long names (imagine unicode), index will be just much less in size. For typical unicodes, it can be like 10 times less in size.

UPD:

The next step is to find out how much benefit and loss. This all will result in slower queries for names, for sure. One benefit is that at some devices, you just cannot fit the DB into the RAM + HDD. All these NAS people crying “cannot run” now.

For more powerful devices it probably needs a rework to “DB interface” as it seems

Consider this:

func (m *model) CurrentFolderFile(folder string, file string) (protocol.FileInfo, bool, error) {
	return m.sdb.GetDeviceFile(folder, protocol.LocalDeviceID, file)
}

clearly you need to resolve the parent_id once here, and go simplified way of query without joins. Without these optimizations this all may hurt standard operation. So I am not sure if just replacing backend only will do some use.

Still can be because DB size can be made like 4 times smaller. Will fix a lot of devices. But in general, who knows.

UPD - was reading the code, and it seems that batching is mostly done with feeding functions into each other, not arrays of items, it is scanSubdirs() seems need to be optimized to benefit from parent_id, but from non-Go point of view, I have no idea what is going on there. However my naive proposal here is to replace GetDeviceFile (folder, dev, filename) with GetDeviceFile (folder, dev, filename, parent_id_if_i_know_it). Maybe this can be integrated into the existing code easily, when you are iterating into the folder in FS, just get the ID of the folder, and feed it into all file queries for this folder iteration. This way you can avoid JOIN for the tree in FS scan at least - which seems to be the most problematic section. And maybe at other places using the same approach. Failed to get the ID or just unsure? Pass the null. Will do tree walk.

bt90 · September 10, 2025, 4:49am

dmih:

to read way is the following:(I don’t write Go so pseudocode)

function get_names_id_by_path (full_name)
{
  path_tokens = full_name.split ('/');
  query = "SELECT id FROM names AS leaf ";
  args = array();
  string reference_table_level = "leaf";
  for (t = path_tokens.count - 2; t >= 0; t--)
  {
    query += format (" INNER JOIN names AS l{0} ON {1}.parent_id = l{0}.id AND l{0}.name = '?'", t, reference_table_level);
    reference_table_level = format ("l{0}", t);
    args.push (path_tokens[t]);
  }
  query += " WHERE leaf.name=?";
  args.push (path_tokens[path_tokens.count - 1]);	
  id = SQL (query, args);
}

Running that for a folder with 100k files nested inside a few levels of directories would produce a gazillion of SQL queries.

I think a simple level of indirection is going to safe a lot of space without butchering query performance:

CREATE TABLE IF NOT EXISTS file_names (
    idx INTEGER PRIMARY KEY NOT NULL,
    name TEXT NOT NULL UNIQUE COLLATE BINARY
) STRICT, WITHOUT ROWID

dmih · September 10, 2025, 10:05am

This is “write once use often” case. Consider the following,

this will be another backward-incompatable migration, so take care to plan well this time, not in half steps as it seems,
to take full benefits, you should cache parent_id, not going through tree walk when you have another tree reference; in dir walk entering the folder - query for the folder node and use it later; in index merge - simple cache of ~1000 dirs will eliminate the problem with 99%.
even better, preload all files of a folder when entering the folder in scan; now you can do it with (name>=’xxx’ AND name<‘yyy’) approach also, but with parent_id it will be more pronounced.
huge save on index space and less IO to DB files. More chances you fit index into RAM of smaller NAS devices.

up to you for sure, but conceptually, as of me, it don’t pay to bother to do anything with another incompatible migration, if you are doing like 30% of what can be done here.

UPD: to preload the folder efficiently, you can have an additional hint of “superhuge folder”, as or just approxmiate leafs count, to use it as a threshold.

dmih · September 10, 2025, 10:16am

as on hash-index (4 or 8 bytes) instead of full index, another huge save-of-space feature, and will most likely be of no change on large devices, and maybe magnitudes speedup on smaller devices, which balance around “will fit / will not fin into the cache”. It will be just *~5-10 times more files before they, essentially, stop working with the dataset because of that. Once again, a huge win as of me, but it seems that you are overall skeptical about resource usage improvements, so it is like I will write it last time to save our time, positions listed clear, thanks for attention anyway, (I keep repeating how syncthing is a breath of a fresh air anyway, was at started and still now)

UPD: this is for both hash store and names store, and in additional, hash with improved locality (with prepended 4-bytes locality hint) will also lower DB page cache pressure magnitude of times)

UPD2: will break name> AND name< approach, and also will break “in alphabetical order” syncs, additional measures needed to accomodate this change

dmih · September 10, 2025, 5:40pm

I’ve made a quick 30-minutes sketch to test. Here are my findings. This is C#, straightforward to read.

database 1:

CREATE TABLE IF NOT EXISTS names (
	id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
	name TEXT NOT NULL COLLATE BINARY
)
STRICT;
CREATE UNIQUE INDEX IF NOT EXISTS names_name ON names (name);

database 2:

CREATE TABLE IF NOT EXISTS names (
	id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
	parent_id INTEGER NOT NULL,
	name TEXT NOT NULL COLLATE BINARY
)
STRICT;
CREATE UNIQUE INDEX IF NOT EXISTS names_parent_id_name ON names (parent_id,name);

all was filled with my projects folder, using simple dir walk, 180K files, 20K folders. Projects. Like syncthing project, approx the same hierarchy, as usual.

Database1: 81 MB, Database2: 20 MB.

Then I did the query code.

database 1:

static object Query1 (string path)
{
	SqliteCommand c;
	c = conn1.CreateCommand ();
	c.CommandText = "SELECT id FROM names WHERE name=@name";
	c.Parameters.AddWithValue ("name", path);
	return c.ExecuteScalar ();
}

database 2:

static object Query2b (string path)
{
	var path_tokens = path.Split ('\\');
	SqliteCommand c = conn2.CreateCommand ();
	StringBuilder ct = new StringBuilder (500);
	var reference_table_level = "ll";
	for (int t = 1; t < path_tokens.Length; t++)
	{
		ct.AppendFormat (" INNER JOIN names AS l{0} ON {1}.id = l{0}.parent_id AND l{0}.name=@name{0}\n", t, reference_table_level);
		c.Parameters.AddWithValue (string.Format ("name{0}", t), path_tokens[t]);
		reference_table_level = string.Format ("l{0}", t);
	}
	ct = new StringBuilder (string.Format ("SELECT {0}.id FROM names AS ll\n{1}", reference_table_level, ct));
	ct.Append ("WHERE ll.name=@name AND ll.parent_id=0");
	c.Parameters.AddWithValue ("name", path_tokens[0]);
	c.CommandText = ct.ToString ();
	return c.ExecuteScalar ();
}

then the benchmark:

started = DateTime.Now;
for (int i = 0; i < 10000; i++)
{
	Query1 (@"C:\Projects -common-\Arduino\DKVM1-STM32F103C6T6\Debug\Drivers\STM32F1xx_HAL_Driver\Src\stm32f1xx_hal_dma.d");
}
Console.WriteLine ("Elasped: {0}", DateTime.Now - started);

started = DateTime.Now;
for (int i = 0; i < 10000; i++)
{
	Query2b (@"Projects -common-\Arduino\DKVM1-STM32F103C6T6\Debug\Drivers\STM32F1xx_HAL_Driver\Src\stm32f1xx_hal_dma.d");
}
Console.WriteLine ("Elasped: {0}", DateTime.Now - started);

and the repeatable results:

Elasped: 00:00:00.6381574
Elasped: 00:00:01.4530019

Twice the time. But I was causing NO any memory or cache contention to speculate that database which is 2x slower is now 4x smaller. Imagine we are looking into this index with limited page cache or RAM.

Will it benefit or not? Up to you. But things are not like “a disaster” going this way. And in a lot of real world scenarios, it will be just a lot faster with a smaller DB. Just for your reference.

Going to hash-index will also make it like additionally even more twice smaller, without any serious change in query performance. Can check that if you are interested.

UPD: this query inside -

SELECT l7.id FROM names AS ll
 INNER JOIN names AS l1 ON ll.id = l1.parent_id AND l1.name=@name1
 INNER JOIN names AS l2 ON l1.id = l2.parent_id AND l2.name=@name2
 INNER JOIN names AS l3 ON l2.id = l3.parent_id AND l3.name=@name3
 INNER JOIN names AS l4 ON l3.id = l4.parent_id AND l4.name=@name4
 INNER JOIN names AS l5 ON l4.id = l5.parent_id AND l5.name=@name5
 INNER JOIN names AS l6 ON l5.id = l6.parent_id AND l6.name=@name6
 INNER JOIN names AS l7 ON l6.id = l7.parent_id AND l7.name=@name7
WHERE ll.name=@name AND ll.parent_id=0

UPD: with name4 approach (and another random dataset),

Full names, unique index, Elasped: 00:00:00.7169035
Level names, indexed, Elasped: 00:00:01.7745987
Leval names, name4 int32 hash index only, Elasped: 00:00:02.4254728

name4 is like that,

CREATE TABLE IF NOT EXISTS names (
	id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
	parent_id INTEGER NOT NULL,
	name4 INTEGER NOT NULL,
	name TEXT NOT NULL COLLATE BINARY
)
STRICT;
CREATE UNIQUE INDEX IF NOT EXISTS names_parent_id_name4 ON names (parent_id,name4);

and the query example is

SELECT l7.id FROM names AS ll
 INNER JOIN names AS l1 ON ll.id = l1.parent_id AND l1.name=@name1 AND l1.name4=@nameHash1
 INNER JOIN names AS l2 ON l1.id = l2.parent_id AND l2.name=@name2 AND l2.name4=@nameHash2
 INNER JOIN names AS l3 ON l2.id = l3.parent_id AND l3.name=@name3 AND l3.name4=@nameHash3
 INNER JOIN names AS l4 ON l3.id = l4.parent_id AND l4.name=@name4 AND l4.name4=@nameHash4
 INNER JOIN names AS l5 ON l4.id = l5.parent_id AND l5.name=@name5 AND l5.name4=@nameHash5
 INNER JOIN names AS l6 ON l5.id = l6.parent_id AND l6.name=@name6 AND l6.name4=@nameHash6
 INNER JOIN names AS l7 ON l6.id = l7.parent_id AND l7.name=@name7 AND l7.name4=@nameHash7
WHERE ll.name=@name AND ll.name4=@nameHash AND ll.parent_id=0

and DB indexes stats,

full names -

*** All indices ***************************************************************
Percentage of total database......................  52.7%    
Number of entries................................. 288816    
Bytes of storage consumed......................... 33542144  
Bytes of payload.................................. 28642729    85.4% 
Bytes of metadata................................. 986147       2.9%

level names -

*** All indices ***************************************************************
Percentage of total database......................  53.1%    
Number of entries................................. 288816    
Bytes of storage consumed......................... 13647872  
Bytes of payload.................................. 11012717    80.7% 
Bytes of metadata................................. 906433       6.6%

level + name4 hashed -

*** All indices ***************************************************************
Percentage of total database......................  29.1%    
Number of entries................................. 288815    
Bytes of storage consumed......................... 5533696   
Bytes of payload.................................. 3976139     71.9% 
Bytes of metadata................................. 882653      16.0%

calmh · September 10, 2025, 8:01pm

I’m choosing to not do that at this point. The move to having names and versions in a separate table is a clear win as it changes the cost from those to being O(#files * #devices) to being just O(#files). The optimisations on the name table is “just” changing a constant multiplier on the size of the name table and I’m less worried about that at the moment compared to the mental & code complexity.

bt90 · September 10, 2025, 9:56pm

As for the database size, I’d recommend to wait for the patch that @calmh is currently crafting.

But there’s one thing left currently that might warrant attention and could potentially be the cause of DB related stalls: transaction handling

https://kerkour.com/sqlite-for-servers#use-immediate-transactions

github.com/mattn/go-sqlite3

Possible solutions to the concurrency problems

opened 03:44PM - 16 Jul 23 UTC

closed 09:04AM - 18 Jul 23 UTC

alinnert

I am aware that this library can cause problems when being used in different gor…outines. These days I was thinking about how this could be solved. I think a good idea would be to use a dedicated goroutine that handles all database related work. It would be similar to a task queue since you could send a query to execute into it and wait for a result to come back, but with support for a single task runner only. Since channels have a limited capacity a direct connection wouldn't work because the channel can overflow pretty easily. So the idea is to use a task quere separate from the actual worker. Like this: ``` mermaid sequenceDiagram participant Source A participant Source B Source A->>Task Queue: Send task A Source B->>Task Queue: Send task B Task Queue->>DB Worker: Send task A activate DB Worker DB Worker->>Task Queue: Send result A deactivate DB Worker Task Queue->>Source A: Send result A Task Queue->>DB Worker: Send task B activate DB Worker DB Worker->>Task Queue: Send result B deactivate DB Worker Task Queue->>Source B: Send result B ``` I'm currently not very experienced with using goroutines and channels, so I've found a possible implementation here: https://reintech.io/blog/implementing-distributed-task-queue-go But it contains a very curious piece of code: <details> <summary>If you're interested</summary> ``` go for { select { case task := <-tq.enqueue: tq.tasks = append(tq.tasks, task) case tq.dequeue <- tq.tasks[0]: tq.tasks = tq.tasks[1:] } } ``` The second `case` seems to wait until `tq.tasks` contains an item that can be sent to `tq.dequeue`. According to a Go developer I know this behavior is not well defined in the language spec and might break or change in a future release. So, I'm keeping my fingers from that for now. --- </details> So, my questions are: a) does anybody know how to implement this or how to improve the implementation above? And b) (that's why I'm opening an issue here) where is the best place for this? It could be done in the individual projects. But since this seems to be a big problem amongst many users couldn't it be part of `go-sqlite3` itself (maybe as a public wrapper or helper function or as an internal implementation detail)? Or maybe a separate wrapper/helper library? In the case of this being custom code could a solution like this be added to the documentation (or linked to one) so everyone working with concurrency has an easier starting point? What do you think about this in general? Are there problems to this approach I don't see yet?

tl;dr: split the DB pool into a pool with X reader connections and a pool with a single writer connection which is created with _txlock=immediate.

calmh · September 11, 2025, 5:19am

I see the recommendation, but I think we’ve handled the underlying reason differently. We have a write lock that ensures we never have more than one write transaction at a time and never see SQLITE_BUSY, which seems to me like what using separate pools is intended to accomplish. Is there another reason for a pool split?

bt90 · September 11, 2025, 5:58am

That’s 90% of the solution, but we’re still using the default transaction type DEFERRED. A dedicated writer pool can be configured at the connection level to use IMMEDIATE transactions.