Syncthing embedded SDK/API

rmacyn · July 15, 2021, 6:52pm

Hey Guys,

Any plans on creating an embeddable SDK/API (dll, .a or .dylib) version of Syncthing?

Basically, we’d like to embed Syncthing into our user application … without having to perform a separate install for Syncthing.

Just a thought

Thanks in advance,

-Ralph

AudriusButkevicius · July 15, 2021, 7:34pm

No, it’s an application, not a library. I am not even sure it would make sense as a library, as you still have to point it at a folder, so I am not sure what the point of a library would be.

There is a way to build it as a “library”, but its only to make it easier to manage applications lifetime whole embedded in a managing application.

rmacyn · July 15, 2021, 9:28pm

Makes sense … thanks for the quick response.

-R

sybren · September 30, 2023, 10:37am

I’d love to be able to embed Syncthing in our application Flamenco. It’s is a render farm management system, which needs to have a way to make files available on the render nodes. Currently it relies on a NAS or other form of shared storage that’s directly accessible by all computers in the farm. It would be amazing if Flamenco itself (with the help of the embedded Syncthing) could take care of making these files available to the render nodes.

So yes, it would still have to be pointed at a directory, but Flamenco can take care of that for users. The way I’d want to do this embedding is in a way that users just experience “it works”, without having to think about how exactly.

Flamenco itself is already built in Go, by the way, so I can just import any Syncthing package directly (instead of having to go the .dll/.so route).

AudriusButkevicius · September 30, 2023, 1:34pm

Any why do you need continious bidirectional sync, with various rename detections, hashing, delta change detection, etc etc? Not familliar with the project, but it doesnt’ sound like you do?

Sounds you need some sort of fetch files on demand at best, which can go through a central node, which you can just implement?

sybren · October 9, 2023, 10:09am

You have some interesting points. I’ll go over them one by one.

continious bidirectional sync: render farms nodes don’t just need their input, they also need to produce output that has to be sent back. Also the output of one node (an image) can act as the input of another node (generate a video out of such images).
various rename detections: not necessarily necessary, but duplication detection is important as many render jobs will have overlapping inputs (character files are the same, the animation in each shot file is different, just to name an example)
hashing: not directly necessary
delta change detection: there will be various revisions of the same shot, and minimizing the amount of data that needs to be synced will help a lot.

Sounds you need some sort of fetch files on demand at best, which can go through a central node, which you can just implement?

With Syncthing there’s a decoupling that can happen that’s very interesting. Yes, the central node will be the “ground truth” as to what files are being rendered. But when a few nodes are in a different location, behind a network bottleneck to the central node, having those nodes sync up between each other where possible would certainly be a nice feature.

You’re absolutely right in that our product can exist and function without Syncthing. It has done so for many years already, and it’s been used for many of our productions (we have a playlist on YouTube if you’re curious).

Still, there are many studios who work with tools like Dropbox to sync their files between the artists, and this makes it hard to integrate with Flamenco. Most importantly: without explicitly building in support for each different, proprietary sync tool, Flamenco wouldn’t know when the syncing is done and the set of files should be complete complete. In other words: when files are missing, do we wait or do we fail?

It is my hope that with Syncthing embedded in the application itself, it’s possible to have a tighter, more robust integration.

AudriusButkevicius · October 9, 2023, 12:42pm

I have worked on rendering farm software in my past life, we had the same problem you have, and we solved it by writing some small specialised piece of software to just sent assets render nodes needed to them, with a shared local cache that can be reused by nodes, and retrieved results.

Continuous bidirectional sync suggests all nodes have all versions of all assets, which I’m certain is not what you want.

You want the render nodes to only have assets they need for the job. You don’t suddenly want to consume bandwidth downloading some 50Gb video that some other node finished producing, if you don’t need it to finish your render job. Syncthing effectively “makes two folders look the same”, and surely to utilise bandwidth effectively and not download stuff you don’t need, you need a more specialized, job aware solution.

Furthermore, media assets can’t benefit from delta change detection as they are usually compressed/encoded in a non incremental way (as appending to a log would be for example).

sybren · October 9, 2023, 4:42pm

You make some good points, again. Thanks for thinking with me about this, much appreciated!