This is a very bad indicator for every one who is interested in the project. Specifically as it is positioning itself as security centric and open source. 36% means that developer tried some stuff. There was no attention to detail. Despite an attempt for test driven development. I know that test driven development is controversial topic and lot’s of people hate it. But having 36% on the other hand falls under the category: “It kind of works”. It gives no reassurance to the contributor that he/she didn’t broke anything unintentionally. It also means that on each pull request someone have to do hardcore manual testing.
Very interesting topic.
There is also a bunch of integration tests which actually checks that syncthing actually works. You can only run them on Linux today, but I think every pull request given it’s not a trivial rename gets integration tests thrown at it, which allows to check whether or not syncthing still does what it says on the box.
In my opinion, having high line coverage means nothing, I’ve seen projects through my short lifetime where the goal was to have at least 80% of the lines covered, and it would still have such amazing bugs that I couldn’t even understand how it was possible to make such obvious mistakes… It seems that the code was deliberately written JUST to pass the tests… which sounds a bit like TDD… So having high coverage does not prove that your code is bug free.
At least for me, it’s a hobby project, this is how I spend my spare time when I want to do development. That might make me a bad developer, but given it’s my spare time, I do not feel obliged to write tests for everything that I develop just because there is no boss to whip my arse for not doing so.
As I make pull requests, I try to write tests to prove that my code works or when I feel that my code introduces risks, which gives some reassurance to @calmh and others. Most of the stuff I write always goes in as a pull request so that a second pair of eyes could stare at it, and if the second pair of eyes will tell me that this needs some tests, I would not resist in writing them.
Though I guess given it’s somebodys spare time which produced the pull request, asking someone to write tests might mean that they will have to spend even more spare time on something that they might not enjoy, or something that does not add physical value to the end user (apart from the value of not breaking things), so that might never happen.
I agree though, there should be more tests, and the line coverage should be better.
I guess @calmh should be more strict, and just demand for tests in the pull requests, as otherwise this is how things start going downhill.
I don’t exactly know how to fix the situation we are currently in, I guess someone just needs to be punished and write tests for a whole week.
Other problem I see is that some parts of the code just have no tests at all (JS part for example), meaning that even if someone would like to write the tests, they’d have to setup the everything (including the framework, commands to run them and so on) from scratch.
Yep, more tests would be good.
Sorry, this deserves a better answer than the slightly flippant one above.
Yeah, the test coverage sucks. That’s mostly my fault, being the author of the absolute majority of those 64 untested percents. I’m a lazy programmer and syncthing has definitely been a “it kinda works” kinda project, in that it’s a prototype I hacked together to scratch an itch, and because it was fun. Writing tests is slightly less fun than getting something to work. It’s a sin we’re working on rectifying, slowly - the test coverage is increasing.
The integration tests on the other hand do bring up the effective coverage quite significantly, but it’s not accounted for in the numbers. The integration tests run on Mac and Linux (at least; probably other unixes, and they are being slowly made more Windows friendly) and simply start a bunch of instances of syncthing and pound on it in various ways. Files are synchronized, changed, set read only, removed, removed from read only directories, etc. Many bugs have been caught and reproduced here rather than in the unit tests.
The current unit tests mostly fall into two categories - stuff that’s easy to test (like file hashing; one input, one output, you bet it’s tested) and stuff that was tricky enough to not want to try without tests. But the majority is in between - not so easy to test (perhaps that code should be rearchitected to be more test friendly), and not too impossibly tricky (looks like it should work, so it probably does, right…?). It should be fixed, indeed. Any help in that direction is appreciated.