Assumed Audience: Hackers who like humble brags. 😛
Epistemic Status: You decide.
On Feb 5, 2023, I posted on Hacker News the article that became my most popular one on that site.
It is called, “Why I Use C When I Believe in Memory Safety.”
I will let you judge whether or not I justified the decision to use C.
But one of my programmer idols, Daniel Stenberg, recently wrote about his experience writing C and how well he did.
Well, I do have a C codebase in “wide” use, and while it is not nearly as old or widely used as curl, it is still data.
I have kept careful track of all “C mistakes” (as Daniel put it), and there are
six, about 1 per year. (I started bc
in January 2018.)
Daniel compares his work against the 70% number, so I’ll do the same.
Those links say 70% of all security bugs, but since my bc
/dc
does not have
the !
command (to spawn processes) and only reads files, not writes them
(except stdout
), it’s hard to actually turn a memory safety issue into a
security bug, even if it’s possible.
And I gotta count something, so I’ll just count all bugs. C is so unsafe that all bugs could be security bugs anyway.
Okay, so we have six C bugs; how many bugs do we have total?
This information is up-to-date as of the publication of this post.
On my personal code forge, the issue count is four, and one of those is not a bug, but a rant from someone who doesn’t like my copyright notice policy.
So three issues (#3, #8 and #15), none of which are any of the six C bugs.
Let’s look at pull requests on my personal code forge.
There are five, two of which are for the same problem: whether to use POSIX 2008
and not use _XOPEN_SOURCE
(#1 and #2).
There is also another that is not technically a bug (#6) since it just silences a Clang warning.
Those could be classified as bugs or not, but I won’t count them to not inflate my numbers.
So two pull requests (#5 and #7) are for bugs, neither of which are one of the six C bugs.
So three issues and two pull requests for bugs: five total.
Now let’s look at GitHub.
There are 38 issues.
I classified 17 as not bugs:
- #2 (manual problem)
- #29 (feature request)
- #30 (release problem)
- #32 (license problem)
- #33 (
README
problem) - #37 (question)
- #43 (feature request)
- #45 (packaging problem)
- #46 (release problem)
- #47 (feature request)
- #52 (release problem)
- #58 (question)
- #59 (praise and suggestion)
- #61 (bug in downstream test code)
- #65 (question)
- #69 (feature request)
- #70 (feature request)
And I classified 21 as bugs:
- #1 (build bug)
- #3 (build bug)
- #31
- #34 (build bug)
- #35 (build bug)
- #36 (test bug)
- #39
- #40
- #41 (test bug)
- #42
- #48
- #50
- #53 (build bug)
- #55
- #56 (build bug)
- #60 (build bug)
- #62 (test bug)
- #63 (build bug)
- #64
- #67 (build bug)
- #71 (build bug)
If bc
failed to build on a platform it previously built on, or had some other
unexpected build problem, I usually classify those as bugs.
Those bugs are labelled with “build bug” above.
Same thing with “test bug” items; those are items where the problem was in the test suite.
Of the issues and pull requests on my personal code forge, all of the issues (#3, #8 and #15) are build or test bugs, and both pull requests (#5 and #7) are for real bugs.
I’ll run numbers later with both types removed as bugs.
Now, for GitHub pull requests: there are 34.
I classified 23 as not bugs:
- #5 (build improvement, but not a bug)
- #9 (build improvement, but not a bug)
- #10 (build improvement, but not a bug)
- #11 (build improvement, but not a bug)
- #13 (feature request)
- #15 (feature request)
- #16 (feature request)
- #17 (feature request)
- #18 (feature request)
- #19 (feature request)
- #20 (feature request)
- #21 (feature request)
- #22 (feature request)
- #23 (feature request)
- #24 (style fixes)
- #26 (translation addition)
- #28 (typo fix)
- #38 (refactoring)
- #49 (typo fix)
- #54 (build improvement, but not a bug)
- #57 (documentation fix)
- #68 (feature request)
- #72 (feature request)
And I classified 11 as bugs:
It may seem like #14 was not a bug, but a compiler could have considered the mistake as undefined behavior.
In addition, since my bc
is used on FreeBSD, there are five bugs reported
there.
I have classified two as not bugs:
And I have classified three as bugs:
268076, in particular, has three bugs; all of which were separate and
reported by the same user in the course of trying to fix things. Also, it is not
a memory bug because the pointer that caused the crash was explicitly
initialized to NULL
.
In addition, there is one issue that that was reported on my personal code forge that I lost through poor system administration. It was #10, and it was titled:
“scale” not set correctly with -l when first command is a syntax error
and I fixed it in 299a4fd353
.
This one was a real bug.
And that takes care of any bugs that I know were reported.
So I have:
- 3 issues from my code forge.
- 2 pull requests from my code forge.
- 21 issues from GitHub.
- 11 pull requests from GitHub.
- 5 bugs from 3 issues from https://bugs.freebsd.org/.
And since the six C bugs were actually reported by nobody, but found by myself, we can add them to the total.
So we have 48 bugs, six of which are C bugs. That’s 12.5%, far better than the average of 70%.
However, that includes build and test bugs, of which there are 21. If we take them out, we have 27 bugs left, and 6 of 27 is 22.2%, still far better than average.
And this is where I get to whine!
You see, I’ve only included bugs that were reported on code forges; what about bugs in releases that I found myself? I found all six of the C bugs myself, so why shouldn’t bugs I found myself?
So let’s do that.
After doing a grep on my Git log for “bug”, “Bug”, “issue”, “Issue”, “crash”, “Crash”, “fix”, and “Fix”, grabbing a rough estimation of what was actually in a release, and then removing duplicates (including for reported bugs), I got this list:
f3e5367e1f1
f4b43239652
ff24cbb8332
945873ddc95
(see also54c0facdce8
54c0facdce8
85ac0e9cbed
9e5ae8441f7
bc5d89e41f9
d1f5f355599
2e9146cc105
(build bug)8e787bc56de
(see alsoaa19025bab1
)ae186e65ad3
0c1abbe6dff
52670bd9b9a
1156cd1bdee
77ef954773c
(build bug)8850fa7919f
(build bug)8a1d001dcfc
701068c5a5a
c82e10fa686
4d3505dbcd7
1d8f1e5f77d
f46644247f9
f22634b3185
3d192692a3c
(test bug)f60ce30fd5c
4277d5ad1a3
81905ee8e22
e0fb717308e
04ef32e888f
3a66b416bd1
e9ff9a45f5d
(test bug)26daf98c4ef
2b65eb21cfc
(build bug)8e83f103a8d
f71970ae6a7
107c53112d4
78bc9d0c742
(see also2484c9b6001
and92ce6d79ec9
)96f6dc9e539
(test bug)51ca77bd1a1
(build bug)21641355b30
(build bug)37e3d2ad7e9
(test bug)2555a5e66dc
da4a533759a
b56c24c20e9
61e990a4002
2092821da59
be2f6afca33
It may have some non-release bugs in there, but it’s also probably not complete, so whatever.
Anyway, there are 48 bugs there, so we now have a total of 96. Six of 96 is 6.25%.
If we take out the build and test bugs, we have 37 of the new bugs plus 27 old ones, which is 64. Six of 64 is 9.375%.
So with those numbers, am I a good C programmer?
I’ll let you decide.
But I’ll just add this: if this is what I can do in my spare time, imagine what I could do if I got paid.
Okay, I am NOT a good C programmer; I just may be better than average.
Betteridge’s Law still holds.