A few billion lines of code later: using static analysis to find bugs in the real world
Citation: Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, Dawson Engler (2010/02) A few billion lines of code later: using static analysis to find bugs in the real world. Communications of the ACM, Volume 53, Issue 2 (RSS)
DOI (original publisher): 10.1145/1646353.1646374
Semantic Scholar (metadata): 10.1145/1646353.1646374
Sci-Hub (fulltext): 10.1145/1646353.1646374
Internet Archive Scholar (search for fulltext): A few billion lines of code later: using static analysis to find bugs in the real world
Download: https://dl.acm.org/doi/abs/10.1145/1646353.1646374
Tagged: Computer Science
(RSS) software engineering (RSS)
Summary
Authors built a static bug-finding tool, Coverity, and apply it in practice.
- Coverity has false-positives (coverity flags code that is not erroneous) and false-negatives (some errors are not flagged).
- "Circa 2000, unsoundness [having false-negatives] was controversial in the research community, though it has since become almost a de facto tool bias for commercial products and many research projects."
- Sales strategy: Send an engineer and salesperson to the client, run the tool on their codebase, the engineer helps with "unique" client configurations and helps educate the client. This is a tough hurdle for the system, because no time to cherry-pick results and massage configuration.
- Educating users is difficult:
- Initially the tool used the output of Make to learn how to compile source-code, and where the source-code was.
- Clients have bespoke build systems and might not even know about Make.
- Later on, the tool intercepted syscalls to learn the compiler invocation and context. But this needs the commandline.
- Client developers don't necessarily build from the commandline.
- Later on, the tool intercepted syscalls to learn the compiler invocation and context. But this needs the commandline.
- Clients have bespoke build systems and might not even know about Make.
- Initially the tool used the output of Make to learn how to compile source-code, and where the source-code was.
- Clients are often risk-averse to change, so you have to work around broken software instead of fixing it.
- Compilers deviate from language standard intentionally and otherwise.
- Often clients want to buy their tool, but restrict their source code.
- Some clients don't believe that bugs the tool finds are real bugs. They often depend on non-standard behavior.
- Some clients try to argue with you, often emotionally. It's best not to argue; try to make a meeting with their peers.
- Upgrading tool to catch more bugs negatively effects metrics for managers.
- Determinism is more important to users than finding more bugs.
- Deep analysis can catch bugs, but those are hard to explain to users (e.g. races).
- Checking for trivial bugs is still useful. Given enough code, they will occur.