Discussion: Counting and discriminating vulnerabilities

From the CHERI use case document, we have the following issue

The question of how you count vulnerabilities, is somewhat nuanced
Some options to consider
Raw: count all vulnerabilities relevant to this system or component
Exploitable: only count vulnerabilities where a know exploit exists
Severity: only count vulnerabilities of a particular CVSS severity score
Incidence:: only count vulnerabilities, against which incidents have been reported
Actionable:: only count vulnerabilities, against action is actively planned or taken
There are issues with each measure.
The Microsoft “Security Analysis of CHERI ISA” [4] paper for example uses (e) implicity. Their historical vulnerability analysis relied on: “In 2019, the MSRC classified 655 vulnerabilities as memory safety issues that were rated severe enough to be fixed in a security update” Hence an internal severity rating was used, which presumably was contextual, and factored in the security considerations of the end user application. This measure has many benefits in that it ensures the analysed vulnerabilities reflect the real world economic cost of managing them. However, if we are to have measures that can be compared against systems and markets, however, we need a standardised repeatable methodology.
The second consideration is how do we automatically determine whether CHERI offers sufficient protection to render the vulnerability non-exploitable. Ideally we need a deterministic decision tree that can work this out from data present in the CVE or attached CWE. And if that is not possible, then we need definitive consistent guidance on, what information we need to be able to make this determination.
For CHERI to have real world operational value, this decision method needs to be automated. This may mean making recommendations to augment standard CVE/CWE data.
To make things more complex, it is clear that different CHERI interventions/use cases have different protective properties. The purecap decision method may be tractable. However, mapping the impacts of custom compartments to vulnerability exploitability will be more difficult.

Specifically if we compare

There are least two questions regarding to vulnerability counting. Assuming we start with a "big list"

down select by category: what method do we use to identify which vulnerabilities are in scope. Are these vulnerabilities mitigated concretely by a specific CHERI release and evidenced (stable build) - or are they conceptual. Or do they only apply to a certain use case.
down select by severity impact: what method are we using to triage and count vulnerability by

Microsoft notes

The Microsoft article is counting temporal and spatial memory errors. We presume the spatial errors are mitigated by a purecap intervention. ("Mitigated by default by CHERI"), but it is not entirely clear. This needs checking.

Note of course the build incorporating temporal safety is still work in progress.

The severity threshold is an internal metric.

The result is estimate of 67% of deterministically protected vulnerabilities

Capabilities Ltd notes

The capabilities article uses a number of different categories.

Mitigated by memory protection: we assume this is PureCAP + temporal
Mitigated by straightforward software compartmentalization: this needs checking. Does this refer to a particular compartmentalisation strategy. Or does is presume a sufficiently "prescient" strategy.
Mitigated by memory protection, but will lead to application crash, which in turn may be mitigated by straightforward compartmentalization: this could do with some clarification. Are there examples of how this will work in practice. Presumably, it assumes there is already a process/thread boundary in place.

Google notes

TODO

Discussion: Counting and discriminating vulnerabilities

Microsoft notes​

Capabilities Ltd notes​

Google notes​

General requirement

Microsoft notes

Capabilities Ltd notes

Google notes