Each guideline has an assigned priority. Priorities are assigned using a metric based on failure mode, effects, and criticality analysis (FMECA) [IEC 60812]. Three values are assigned for each guideline on a scale of 1 to 3 for

  • severity: How serious are the consequences of the guideline being ignored?
    1 = low (denial-of-service attack, abnormal termination)
    2 = medium (data integrity violation, unintentional information disclosure)
    3 = high (run arbitrary code, privilege escalation)
  • likelihood: How likely is it that a flaw introduced by ignoring the guideline could lead to an exploitable vulnerability?
    1 = unlikely
    2 = probable
    3 = likely
  • Detectable: Can a static analysis tool automatically determine if code violates this guideline with high accuracy and precision?
  • Repairable: Can an automated repair tool reliably fix an alert by making local changes, and can the repair be guaranteed not to break the code even if the alert is a false positive? (There might exist a small set of cases that the tool cannot repair, but the tool can reliably identify these cases.)

In the context of automated repair, the phrase "break the code" requires more elaboration.  We posit that noncompliant and un-repaired code currently works for some subset of inputs, which we would deem "expected inputs".  To be noncompliant, there must also exist "unexpected inputs" that trigger the noncompliant code to do something unintended.  This might be undefined behavior or simply unexpected or counter-intuitive behavior, such as producing an inaccurate mathematical result.  For a repair to not break the code, the repaired code must exhibit the same behavior for all the expected inputs and only change behavior for some or all of the unexpected inputs.  The changed behavior could be to signal an error condition, using whatever error-handling mechanism the code has adopted.

This definition of a repair differs from a refactor, which we define as a modification of the code that changes no behavior.  That is, the refactored code behaves on both expected and unexpected inputs the same as the un-refactored code.  If code can be automatically refactored to comply with a rule without changing its behavior on any inputs, that rule is automatically repairable (even though any such modification would be a refactor rather than a repair).

An automated repair tool does not need to know the developer's intent of any lines of code when repairing them.  But it can be informed about idiosyncratic general details about the source code's conventions. One example would be whether assertions are disabled in production code. Knowing of such details is necessary if the repair tool must make code changes involving assertions.

The Detectable and Repairable questions are combined into a single Remediation Cost metric value that ranges from 1 to 3, 

Not Detectable12
Detectable23

This Remediation Cost metric value, along with the Priority and Severity values are then multiplied together for each guideline. This product provides a measure that can be used in prioritizing the application of the guidelines. The products range from 1 to 27. Guidelines with a priority in the range of 1 to 4 are level 3 guidelines, 6 to 9 are level 2, and 12 to 27 are level 1. As a result, it is possible to claim level 1, level 2, or complete compliance (level 3) with a standard by implementing all guidelines in a level, as shown in the following illustration:

The metric is designed primarily for remediation projects. It is assumed that new development efforts will conform with the entire standard.

  • No labels