5 SV-AF – A Security Vulnerability Analysis Framework
5.4 Case Studies
5.4.2 Case Studies Results
Case Study 1: Identifying open source components that are directly susceptible to known
security vulnerabilities.
Objective: The goal of this study is to evaluate the performance of our semantic similarity
linking approach used to align two domain specific ontologies.
Approach: In order to align (link) these two ontologies (SEVONT and SBSON), we use the PSL
framework to align project specific information found in both ontologies. We trained PSL using a corpus of 500 randomly selected project instance pairs for which we manually created links. We then executed our PSL alignment rules on this training dataset to train our approach. As a result of this training, two concept instances in these ontologies can now be aligned with a degree of certainty, if A and B, with same names are defined in different ontologies ( ¬𝑆𝑎𝑚𝑒𝑆𝑜𝑢𝑟𝑐𝑒 ) and have similar Vendors and same Version numbers. SameName, SimilarVendor, and SameVersion are similarity functions implemented using a Levenshtein distance metric. In this example, the SameProject(A,B) is given a weight of 0.9 (Listing 8), which is based on results from the PSL training set. Figure 31 shows the PSL inference results
70 https://db.apache.org/derby/
71 http://hibernate.org/validator/
for our training dataset, with the weights for the 𝑆𝑎𝑚𝑒𝑃𝑟𝑜𝑗𝑒𝑐𝑡(𝐴, 𝐵) alignment ranging from a minimum of 0.04 to a maximum of 0.42.
Using the semantic rule (Listing 8), PSL can now perform maximum a posteriori (MPE) reasoning [219] to infer the most likely values for a set of propositions and observed values for the remaining (evidence) propositions.
𝑆𝑜𝑢𝑟𝑐𝑒(𝐴, 𝑆𝑛𝐴) ∧ 𝑆𝑜𝑢𝑟𝑐𝑒(𝐵, 𝑆𝑛𝐵) ∧ ¬𝑆𝑎𝑚𝑒𝑆𝑜𝑢𝑟𝑐𝑒(𝑆𝑛𝐴, 𝑆𝑛𝐵) ∧ 𝑁𝑎𝑚𝑒(𝐴, 𝑋1) ∧ 𝑁𝑎𝑚𝑒(𝐵, 𝑌1) ∧ 𝑆𝑎𝑚𝑒𝑁𝑎𝑚𝑒(𝑋1, 𝑌1) ∧ 𝑉𝑒𝑛𝑑𝑜𝑟(𝐴, 𝑋2) ∧ 𝑉𝑒𝑛𝑑𝑜𝑟(𝐵, 𝑌2) ∧ 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑉𝑒𝑛𝑑𝑜𝑟(𝑋2, 𝑌2) ∧ 𝑉𝑒𝑟𝑠𝑖𝑜𝑛(𝐴, 𝑋3) ∧ 𝑉𝑒𝑟𝑠𝑖𝑜𝑛(𝐵, 𝑌3) ∧ 𝑆𝑎𝑚𝑒𝑉𝑒𝑟𝑠𝑖𝑜𝑛(𝑋3, 𝑌3) ⇒ 𝑆𝑎𝑚𝑒𝑃𝑟𝑜𝑗𝑒𝑐𝑡(𝐴, 𝐵) weight: 0.9
Listing 8: SameProject Rules.
For a full discussion on MPE reasoning, we refer the reader to [219]. The results of the PSL inference are a set of 𝐴 × 𝐵 SameProject weights that range from [0..1], with 0 two concept instances having no similarity and 1 corresponding to 100% similarity among instances.
Figure 31: PSL similarities results.
As part of our knowledge modeling approach, we materialized the inferred semantic instance links (owl:sameAs) between the SEVONT and SBSON ontology, making this inferred knowledge a persistent part of our knowledge model. We add weights for each link based on the inferred similarity values using the domain spanning similarity measure (SimilarityMeasure) class in our model (Section 5.2.1).
Findings. Our study showed that 0.062% of all Maven projects contain known security
vulnerabilities that have been reported in the NVD database. An example for such a vulnerability is shown in Table 29.
Table 29: Example of linked vulnerability
SEVONT fact SBSON fact Corresponding Vulnerability
Sevont- securityDB.owl#sonatype :nexus:2.3.1 Sbson- build.owl#org.sonaty pe.nexus:nexus:2.3.1 Sevont-securityDB.owl#CVE-2014-0792
A further results analysis showed that projects might often suffer from multiple vulnerabilities. We also observed that 48.8% of the 750 identified vulnerable project releases suffer from multiple security vulnerabilities, with PostgreSQL 7.4.1 being the most vulnerable project in our dataset, containing 25 known vulnerabilities. Providing this additional insight can guide system update decisions and help avoid the reuse of APIs/components with known security vulnerabilities or components that might be prone to these types of vulnerabilities.
For example, in December 2010, Google released its Nexus S smartphone73. The phone was
originally running on Android 2.3.3—an Android version that already contained the security vulnerability discussed in Table 30. While the Nexus S received regular Android OS updates up to Android Version 4.2, an actual fix of the reported vulnerability (CVE-2013-4787) was only introduced with Android 4.2.2. However, this new Android version is not supported and distributed for the Nexus S, leaving existing users of the phone susceptible to attacks. Our analysis also showed that the same vulnerability can affect multiple releases of a product. For example, security vulnerability CVE-2013-478774 has been reported for five different Android versions (Table 30). For product maintainers this information can help to ensure consistent patching and regression testing across product lines or different versions of a product.
Table 30: Critical Vulnerabilities for Android Project
Android Version CVE-IDs # of direct dependencies
SBSON#com.google.android:android:2.2.1 CVE-2013-4787 360 SBONS#com.google.android:android:2.3.1 CVE-2013-4787 176 SBSON#com.google.android:android:2.3.3 CVE-2013-4787 351 SBSON#com.google.android:android:3.0 CVE-2013-4787 34 SBSON#com.google.android:android:4.2 CVE-2013-4787 1 73 https://en.wikipedia.org/wiki/Nexus_S 74 https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2013-4787
Evaluation: We evaluate the linking accuracy when aligning project instances (owl:sameAs)
between our Maven and NVD ontologies.
During the first step of our evaluation, we compared the impact of the similarity weight thresholds (w = 0.1, w = 0.2, w = 0.3, and w = 0.4) in terms of precision, recall, and F1 measure on the inferred links created by the PSL alignment process. Precision is calculated with true positives being the number of project instance pairs correctly classified as similar, while false positives correspond to the number of non-similar instance pairs that are incorrectly classified as same projects. For Recall, false negatives correspond to the number of non-similar instance pairs that are incorrectly classified as being similar projects. The F1-score is the harmonic mean of precision and recall, giving equal weight to both measures.
Table 31: owl:sameAs link (w) evaluation
Precision Data Size w=0.0 w=0.1 w=0.2 w=0.3 w=0.4 500 0.77 0.88 0.98 0.93 0.75 Recall 0.77 0.68 0.30 0.03 0.01 F1-score 0.77 0.77 0.46 0.05 0.01
Our analysis (Table 31) showed that an increase in the similarity threshold from 0.1 (low similarity) to 0.4 (higher similarity) had limited effect on the precision (decrease from 0.98 to 0.75), recall was significantly lower (down from 0.68 to 0.01).
A manual inspection of the inferred links showed that the low recall for the higher threshold values is due to the inconsistent capturing of vendor information within the two ontologies. NVD relies on the common name to identify a vendor, whereas Maven uses the fully qualified package name as the vendor name. For example, using a w=0.0, org.apache.cxf:cxf:3.0.1,org.apache.geronim.configs:cxf:3.0.1 and org.apache.geronimo.plugins:cxf:3.0.1 in SBSON will be considered the same instance as
apache:cxf:3.0.1 in SEVONT and therefore correctly linked. However using a higher similarity threshold,
these instances will no longer be linked. We use the similarity weight of w = 0.1 in all subsequent experiments due to its high F1-score.
We further evaluated the link quality by comparing our approach against the OWASP Dependency-Check tool [42], a specialized tool which identifies direct dependencies between projects and publicly disclosed vulnerabilities. For the study, we apply the OWASP dependency check tool as our gold standard and compare the detected dependencies against the links
generated by our approach (Table 32). The low OWASP recall is because OWASP requires JAR files to be available to be able to map the files to the vulnerabilities. However, not all projects hosted in Maven are distributed with their JAR files.
Table 32: SV-AF vs. OWASP Dependency Check tool accuracy evaluation
Data Size SV-AF w=0.1 Precision Recall F1-score Precision Recall F1-score OWASP
500 0.88 0.68 0.77 0.81 0.26 0.40
Case Study 2: Identifying open source components that are directly and indirectly dependent on vulnerable components.
Objective: In this study we evaluate how our framework can support the analysis of potential
security vulnerability impacts on dependent software components. Furthermore, the case study illustrates the flexibility of our knowledge modeling approach and highlights how additional knowledge resources can be seamlessly integrated and reasoned upon.
Approach: For this case study, we extend our analysis to include transitive closure dependencies
(Figure 32) that not only identify components that are directly but also indirectly affected by known vulnerabilities. For this impact analysis, we selected 5 open source Java projects (Table 28) with known security vulnerabilities for which we do not distinguish if a component actually makes use (calls) a vulnerable component or not.
Project #1 dependsOn Project #2 dependsOn Project #3 Project #n
Level #1 Level #2
dependsOn
Level #n
Inferred relation Declared relation
dependsOn
Figure 32: Inferred project dependencies in SBSON.
Findings: In what follows, we summarize the findings from our case study. We report on our
transitive dependency analysis which highlights also the benefits of our knowledge modeling approach, the ability to integrate knowledge resources while taking advantage of inference services provided by the SW. Given the bi-directional links we established between the NVD
and the Maven repository, our analysis is no longer limited to identifying whether a project depends on a vulnerable component. Instead, given a vulnerable component we can now also provide a more holistic analysis by identifying in a global context which other projects potentially directly or indirectly depend on this vulnerable component.
Table 33 provides a summary of our analysis. In order to keep the results simple and readable, we consider only three levels of transitivity. For example, the vulnerable project Hibernate-validator 4.1.0 (P4) has a potential impact set of 3,805 direct dependent projects (level 1) and 128,109 dependent projects when we consider an additional two levels of transitivity (level 3).
Table 33: Transitive dependencies on vulnerable components
ID Component Name # Vulner-abilities CVE-IDs Number of dependent components based on transitivity level (L)
L1 L2 L3
P1 Wss4j 1.6.16 2 CVE-2015-0227 CVE-2014-3623 336 639 73
P2 Httpclient 4.1 2 CVE-2011-1498 CVE-2014-3577 685 4,961 41,326
P3 Derby 10.1.1.0 3 CVE-2006-7216 CVE-2005-4849
CVE-2006-7217 385 37,999 66,147
P4 Hibernate-validator 4.1.0.Final 1 CVE-2014-3558 3,805 39,295 128,109
P5 Openjpa 1.1.0 1 CVE-2013-1768 74 49,460 141,303
Figure 33 illustrates a typical usage scenario for our modeling approach. While the Geronimo- jetty6-javaee5 (version 2.1.1) itself has no known vulnerabilities reported, the project depends on several components (level 1 dependencies) with known security issues (5 Java projects with a total of 15 known security vulnerabilities), thus potentially making Geronimo-jetty6-javaee5 a very vulnerable component.
Dojo (version 1.0.2) CVE-2010- 2276 affects uses CVE-2010- 2274 CVE-2010- 2275 CVE-2010- 2273 affects affects affects Openjpa (versions 1.0.2 & 2.1.1) CVE-2013- 1768 affects uses Myfaces (version 2.1.1) CVE-2011- 4367 affects uses Cxf (version 2.1.1) CVE-2011- 4367 affects uses Jetty (version 6.1.7) CVE-2009- 4612 CVE-2009- 1524 CVE-2009- 1523 CVE-2009- 4461 CVE-2009- 4610 CVE-2009- 4609 CVE-2009- 4611 affects affects affects affects affects affects affects uses Geronimo-jetty6-javaee5 (version 2.1.1) Medium Severity High Severity External APIs
Figure 33: Geronimo-jetty6-javaee5 uses 5 projects (external APIs) from level 1 dependency and each project suffers from security vulnerabilities.