Vulnerability Management
This page describes how the Ark project handles security vulnerabilities — how they are identified, how risk is rated, how patches are tracked, and how findings are reported. It consolidates practices that already exist across CI workflows, Dependabot configuration, the JFrog Xray whitelist, the auto-CVE-issue action, and the dated security-assurance report pages. It is intended as both an operator/contributor reference and a citable statement of practice for audits.
The broader engineering process this fits into is described in the Secure Software Development Lifecycle page; its Risk Assessment subsection cross-links here for the full treatment.
Overview
Ark identifies vulnerabilities through six channels, applies a baseline/whitelist prioritisation model, and tracks remediation via Dependabot, auto-created CVE issues, and a CVE-fixer agent flow. The Enforcement model below makes explicit which of these steps fail the build, which only surface results, and which are not part of CI at all.
| Source | Type | Where it runs |
|---|---|---|
| JFrog Xray build scan | SCA / supply chain | CI (.github/workflows/cicd.yaml, jfrog-xray-scan job) |
| JFrog Xray container scan | Image vulnerability | CI (xray-container-scan job, matrix over 9 images) |
| SonarQube | SAST / code quality | CI (.github/workflows/sonar_scan.yaml, self-hosted runner) |
| Dependabot | Dependency updates | GitHub (weekly per ecosystem) |
| Gitleaks | Secret scanning | Pre-commit hook only |
| Penetration testing | Manual assessment | External, periodic |
How scans gate merges
The merge gate for main is CODEOWNERS approval, not the security scans themselves. None of the scans below are configured as required status checks in the main-branch ruleset; they run on every PR, surface results visibly, and reviewers withhold approval when they are red. The full ruleset detail is in Secure SDLC → Enforcement model.
Within that context the individual scans differ in how they fail:
- Xray build scan — fails the workflow on new, non-whitelisted violations (diffed against the last successful main scan).
- Xray container scan — report-only; emits per-image severity tables to the GitHub Actions step summary and uploads JSON artifacts, but does not exit non-zero on findings.
- SonarQube — invoked with
-Dsonar.qualitygate.wait=true, so the workflow fails if the quality gate fails. - Dependabot — opens PRs that run through the same CI and review process as any other change.
Scanning methods
Dependency and supply-chain scanning — JFrog Xray (build scan)
Invoked from .github/workflows/cicd.yaml (jfrog-xray-scan job) against a project-specific JFrog Xray watch (a server-side policy binding — see Prioritisation). Scan output is uploaded as the artifact xray-scan-agents-at-scale-{run_number}.json. New violations are detected by diffing the current scan against the last successful main scan; non-whitelisted new findings fail the job.
Container image scanning — JFrog Xray (container scan)
Invoked from the xray-container-scan job (.github/workflows/cicd.yaml) and the composite action .github/actions/jfrog-xray-container-scan/. Runs as a matrix over the following nine images:
ark-controller, ark-completions, ark-api, ark-dashboard, ark-broker, langchain-weather-agent, ark-mcp, ark-cli, ark-tools.
For each image, the action runs jf docker scan, counts findings by severity (Critical / High / Medium / Low / Unknown), writes a markdown table to the GitHub Actions step summary, and uploads the raw JSON. It does not fail the job on findings; it is observability rather than a gate.
Static analysis — SonarQube
The sonar_scan job in .github/workflows/sonar_scan.yaml runs SonarQube against the codebase and waits on the configured quality gate (-Dsonar.qualitygate.wait=true). The SonarQube server is self-hosted (McKinsey-internal); fork PRs cannot complete this scan, by design. Findings of Security severity fail the workflow; reliability and maintainability findings are tracked under a downward-trend expectation in Code Analysis Reports.
Secret scanning — Gitleaks
gitleaks is configured as a pre-commit hook in .pre-commit-config.yaml and is not invoked in any CI workflow. The contributor-side gap — Ark does not document pre-commit install, so the hook only runs for developers who have set it up themselves — is detailed in Secure SDLC → Error checking procedures. Server-side GitHub secret-scanning alerts are the only mechanical backstop.
Dependency updates — Dependabot
.github/dependabot.yaml opens PRs on a weekly schedule for the following ecosystems and directories:
| Ecosystem | Directories |
|---|---|
github-actions | .github/workflows |
terraform | infrastructure/providers/aws, infrastructure/providers/gcp |
npm | services/ark-broker/ark-broker, services/ark-dashboard, docs |
pip | services/ark-api |
docker | services/ark-api, services/ark-broker/ark-broker, services/ark-dashboard, services/ark-mcp |
Each Dependabot PR runs the same CI pipeline and is subject to the same CODEOWNERS approval gate as any other change.
Manual assessment — penetration testing
Periodic third-party penetration tests are summarised in Penetration Testing Reports with risk level, remediation status, and dates. The most recent assessment recorded there is Pentest #2 (DataArt, December 2025; v1.1 retest March 2026).
Prioritisation and risk rating
Ark uses a baseline / whitelist model rather than fixed CVSS thresholds in-repo.
The whitelist .github/actions/jfrog-xray-scan/tolerated_violations.txt lists Xray violation IDs (typically each with one or more CVE references) that are accepted as either patched-pending-rebuild, not exploitable in the Ark context, or unpatched-upstream with a documented compensating control. Each entry includes prose justification: patch status, exploitability analysis, affected components, and mitigation steps. A typical entry looks like:
XRAY-987239 - CVE-2024-34997joblib NumpyArrayWrapper deserialization vulnerability. Severity: High (CVSS 7.8). Confirmed false positive by joblib maintainers (joblib/joblib#1588). NumpyArrayWrapper only deserializes joblib’s own cached content written to trusted local disk; it never processes externally-supplied data. No fix is planned because no real vulnerability exists. joblib 1.5.3 is already the latest version on PyPI. joblib is a transitive dependency of sentence-transformers in samples/rag-external-vectordb/ingestion/requirements.txt.
The CI logic (.github/workflows/cicd.yaml jfrog-xray-scan job) computes the set difference between the current scan and the last successful main scan, filters by the whitelist, and fails the workflow run only on the unwhitelisted-new subset.
The Xray watch is configured server-side in JFrog (its name is referenced from .github/workflows/cicd.yaml) and its policies are not stored in this repository, so its severity and license-policy rules are not documented here.
Action plan and patch management
Ark’s remediation flow has three entry points:
-
Routine dependency CVEs — Dependabot. The weekly Dependabot PRs (see Dependency updates) run the full CI pipeline; a maintainer reviews and merges, and the next main scan removes the violation.
-
New unwhitelisted Xray violations — auto-issued tracking. When the
jfrog-xray-scanjob detects new unwhitelisted violations, it invokes the composite action.github/actions/create-cve-issues/. The action extracts CVE IDs from the scan JSON, searches existing open issues labelledsecurity,CVE, and either:- Comments on a matching existing issue with scan metadata (build number, commit, branch, violation IDs), or
- Opens a new issue labelled
security,CVEtitled with the CVE list.
-
Manual CVE workflow / pentest findings —
ark-security-patcheragent. Maintainers can drive a structured workflow defined in.claude/agents/ark-security-patcher.md: check for existing issues, classify the finding (CVE / pentest / generic), research and analyse impact, propose mitigation options, implement the fix, and open afix:PR. The agent leverages thevulnerability-fixer,pentest-issue-resolver,research,analysis, andissuesskills.
Compensating controls. Where a finding cannot be remediated immediately (e.g. upstream patch not yet released, or finding is non-exploitable in Ark’s deployment shape), the justification and any compensating control are documented inline in tolerated_violations.txt. The whitelist therefore doubles as the citable record of accepted residual risk.
Reporting
Reporting outputs land in five places:
- JSON scan artifacts uploaded by every Xray job (build scan, container scan per image) with 30-day retention on the workflow run. The
jfrog-xray-scanjob names its artifactxray-scan-agents-at-scale-{run_number}.json; container scans are namedxray-container-scan-{image_name}.json. - GitHub Actions step summaries. The container scan composite action emits a markdown table per image (totals + per-vulnerability breakdown) to
$GITHUB_STEP_SUMMARY, visible on the run page. - SonarQube dashboard (self-hosted) shows per-PR analysis, quality gate status, and trend data; access is McKinsey-internal.
- Dated security-assurance report pages maintained in this documentation:
- Penetration Testing Reports — third-party assessment findings, risk level, remediation status, and dates.
- Code Analysis Reports — per-stack SonarQube issue counts and themes.
- Artifact Analysis Reports — per-image vulnerability counts.
- Auto-created GitHub issues for new CVE violations (labelled
security,CVE), serving as the visible trail of action items.
Risk rating by asset group
Ark’s assets are container images and dependency ecosystems, not traditional network tiers (DMZ / internal / management). Scanning is segmented along those two axes — per image via Container image scanning, and per dependency ecosystem via Dependency updates.
There is no formal asset-criticality ranking in the repository today (for example, ark-controller is not formally tagged as higher-severity than ark-cli); reviewer judgement during CODEOWNERS review is the current mechanism for that weighting.
Vulnerability intelligence sources
Vulnerability information feeding the above flows comes from two upstream sources:
- JFrog Xray CVE database — used for the build scan and the container scan. Defines the Xray issue IDs that appear in the whitelist and the watch.
- GitHub Security Advisory Database — used by Dependabot to surface CVEs in dependency ecosystems.
There is no current integration with Snyk, the NVD directly, the CISA Known Exploited Vulnerabilities (KEV) catalogue, OSV, or commercial threat-intelligence feeds. Findings outside what Xray and the GHSA database expose are picked up through penetration testing and ad-hoc maintainer research using the ark-security-patcher agent.
Known limitations
The gaps in the current process are called out inline in the sections above and indexed here:
- No security scan is a required status check — see How scans gate merges.
- The Xray container scan is report-only — see Container image scanning.
- No scheduled (cron) security workflow; no CodeQL, Trivy, OSV, or OpenSSF Scorecard — see Scan frequency and triggers.
- The JFrog Xray watch policy is not version-controlled in this repository — see Prioritisation and risk rating.
See also
- Secure Software Development Lifecycle — the broader engineering process; its Risk Assessment section cross-links here.
- Build Pipelines — full CI/CD pipeline overview.
- Penetration Testing Reports, Code Analysis Reports, Artifact Analysis Reports — dated assurance evidence.