Scenario 20: Dependency Analysis

Comprehensive dependency analysis: vulnerability checking, outdated package detection, license compliance, health scoring, and SBOM generation.

Table of Contents

Quick Start

/select 20

How It Works

Architecture

The dependency analysis module (src/dependencies/) consists of 4 components connected in a pipeline:

Project Source
    |
    v
DependencyGraphBuilder (scan_project, NetworkX graph)
    |
    v
DependencyGraph (files + dependencies + vulnerabilities)
    |
    +---> VulnerabilityChecker (OSV API + GitHub Advisory + NVD)
    |
    +---> DependencyAnalyzer (find_outdated, find_unused, find_duplicates,
    |                         suggest_updates, get_health_score)
    |
    +---> LicenseChecker (SPDX normalization, copyleft detection, SBOM export)
    |
    v
Report / REST API / CLI
Component Module Purpose
DependencyGraphBuilder graph_builder.py Scan projects, parse dependency files, build NetworkX graph
VulnerabilityChecker vulnerability.py Check dependencies against OSV, GitHub Advisory, NVD
DependencyAnalyzer analyzer.py Find outdated/unused/duplicate packages, suggest updates, health score
LicenseChecker license_checker.py SPDX normalization, copyleft detection, compatibility, SBOM export

7 parser modules (src/dependencies/parsers/):

Parser Files
Python pyproject.toml, requirements.txt, Pipfile, setup.py
JavaScript package.json, package-lock.json, yarn.lock, pnpm-lock.yaml
Go go.mod, go.sum
Java pom.xml, build.gradle, build.gradle.kts
C# *.csproj, packages.config, Directory.Packages.props
PHP composer.json, composer.lock
C/C++ conanfile.txt, conanfile.py, vcpkg.json, CMakeLists.txt

Intent Detection

The workflow (src/workflow/scenarios/dependencies_analysis.py) detects dependency-related queries via is_dependency_query() using 36 bilingual keywords:

Language Keywords
English (24) dependency, dependencies, package, npm, pip, cargo, maven, gradle, go mod, vulnerability, CVE, security audit, outdated, update, license, SBOM, supply chain, dependency graph, package manager, lock file, version conflict, transitive
Russian (12) зависимость, зависимости, пакет, уязвимость, устаревший, обновить, лицензия, граф зависимостей, менеджер пакетов, цепочка поставок

The dependencies_workflow() function is registered in LangGraph via graph_builder.py and routed through intent_classifier.pyrouter.py. Optional enrichment via EnrichmentAdapter adds vector context.

Supported Package Managers

PackageManager enum (12 values):

Value Manager Language
PIP pip Python
POETRY Poetry Python
CONDA Conda Python
NPM npm JavaScript
YARN Yarn JavaScript
PNPM pnpm JavaScript
GO Go Modules Go
CARGO Cargo Rust
BUNDLER Bundler Ruby
MAVEN Maven Java
GRADLE Gradle Java
NUGET NuGet .NET

Data Models

Dependency (14 fields)

Field Type Description
id str Unique identifier
name str Package name
version_spec str? Version requirement (e.g., >=1.0.0)
resolved_version str? Actually installed version
is_dev bool Development dependency
is_optional bool Optional dependency
is_direct bool Direct vs transitive (default: True)
license str? License identifier
repository_url str? Source repository URL
homepage_url str? Project homepage
description str? Package description
file_id str? Reference to DependencyFile
parent_ids list[str] Parent dependency IDs
child_ids list[str] Child dependency IDs

DependencyGraph (3 collections + 5 methods)

Field/Method Type Description
files list[DependencyFile] Parsed manifest files
dependencies dict[str, Dependency] All dependencies by ID
vulnerabilities list[Vulnerability] Found vulnerabilities
get_dependency(name) method Find dependency by name
get_direct_dependencies() method Filter direct dependencies
get_transitive_dependencies() method Filter transitive dependencies
get_dev_dependencies() method Filter dev dependencies
to_dict() method Serialize to dictionary

Vulnerability (15 fields)

Field Type Description
id str Internal ID
dependency_name str Affected package
dependency_version str? Affected version
vuln_id str CVE or GHSA identifier
source VulnerabilitySource OSV, GITHUB_ADVISORY, NVD, or SNYK
severity VulnerabilitySeverity CRITICAL, HIGH, MEDIUM, LOW, UNKNOWN
cvss_score float? CVSS score (0.0–10.0)
title str Vulnerability title
description str Detailed description
affected_versions str? Affected version range
fixed_version str? Version with fix
references list[str] Reference URLs
published_at datetime? Publication date
checked_at datetime? When the check was performed
metadata dict Additional metadata

LicenseInfo (7 fields)

Field Type Description
dependency_name str Package name
license_id str SPDX identifier
license_name str Human-readable name
is_osi_approved bool OSI-approved license
is_copyleft bool Copyleft license
is_allowed bool? Allowed by project policy
notes str? Additional notes

DependencyConfig (13 fields)

Field Type Default Description
auto_scan_on_import bool True Auto-scan on project import
include_dev bool True Include dev dependencies
max_depth int 10 Maximum dependency depth
vulnerability_enabled bool True Enable vulnerability checking
vulnerability_sources list[str] ["osv", "github_advisory", "nvd"] Active vulnerability sources
check_interval_hours int 24 Re-check interval
severity_threshold str "medium" Minimum severity to report
license_check_enabled bool True Enable license checking
allowed_licenses list[str] ["MIT", "Apache-2.0", ...] Permitted licenses
flagged_licenses list[str] ["GPL-3.0", "AGPL-3.0", ...] Flagged licenses
python_files list[str] ["pyproject.toml", ...] Python manifest patterns
javascript_files list[str] ["package.json", ...] JavaScript manifest patterns
go_files list[str] ["go.mod", "go.sum"] Go manifest patterns

Vulnerability Checking

VulnerabilityChecker queries 3 vulnerability databases:

Source API Ecosystems
OSV api.osv.dev/v1 (REST) PyPI, npm, Go, crates.io, Maven
GitHub Advisory GitHub GraphQL API All via GHSA
NVD National Vulnerability Database All via CPE

The VulnerabilitySource enum also includes SNYK but it has no active implementation.

Severity levels (VulnerabilitySeverity enum):

Severity CVSS Score
CRITICAL 9.0–10.0
HIGH 7.0–8.9
MEDIUM 4.0–6.9
LOW 0.1–3.9
UNKNOWN Not available

Dependency Analysis

DependencyAnalyzer provides 6 analysis methods:

Method Description
find_outdated(graph, check_latest) Find packages with newer versions available
find_unused(graph) Detect potentially unused dependencies
find_duplicates(graph) Find duplicate packages across files
suggest_updates(graph, conservative) Suggest safe updates (conservative: minor/patch only)
get_health_score(graph) Calculate 0–100 health score with A–F rating
_score_to_rating(score) Convert numeric score to letter grade

License Checking

LicenseChecker handles license compliance with:

  • SPDX normalization — 17 license aliases mapped to standard SPDX identifiers
  • License classification — 10 permissive, 16 copyleft, 4 weak copyleft licenses
  • OSI approval tracking

Key methods:

Method Description
check_license(dep) Check a single dependency’s license
check_graph(graph) Check all dependencies in the graph
find_issues(graph) Find license compatibility issues
find_missing_licenses(graph) Find dependencies without license info
check_compatibility(licenses) Check inter-license compatibility
export_sbom(graph, format) Export SBOM in SPDX 2.3 or CycloneDX 1.4 format

Health Score

get_health_score() returns a 0–100 score with breakdown:

Category Max Penalty Description
Vulnerabilities 50 points Based on count and severity
Outdated packages 30 points Based on number of outdated deps
Duplicates 10 points Based on duplicate count

Rating scale:

Score Rating
90–100 A
80–89 B
70–79 C
60–69 D
0–59 F

Configuration

DependencyConfig dataclass (13 parameters):

dependencies:
  auto_scan_on_import: true
  include_dev: true
  max_depth: 10

  # Vulnerability checking
  vulnerability_enabled: true
  vulnerability_sources:
    - osv
    - github_advisory
    - nvd
  check_interval_hours: 24
  severity_threshold: medium    # critical, high, medium, low

  # License checking
  license_check_enabled: true
  allowed_licenses:
    - MIT
    - Apache-2.0
    - BSD-3-Clause
    - BSD-2-Clause
    - ISC
  flagged_licenses:
    - GPL-3.0
    - AGPL-3.0
    - GPL-2.0

CLI Usage

# Scan for dependencies
python -m src.cli.import_commands deps scan [path] --include-dev --max-depth 10

# List dependencies (with filters)
python -m src.cli.import_commands deps list --outdated --dev --direct --format table|json|tree

# Check for vulnerabilities
python -m src.cli.import_commands deps check-vulns --severity medium --fail-on high -o report.json

# Show dependency graph
python -m src.cli.import_commands deps graph --format text|dot|json|mermaid --depth 3 -o graph.dot

# Check licenses
python -m src.cli.import_commands deps licenses --check-compliance --allow MIT Apache-2.0 --deny GPL-3.0

# Show health score
python -m src.cli.import_commands deps health

# Suggest safe updates
python -m src.cli.import_commands deps update --conservative
python -m src.cli.import_commands deps update --all

REST API

Primary project-scoped endpoints in src/api/routers/dependencies.py:

Method Endpoint Description
GET /api/v1/deps/projects/{project_name}/summary Registered-project SCA summary
GET /api/v1/deps/projects/{project_name}/dependencies List project dependencies
GET /api/v1/deps/projects/{project_name}/vulnerabilities Project vulnerability results
GET /api/v1/deps/projects/{project_name}/sbom Export project SBOM
POST /api/v1/deps/projects/{project_name}/audit Audit project dependencies
GET /api/v1/deps/projects/{project_name}/gost-report Generate project GOST 5.16.3 report

Legacy compatibility endpoints under /api/v1/deps/scan, /list, /graph, /check-vulnerabilities, /licenses, /health-score, /sbom, /audit, and /sync-cache are still available for explicit scan-first workflows, but the Web product path uses the project-scoped routes above.

Example:

# Scan for dependencies
curl http://localhost:8000/api/v1/deps/projects/codegraph/summary

# List dependencies
curl http://localhost:8000/api/v1/deps/projects/codegraph/dependencies

# Export SBOM
curl "http://localhost:8000/api/v1/deps/projects/codegraph/sbom?format=spdx"

Use Cases

Scanning Dependencies

> Scan project dependencies

## Dependency Scan Results

**Files analyzed:** 3
**Total dependencies:** 127 (23 direct, 45 dev, 59 transitive)

| Package | Version | Type | License |
|---------|---------|------|---------|
| fastapi | 0.104.1 | direct | MIT |
| pydantic | 2.5.2 | direct | MIT |
| sqlalchemy | 2.0.23 | direct | MIT |
| uvicorn | 0.24.0 | direct | BSD-3-Clause |
| ... | | | |

Checking Vulnerabilities

> Check dependencies for vulnerabilities

## Vulnerability Report

Found 3 vulnerabilities:

| ID | Package | Severity | Fixed In |
|----|---------|----------|----------|
| CVE-2024-12345 | requests | CRITICAL | 2.32.0 |
| CVE-2024-23456 | pillow | HIGH | 10.2.0 |
| CVE-2024-34567 | cryptography | MEDIUM | 41.0.7 |

Sources: OSV, GitHub Advisory, NVD

Health Score

> Show dependency health

## Dependency Health: 72/100 (C)

Breakdown:
- Vulnerabilities: 3 found (-25 pts)
- Outdated: 5 packages (-3 pts)
- Duplicates: 0 (-0 pts)

Example Questions

Scanning: - “Scan project for dependencies” - “What packages does this project use?” - “Show the dependency graph”

Security: - “Check for vulnerable packages” - “Find packages with critical vulnerabilities” - “Are there any CVEs in our dependencies?”

Updates: - “Are there any outdated dependencies?” - “Suggest safe updates for outdated packages” - “Show dependency health score”

Licenses: - “Check if all licenses are MIT compatible” - “Generate SBOM for the project” - “Find dependencies without license information”