noa-x / leakguard öffentlich

Öffentliche Dateiansicht: Raw-Dateien, Tree, Releases und Issues sind ohne Login verfügbar.

DOCS.md

Zurück Raw

LeakGuard Documentation

Project Overview

LeakGuard is a fast, lightweight secret scanner designed to detect accidentally committed credentials, tokens, and sensitive configuration values in codebases. Built in Rust with Python bindings, it provides both a command-line interface and a Python package for integration into development workflows.

Key Features

104 built-in detection rules covering cloud providers, LLM platforms, databases, HTTP authentication, observability tools, and SaaS ecosystems
High-performance scanning optimized for local development and CI environments
Multiple output formats: pretty (default), json, sarif, and markdown
GitHub Actions integration with --github-summary flag for workflow summaries
False positive management:
- Inline ignore markers (# leakguard:ignore)
- Rule-level disabling via configuration
Safe defaults:
- Binary files automatically skipped
- .env files excluded by default
Cross-platform support (Linux, macOS, Windows)
Python package distribution via PyPI

Use Cases

Pre-commit scanning in local development
CI/CD pipeline security checks
Repository audits for exposed secrets
Integration with security monitoring tools

Architecture & Components

LeakGuard follows a modular architecture with clear separation between core scanning logic and interface layers.

Core Components

1. Rule Engine

Located in src/rules/
Contains 104+ detection patterns for common secret formats
Rules are defined as Rust structs with regex patterns and metadata
Supports rule versioning and categorization

2. Scanner

Implements file traversal using walkdir
Handles file filtering (extensions, paths)
Manages parallel scanning of files
Applies detection rules to file contents

3. Output Formatters

pretty: Human-readable colored terminal output
json: Machine-readable output for tooling
sarif: Static Analysis Results Interchange Format (for GitHub Advanced Security)
markdown: GitHub-flavored markdown for reporting

4. Configuration System

TOML-based configuration files
Environment variable overrides
Command-line argument parsing

Technical Stack

Component	Technology	Purpose
Core	Rust	High-performance scanning engine
Python Bindings	PyO3 + Maturin	Python package distribution
CLI	Clap	Command-line argument parsing
Configuration	TOML	User settings
Output Formatting	Serde + Custom impl	Multiple output formats
File Traversal	Walkdir	Directory scanning

Build System

LeakGuard uses a hybrid build system:

Rust Toolchain:
- Primary build system for the core scanner
- cargo build for development
- cargo test for unit/integration tests
Python Packaging:
- Maturin for building Python wheels
- pyproject.toml defines build requirements
- Supports both pure Rust and Python extension modules

Getting Started

Prerequisites

System Requirements

Rust 1.70+ (for development)
Python 3.8+ (for Python package)
pip (for Python installation)

Supported Platforms

Linux (x86_64, aarch64)
macOS (x86_64, arm64)
Windows (x86_64)

Installation

From PyPI (Recommended)

pip install leakguard

From Source

# Clone repository
git clone https://github.com/adrian-lorenz/leakguard.git
cd leakguard

# Install Rust toolchain if needed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Build and install Python package
pip install maturin
maturin develop --release

Pre-built Binaries

Download pre-built binaries from GitHub Releases:

# Example for Linux x86_64
wget https://github.com/adrian-lorenz/leakguard/releases/download/v1.0.4/leakguard-v1.0.4-x86_64-unknown-linux-gnu.tar.gz
tar -xzf leakguard-v1.0.4-x86_64-unknown-linux-gnu.tar.gz
./leakguard --version

Running LeakGuard

Basic Scan

leakguard scan /path/to/codebase

Common Options

# Scan with JSON output
leakguard scan --output json /path/to/codebase

# Scan with SARIF output (for GitHub Advanced Security)
leakguard scan --output sarif /path/to/codebase

# Generate GitHub Actions summary
leakguard scan --github-summary /path/to/codebase

# Scan with custom configuration
leakguard scan --config leakguard.toml /path/to/codebase

Python Usage

from leakguard import scan

results = scan(
    path="/path/to/codebase",
    output_format="json",
    github_summary=True
)
print(results)

Configuration

LeakGuard supports configuration through command-line arguments, environment variables, and configuration files.

Configuration File

Create a leakguard.toml file in your project root or specify a custom path with --config:

# Example leakguard.toml
[scanner]
exclude = ["**/node_modules/**", "**/dist/**"]
include_extensions = [".py", ".js", ".go", ".rs"]
max_file_size = 1048576  # 1MB

[rules]
disable = ["aws-access-key", "slack-token"]
severity_threshold = "medium"

[output]
format = "json"
github_summary = true

Environment Variables

Variable	Description	Default
`LEAKGUARD_CONFIG`	Path to config file	None
`LEAKGUARD_OUTPUT`	Output format (`pretty`, `json`, etc)	`pretty`
`LEAKGUARD_GITHUB`	Enable GitHub summary	`false`

Command-line Arguments

leakguard scan --help

Key arguments:

USAGE:
    leakguard scan [OPTIONS] <PATH>

ARGS:
    <PATH>    Path to scan

OPTIONS:
    -c, --config <CONFIG>          Path to config file
    -o, --output <FORMAT>          Output format [default: pretty] [possible values: pretty, json, sarif, markdown]
        --github-summary           Generate GitHub Actions summary
    -e, --exclude <PATTERN>...     Glob patterns to exclude
    -i, --include <EXTENSION>...   File extensions to include
        --max-file-size <BYTES>    Maximum file size to scan (bytes) [default: 1048576]
        --severity <LEVEL>         Minimum severity to report [default: low] [possible values: low, medium, high, critical]
    -h, --help                     Print help information

Rule Configuration

Disable specific rules in your config file:

[rules]
disable = [
    "aws-access-key-id",
    "slack-webhook",
    "github-pat"
]

Path Exclusion

Exclude paths using glob patterns:

[scanner]
exclude = [
    "**/node_modules/**",
    "**/vendor/**",
    "**/dist/**",
    "**/build/**",
    "**/.git/**",
    "**/.env",
    "**/*.min.js",
    "**/*.lock"
]

API / Usage Reference

Command-line Interface

`scan` Command

Primary command for scanning directories:

leakguard scan [OPTIONS] <PATH>

Options

Option	Description	Default
`--config <PATH>`	Path to configuration file	`leakguard.toml`
`--output <FORMAT>`	Output format (`pretty`, `json`, `sarif`, `markdown`)	`pretty`
`--github-summary`	Generate GitHub Actions summary	`false`
`--exclude <PATTERN>`	Glob patterns to exclude (can be specified multiple times)
`--include <EXT>`	File extensions to include (e.g., `.py`, `.js`)	All text files
`--max-file-size <BYTES>`	Maximum file size to scan (bytes)	`1048576` (1MB)
`--severity <LEVEL>`	Minimum severity to report (`low`, `medium`, `high`, `critical`)	`low`
`--no-ignore`	Don't respect `.gitignore` files	`false`

Example Commands

Basic scan with default settings:

leakguard scan .

Scan with JSON output and GitHub summary:

leakguard scan --output json --github-summary .

Scan with custom exclusions and severity threshold:

leakguard scan --exclude "**/tests/**" --exclude "**/fixtures/**" --severity high .

Python API

`scan()` Function

Main function for Python integration:

def scan(
    path: str,
    *,
    config_path: Optional[str] = None,
    output_format: str = "json",
    github_summary: bool = False,
    exclude: Optional[List[str]] = None,
    include_extensions: Optional[List[str]] = None,
    max_file_size: int = 1048576,
    severity_threshold: str = "low",
    no_ignore: bool = False
) -> Union[dict, str]:
    """
    Scan a directory for secrets.

    Args:
        path: Path to scan
        config_path: Path to config file
        output_format: Output format ('json', 'sarif', 'markdown')
        github_summary: Generate GitHub Actions summary
        exclude: List of glob patterns to exclude
        include_extensions: List of file extensions to include
        max_file_size: Maximum file size in bytes
        severity_threshold: Minimum severity to report
        no_ignore: Don't respect .gitignore files

    Returns:
        Scan results in specified format (dict for JSON, str for others)
    """
    pass

Example Usage

from leakguard import scan

# Basic scan
results = scan("/path/to/codebase")
print(results)

# Advanced scan with configuration
results = scan(
    path="/path/to/codebase",
    output_format="sarif",
    github_summary=True,
    exclude=["**/tests/**", "**/fixtures/**"],
    severity_threshold="medium"
)

# Process results
if results:
    print(f"Found {len(results['results'])} potential secrets")

Output Formats

Pretty (Default)

Human-readable terminal output with colors:

Found 2 potential secrets in 42 files:

[HIGH] AWS Access Key ID
  → /src/config/prod.py:42
  42 | AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"

[MEDIUM] Slack Webhook
  → /scripts/deploy.sh:15
  15 | SLACK_WEBHOOK="https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"

JSON

Machine-readable output:

{
  "version": "1.0.4",
  "results": [
    {
      "rule_id": "aws-access-key-id",
      "rule_name": "AWS Access Key ID",
      "severity": "high",
      "file": "/src/config/prod.py",
      "line": 42,
      "match": "AKIAIOSFODNN7EXAMPLE",
      "context": "AWS_ACCESS_KEY_ID = \"AKIAIOSFODNN7EXAMPLE\""
    }
  ],
  "stats": {
    "files_scanned": 42,
    "files_skipped": 8,
    "secrets_found": 2,
    "duration_ms": 125
  }
}

SARIF

Static Analysis Results Interchange Format (for GitHub Advanced Security):

{
  "$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
  "version": "2.1.0",
  "runs": [
    {
      "tool": {
        "driver": {
          "name": "leakguard",
          "version": "1.0.4",
          "informationUri": "https://github.com/adrian-lorenz/leakguard"
        }
      },
      "results": [
        {
          "ruleId": "aws-access-key-id",
          "level": "error",
          "message": {
            "text": "AWS Access Key ID detected"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "file:///src/config/prod.py"
                },
                "region": {
                  "startLine": 42,
                  "snippet": {
                    "text": "AWS_ACCESS_KEY_ID = \"AKIAIOSFODNN7EXAMPLE\""
                  }
                }
              }
            }
          }
        }
      ]
    }
  ]
}

Markdown

GitHub-flavored markdown for reporting:

# LeakGuard Scan Results

**Version**: 1.0.4
**Files scanned**: 42
**Secrets found**: 2
**Duration**: 125ms

## Findings

### HIGH: AWS Access Key ID
**File**: `/src/config/prod.py:42`
**Rule ID**: aws-access-key-id

```python
42 | AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"

MEDIUM: Slack Webhook

File: /scripts/deploy.sh:15 Rule ID: slack-webhook

15 | SLACK_WEBHOOK="https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"


## Ignoring Findings

### Inline Ignore

Add a comment to ignore specific findings:

```python
# leakguard:ignore aws-access-key-id
AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"  # This is a test key

File-level Ignore

Add a .leakguardignore file to your project root:

# Ignore specific rules
aws-access-key-id
slack-webhook

# Ignore specific files
**/test_keys.py
**/fixtures/*

Contributing

We welcome contributions to LeakGuard! Here's how you can help:

Development Setup

Clone the repository:

git clone https://github.com/adrian-lorenz/leakguard.git
cd leakguard

Install Rust toolchain:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup update

Install Python dependencies:

pip install maturin pytest

Build the project:

maturin develop --release

Code Structure

.
├── .github/            # GitHub workflows and issue templates
├── src/
│   ├── lib.rs          # Main library code
│   ├── main.rs         # CLI entry point
│   ├── rules/          # Detection rules
│   ├── scanner.rs      # Scanning logic
│   ├── output.rs       # Output formatters
│   └── config.rs       # Configuration handling
├── tests/              # Integration tests
├── Cargo.toml          # Rust manifest
└── pyproject.toml      # Python package configuration

Testing

Rust Tests

cargo test

Python Tests

pytest

Integration Tests

# Run with test fixtures
cargo test -- --ignored

Adding New Rules

Create a new rule in src/rules/:

// Example rule definition
pub fn aws_access_key_id() -> Rule {
    Rule {
        id: "aws-access-key-id".to_string(),
        name: "AWS Access Key ID".to_string(),
        pattern: Regex::new(r"(?i)aws(.{0,20})?(?-i)['\"][0-9a-z/+]{20,40}['\"]").unwrap(),
        severity: Severity::High,
        description: "Detects AWS Access Key IDs".to_string(),
        tags: vec!["aws".to_string(), "cloud".to_string()],
    }
}

pub fn all_rules() -> Vec<Rule> {
    vec![
        aws::aws_access_key_id(),
        // ... other rules
        your_new_rule(),
    ]
}

Add test cases in tests/rules.rs:

#[test]
fn test_your_new_rule() {
    let rule = your_new_rule();
    assert!(rule.pattern.is_match("AKIAIOSFODNN7EXAMPLE"));
    assert!(!rule.pattern.is_match("not-a-key"));
}

Code Style

Rust

Follow Rust's official style guidelines
Use cargo fmt for formatting
Use cargo clippy for linting

cargo fmt
cargo clippy -- -D warnings

Python

Follow PEP 8 guidelines
Use black for formatting
Use flake8 for linting

black .
flake8

Pull Request Process

Fork the repository
Create a feature branch (git checkout -b feature/your-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin feature/your-feature)
Open a Pull Request

Release Process

Update version in Cargo.toml
Update changelog
Create a tag (git tag -a vX.Y.Z -m "Release X.Y.Z")
Push the tag (git push origin vX.Y.Z)
GitHub Actions will build and publish the release

Community

Issue Tracker: https://github.com/adrian-lorenz/leakguard/issues
Discussions: https://github.com/adrian-lorenz/leakguard/discussions
Security Policy: Report vulnerabilities to security@noa-x.de

Sprachen

Rust 75%

Markdown 21%

YAML 4.1%

Klonen

HTTPS

Branches