Öffentliche Dateiansicht: Raw-Dateien, Tree, Releases und Issues sind ohne Login verfügbar.
DOCS.md

LeakGuard Documentation

Project Overview

LeakGuard is a fast, lightweight secret scanner designed to detect accidentally committed credentials, tokens, and sensitive configuration values in codebases. Built in Rust with Python bindings, it provides both a command-line interface and a Python package for integration into development workflows.

Key Features

  • 104 built-in detection rules covering cloud providers, LLM platforms, databases, HTTP authentication, observability tools, and SaaS ecosystems
  • High-performance scanning optimized for local development and CI environments
  • Multiple output formats: pretty (default), json, sarif, and markdown
  • GitHub Actions integration with --github-summary flag for workflow summaries
  • False positive management:
    • Inline ignore markers (# leakguard:ignore)
    • Rule-level disabling via configuration
  • Safe defaults:
    • Binary files automatically skipped
    • .env files excluded by default
  • Cross-platform support (Linux, macOS, Windows)
  • Python package distribution via PyPI

Use Cases

  • Pre-commit scanning in local development
  • CI/CD pipeline security checks
  • Repository audits for exposed secrets
  • Integration with security monitoring tools

Architecture & Components

LeakGuard follows a modular architecture with clear separation between core scanning logic and interface layers.

Core Components

1. Rule Engine

  • Located in src/rules/
  • Contains 104+ detection patterns for common secret formats
  • Rules are defined as Rust structs with regex patterns and metadata
  • Supports rule versioning and categorization

2. Scanner

  • Implements file traversal using walkdir
  • Handles file filtering (extensions, paths)
  • Manages parallel scanning of files
  • Applies detection rules to file contents

3. Output Formatters

  • pretty: Human-readable colored terminal output
  • json: Machine-readable output for tooling
  • sarif: Static Analysis Results Interchange Format (for GitHub Advanced Security)
  • markdown: GitHub-flavored markdown for reporting

4. Configuration System

  • TOML-based configuration files
  • Environment variable overrides
  • Command-line argument parsing

Technical Stack

Component Technology Purpose
Core Rust High-performance scanning engine
Python Bindings PyO3 + Maturin Python package distribution
CLI Clap Command-line argument parsing
Configuration TOML User settings
Output Formatting Serde + Custom impl Multiple output formats
File Traversal Walkdir Directory scanning

Build System

LeakGuard uses a hybrid build system:

  1. Rust Toolchain:

    • Primary build system for the core scanner
    • cargo build for development
    • cargo test for unit/integration tests
  2. Python Packaging:

    • Maturin for building Python wheels
    • pyproject.toml defines build requirements
    • Supports both pure Rust and Python extension modules

Getting Started

Prerequisites

System Requirements

  • Rust 1.70+ (for development)
  • Python 3.8+ (for Python package)
  • pip (for Python installation)

Supported Platforms

  • Linux (x86_64, aarch64)
  • macOS (x86_64, arm64)
  • Windows (x86_64)

Installation

From PyPI (Recommended)

pip install leakguard

From Source

# Clone repository
git clone https://github.com/adrian-lorenz/leakguard.git
cd leakguard

# Install Rust toolchain if needed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Build and install Python package
pip install maturin
maturin develop --release

Pre-built Binaries

Download pre-built binaries from GitHub Releases:

# Example for Linux x86_64
wget https://github.com/adrian-lorenz/leakguard/releases/download/v1.0.4/leakguard-v1.0.4-x86_64-unknown-linux-gnu.tar.gz
tar -xzf leakguard-v1.0.4-x86_64-unknown-linux-gnu.tar.gz
./leakguard --version

Running LeakGuard

Basic Scan

leakguard scan /path/to/codebase

Common Options

# Scan with JSON output
leakguard scan --output json /path/to/codebase

# Scan with SARIF output (for GitHub Advanced Security)
leakguard scan --output sarif /path/to/codebase

# Generate GitHub Actions summary
leakguard scan --github-summary /path/to/codebase

# Scan with custom configuration
leakguard scan --config leakguard.toml /path/to/codebase

Python Usage

from leakguard import scan

results = scan(
    path="/path/to/codebase",
    output_format="json",
    github_summary=True
)
print(results)

Configuration

LeakGuard supports configuration through command-line arguments, environment variables, and configuration files.

Configuration File

Create a leakguard.toml file in your project root or specify a custom path with --config:

# Example leakguard.toml
[scanner]
exclude = ["**/node_modules/**", "**/dist/**"]
include_extensions = [".py", ".js", ".go", ".rs"]
max_file_size = 1048576  # 1MB

[rules]
disable = ["aws-access-key", "slack-token"]
severity_threshold = "medium"

[output]
format = "json"
github_summary = true

Environment Variables

Variable Description Default
LEAKGUARD_CONFIG Path to config file None
LEAKGUARD_OUTPUT Output format (pretty, json, etc) pretty
LEAKGUARD_GITHUB Enable GitHub summary false

Command-line Arguments

leakguard scan --help

Key arguments:

USAGE:
    leakguard scan [OPTIONS] <PATH>

ARGS:
    <PATH>    Path to scan

OPTIONS:
    -c, --config <CONFIG>          Path to config file
    -o, --output <FORMAT>          Output format [default: pretty] [possible values: pretty, json, sarif, markdown]
        --github-summary           Generate GitHub Actions summary
    -e, --exclude <PATTERN>...     Glob patterns to exclude
    -i, --include <EXTENSION>...   File extensions to include
        --max-file-size <BYTES>    Maximum file size to scan (bytes) [default: 1048576]
        --severity <LEVEL>         Minimum severity to report [default: low] [possible values: low, medium, high, critical]
    -h, --help                     Print help information

Rule Configuration

Disable specific rules in your config file:

[rules]
disable = [
    "aws-access-key-id",
    "slack-webhook",
    "github-pat"
]

Path Exclusion

Exclude paths using glob patterns:

[scanner]
exclude = [
    "**/node_modules/**",
    "**/vendor/**",
    "**/dist/**",
    "**/build/**",
    "**/.git/**",
    "**/.env",
    "**/*.min.js",
    "**/*.lock"
]

API / Usage Reference

Command-line Interface

scan Command

Primary command for scanning directories:

leakguard scan [OPTIONS] <PATH>

Options

Option Description Default
--config <PATH> Path to configuration file leakguard.toml
--output <FORMAT> Output format (pretty, json, sarif, markdown) pretty
--github-summary Generate GitHub Actions summary false
--exclude <PATTERN> Glob patterns to exclude (can be specified multiple times)
--include <EXT> File extensions to include (e.g., .py, .js) All text files
--max-file-size <BYTES> Maximum file size to scan (bytes) 1048576 (1MB)
--severity <LEVEL> Minimum severity to report (low, medium, high, critical) low
--no-ignore Don't respect .gitignore files false

Example Commands

  1. Basic scan with default settings:
leakguard scan .
  1. Scan with JSON output and GitHub summary:
leakguard scan --output json --github-summary .
  1. Scan with custom exclusions and severity threshold:
leakguard scan --exclude "**/tests/**" --exclude "**/fixtures/**" --severity high .

Python API

scan() Function

Main function for Python integration:

def scan(
    path: str,
    *,
    config_path: Optional[str] = None,
    output_format: str = "json",
    github_summary: bool = False,
    exclude: Optional[List[str]] = None,
    include_extensions: Optional[List[str]] = None,
    max_file_size: int = 1048576,
    severity_threshold: str = "low",
    no_ignore: bool = False
) -> Union[dict, str]:
    """
    Scan a directory for secrets.

    Args:
        path: Path to scan
        config_path: Path to config file
        output_format: Output format ('json', 'sarif', 'markdown')
        github_summary: Generate GitHub Actions summary
        exclude: List of glob patterns to exclude
        include_extensions: List of file extensions to include
        max_file_size: Maximum file size in bytes
        severity_threshold: Minimum severity to report
        no_ignore: Don't respect .gitignore files

    Returns:
        Scan results in specified format (dict for JSON, str for others)
    """
    pass

Example Usage

from leakguard import scan

# Basic scan
results = scan("/path/to/codebase")
print(results)

# Advanced scan with configuration
results = scan(
    path="/path/to/codebase",
    output_format="sarif",
    github_summary=True,
    exclude=["**/tests/**", "**/fixtures/**"],
    severity_threshold="medium"
)

# Process results
if results:
    print(f"Found {len(results['results'])} potential secrets")

Output Formats

Pretty (Default)

Human-readable terminal output with colors:

Found 2 potential secrets in 42 files:

[HIGH] AWS Access Key ID
  → /src/config/prod.py:42
  42 | AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"

[MEDIUM] Slack Webhook
  → /scripts/deploy.sh:15
  15 | SLACK_WEBHOOK="https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"

JSON

Machine-readable output:

{
  "version": "1.0.4",
  "results": [
    {
      "rule_id": "aws-access-key-id",
      "rule_name": "AWS Access Key ID",
      "severity": "high",
      "file": "/src/config/prod.py",
      "line": 42,
      "match": "AKIAIOSFODNN7EXAMPLE",
      "context": "AWS_ACCESS_KEY_ID = \"AKIAIOSFODNN7EXAMPLE\""
    }
  ],
  "stats": {
    "files_scanned": 42,
    "files_skipped": 8,
    "secrets_found": 2,
    "duration_ms": 125
  }
}

SARIF

Static Analysis Results Interchange Format (for GitHub Advanced Security):

{
  "$schema": "https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json",
  "version": "2.1.0",
  "runs": [
    {
      "tool": {
        "driver": {
          "name": "leakguard",
          "version": "1.0.4",
          "informationUri": "https://github.com/adrian-lorenz/leakguard"
        }
      },
      "results": [
        {
          "ruleId": "aws-access-key-id",
          "level": "error",
          "message": {
            "text": "AWS Access Key ID detected"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "file:///src/config/prod.py"
                },
                "region": {
                  "startLine": 42,
                  "snippet": {
                    "text": "AWS_ACCESS_KEY_ID = \"AKIAIOSFODNN7EXAMPLE\""
                  }
                }
              }
            }
          }
        }
      ]
    }
  ]
}

Markdown

GitHub-flavored markdown for reporting:

# LeakGuard Scan Results

**Version**: 1.0.4
**Files scanned**: 42
**Secrets found**: 2
**Duration**: 125ms

## Findings

### HIGH: AWS Access Key ID
**File**: `/src/config/prod.py:42`
**Rule ID**: aws-access-key-id

```python
42 | AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"

MEDIUM: Slack Webhook

File: /scripts/deploy.sh:15 Rule ID: slack-webhook

15 | SLACK_WEBHOOK="https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"

## Ignoring Findings

### Inline Ignore

Add a comment to ignore specific findings:

```python
# leakguard:ignore aws-access-key-id
AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"  # This is a test key

File-level Ignore

Add a .leakguardignore file to your project root:

# Ignore specific rules
aws-access-key-id
slack-webhook

# Ignore specific files
**/test_keys.py
**/fixtures/*

Contributing

We welcome contributions to LeakGuard! Here's how you can help:

Development Setup

  1. Clone the repository:
git clone https://github.com/adrian-lorenz/leakguard.git
cd leakguard
  1. Install Rust toolchain:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup update
  1. Install Python dependencies:
pip install maturin pytest
  1. Build the project:
maturin develop --release

Code Structure

.
├── .github/            # GitHub workflows and issue templates
├── src/
│   ├── lib.rs          # Main library code
│   ├── main.rs         # CLI entry point
│   ├── rules/          # Detection rules
│   ├── scanner.rs      # Scanning logic
│   ├── output.rs       # Output formatters
│   └── config.rs       # Configuration handling
├── tests/              # Integration tests
├── Cargo.toml          # Rust manifest
└── pyproject.toml      # Python package configuration

Testing

Rust Tests

cargo test

Python Tests

pytest

Integration Tests

# Run with test fixtures
cargo test -- --ignored

Adding New Rules

  1. Create a new rule in src/rules/:
// Example rule definition
pub fn aws_access_key_id() -> Rule {
    Rule {
        id: "aws-access-key-id".to_string(),
        name: "AWS Access Key ID".to_string(),
        pattern: Regex::new(r"(?i)aws(.{0,20})?(?-i)['\"][0-9a-z/+]{20,40}['\"]").unwrap(),
        severity: Severity::High,
        description: "Detects AWS Access Key IDs".to_string(),
        tags: vec!["aws".to_string(), "cloud".to_string()],
    }
}
  1. Register the rule in src/rules/mod.rs:
pub fn all_rules() -> Vec<Rule> {
    vec![
        aws::aws_access_key_id(),
        // ... other rules
        your_new_rule(),
    ]
}
  1. Add test cases in tests/rules.rs:
#[test]
fn test_your_new_rule() {
    let rule = your_new_rule();
    assert!(rule.pattern.is_match("AKIAIOSFODNN7EXAMPLE"));
    assert!(!rule.pattern.is_match("not-a-key"));
}

Code Style

Rust

  • Follow Rust's official style guidelines
  • Use cargo fmt for formatting
  • Use cargo clippy for linting
cargo fmt
cargo clippy -- -D warnings

Python

  • Follow PEP 8 guidelines
  • Use black for formatting
  • Use flake8 for linting
black .
flake8

Pull Request Process

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request

Release Process

  1. Update version in Cargo.toml
  2. Update changelog
  3. Create a tag (git tag -a vX.Y.Z -m "Release X.Y.Z")
  4. Push the tag (git push origin vX.Y.Z)
  5. GitHub Actions will build and publish the release

Community

Sprachen
Rust 75%
Markdown 21%
YAML 4.1%
Klonen
HTTPS