Clue API Development Guide¶
This guide outlines how to set up a development environment for the Clue API.
Prerequisites¶
Before you begin, ensure you have the following installed on your system:
- Python 3.10, 3.11, or 3.12 (3.12 recommended)
- Poetry for dependency management
- Docker and Docker Compose for test dependencies
- Git
Development Environment Setup¶
1. Clone the Repository¶
git clone https://github.com/CybercentreCanada/clue.git
cd clue/api
2. Install Poetry¶
If you don't have Poetry installed, follow the official installation guide.
Verify Poetry installation:
poetry --version
3. Set Up Python Environment¶
Configure Poetry to use Python 3.12 (or your preferred version):
poetry env use 3.12
Verify the environment:
poetry env info
4. Install Dependencies¶
Install all dependencies including development and test dependencies:
poetry install --all-extras --with test,dev,types
This will install:
- Core dependencies
- Server extras (Werkzeug, bcrypt, PyYAML, etc.)
- Test dependencies (pytest, mypy, coverage, etc.)
- Development dependencies (pre-commit, ruff, etc.)
- Type checking dependencies
5. Create Required Directories and Configuration¶
Create the necessary directories for Clue configuration and logs:
sudo mkdir -p /etc/clue/conf/
sudo mkdir -p /etc/clue/lookups/
sudo mkdir -p /var/log/clue
sudo chmod a+rw /etc/clue/conf/
sudo chmod a+rw /etc/clue/lookups/
sudo chmod a+rw /var/log/clue
Copy configuration files:
cp build_scripts/classification.yml /etc/clue/conf/classification.yml
cp test/unit/config.yml /etc/clue/conf/config.yml
6. Start Test Dependencies¶
Start the required services (Redis and Keycloak) using Docker Compose:
cd dev
docker-compose up --build -d
cd ..
Wait for services to be healthy:
poetry run python build_scripts/keycloak_health.py
Development Workflow¶
Code Quality and Formatting¶
The project uses several tools to maintain code quality:
# Check formatting
poetry run ruff format clue --diff
# Apply formatting
poetry run ruff format clue
# Run linter checks
poetry run ruff check clue
# Fix auto-fixable issues
poetry run ruff check clue --fix
# Run type checking
poetry run type_check
# Run all tests
poetry run test
Test Specific Files or Functions¶
# Run specific test files
poetry run pytest test/unit/test_specific_module.py
# Run specific test functions
poetry run pytest test/unit/test_specific_module.py::test_function_name
Starting the Development Server¶
Start the Clue API server:
poetry run server
The server will start and be available at the configured port, usually 5000 (check your config.yml anmd environment variables for details).
Testing Enrichment Services¶
For testing connections to plugins from the central API, you may need to start additional test servers:
# Terminal 1
poetry run flask --app test.utils.test_server run --no-reload --port 5008
# Terminal 2
poetry run flask --app test.utils.bad_server run --no-reload --port 5009
# Terminal 3
poetry run flask --app test.utils.slow_server run --no-reload --port 5010
# Terminal 4
poetry run flask --app test.utils.telemetry_server run --no-reload --port 5011
Project Structure¶
api/
├── clue/ # Main application code
│ ├── api/ # API endpoints
│ ├── cache/ # Caching utilities
│ ├── common/ # Common utilities and helpers
│ ├── constants/ # Application constants
│ ├── cronjobs/ # Scheduled tasks
│ ├── extensions/ # Flask extensions
│ ├── helper/ # Helper modules
│ ├── models/ # Data models
│ ├── plugin/ # Plugin system
│ ├── remote/ # Remote service integrations
│ ├── security/ # Security modules
│ └── services/ # Business logic services
├── build_scripts/ # Build and utility scripts
├── dev/ # Development environment files
├── docs/ # Documentation
├── scripts/ # Utility scripts
└── test/ # Test files
├── integration/ # Integration tests
├── unit/ # Unit tests
└── utils/ # Test utilities
Available Poetry Scripts¶
The project defines several convenient scripts in pyproject.toml:
poetry run server- Start the Clue API serverpoetry run test- Run the test suitepoetry run type_check- Run type checkingpoetry run coverage_report- Generate coverage reports (must be run aftertest)poetry run plugin- Interactive plugin management
Configuration¶
Environment Variables¶
The following environment variables can override configuration settings:
CLUE_CONFIG_PATH- Path to the main configuration file (default:/etc/clue/conf/config.yml)CLUE_CLASSIFICATION_PATH- Path to the classification file (default:/etc/clue/conf/classification.yml)CLUE_PLUGIN_DIRECTORY- Path to where clue extensions to the central API are stored (default:/etc/clue/plugins)CLUE_SESSION_COOKIE_SAMESITE- Set SameSite attribute for session cookies. Must beStrict,Lax, orNonefor securityCLUE_HSTS_MAX_AGE- HTTP Strict Transport Security max-age value in seconds for enhanced HTTPS securityFLASK_ENV- Flask environment (development/production)FLASK_DEBUG- Enable Flask debug modeREDIS_HOST- Override Redis hostnameREDIS_PORT- Override Redis port
Configuration Files¶
Clue uses two main configuration files:
/etc/clue/conf/config.yml- Main application configuration/etc/clue/conf/classification.yml- Classification configuration
Main Configuration (config.yml)¶
The main configuration file defines all aspects of the Clue API server. Here are the key sections:
API Configuration¶
api:
# Security settings
secret_key: "your-secret-key-here" # Flask secret key for sessions
session_duration: 3600 # Session duration in seconds (1 hour)
validate_session_ip: true # Validate session IP matches
validate_session_useragent: true # Validate session user agent matches
validate_session_xsrf_token: true # Enable XSRF token validation
# Debugging and auditing
debug: false # Enable Flask debug mode
audit: true # Log API calls for auditing
# Service discovery
discover_url: null # Optional service discovery URL
# External enrichment sources
external_sources: [] # List of external enrichment services
# OAuth on Behalf (OBO) targets
obo_targets: {} # Services that Clue can OBO to
Authentication Configuration¶
auth:
# API Key authentication
allow_apikeys: false # Enable API key authentication
apikeys: # Map of API keys to user identifiers
"api-key-1": "user1"
"api-key-2": "user2"
# OAuth settings
oauth:
enabled: false # Enable OAuth authentication
gravatar_enabled: false # Enable Gravatar for user avatars
other_audiences: [] # Additional JWT audiences to accept
providers: {} # OAuth provider configurations
# Service account authentication
service_account:
enabled: false # Enable service account authentication
accounts: [] # List of service account credentials
# Token propagation
propagate_clue_key: true # Include Clue token in OBO requests
Core Services Configuration¶
core:
# Redis configuration
redis:
host: "127.0.0.1" # Redis server hostname
port: 6379 # Redis server port
# Extensions to load
extensions: [] # List of Clue extensions to load
# Metrics collection
metrics:
export_interval: 5 # Metrics export interval in seconds
redis: # Redis instance for metrics
host: "127.0.0.1"
port: 6379
apm_server: # Application Performance Monitoring
server_url: null # APM server URL
token: null # APM authentication token
Logging Configuration¶
logging:
# Log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL, DISABLED
log_level: "DEBUG" # Current log level
# Output destinations
log_to_console: true # Log to console/stdout
log_to_file: false # Log to files
log_to_syslog: false # Log to syslog server
# File logging settings
log_directory: "/var/log/clue/" # Directory for log files
log_as_json: false # Use JSON format for logs
# Syslog settings
syslog_host: "localhost" # Syslog server hostname
syslog_port: 514 # Syslog server port
# Health monitoring
heartbeat_file: null # File to touch for health checks
export_interval: 5 # Counter logging interval
UI Configuration¶
ui:
cors_origins: [] # Allowed CORS origins for web UI
External Source Configuration¶
When configuring external enrichment sources, each source supports these options:
api:
external_sources:
- name: "example-source" # Unique source name
url: "https://api.example.com" # Source API URL
classification: "TLP:CLEAR" # Minimum classification level
max_classification: "TLP:RED" # Maximum classification level
include_default: true # Include in default queries
production: false # Production-ready flag
default_timeout: 30 # Request timeout in seconds
built_in: true # Built-in source flag
maintainer: "Admin <admin@example.com>" # RFC-5322 contact
documentation_link: "https://docs.example.com" # Documentation URL
datahub_link: "https://datahub.example.com" # DataHub entry
obo_target: null # OBO target name
OAuth Provider Configuration¶
For OAuth authentication, configure providers like this:
auth:
oauth:
enabled: true
providers:
keycloak:
client_id: "clue-api" # OAuth client ID
client_secret: "your-client-secret" # OAuth client secret
audience: "clue-api" # JWT audience
scope: "openid profile email groups" # OAuth scopes
jwks_uri: "https://auth.example.com/realms/clue/protocol/openid-connect/certs"
access_token_url: "https://auth.example.com/realms/clue/protocol/openid-connect/token"
authorize_url: "https://auth.example.com/realms/clue/protocol/openid-connect/auth"
api_base_url: "https://auth.example.com/realms/clue/protocol/openid-connect"
# User management
auto_create: true # Auto-create users
auto_sync: false # Auto-sync user data
required_groups: ["clue-users"] # Required OAuth groups
# Role and classification mapping
role_map: # Map OAuth groups to Clue roles
"clue-admins": "admin"
"clue-analysts": "analyst"
classification_map: # Map OAuth groups to clearance levels
"clue-admins": "TLP:RED"
"clue-analysts": "TLP:AMBER"
# User ID configuration
uid_randomize: false # Generate random usernames
uid_regex: "^(.+)@example\\.com$" # Extract username from email
uid_format: "{0}" # Username format string
Classification Configuration (classification.yml)¶
The classification configuration defines the data classification system used by Clue. This file follows the Assemblyline classification engine format.
For detailed information on configuring the classification engine, see the Assemblyline Classification Engine Documentation.
Key aspects of classification configuration:
- Classification Levels: Define hierarchical classification levels (e.g., TLP:CLEAR, TLP:GREEN, TLP:AMBER, TLP:RED)
- Required Classifications: Specify minimum classification levels for different data types
- Enforcement Rules: Configure how classifications are enforced and propagated
- Marking Schemes: Define visual and textual markings for classified data
Example basic classification configuration:
classification:
enforce: true
dynamic_groups: false
levels:
- TLP:CLEAR
- TLP:GREEN
- TLP:AMBER
- TLP:RED
required:
- TLP:CLEAR
groups:
- name: "TLP"
short_name: "TLP"
description: "Traffic Light Protocol"
auto_select: true
Configuration Validation¶
Clue validates configuration against a JSON schema on startup. If configuration is invalid, the server will fail to start with descriptive error messages indicating which settings need correction.
Docker Development¶
Building the Docker Image¶
Build the development Docker image:
poetry build
docker build -t clue-api:dev .
Docker Compose for Full Stack¶
The development environment uses Docker Compose to provide essential services for testing and development. The api/dev/docker-compose.yml file defines the following services:
Redis Service¶
redis:
image: redis
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 30s
timeout: 10s
retries: 3
Purpose: Redis serves as the caching layer and session store for Clue. It's used for:
- Session management and user authentication state
- Caching enrichment results to improve performance
- Rate limiting and quota management
- Metrics aggregation and temporary data storage
Development Notes:
- Exposed on the standard port 6379
- Includes health checks to ensure the service is ready before tests run
- No persistence configured - data is lost when container stops (suitable for development)
Keycloak Service (Custom Build)¶
keycloak:
build:
context: ./keycloak
dockerfile: Dockerfile
no_cache: true
environment:
KC_HEALTH_ENABLED: true
ports:
- "9100:8080"
expose:
- "9100"
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8080/health/ready"]
interval: 5s
timeout: 5s
retries: 15
Purpose: Keycloak provides OAuth/OpenID Connect authentication services for testing authentication flows in Clue.
Custom Dockerfile Features:
The custom Keycloak image (api/dev/keycloak/Dockerfile) extends the official Keycloak 18.0.2 image with:
-
Pre-configured Admin Account:
-
Username:
admin - Password:
admin -
Allows immediate access to Keycloak admin console
-
Development Mode: Runs in development mode with debug enabled for easier troubleshooting
-
Pre-imported Realm: Automatically imports the included realm configuration from
keycloak-realm.json -
Enhanced Features: Enables token exchange and fine-grained authorization features that may be used by Clue
Pre-configured Test Users: The imported realm includes several test users for development and testing:
admin- Administrative userdewey,donald,goose- Standard test usersguest- Guest user with limited permissionshuey,louie- Additional test users
Realm Configuration Highlights:
The keycloak-realm.json file configures a built-in realm with:
- Client Applications: Pre-configured OAuth clients for Clue API integration
- User Groups: Different user groups with varying permission levels
- Authentication Flows: Standard OAuth flows for web and API authentication
- Security Settings: Appropriate security headers and session management
- Internationalization: Support for English and French locales
- Token Lifespans: Configured for development (shorter lifespans for testing)
Starting the Development Stack¶
To start all development services:
cd api/dev
docker-compose up --build -d
To verify services are healthy:
# Check service status
docker-compose ps
# Check logs if needed
docker-compose logs redis
docker-compose logs keycloak
# Or use the built-in health check
poetry run python build_scripts/keycloak_health.py
Accessing Services¶
- Redis: Available at
localhost:6379(no authentication required) - Keycloak Admin Console:
http://localhost:9100/admin - Username:
admin - Password:
admin - Keycloak Realm:
http://localhost:9100/realms/HogwartsMini
Stopping and Cleaning Up¶
# Stop services
docker-compose down
# Stop and remove volumes (clean slate)
docker-compose down -v
# Rebuild from scratch
docker-compose down -v && docker-compose up --build
Integration with Clue API¶
When the Clue API is configured for OAuth authentication, it can connect to the local Keycloak instance using:
auth:
oauth:
enabled: true
providers:
keycloak:
client_id: "clue-api"
jwks_uri: "http://localhost:9100/realms/HogwartsMini/protocol/openid-connect/certs"
access_token_url: "http://localhost:9100/realms/HogwartsMini/protocol/openid-connect/token"
authorize_url: "http://localhost:9100/realms/HogwartsMini/protocol/openid-connect/auth"
api_base_url: "http://localhost:9100/realms/HogwartsMini/protocol/openid-connect"
# ... other configuration
This setup provides a complete development environment for testing authentication, authorization, caching, and all Clue API functionality without requiring external services.
Troubleshooting¶
Environment Issues¶
If you encounter issues with the Python environment:
- Delete the existing environment:
poetry env remove python - Recreate it:
poetry env use 3.12 - Reinstall dependencies:
poetry install --all-extras --with test,dev,types
Service Dependencies¶
If Docker services aren't starting properly:
- Stop all containers:
docker-compose down - Remove volumes:
docker-compose down -v - Rebuild:
docker-compose up --build
Additional Resources¶
Contributing¶
See Contributing for general information on how to contribute.
In addition to the standard contributing practices, here are some additional guidelines specific to the Clue API project:
Pre-commit Hooks¶
Install pre-commit hooks to automatically run checks before commits:
poetry run pre-commit install
Pre-submission Checks¶
Before opening a PR, ensure:¶
- All tests pass:
poetry run test - Code is properly formatted:
poetry run ruff format clue - Linting passes:
poetry run ruff check clue - Type checking passes:
poetry run type_check
Note that installing pre-commit hooks will also help handle these cases.
Getting Help¶
You can reach the Clue developemnt team on the CCCS aurora discord: https://discord.gg/GUAy9wErNu