This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Testing Strategy
Relevant source files
- examples/us_gaap_human_readable.rs
- python/narrative_stack/us_gaap_store_integration_test.sh
- src/network/sec_client.rs
- tests/config_manager_tests.rs
- tests/distill_us_gaap_fundamental_concepts_tests.rs
- tests/sec_client_tests.rs
This page documents the testing approach for the rust-sec-fetcher codebase, covering both Rust unit tests and Python integration tests. The testing strategy emphasizes isolated unit testing with mocking for Rust components and end-to-end integration testing for the Python ML pipeline. For information about the CI/CD automation of these tests, see CI/CD Pipeline.
Test Architecture Overview
The codebase employs a dual-layer testing strategy that mirrors its dual-language architecture. Rust components use unit tests with HTTP mocking, while Python components use containerized integration tests with real database instances.
graph TB
subgraph "Rust Unit Tests"
SecClientTest["sec_client_tests.rs\nTest SecClient HTTP layer"]
DistillTest["distill_us_gaap_fundamental_concepts_tests.rs\nTest concept transformation"]
ConfigTest["config_manager_tests.rs\nTest configuration loading"]
end
subgraph "Test Infrastructure"
MockitoServer["mockito::Server\nHTTP mock server"]
TempDir["tempfile::TempDir\nTemporary config files"]
end
subgraph "Python Integration Tests"
IntegrationScript["us_gaap_store_integration_test.sh\nTest orchestration"]
PytestRunner["pytest\nTest execution"]
end
subgraph "Docker Test Environment"
MySQLContainer["us_gaap_test_db\nMySQL 8.0 container"]
SimdRDriveContainer["simd-r-drive-ws-server\nWebSocket server"]
SQLSchema["us_gaap_schema_2025.sql\nDatabase schema fixture"]
end
SecClientTest --> MockitoServer
ConfigTest --> TempDir
IntegrationScript --> MySQLContainer
IntegrationScript --> SimdRDriveContainer
IntegrationScript --> SQLSchema
IntegrationScript --> PytestRunner
PytestRunner --> MySQLContainer
PytestRunner --> SimdRDriveContainer
Test Component Relationships
Sources: tests/sec_client_tests.rs:1-159 tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275 tests/config_manager_tests.rs:1-95 python/narrative_stack/us_gaap_store_integration_test.sh:1-39
Rust Unit Testing Strategy
The Rust test suite focuses on three critical areas: HTTP client behavior, data transformation correctness, and configuration management. Tests are isolated using mocking frameworks and temporary file systems.
SecClient HTTP Testing
The SecClient tests verify HTTP operations, retry logic, and authentication header generation using the mockito library for HTTP mocking.
| Test Function | Purpose | Mock Behavior |
|---|---|---|
test_user_agent | Validates User-Agent header format | N/A (direct method call) |
test_invalid_email_panic | Ensures invalid emails cause panic | N/A (panic verification) |
test_fetch_json_without_retry_success | Tests successful JSON fetch | Returns 200 with valid JSON |
test_fetch_json_with_retry_success | Tests successful fetch with retry available | Returns 200 immediately |
test_fetch_json_with_retry_failure | Verifies retry exhaustion | Returns 500 three times |
test_fetch_json_with_retry_backoff | Tests retry with eventual success | Returns 500 once, then 200 |
User Agent Validation Test Pattern
Sources: tests/sec_client_tests.rs:6-21
The test creates a minimal AppConfig with only the email field set tests/sec_client_tests.rs:8-10 constructs a ConfigManager and SecClient tests/sec_client_tests.rs:10-12 then verifies the User-Agent format matches the expected pattern including the package version from CARGO_PKG_VERSION tests/sec_client_tests.rs:14-20
HTTP Retry and Backoff Testing
The retry logic tests use mockito::Server to simulate various HTTP response scenarios:
Sources: tests/sec_client_tests.rs:64-158
graph TB
SetupMock["Setup mockito::Server"]
ConfigureResponse["Configure mock response\nStatus, Body, Expect count"]
CreateClient["Create SecClient\nwith max_retries config"]
FetchJSON["Call client.fetch_json()"]
VerifyResult["Verify result:\nSuccess or Error"]
VerifyCallCount["Verify mock was called\nexpected number of times"]
SetupMock --> ConfigureResponse
ConfigureResponse --> CreateClient
CreateClient --> FetchJSON
FetchJSON --> VerifyResult
FetchJSON --> VerifyCallCount
The retry failure test configures a mock to return HTTP 500 status and expects exactly 3 calls (initial + 2 retries) tests/sec_client_tests.rs:96-103 setting max_retries = 2 in the config tests/sec_client_tests.rs106 The test verifies the result is an error after exhausting retries tests/sec_client_tests.rs:111-119
The backoff test simulates transient failures by creating two mock endpoints: one that returns 500 once, and another that returns 200 tests/sec_client_tests.rs:126-140 This validates that the retry mechanism successfully recovers from temporary failures.
Sources: tests/sec_client_tests.rs:94-158
US GAAP Concept Transformation Testing
The distill_us_gaap_fundamental_concepts_tests.rs file contains 1,275 lines of comprehensive tests covering all 64 FundamentalConcept variants. These tests verify the complex mapping logic that normalizes diverse US GAAP terminology into a standardized taxonomy.
Test Coverage by Concept Category
| Category | Test Functions | Concept Variants Tested | Lines |
|---|---|---|---|
| Assets & Liabilities | 6 | 9 | 5-148 |
| Income Statement | 15 | 25 | 20-464 |
| Cash Flow | 12 | 15 | 550-683 |
| Equity | 7 | 13 | 150-286 |
| Revenue Variations | 2 (with 57+ assertions) | 45+ | 954-1188 |
Hierarchical Mapping Test Pattern
The tests verify that specific concepts map to both their precise category and parent categories:
Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:131-139
graph TB
InputConcept["Input: 'AssetsCurrent'"]
CallDistill["Call distill_us_gaap_fundamental_concepts()"]
ExpectVector["Expect Vec with 2 elements"]
CheckSpecific["Assert contains:\nFundamentalConcept::CurrentAssets"]
CheckParent["Assert contains:\nFundamentalConcept::Assets"]
InputConcept --> CallDistill
CallDistill --> ExpectVector
ExpectVector --> CheckSpecific
ExpectVector --> CheckParent
The test for AssetsCurrent verifies it maps to both CurrentAssets (specific) and Assets (parent category) tests/distill_us_gaap_fundamental_concepts_tests.rs:132-138 This hierarchical pattern appears throughout the test suite for concepts that have parent-child relationships.
Synonym Consolidation Testing
Multiple test functions verify that synonym variations of the same concept map to a single canonical form:
Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:80-112
graph LR
CostOfRevenue["CostOfRevenue"]
CostOfGoodsAndServicesSold["CostOfGoodsAndServicesSold"]
CostOfServices["CostOfServices"]
CostOfGoodsSold["CostOfGoodsSold"]
CostOfGoodsSoldExcluding["CostOfGoodsSold...\nExcludingDepreciation"]
CostOfGoodsSoldElectric["CostOfGoodsSoldElectric"]
Canonical["FundamentalConcept::\nCostOfRevenue"]
CostOfRevenue --> Canonical
CostOfGoodsAndServicesSold --> Canonical
CostOfServices --> Canonical
CostOfGoodsSold --> Canonical
CostOfGoodsSoldExcluding --> Canonical
CostOfGoodsSoldElectric --> Canonical
The test_cost_of_revenue function contains 6 assertions verifying different US GAAP terminology variants all map to FundamentalConcept::CostOfRevenue tests/distill_us_gaap_fundamental_concepts_tests.rs:81-111 This pattern ensures consistent normalization across different reporting styles.
Industry-Specific Revenue Testing
The revenue tests demonstrate the most complex mapping scenario, where 57+ industry-specific revenue concepts all normalize to FundamentalConcept::Revenues:
Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:954-1188
graph TB
subgraph "Standard Revenue Terms"
Revenues["Revenues"]
SalesRevenueNet["SalesRevenueNet"]
SalesRevenueServicesNet["SalesRevenueServicesNet"]
end
subgraph "Industry-Specific Terms"
HealthCare["HealthCareOrganizationRevenue"]
RealEstate["RealEstateRevenueNet"]
OilGas["OilAndGasRevenue"]
Financial["FinancialServicesRevenue"]
Advertising["AdvertisingRevenue"]
Subscription["SubscriptionRevenue"]
Mining["RevenueMineralSales"]
end
subgraph "Specialized Terms"
Franchisor["FranchisorRevenue"]
Admissions["AdmissionsRevenue"]
Licenses["LicensesRevenue"]
Royalty["RoyaltyRevenue"]
Clearing["ClearingFeesRevenue"]
Passenger["PassengerRevenue"]
end
Canonical["FundamentalConcept::Revenues"]
Revenues --> Canonical
SalesRevenueNet --> Canonical
SalesRevenueServicesNet --> Canonical
HealthCare --> Canonical
RealEstate --> Canonical
OilGas --> Canonical
Financial --> Canonical
Advertising --> Canonical
Subscription --> Canonical
Mining --> Canonical
Franchisor --> Canonical
Admissions --> Canonical
Licenses --> Canonical
Royalty --> Canonical
Clearing --> Canonical
Passenger --> Canonical
The test_revenues function contains 47 distinct assertions, each verifying a different industry-specific revenue term maps to the canonical FundamentalConcept::Revenues tests/distill_us_gaap_fundamental_concepts_tests.rs:955-1188 Some terms also map to more specific sub-categories, such as InterestAndDividendIncomeOperating mapping to both InterestAndDividendIncomeOperating and Revenues tests/distill_us_gaap_fundamental_concepts_tests.rs:989-994
graph TB
CreateTempDir["tempfile::tempdir()"]
CreatePath["dir.path().join('config.toml')"]
WriteContents["Write TOML contents\nto file"]
LoadConfig["ConfigManager::from_config(path)"]
AssertValues["Assert config values\nmatch expectations"]
DropTempDir["Drop TempDir\n(automatic cleanup)"]
CreateTempDir --> CreatePath
CreatePath --> WriteContents
WriteContents --> LoadConfig
LoadConfig --> AssertValues
AssertValues --> DropTempDir
Configuration Manager Testing
The config_manager_tests.rs file tests configuration loading, validation, and error handling using temporary files created with the tempfile crate.
Temporary Config File Test Pattern
Sources: tests/config_manager_tests.rs:8-57
The helper function create_temp_config creates a temporary directory and writes config contents to a config.toml file tests/config_manager_tests.rs:9-17 The function returns both the TempDir (to prevent premature cleanup) and the PathBuf tests/config_manager_tests.rs16 Tests explicitly drop the TempDir at the end to ensure cleanup tests/config_manager_tests.rs56
Invalid Configuration Key Detection
The test suite verifies that unknown configuration keys are rejected with helpful error messages:
| Test | Config Content | Expected Behavior |
|---|---|---|
test_load_custom_config | Valid keys only | Success, values loaded |
test_load_non_existent_config | N/A (file doesn't exist) | Error returned |
test_fails_on_invalid_key | Contains invalid_key | Error with valid keys listed |
Sources: tests/config_manager_tests.rs:68-94
The invalid key test verifies that the error message contains documentation of valid configuration keys, including their types tests/config_manager_tests.rs:88-93:
email (String | Null)max_concurrent (Integer | Null)max_retries (Integer | Null)min_delay_ms (Integer | Null)
Python Integration Testing Strategy
The Python testing strategy uses Docker Compose to orchestrate a complete test environment with MySQL database and WebSocket services, enabling end-to-end validation of the data pipeline.
graph TB
ScriptStart["us_gaap_store_integration_test.sh"]
SetupEnv["Set environment variables\nPROJECT_NAME=us_gaap_it\nCOMPOSE=docker compose -p ..."]
ActivateVenv["source .venv/bin/activate"]
InstallDeps["uv pip install -e . --group dev"]
RegisterTrap["trap 'cleanup' EXIT"]
StartContainers["docker compose up -d\n--profile test"]
WaitMySQL["Wait for MySQL ping\nmysqladmin ping loop"]
CreateDB["CREATE DATABASE us_gaap_test"]
LoadSchema["mysql < us_gaap_schema_2025.sql"]
RunPytest["pytest -s -v\ntest_us_gaap_store.py"]
Cleanup["Cleanup trap:\ndocker compose down\n--volumes --remove-orphans"]
ScriptStart --> SetupEnv
SetupEnv --> ActivateVenv
ActivateVenv --> InstallDeps
InstallDeps --> RegisterTrap
RegisterTrap --> StartContainers
StartContainers --> WaitMySQL
WaitMySQL --> CreateDB
CreateDB --> LoadSchema
LoadSchema --> RunPytest
RunPytest --> Cleanup
Integration Test Orchestration Flow
Sources: python/narrative_stack/us_gaap_store_integration_test.sh:1-39
Docker Compose Test Profile
The script uses an isolated Docker Compose project name to prevent conflicts with development containers python/narrative_stack/us_gaap_store_integration_test.sh:8-9:
PROJECT_NAME="us_gaap_it"COMPOSE="docker compose -p $PROJECT_NAME --profile test"
The --profile test flag ensures only test-specific services are started python/narrative_stack/us_gaap_store_integration_test.sh23 which include:
db_test(MySQL container namedus_gaap_test_db)simd_r_drive_ws_server_test(WebSocket server)
Sources: python/narrative_stack/us_gaap_store_integration_test.sh:8-23
MySQL Test Database Setup
The integration test script performs a multi-step database initialization:
Sources: python/narrative_stack/us_gaap_store_integration_test.sh:25-35
The script waits for MySQL to be ready using a ping loop python/narrative_stack/us_gaap_store_integration_test.sh:25-28:
Then creates the test database python/narrative_stack/us_gaap_store_integration_test.sh:30-31 and loads the schema from a fixture file python/narrative_stack/us_gaap_store_integration_test.sh:33-35
Test Cleanup and Isolation
The script registers a cleanup trap to ensure test containers are always removed python/narrative_stack/us_gaap_store_integration_test.sh:14-19:
This trap executes on both successful completion and script failure, ensuring:
- All containers in the
us_gaap_itproject are stopped and removed - All volumes are deleted (
--volumes) - Orphaned containers from previous runs are cleaned up (
--remove-orphans)
Sources: python/narrative_stack/us_gaap_store_integration_test.sh:14-19
Pytest Execution Environment
The test environment is configured with specific paths and options python/narrative_stack/us_gaap_store_integration_test.sh:37-38:
| Configuration | Value | Purpose |
|---|---|---|
PYTHONPATH | src | Ensure module imports work |
pytest flags | -s -v | Show output, verbose mode |
| Test path | tests/integration/test_us_gaap_store.py | Integration test module |
The -s flag disables output capturing, allowing print statements and logs to display immediately during test execution, which is useful for debugging long-running integration tests. The -v flag enables verbose mode with detailed test function names.
Sources: python/narrative_stack/us_gaap_store_integration_test.sh:37-38
Test Fixtures and Mocking Patterns
The codebase uses different mocking strategies appropriate to each language and component.
graph TB
CreateServer["mockito::Server::new_async()"]
ConfigureMock["server.mock(method, path)\n.with_status(code)\n.with_body(json)\n.expect(count)"]
CreateAsyncMock["create_async().await"]
GetServerUrl["server.url()"]
MakeRequest["client.fetch_json(url)"]
VerifyMock["Mock automatically verifies\nexpected call count"]
CreateServer --> ConfigureMock
ConfigureMock --> CreateAsyncMock
CreateAsyncMock --> GetServerUrl
GetServerUrl --> MakeRequest
MakeRequest --> VerifyMock
Rust Mocking with mockito
The mockito library provides HTTP server mocking for testing the SecClient:
Sources: tests/sec_client_tests.rs:36-62
Mock configuration example from test_fetch_json_without_retry_success tests/sec_client_tests.rs:39-45:
- Method:
GET - Path:
/files/company_tickers.json - Status:
200 - Header:
Content-Type: application/json - Body: JSON string with sample ticker data
- Call using
server.url()to get the mock endpoint
graph LR
CreateTempDir["tempfile::tempdir()"]
GetPath["dir.path().join('config.toml')"]
CreateFile["fs::File::create(path)"]
WriteContents["writeln!(file, contents)"]
ReturnBoth["Return (TempDir, PathBuf)"]
CreateTempDir --> GetPath
GetPath --> CreateFile
CreateFile --> WriteContents
WriteContents --> ReturnBoth
Temporary File Fixtures
Configuration tests use the tempfile crate to create isolated test environments:
Sources: tests/config_manager_tests.rs:8-17
The create_temp_config helper returns both the TempDir and PathBuf to prevent premature cleanup tests/config_manager_tests.rs16 The TempDir must remain in scope until after the test completes, as dropping it deletes the temporary directory.
SQL Schema Fixtures
The Python integration tests use a versioned SQL schema file as a fixture:
| File | Purpose | Usage |
|---|---|---|
tests/integration/assets/us_gaap_schema_2025.sql | Define us_gaap_test database structure | Loaded via docker exec python/narrative_stack/us_gaap_store_integration_test.sh:33-35 |
This approach ensures tests run against a known database structure that matches production schema, enabling reliable testing of database operations and query logic.
Sources: python/narrative_stack/us_gaap_store_integration_test.sh:33-35
Test Execution Commands
The following table summarizes how to run different test suites:
| Test Suite | Command | Working Directory | Prerequisites |
|---|---|---|---|
| All Rust unit tests | cargo test | Repository root | Rust toolchain |
| Specific Rust test file | cargo test --test sec_client_tests | Repository root | Rust toolchain |
| Single Rust test function | cargo test test_user_agent | Repository root | Rust toolchain |
| Python integration tests | ./us_gaap_store_integration_test.sh | python/narrative_stack/ | Docker, Git LFS, uv |
| Python integration with pytest | pytest -s -v tests/integration/ | python/narrative_stack/ | Docker, containers running |
Prerequisites for Integration Tests
The Python integration test script requires python/narrative_stack/us_gaap_store_integration_test.sh:1-2:
- Git LFS enabled and updated (for large test fixtures)
- Docker with Docker Compose v2
- Python virtual environment with
uvpackage manager - Test data files committed with Git LFS
Sources: tests/sec_client_tests.rs:1-159 tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275 tests/config_manager_tests.rs:1-95 python/narrative_stack/us_gaap_store_integration_test.sh:1-39