Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Network Layer & SecClient

Relevant source files

Purpose and Scope

This page documents the network layer of the Rust sec-fetcher application, specifically the SecClient HTTP client and its associated infrastructure. The SecClient provides the foundational HTTP communication layer for all SEC EDGAR API interactions, implementing throttling, caching, and retry logic to ensure reliable and compliant data fetching.

This page covers:

  • The SecClient structure and initialization
  • Throttling and rate limiting policies
  • HTTP caching mechanisms
  • Request/response handling
  • User-Agent management and email validation

For information about the specific network fetching functions that use SecClient (such as fetch_company_tickers, fetch_us_gaap_fundamentals, etc.), see Data Fetching Functions. For details on the caching system architecture, see Caching & Storage System.


SecClient Architecture Overview

Component Diagram

Sources: src/network/sec_client.rs:1-181


SecClient Structure and Initialization

SecClient Fields

The SecClient struct maintains four core components:

FieldTypePurpose
emailStringContact email for SEC User-Agent header
http_clientClientWithMiddlewarereqwest client with middleware stack
cache_policyArc<CachePolicy>Shared cache configuration
throttle_policyArc<ThrottlePolicy>Shared throttle configuration

Sources: src/network/sec_client.rs:14-19

Construction from ConfigManager

The from_config_manager() constructor performs the following initialization sequence:

  1. Extract Configuration : Reads email, max_concurrent, min_delay_ms, and max_retries from AppConfig
  2. Create CachePolicy : Configures cache with 1-week TTL and disables header respect
  3. Create ThrottlePolicy : Configures rate limiting with adaptive jitter
  4. Initialize HTTP Cache : Retrieves the shared HTTP cache store from Caches
  5. Build Middleware Stack : Combines cache and throttle layers
  6. Construct Client : Creates ClientWithMiddleware with full middleware stack

Sources: src/network/sec_client.rs:23-89

Configuration Parameters

The following table details the configuration parameters required by SecClient:

ParameterAppConfig FieldRequiredDefaultPurpose
EmailemailYesNoneSEC User-Agent compliance
Max Concurrentmax_concurrentYesNoneConcurrent request limit
Min Delay (ms)min_delay_msYesNoneBase throttle delay
Max Retriesmax_retriesYesNoneRetry attempt limit

Sources: src/network/sec_client.rs:28-43


Throttle Policy Configuration

ThrottlePolicy Structure

The ThrottlePolicy from reqwest_drive controls request rate limiting with the following parameters:

FieldTypeSourcePurpose
base_delay_msu64AppConfig.min_delay_msMinimum delay between requests
max_concurrentu64AppConfig.max_concurrentMaximum concurrent requests
max_retriesu64AppConfig.max_retriesMaximum retry attempts
adaptive_jitter_msu64Hardcoded: 500Randomized delay for retry backoff

Sources: src/network/sec_client.rs:52-59

Request Throttling Mechanism

The throttle mechanism operates as follows:

  1. Concurrent Limit : Enforces max_concurrent simultaneous requests
  2. Base Delay : Applies base_delay_ms between sequential requests
  3. Retry Logic : Retries failed requests up to max_retries times
  4. Adaptive Jitter : Adds randomized delay of up to adaptive_jitter_ms on retries to prevent thundering herd

The ThrottlePolicy can be overridden on a per-request basis by passing a custom policy to raw_request():

Sources: src/network/sec_client.rs:140-165


Cache Policy Configuration

CachePolicy Structure

The CachePolicy configuration defines HTTP caching behavior:

FieldValuePurpose
default_ttlDuration::from_secs(60 * 60 * 24 * 7)1 week cache lifetime
respect_headersfalseIgnore HTTP cache control headers
cache_status_overrideNoneNo status code override

Note : The 1-week TTL is currently hardcoded and marked with a TODO comment for future configurability.

Sources: src/network/sec_client.rs:45-50

Cache Storage Integration

The cache storage uses the following integration:

  1. HTTP Cache Store : Retrieved via Caches::get_http_cache_store(), a static OnceLock singleton
  2. Drive Integration : Uses init_cache_with_drive_and_throttle() from reqwest_drive
  3. Persistent Storage : Backed by simd-r-drive WebSocket server for cross-session persistence

Sources: src/network/sec_client.rs:73-81


User-Agent Management

User-Agent Format

The get_user_agent() method generates a compliant User-Agent string for SEC EDGAR API requests:

Format: <package_name>/<version> (+<email>)
Example: sec-fetcher/0.1.0 (+contact@example.com)

Sources: src/network/sec_client.rs:91-108

Email Validation

The email validation occurs at User-Agent generation time rather than during instantiation. This design ensures that:

  1. Every network request validates the email format
  2. Invalid configurations fail fast on first network call
  3. The email is validated even if the SecClient is constructed through different paths

Sources: src/network/sec_client.rs:91-99


Request Methods

raw_request Method

The raw_request() method provides low-level HTTP request functionality:

Parameters:

  • method: HTTP method (GET, POST, etc.)
  • url: Target URL
  • headers: Optional additional headers as key-value tuples
  • custom_throttle_policy: Optional per-request throttle override

Behavior:

  1. Constructs request with User-Agent header
  2. Applies optional custom headers
  3. Injects custom throttle policy if provided (via request extensions)
  4. Executes request through middleware stack
  5. Returns raw reqwest::Response

Sources: src/network/sec_client.rs:140-165

fetch_json Method

The fetch_json() method is a convenience wrapper for JSON API requests:

Flow:

  1. Calls raw_request() with GET method
  2. Awaits response
  3. Deserializes response body to serde_json::Value
  4. Returns parsed JSON

Sources: src/network/sec_client.rs:167-179

Request Flow Diagram

Sources: src/network/sec_client.rs:140-179


Testing Infrastructure

Test Organization

The SecClient tests use the mockito crate for HTTP mocking:

Sources: tests/sec_client_tests.rs:1-159

Test Coverage

The test suite covers the following scenarios:

TestPurposeMock BehaviorAssertion
test_user_agentUser-Agent formattingN/AMatches expected format
test_invalid_email_panicEmail validationN/APanics with expected message
test_fetch_json_without_retry_successSuccessful request200 JSON responseJSON parsed correctly
test_fetch_json_with_retry_successRetry not needed200 JSON responseJSON parsed correctly
test_fetch_json_with_retry_failureRetry exhaustion500 error (3x)Returns error
test_fetch_json_with_retry_backoffRetry with recovery500 → 200JSON parsed correctly

Key Testing Patterns:

  1. Mock Server : mockito::Server::new_async() creates isolated HTTP endpoints
  2. Configuration : Tests use AppConfig::default() with overrides
  3. Async Execution : All network tests use #[tokio::test]
  4. Expectations : mockito verifies request counts with .expect(n)

Sources: tests/sec_client_tests.rs:7-158


Dependencies and Integration

External Dependencies

CratePurposeUsage in SecClient
reqwestHTTP clientCore HTTP functionality
reqwest_driveMiddleware stackCache and throttle integration
tokioAsync runtimeAll async operations
serde_jsonJSON parsingResponse deserialization
email_addressEmail validationUser-Agent email checking

Sources: src/network/sec_client.rs:1-12 Cargo.lock:1-100

Internal Dependencies

The SecClient integrates with:

  1. ConfigManager : Provides configuration for initialization (Configuration System)
  2. Caches : Supplies HTTP cache storage singleton (Caching & Storage System)
  3. Network Module : Exposed via src/network.rs module
  4. Fetch Functions : Used by all data fetching operations (Data Fetching Functions)

Sources: src/network.rs:1-23 src/network/sec_client.rs:1-12


Error Handling

Error Types

SecClient methods return Result<T, Box<dyn Error>>, allowing propagation of:

  1. reqwest::Error : HTTP client errors (connection, timeout, etc.)
  2. serde_json::Error : JSON deserialization errors
  3. Custom Errors : From configuration or validation

Panic Conditions

The only panic condition is invalid email format detected in get_user_agent():

This is intentional as an invalid email violates SEC API requirements.

Sources: src/network/sec_client.rs:95-98


Usage Example

The following example demonstrates typical SecClient usage:

Sources: tests/sec_client_tests.rs:36-62