Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Main Application Flow

Relevant source files

This document describes the main application entry point and execution flow in src/main.rs, including initialization, data fetching loops, CSV output organization, and error handling strategies. The main application orchestrates the high-level data collection process by coordinating the configuration system, network layer, and storage systems.

For detailed information about the network client and its middleware layers, see SecClient. For data model structures used throughout the application, see Data Models & Enumerations. For configuration and credential management, see Configuration System.

Application Entry Point and Initialization

The application entry point is the main() function in src/main.rs:174-240 The initialization sequence follows a consistent pattern regardless of which processing mode is active:

Sources: src/main.rs:174-180

graph TB
    Start["Program Start\n#[tokio::main]\nasync fn main()"]
LoadConfig["ConfigManager::load()\nLoad TOML configuration\nValidate credentials"]
CreateClient["SecClient::from_config_manager()\nInitialize HTTP client\nApply throttling & caching"]
CheckMode{"Processing\nMode?"}
FetchTickers["fetch_company_tickers(&client)\nRetrieve complete ticker dataset"]
FetchInvCo["fetch_investment_company_\nseries_and_class_dataset(&client)\nGet fund listings"]
USGAAPLoop["US-GAAP Processing Loop\n(Active Implementation)"]
InvCoLoop["Investment Company Loop\n(Commented Prototypes)"]
ErrorReport["Generate Error Summary\nPrint failed tickers"]
End["Program Exit"]
Start --> LoadConfig
 
   LoadConfig --> CreateClient
 
   CreateClient --> CheckMode
 
   CheckMode -->|US-GAAP Mode| FetchTickers
 
   CheckMode -->|Investment Co Mode| FetchInvCo
 
   FetchTickers --> USGAAPLoop
 
   FetchInvCo --> InvCoLoop
 
   USGAAPLoop --> ErrorReport
 
   InvCoLoop --> ErrorReport
 
   ErrorReport --> End

The initialization handles errors at each stage, returning early if critical components fail to load:

Initialization StepError HandlingLine Reference
Configuration LoadingReturns Box<dyn Error> on failuresrc/main.rs176
Client CreationReturns Box<dyn Error> on failuresrc/main.rs178
Ticker FetchingReturns Box<dyn Error> on failuresrc/main.rs180

Active Implementation: US-GAAP Fundamentals Processing

The currently active implementation in main.rs processes US-GAAP fundamentals for all company tickers. This represents the production data collection pipeline.

Sources: src/main.rs:174-240

graph TB
    Start["Start US-GAAP Processing"]
FetchTickers["fetch_company_tickers(&client)\nReturns Vec&lt;Ticker&gt;"]
PrintTotal["Print total records\nDisplay first 60 tickers"]
InitErrorLog["Initialize error_log:\nHashMap&lt;String, String&gt;"]
LoopStart{"For each\ncompany_ticker\nin tickers"}
PrintProgress["Print processing status:\nticker_symbol (i+1 of total)"]
FetchFundamentals["fetch_us_gaap_fundamentals(\n&client,\n&company_tickers,\n&ticker_symbol)"]
CreateFile["File::create(\ndata/22-june-us-gaap/{ticker}.csv)"]
WriteCSV["CsvWriter::new(&mut file)\n.include_header(true)\n.finish(&mut df)"]
CaptureError["Insert error into error_log:\nticker → error message"]
CheckErrors{"error_log\nempty?"}
PrintErrors["Print error summary:\nList all failed tickers"]
PrintSuccess["Print success message"]
End["End Processing"]
Start --> FetchTickers
 
   FetchTickers --> PrintTotal
 
   PrintTotal --> InitErrorLog
 
   InitErrorLog --> LoopStart
 
   LoopStart -->|Next ticker| PrintProgress
 
   PrintProgress --> FetchFundamentals
 
   FetchFundamentals -->|Success| CreateFile
 
   CreateFile -->|Success| WriteCSV
 
   FetchFundamentals -->|Error| CaptureError
 
   CreateFile -->|Error| CaptureError
 
   WriteCSV -->|Error| CaptureError
 
   WriteCSV -->|Success| LoopStart
 
   CaptureError --> LoopStart
 
   LoopStart -->|Complete| CheckErrors
 
   CheckErrors -->|Has errors| PrintErrors
 
   CheckErrors -->|No errors| PrintSuccess
 
   PrintErrors --> End
 
   PrintSuccess --> End

Processing Loop Details

The main processing loop iterates through all company tickers sequentially, performing the following operations for each ticker:

  1. Progress Logging - src/main.rs:190-195: Prints current position in the dataset
  2. Data Fetching - src/main.rs203: Calls fetch_us_gaap_fundamentals() with client, full ticker list, and current ticker symbol
  3. CSV Generation - src/main.rs:206-214: Creates file and writes DataFrame with headers
  4. Error Capture - src/main.rs:212-225: Logs any failures to error_log HashMap

Error Handling Strategy

The application uses a non-fatal error handling approach:

Error TypeHandling StrategyResult
Fetch ErrorLogged to error_log, processing continuessrc/main.rs:223-225
File Creation ErrorLogged to error_log, processing continuessrc/main.rs:217-220
CSV Write ErrorLogged to error_log, processing continuessrc/main.rs:211-214
Configuration ErrorFatal, returns immediatelysrc/main.rs176
Client Creation ErrorFatal, returns immediatelysrc/main.rs178

Sources: src/main.rs:185-227

CSV Output Organization

The US-GAAP processing mode writes output to a flat directory structure:

data/22-june-us-gaap/
├── AAPL.csv
├── MSFT.csv
├── GOOGL.csv
└── ...

Each file is named using the ticker symbol and contains the complete US-GAAP fundamentals DataFrame for that company.

Sources: src/main.rs205

Investment Company Processing Flow (Prototype)

Two commented-out prototype implementations exist in main.rs for processing investment company data. These represent alternative processing modes that may be activated in future iterations.

Production Prototype (Lines 18-122)

The first prototype implements production-ready investment company processing with comprehensive error handling:

Sources: src/main.rs:18-122

graph TB
    Start["Start Investment Co Processing"]
InitLogger["env_logger::Builder\n.filter(None, LevelFilter::Info)"]
InitErrorLog["Initialize error_log: Vec&lt;String&gt;"]
LoadConfig["ConfigManager::load()\nHandle fatal errors"]
CreateClient["SecClient::from_config_manager()\nHandle fatal errors"]
FetchInvCo["fetch_investment_company_\nseries_and_class_dataset(&client)"]
LogTotal["info! Total investment companies"]
LoopStart{"For each fund\nin investment_\ncompanies"}
LogProgress["info! Processing: i+1 of total"]
CheckTicker{"fund.class_\nticker exists?"}
LogTicker["info! Ticker symbol"]
FetchNPORT["fetch_nport_filing_by_\nticker_symbol(&client, ticker)"]
GetFirstLetter["ticker.chars().next()\n.to_ascii_uppercase()"]
CreateDir["create_dir_all(\ndata/fund-holdings/{letter})"]
LogRecords["info! Total records"]
WriteCSV["latest_nport_filing\n.write_to_csv(file_path)"]
LogError["error! Log message\nerror_log.push(msg)"]
CheckFinal{"error_log\nempty?"}
PrintSummary["error! Print error summary"]
PrintSuccess["info! All funds successful"]
End["End Processing"]
Start --> InitLogger
 
   InitLogger --> InitErrorLog
 
   InitErrorLog --> LoadConfig
 
   LoadConfig -->|Error| LogError
 
   LoadConfig -->|Success| CreateClient
 
   CreateClient -->|Error| LogError
 
   CreateClient -->|Success| FetchInvCo
 
   FetchInvCo -->|Error| LogError
 
   FetchInvCo -->|Success| LogTotal
 
   LogTotal --> LoopStart
 
   LoopStart -->|Next fund| LogProgress
 
   LogProgress --> CheckTicker
 
   CheckTicker -->|Yes| LogTicker
 
   CheckTicker -->|No| LoopStart
 
   LogTicker --> FetchNPORT
 
   FetchNPORT -->|Error| LogError
 
   FetchNPORT -->|Success| GetFirstLetter
 
   GetFirstLetter --> CreateDir
 
   CreateDir -->|Error| LogError
 
   CreateDir -->|Success| LogRecords
 
   LogRecords --> WriteCSV
 
   WriteCSV -->|Error| LogError
 
   WriteCSV -->|Success| LoopStart
 
   LogError --> LoopStart
 
   LoopStart -->|Complete| CheckFinal
 
   CheckFinal -->|Has errors| PrintSummary
 
   CheckFinal -->|No errors| PrintSuccess
 
   PrintSummary --> End
 
   PrintSuccess --> End

Directory Organization for Fund Holdings

The investment company prototype organizes CSV output by the first letter of the ticker symbol:

data/fund-holdings/
├── A/
│   ├── AADR.csv
│   └── AAXJ.csv
├── B/
│   ├── BKLN.csv
│   └── BND.csv
├── V/
│   ├── VTI.csv
│   └── VXUS.csv
└── ...

This alphabetical categorization is implemented at src/main.rs:83-84 using ticker_symbol.chars().next().unwrap().to_ascii_uppercase().

Sources: src/main.rs:82-107

Debug Prototype (Lines 124-171)

A simpler debug version exists for testing and development, with minimal error handling and console output:

FeatureDebug PrototypeProduction Prototype
Error LoggingFatal errors onlyComprehensive Vec/HashMap logs
LoggerNot initializedenv_logger with INFO level
Progress TrackingSimple println!Structured info! macros
Error RecoveryImmediate exit with ?Continue processing, log errors
OutputConsole + CSVCSV only

Sources: src/main.rs:124-171

Data Flow Through Network Layer

The main application delegates all SEC API interactions to the network layer, which handles caching, throttling, and retries transparently:

Sources: src/main.rs:3-9 src/network/fetch_us_gaap_fundamentals.rs:12-28 src/network/fetch_nport_filing.rs:10-49 src/network/fetch_investment_company_series_and_class_dataset.rs:31-105

graph LR
    Main["main.rs\nmain()"]
FetchTickers["network::fetch_company_tickers\nGET company_tickers.json"]
FetchGAAP["network::fetch_us_gaap_fundamentals\nGET CIK{cik}/company-facts.json"]
FetchInvCo["network::fetch_investment_company_\nseries_and_class_dataset\nGET investment-company-series-class-{year}.csv"]
FetchNPORT["network::fetch_nport_filing_\nby_ticker_symbol\nGET data/{cik}/{accession}/primary_doc.xml"]
SecClient["SecClient\nHTTP middleware\nThrottling, Caching, Retries"]
SECAPI["SEC EDGAR API\nwww.sec.gov"]
Main --> FetchTickers
 
   Main --> FetchGAAP
 
   Main --> FetchInvCo
 
   Main --> FetchNPORT
    
 
   FetchTickers --> SecClient
 
   FetchGAAP --> SecClient
 
   FetchInvCo --> SecClient
 
   FetchNPORT --> SecClient
    
 
   SecClient --> SECAPI

The network functions are designed to be composable, allowing the main application to chain multiple API calls together (e.g., fetching a CIK first, then using it to fetch submissions).

Composite Data Fetching Operations

Several network functions perform composite operations by calling other network functions. For example, fetch_nport_filing_by_ticker_symbol orchestrates multiple API calls:

Sources: src/network/fetch_nport_filing.rs:10-49

graph TB
    Input["Input: ticker_symbol"]
FetchCIK["fetch_cik_by_ticker_symbol\n(&sec_client, ticker_symbol)"]
FetchSubs["fetch_cik_submissions\n(&sec_client, cik)"]
GetRecent["CikSubmission::most_recent_\nnport_p_submission(submissions)"]
FetchFiling["fetch_nport_filing_by_cik_\nand_accession_number(\n&sec_client, cik, accession)"]
FetchTickers["fetch_company_tickers\n(&sec_client)"]
ParseXML["parse_nport_xml(\n&xml_data, &company_tickers)"]
Output["Output: Vec&lt;NportInvestment&gt;"]
Input --> FetchCIK
 
   FetchCIK --> FetchSubs
 
   FetchSubs --> GetRecent
 
   GetRecent --> FetchFiling
 
   FetchFiling --> FetchTickers
 
   FetchTickers --> ParseXML
 
   ParseXML --> Output

This composition pattern allows the main application to use high-level functions while the network layer handles the complexity of multi-step API interactions.

graph TB
    MainFn["main() -> Result&lt;(), Box&lt;dyn Error&gt;&gt;"]
NetworkFn["Network functions\n-> Result&lt;T, Box&lt;dyn Error&gt;&gt;"]
Parser["Parsers\n-> Result&lt;T, Box&lt;dyn Error&gt;&gt;"]
SecClient["SecClient\nImplements retries & backoff"]
MainFatal["Fatal Errors:\nConfig load failure\nClient creation failure\nInitial ticker fetch failure"]
MainNonFatal["Non-Fatal Errors:\nIndividual ticker fetch failures\nCSV write failures\n→ Logged to error_log\n→ Processing continues"]
NetworkRetry["Automatic Retries:\nHTTP timeouts\nRate limit errors\n→ Exponential backoff\n→ Max 3 retries"]
ParserErr["Parser Errors:\nMalformed XML/JSON\nMissing required fields\n→ Propagate to caller"]
MainFn --> MainFatal
 
   MainFn --> MainNonFatal
    
 
   MainFn --> NetworkFn
 
   NetworkFn --> NetworkRetry
 
   NetworkFn --> SecClient
    
 
   NetworkFn --> Parser
 
   Parser --> ParserErr
    
 
   NetworkRetry --> MainNonFatal
 
   ParserErr --> MainNonFatal

Error Propagation and Recovery

The application uses Rust's Result type throughout the call stack, with different error handling strategies at each level:

Sources: src/main.rs:174-240 src/network/fetch_us_gaap_fundamentals.rs:12-28

Fatal errors (configuration, client setup) immediately return from main(), while per-ticker errors are captured in error_log and reported at the end without interrupting the processing loop.

File System Output Structure

The application writes structured CSV files to the local file system. The output directory organization differs between processing modes:

US-GAAP Mode Output

data/
└── 22-june-us-gaap/
    ├── AAPL.csv
    ├── MSFT.csv
    └── {ticker}.csv

Sources: src/main.rs205

Investment Company Mode Output

data/
└── fund-holdings/
    ├── A/
    │   ├── AADR.csv
    │   └── {ticker}.csv
    ├── B/
    │   └── {ticker}.csv
    └── {letter}/
        └── {ticker}.csv

Sources: src/main.rs:82-102

The alphabetical organization in investment company mode enables efficient file system operations when dealing with thousands of fund ticker symbols, avoiding the creation of a single directory with excessive entries.

Performance Characteristics

The main application is designed for batch processing with the following characteristics:

AspectImplementationLocation
ConcurrencySequential processing (no parallelization in main loop)src/main.rs187
Rate LimitingHandled by SecClient throttle policySee SecClient
CachingHTTP cache and preprocessor cache in SecClientSee Caching System
Memory ManagementProcesses one ticker at a time, drops DataFrames after writesrc/main.rs:203-221
Error RecoveryNon-fatal errors logged, processing continuessrc/main.rs:185-227

The sequential processing model ensures consistent rate limiting and simplifies error tracking, though it could be parallelized in future iterations using Tokio tasks or Rayon parallel iterators.

Sources: src/main.rs:174-240

Summary

The main application flow follows a simple but robust pattern:

  1. Initialize configuration and HTTP client with comprehensive error handling
  2. Fetch the complete dataset of tickers or investment companies
  3. Iterate through each entry sequentially
  4. Process each entry by fetching additional data and writing to CSV
  5. Log any errors without stopping processing
  6. Report a summary of successes and failures

This design prioritizes resilience (continue processing on errors) and observability (detailed logging and error summaries) over performance, making it suitable for overnight batch jobs that process thousands of securities. The commented prototype implementations demonstrate alternative processing modes that can be activated by uncommenting the relevant sections and commenting out the active implementation.