This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Main Application Flow
Relevant source files
- examples/nport_filing.rs
- src/main.rs
- src/network/fetch_investment_company_series_and_class_dataset.rs
- src/network/fetch_nport_filing.rs
- src/network/fetch_us_gaap_fundamentals.rs
- src/parsers/parse_nport_xml.rs
This document describes the main application entry point and execution flow in src/main.rs, including initialization, data fetching loops, CSV output organization, and error handling strategies. The main application orchestrates the high-level data collection process by coordinating the configuration system, network layer, and storage systems.
For detailed information about the network client and its middleware layers, see SecClient. For data model structures used throughout the application, see Data Models & Enumerations. For configuration and credential management, see Configuration System.
Application Entry Point and Initialization
The application entry point is the main() function in src/main.rs:174-240 The initialization sequence follows a consistent pattern regardless of which processing mode is active:
Sources: src/main.rs:174-180
graph TB
Start["Program Start\n#[tokio::main]\nasync fn main()"]
LoadConfig["ConfigManager::load()\nLoad TOML configuration\nValidate credentials"]
CreateClient["SecClient::from_config_manager()\nInitialize HTTP client\nApply throttling & caching"]
CheckMode{"Processing\nMode?"}
FetchTickers["fetch_company_tickers(&client)\nRetrieve complete ticker dataset"]
FetchInvCo["fetch_investment_company_\nseries_and_class_dataset(&client)\nGet fund listings"]
USGAAPLoop["US-GAAP Processing Loop\n(Active Implementation)"]
InvCoLoop["Investment Company Loop\n(Commented Prototypes)"]
ErrorReport["Generate Error Summary\nPrint failed tickers"]
End["Program Exit"]
Start --> LoadConfig
LoadConfig --> CreateClient
CreateClient --> CheckMode
CheckMode -->|US-GAAP Mode| FetchTickers
CheckMode -->|Investment Co Mode| FetchInvCo
FetchTickers --> USGAAPLoop
FetchInvCo --> InvCoLoop
USGAAPLoop --> ErrorReport
InvCoLoop --> ErrorReport
ErrorReport --> End
The initialization handles errors at each stage, returning early if critical components fail to load:
| Initialization Step | Error Handling | Line Reference |
|---|---|---|
| Configuration Loading | Returns Box<dyn Error> on failure | src/main.rs176 |
| Client Creation | Returns Box<dyn Error> on failure | src/main.rs178 |
| Ticker Fetching | Returns Box<dyn Error> on failure | src/main.rs180 |
Active Implementation: US-GAAP Fundamentals Processing
The currently active implementation in main.rs processes US-GAAP fundamentals for all company tickers. This represents the production data collection pipeline.
Sources: src/main.rs:174-240
graph TB
Start["Start US-GAAP Processing"]
FetchTickers["fetch_company_tickers(&client)\nReturns Vec<Ticker>"]
PrintTotal["Print total records\nDisplay first 60 tickers"]
InitErrorLog["Initialize error_log:\nHashMap<String, String>"]
LoopStart{"For each\ncompany_ticker\nin tickers"}
PrintProgress["Print processing status:\nticker_symbol (i+1 of total)"]
FetchFundamentals["fetch_us_gaap_fundamentals(\n&client,\n&company_tickers,\n&ticker_symbol)"]
CreateFile["File::create(\ndata/22-june-us-gaap/{ticker}.csv)"]
WriteCSV["CsvWriter::new(&mut file)\n.include_header(true)\n.finish(&mut df)"]
CaptureError["Insert error into error_log:\nticker → error message"]
CheckErrors{"error_log\nempty?"}
PrintErrors["Print error summary:\nList all failed tickers"]
PrintSuccess["Print success message"]
End["End Processing"]
Start --> FetchTickers
FetchTickers --> PrintTotal
PrintTotal --> InitErrorLog
InitErrorLog --> LoopStart
LoopStart -->|Next ticker| PrintProgress
PrintProgress --> FetchFundamentals
FetchFundamentals -->|Success| CreateFile
CreateFile -->|Success| WriteCSV
FetchFundamentals -->|Error| CaptureError
CreateFile -->|Error| CaptureError
WriteCSV -->|Error| CaptureError
WriteCSV -->|Success| LoopStart
CaptureError --> LoopStart
LoopStart -->|Complete| CheckErrors
CheckErrors -->|Has errors| PrintErrors
CheckErrors -->|No errors| PrintSuccess
PrintErrors --> End
PrintSuccess --> End
Processing Loop Details
The main processing loop iterates through all company tickers sequentially, performing the following operations for each ticker:
- Progress Logging - src/main.rs:190-195: Prints current position in the dataset
- Data Fetching - src/main.rs203: Calls
fetch_us_gaap_fundamentals()with client, full ticker list, and current ticker symbol - CSV Generation - src/main.rs:206-214: Creates file and writes DataFrame with headers
- Error Capture - src/main.rs:212-225: Logs any failures to
error_logHashMap
Error Handling Strategy
The application uses a non-fatal error handling approach:
| Error Type | Handling Strategy | Result |
|---|---|---|
| Fetch Error | Logged to error_log, processing continues | src/main.rs:223-225 |
| File Creation Error | Logged to error_log, processing continues | src/main.rs:217-220 |
| CSV Write Error | Logged to error_log, processing continues | src/main.rs:211-214 |
| Configuration Error | Fatal, returns immediately | src/main.rs176 |
| Client Creation Error | Fatal, returns immediately | src/main.rs178 |
Sources: src/main.rs:185-227
CSV Output Organization
The US-GAAP processing mode writes output to a flat directory structure:
data/22-june-us-gaap/
├── AAPL.csv
├── MSFT.csv
├── GOOGL.csv
└── ...
Each file is named using the ticker symbol and contains the complete US-GAAP fundamentals DataFrame for that company.
Sources: src/main.rs205
Investment Company Processing Flow (Prototype)
Two commented-out prototype implementations exist in main.rs for processing investment company data. These represent alternative processing modes that may be activated in future iterations.
Production Prototype (Lines 18-122)
The first prototype implements production-ready investment company processing with comprehensive error handling:
Sources: src/main.rs:18-122
graph TB
Start["Start Investment Co Processing"]
InitLogger["env_logger::Builder\n.filter(None, LevelFilter::Info)"]
InitErrorLog["Initialize error_log: Vec<String>"]
LoadConfig["ConfigManager::load()\nHandle fatal errors"]
CreateClient["SecClient::from_config_manager()\nHandle fatal errors"]
FetchInvCo["fetch_investment_company_\nseries_and_class_dataset(&client)"]
LogTotal["info! Total investment companies"]
LoopStart{"For each fund\nin investment_\ncompanies"}
LogProgress["info! Processing: i+1 of total"]
CheckTicker{"fund.class_\nticker exists?"}
LogTicker["info! Ticker symbol"]
FetchNPORT["fetch_nport_filing_by_\nticker_symbol(&client, ticker)"]
GetFirstLetter["ticker.chars().next()\n.to_ascii_uppercase()"]
CreateDir["create_dir_all(\ndata/fund-holdings/{letter})"]
LogRecords["info! Total records"]
WriteCSV["latest_nport_filing\n.write_to_csv(file_path)"]
LogError["error! Log message\nerror_log.push(msg)"]
CheckFinal{"error_log\nempty?"}
PrintSummary["error! Print error summary"]
PrintSuccess["info! All funds successful"]
End["End Processing"]
Start --> InitLogger
InitLogger --> InitErrorLog
InitErrorLog --> LoadConfig
LoadConfig -->|Error| LogError
LoadConfig -->|Success| CreateClient
CreateClient -->|Error| LogError
CreateClient -->|Success| FetchInvCo
FetchInvCo -->|Error| LogError
FetchInvCo -->|Success| LogTotal
LogTotal --> LoopStart
LoopStart -->|Next fund| LogProgress
LogProgress --> CheckTicker
CheckTicker -->|Yes| LogTicker
CheckTicker -->|No| LoopStart
LogTicker --> FetchNPORT
FetchNPORT -->|Error| LogError
FetchNPORT -->|Success| GetFirstLetter
GetFirstLetter --> CreateDir
CreateDir -->|Error| LogError
CreateDir -->|Success| LogRecords
LogRecords --> WriteCSV
WriteCSV -->|Error| LogError
WriteCSV -->|Success| LoopStart
LogError --> LoopStart
LoopStart -->|Complete| CheckFinal
CheckFinal -->|Has errors| PrintSummary
CheckFinal -->|No errors| PrintSuccess
PrintSummary --> End
PrintSuccess --> End
Directory Organization for Fund Holdings
The investment company prototype organizes CSV output by the first letter of the ticker symbol:
data/fund-holdings/
├── A/
│ ├── AADR.csv
│ └── AAXJ.csv
├── B/
│ ├── BKLN.csv
│ └── BND.csv
├── V/
│ ├── VTI.csv
│ └── VXUS.csv
└── ...
This alphabetical categorization is implemented at src/main.rs:83-84 using ticker_symbol.chars().next().unwrap().to_ascii_uppercase().
Sources: src/main.rs:82-107
Debug Prototype (Lines 124-171)
A simpler debug version exists for testing and development, with minimal error handling and console output:
| Feature | Debug Prototype | Production Prototype |
|---|---|---|
| Error Logging | Fatal errors only | Comprehensive Vec/HashMap logs |
| Logger | Not initialized | env_logger with INFO level |
| Progress Tracking | Simple println! | Structured info! macros |
| Error Recovery | Immediate exit with ? | Continue processing, log errors |
| Output | Console + CSV | CSV only |
Sources: src/main.rs:124-171
Data Flow Through Network Layer
The main application delegates all SEC API interactions to the network layer, which handles caching, throttling, and retries transparently:
Sources: src/main.rs:3-9 src/network/fetch_us_gaap_fundamentals.rs:12-28 src/network/fetch_nport_filing.rs:10-49 src/network/fetch_investment_company_series_and_class_dataset.rs:31-105
graph LR
Main["main.rs\nmain()"]
FetchTickers["network::fetch_company_tickers\nGET company_tickers.json"]
FetchGAAP["network::fetch_us_gaap_fundamentals\nGET CIK{cik}/company-facts.json"]
FetchInvCo["network::fetch_investment_company_\nseries_and_class_dataset\nGET investment-company-series-class-{year}.csv"]
FetchNPORT["network::fetch_nport_filing_\nby_ticker_symbol\nGET data/{cik}/{accession}/primary_doc.xml"]
SecClient["SecClient\nHTTP middleware\nThrottling, Caching, Retries"]
SECAPI["SEC EDGAR API\nwww.sec.gov"]
Main --> FetchTickers
Main --> FetchGAAP
Main --> FetchInvCo
Main --> FetchNPORT
FetchTickers --> SecClient
FetchGAAP --> SecClient
FetchInvCo --> SecClient
FetchNPORT --> SecClient
SecClient --> SECAPI
The network functions are designed to be composable, allowing the main application to chain multiple API calls together (e.g., fetching a CIK first, then using it to fetch submissions).
Composite Data Fetching Operations
Several network functions perform composite operations by calling other network functions. For example, fetch_nport_filing_by_ticker_symbol orchestrates multiple API calls:
Sources: src/network/fetch_nport_filing.rs:10-49
graph TB
Input["Input: ticker_symbol"]
FetchCIK["fetch_cik_by_ticker_symbol\n(&sec_client, ticker_symbol)"]
FetchSubs["fetch_cik_submissions\n(&sec_client, cik)"]
GetRecent["CikSubmission::most_recent_\nnport_p_submission(submissions)"]
FetchFiling["fetch_nport_filing_by_cik_\nand_accession_number(\n&sec_client, cik, accession)"]
FetchTickers["fetch_company_tickers\n(&sec_client)"]
ParseXML["parse_nport_xml(\n&xml_data, &company_tickers)"]
Output["Output: Vec<NportInvestment>"]
Input --> FetchCIK
FetchCIK --> FetchSubs
FetchSubs --> GetRecent
GetRecent --> FetchFiling
FetchFiling --> FetchTickers
FetchTickers --> ParseXML
ParseXML --> Output
This composition pattern allows the main application to use high-level functions while the network layer handles the complexity of multi-step API interactions.
graph TB
MainFn["main() -> Result<(), Box<dyn Error>>"]
NetworkFn["Network functions\n-> Result<T, Box<dyn Error>>"]
Parser["Parsers\n-> Result<T, Box<dyn Error>>"]
SecClient["SecClient\nImplements retries & backoff"]
MainFatal["Fatal Errors:\nConfig load failure\nClient creation failure\nInitial ticker fetch failure"]
MainNonFatal["Non-Fatal Errors:\nIndividual ticker fetch failures\nCSV write failures\n→ Logged to error_log\n→ Processing continues"]
NetworkRetry["Automatic Retries:\nHTTP timeouts\nRate limit errors\n→ Exponential backoff\n→ Max 3 retries"]
ParserErr["Parser Errors:\nMalformed XML/JSON\nMissing required fields\n→ Propagate to caller"]
MainFn --> MainFatal
MainFn --> MainNonFatal
MainFn --> NetworkFn
NetworkFn --> NetworkRetry
NetworkFn --> SecClient
NetworkFn --> Parser
Parser --> ParserErr
NetworkRetry --> MainNonFatal
ParserErr --> MainNonFatal
Error Propagation and Recovery
The application uses Rust's Result type throughout the call stack, with different error handling strategies at each level:
Sources: src/main.rs:174-240 src/network/fetch_us_gaap_fundamentals.rs:12-28
Fatal errors (configuration, client setup) immediately return from main(), while per-ticker errors are captured in error_log and reported at the end without interrupting the processing loop.
File System Output Structure
The application writes structured CSV files to the local file system. The output directory organization differs between processing modes:
US-GAAP Mode Output
data/
└── 22-june-us-gaap/
├── AAPL.csv
├── MSFT.csv
└── {ticker}.csv
Sources: src/main.rs205
Investment Company Mode Output
data/
└── fund-holdings/
├── A/
│ ├── AADR.csv
│ └── {ticker}.csv
├── B/
│ └── {ticker}.csv
└── {letter}/
└── {ticker}.csv
Sources: src/main.rs:82-102
The alphabetical organization in investment company mode enables efficient file system operations when dealing with thousands of fund ticker symbols, avoiding the creation of a single directory with excessive entries.
Performance Characteristics
The main application is designed for batch processing with the following characteristics:
| Aspect | Implementation | Location |
|---|---|---|
| Concurrency | Sequential processing (no parallelization in main loop) | src/main.rs187 |
| Rate Limiting | Handled by SecClient throttle policy | See SecClient |
| Caching | HTTP cache and preprocessor cache in SecClient | See Caching System |
| Memory Management | Processes one ticker at a time, drops DataFrames after write | src/main.rs:203-221 |
| Error Recovery | Non-fatal errors logged, processing continues | src/main.rs:185-227 |
The sequential processing model ensures consistent rate limiting and simplifies error tracking, though it could be parallelized in future iterations using Tokio tasks or Rayon parallel iterators.
Sources: src/main.rs:174-240
Summary
The main application flow follows a simple but robust pattern:
- Initialize configuration and HTTP client with comprehensive error handling
- Fetch the complete dataset of tickers or investment companies
- Iterate through each entry sequentially
- Process each entry by fetching additional data and writing to CSV
- Log any errors without stopping processing
- Report a summary of successes and failures
This design prioritizes resilience (continue processing on errors) and observability (detailed logging and error summaries) over performance, making it suitable for overnight batch jobs that process thousands of securities. The commented prototype implementations demonstrate alternative processing modes that can be activated by uncommenting the relevant sections and commenting out the active implementation.