Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Data Fetching Functions

Relevant source files

Purpose and Scope

This document describes the data fetching functions in the Rust sec-fetcher application. These functions provide the interface for retrieving financial data from the SEC EDGAR API, including company tickers, CIK submissions, NPORT filings, US GAAP fundamentals, and investment company datasets.

For information about the underlying HTTP client, throttling, and caching infrastructure, see Network Layer & SecClient. For details on how US GAAP data is transformed after fetching, see US GAAP Concept Transformation. For information about the data structures returned by these functions, see Data Models & Enumerations.

Overview of Data Fetching Architecture

The network module provides six primary fetching functions that retrieve different types of financial data from the SEC EDGAR API. Each function accepts a SecClient reference and returns structured data types.

Diagram: Data Fetching Function Overview

graph TB
    subgraph "Entry Points"
        FetchTickers["fetch_company_tickers()\nsrc/network/fetch_company_tickers.rs"]
FetchCIK["fetch_cik_by_ticker_symbol()\nNot shown in files"]
FetchSubs["fetch_cik_submissions()\nsrc/network/fetch_cik_submissions.rs"]
FetchNPORT["fetch_nport_filing_by_ticker_symbol()\nfetch_nport_filing_by_cik_and_accession_number()\nsrc/network/fetch_nport_filing.rs"]
FetchGAAP["fetch_us_gaap_fundamentals()\nsrc/network/fetch_us_gaap_fundamentals.rs"]
FetchInvCo["fetch_investment_company_series_and_class_dataset()\nsrc/network/fetch_investment_company_series_and_class_dataset.rs"]
end
    
    subgraph "SEC EDGAR API Endpoints"
        APITickers["company_tickers.json"]
APISubmissions["submissions/CIK{cik}.json"]
APINport["Archives/edgar/data/{cik}/{accession}/primary_doc.xml"]
APIFacts["api/xbrl/companyfacts/CIK{cik}.json"]
APIInvCo["files/investment/data/other/.../{year}.csv"]
end
    
    subgraph "Return Types"
        VecTicker["Vec<Ticker>"]
CikType["Cik"]
VecCikSub["Vec<CikSubmission>"]
VecNport["Vec<NportInvestment>"]
DataFrame["TickerFundamentalsDataFrame"]
VecInvCo["Vec<InvestmentCompany>"]
end
    
 
   FetchTickers -->|HTTP GET| APITickers
 
   APITickers --> VecTicker
    
 
   FetchCIK -->|Searches| VecTicker
 
   FetchCIK --> CikType
    
 
   FetchSubs -->|HTTP GET| APISubmissions
 
   APISubmissions --> VecCikSub
    
 
   FetchNPORT -->|HTTP GET| APINport
 
   APINport --> VecNport
    
 
   FetchGAAP -->|HTTP GET| APIFacts
 
   APIFacts --> DataFrame
    
 
   FetchInvCo -->|HTTP GET| APIInvCo
 
   APIInvCo --> VecInvCo

Sources: src/network/fetch_company_tickers.rs src/network/fetch_cik_submissions.rs src/network/fetch_nport_filing.rs src/network/fetch_us_gaap_fundamentals.rs src/network/fetch_investment_company_series_and_class_dataset.rs

Company Ticker Fetching

Function: fetch_company_tickers

The fetch_company_tickers function retrieves the master list of operating company tickers from the SEC EDGAR API. This dataset provides the mapping between ticker symbols, company names, and CIK numbers.

Signature:

Implementation Details:

AspectDetail
API EndpointUrl::CompanyTickers (company_tickers.json)
Response FormatJSON object with numeric keys mapping to ticker data
Parsing LogicLines 1:8-31
CIK ConversionUses Cik::from_u64() to format CIK numbers 22
Origin TagAll tickers tagged with TickerOrigin::CompanyTickers 28

Returned Data Structure:

Each Ticker in the result contains:

  • cik: 10-digit formatted CIK number
  • symbol: Ticker symbol (e.g., "AAPL")
  • company_name: Full company name
  • origin: Set to TickerOrigin::CompanyTickers

Usage Example:

The function is commonly used as a prerequisite for other operations that require ticker-to-CIK mapping:

Sources: src/network/fetch_company_tickers.rs:1-34

CIK Lookup and Submissions

Function: fetch_cik_submissions

The fetch_cik_submissions function retrieves all SEC filings (submissions) for a given company, identified by its CIK number.

Signature:

Implementation Details:

The function performs the following operations:

  1. URL Construction : Creates the submissions endpoint URL using Url::CikSubmission enum 12
  2. JSON Fetching : Retrieves submission data via sec_client.fetch_json() 14
  3. Data Extraction : Parses the filings.recent object containing parallel arrays 2:5-51
  4. Parallel Array Processing : Uses itertools::izip! to iterate over multiple arrays simultaneously 5:5-60

JSON Structure:

Diagram: CikSubmission JSON Structure

Parsing Implementation:

The function extracts four parallel arrays and combines them using izip!:

  • accessionNumber: Accession numbers for filings
  • form: Form types (e.g., "10-K", "NPORT-P")
  • primaryDocument: Primary document filenames
  • filingDate: Filing dates in "YYYY-MM-DD" format

Each set of parallel values is combined into a CikSubmission struct 6:8-75

Date Parsing:

Filing dates are parsed from strings into NaiveDate objects 6:1-63:

Sources: src/network/fetch_cik_submissions.rs:1-79 examples/lookup_cik.rs:43-70

NPORT Filing Fetching

NPORT-P filings contain detailed portfolio holdings for registered investment companies (mutual funds and ETFs). The module provides two related functions for fetching this data.

Function: fetch_nport_filing_by_ticker_symbol

Signature:

Workflow:

Diagram: NPORT Filing Fetch by Ticker Symbol

This convenience function orchestrates multiple calls 1:0-30:

  1. Lookup CIK from ticker symbol 14
  2. Fetch all submissions for that CIK 16
  3. Filter for most recent NPORT-P submission 1:8-20
  4. Fetch the filing details 2:2-27

Function: fetch_nport_filing_by_cik_and_accession_number

Signature:

Implementation Details:

StepOperationCode Reference
1. Fetch company tickersRequired for ticker mapping39
2. Construct URLUrl::CikAccessionPrimaryDocument41
3. Fetch XMLraw_request() with GET method4:3-46
4. Parse XMLparse_nport_xml()48

XML Parsing Details:

The parse_nport_xml function extracts investment holdings from the primary document XML:

  • Main Element : <invstOrSec> (investment or security) 2:6-46
  • Fields Extracted : name, LEI, title, CUSIP, ISIN, balance, currency, USD value, percentage value, payoff profile, asset category, issuer category, country
  • Ticker Mapping : Fuzzy matches investment names to company tickers 1:31-142
  • Sorting : Results sorted by pct_val descending 125

Sources: src/network/fetch_nport_filing.rs:1-49 src/parsers/parse_nport_xml.rs:12-146 examples/nport_filing.rs:1-29

US GAAP Fundamentals Fetching

Function: fetch_us_gaap_fundamentals

The fetch_us_gaap_fundamentals function retrieves standardized financial statement data (US Generally Accepted Accounting Principles) for a company.

Signature:

Type Alias:

The function returns a Polars DataFrame containing financial facts 9

Implementation Flow:

Diagram: US GAAP Fundamentals Fetch Flow

Key Operations:

  1. CIK Lookup : Resolves ticker symbol to CIK using the provided company tickers list 18
  2. URL Construction : Builds the company facts endpoint URL 20
  3. API Call : Fetches JSON data from SEC 25
  4. Parsing : Converts JSON to structured DataFrame 27

API Response Structure:

The SEC company facts endpoint returns nested JSON containing:

  • US-GAAP taxonomy : Standardized accounting concepts
  • Unit types : Currency units (USD), shares, etc.
  • Time series data : Historical values with filing dates and periods

The parsing function (documented in detail in US GAAP Concept Transformation) extracts this into a tabular DataFrame format.

Sources: src/network/fetch_us_gaap_fundamentals.rs:1-28

Investment Company Dataset Fetching

The investment company dataset provides comprehensive information about registered investment companies, including mutual funds and ETFs, their series, and share classes.

graph TB
    Start["Start"]
CheckCache["Check preprocessor_cache\nfor latest_funds_year"]
CacheHit{{"Cache Hit?"}}
UseCache["Use cached year"]
UseCurrent["Use current year"]
TryFetch["fetch_investment_company_series_and_class_dataset_for_year(year)"]
Success{{"Success?"}}
Store["Store year in cache\nTTL: 1 week"]
Return["Return data"]
Decrement["year -= 1"]
CheckLimit{{"year >= 2024?"}}
Error["Return error"]
Start --> CheckCache
 
   CheckCache --> CacheHit
 
   CacheHit -->|Yes| UseCache
 
   CacheHit -->|No| UseCurrent
 
   UseCache --> TryFetch
 
   UseCurrent --> TryFetch
    
 
   TryFetch --> Success
 
   Success -->|Yes| Store
 
   Store --> Return
 
   Success -->|No| Decrement
 
   Decrement --> CheckLimit
 
   CheckLimit -->|Yes| TryFetch
 
   CheckLimit -->|No| Error

Function: fetch_investment_company_series_and_class_dataset

Signature:

Year Fallback Strategy:

This function implements a sophisticated year-based fallback mechanism because the SEC updates the dataset annually and may not have data for the current year immediately:

Diagram: Investment Company Dataset Year Fallback Logic

Implementation Details:

ComponentDescriptionCode Reference
Cache KeyNAMESPACE_HASHER_LATEST_FUNDS_YEAR1:1-15
NamespaceCacheNamespacePrefix::LatestFundsYear13
Cache Querypreprocessor_cache.read_with_ttl::<usize>()3:9-42
Year Range2024 to current year46
Cache TTL1 week (604800 seconds)59

Caching Strategy:

The function caches the most recent successful year to avoid repeated year fallback attempts 3:8-42:

Function: fetch_investment_company_series_and_class_dataset_for_year

Signature:

This lower-level function fetches the dataset for a specific year 8:0-105:

  1. URL Construction : Uses Url::InvestmentCompanySeriesAndClassDataset(year) 84
  2. Throttle Override : Reduces max retries to 2 for faster fallback 8:6-92
  3. Raw Request : Uses raw_request() instead of higher-level fetch methods 9:4-101
  4. CSV Parsing : Calls parse_investment_companies_csv() 104

Throttle Policy Override:

Sources: src/network/fetch_investment_company_series_and_class_dataset.rs:1-178

graph TB
    subgraph "Independent Functions"
        FT["fetch_company_tickers()"]
FCS["fetch_cik_submissions(cik)"]
FIC["fetch_investment_company_series_and_class_dataset()"]
end
    
    subgraph "Dependent Functions"
        FCIK["fetch_cik_by_ticker_symbol()"]
FGAAP["fetch_us_gaap_fundamentals()"]
FNPORT1["fetch_nport_filing_by_ticker_symbol()"]
FNPORT2["fetch_nport_filing_by_cik_and_accession_number()"]
end
    
    subgraph "Data Models"
        VT["Vec&lt;Ticker&gt;"]
C["Cik"]
VCS["Vec&lt;CikSubmission&gt;"]
end
    
 
   FT --> VT
 
   VT --> FCIK
 
   VT --> FGAAP
 
   VT --> FNPORT2
    
 
   FCIK --> C
 
   C --> FCS
 
   C --> FGAAP
    
 
   FCS --> VCS
 
   VCS --> FNPORT1
    
 
   FCIK --> FNPORT1
 
   FNPORT1 --> FNPORT2

Function Dependencies and Integration

The data fetching functions form a dependency graph where some functions rely on others to complete their tasks.

Diagram: Function Dependency Graph

Common Integration Patterns:

  1. Ticker to CIK Resolution :

    • fetch_company_tickers()fetch_cik_by_ticker_symbol()Cik
    • Used by: NPORT filing fetch, US GAAP fundamentals fetch
  2. Submission Filtering :

    • fetch_cik_submissions()CikSubmission::most_recent_nport_p_submission()
    • Used by: NPORT filing by ticker symbol
  3. Ticker Mapping in Results :

    • fetch_company_tickers() → parsing functions (e.g., parse_nport_xml)
    • Enriches raw data with ticker information

Example Integration (from examples):

Sources: src/network/fetch_nport_filing.rs:3-5 examples/nport_filing.rs:16-22 examples/lookup_cik.rs:18-24

Error Handling

All data fetching functions return Result<T, Box<dyn Error>>, providing consistent error handling across the module. Common error scenarios include:

Error TypeCauseExample Functions
Network ErrorsHTTP request failures, timeoutsAll functions
Parsing ErrorsInvalid JSON/XML structurefetch_cik_submissions, fetch_nport_filing_by_cik_and_accession_number
Not FoundTicker symbol not found, no NPORT-P filing existsfetch_nport_filing_by_ticker_symbol
Year Fallback ExhaustionNo data available for any yearfetch_investment_company_series_and_class_dataset

Error Propagation:

Functions use the ? operator to propagate errors up the call stack, allowing callers to handle errors at the appropriate level. For example, in fetch_nport_filing_by_ticker_symbol:

If either dependency function fails, the error is immediately returned to the caller.

Sources: src/network/fetch_cik_submissions.rs:8-11 src/network/fetch_us_gaap_fundamentals.rs:12-16 src/network/fetch_nport_filing.rs:10-13