Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Data Models & Enumerations

Relevant source files

Purpose and Scope

This page documents the core data structures and enumerations used throughout the rust-sec-fetcher application. These models represent SEC financial data, including company identifiers, filing metadata, investment holdings, and financial concepts. The data models are defined in src/models.rs:1-18 and src/enums.rs:1-12

For information about how these models are used in data fetching operations, see Data Fetching Functions. For details on how FundamentalConcept is used in the transformation pipeline, see US GAAP Concept Transformation.

Sources: src/models.rs:1-18 src/enums.rs:1-12


SEC Identifier Models

The system uses three primary identifier types to reference companies and filings within the SEC EDGAR system.

Ticker

The Ticker struct represents a company's stock ticker symbol along with its SEC identifiers. It is returned by the fetch_company_tickers function and serves as the primary entry point for company lookups.

Structure:

  • cik: Cik - The company's Central Index Key
  • ticker_symbol: String - Stock ticker symbol (e.g., "AAPL")
  • company_name: String - Full company name
  • origin: TickerOrigin - Source of the ticker data

Sources: src/models.rs:10-11

Cik (Central Index Key)

The Cik struct represents a 10-digit SEC identifier that uniquely identifies a company or entity. CIKs are zero-padded to exactly 10 digits.

Structure:

  • value: u64 - The numeric CIK value (0 to 9,999,999,999)

Validation:

  • CIK values must not exceed 10 digits
  • Values are zero-padded when formatted (e.g., 123"0000000123")
  • Parsing from strings handles both padded and unpadded formats

Error Handling:

  • CikError::InvalidCik - Raised when CIK exceeds 10 digits
  • CikError::ParseError - Raised when parsing fails

Sources: src/models.rs:4-5

AccessionNumber

The AccessionNumber struct represents a unique identifier for SEC filings. Each accession number is exactly 18 digits and encodes the filer's CIK, filing year, and sequence number.

Format: XXXXXXXXXX-YY-NNNNNN

Components:

  • CIK (10 digits) - The SEC identifier of the filer
  • Year (2 digits) - Last two digits of the filing year
  • Sequence (6 digits) - Unique sequence number within the year

Example: "0001234567-23-000045" represents:

  • CIK: 0001234567
  • Year: 2023
  • Sequence: 000045

Key Methods:

  • from_str(accession_str: &str) - Parses from string (with or without dashes)
  • from_parts(cik_u64: u64, year: u16, sequence: u32) - Constructs from components
  • to_string() - Returns dash-separated format
  • to_unformatted_string() - Returns plain 18-digit string

Sources: src/models/accession_number.rs:1-188 src/models.rs:1-3

SEC Identifier Relationships

Sources: src/models.rs:1-18 src/models/accession_number.rs:35-187


Filing Data Structures

CikSubmission

The CikSubmission struct represents metadata about an SEC filing submission. It is returned by the fetch_cik_submissions function.

Key Fields:

  • cik: Cik - The filer's Central Index Key
  • accession_number: AccessionNumber - Unique filing identifier
  • form: String - Filing form type (e.g., "10-K", "10-Q", "NPORT-P")
  • primary_document: String - Main document filename
  • filing_date: String - Date the filing was submitted

Sources: src/models.rs:7-8

NportInvestment

The NportInvestment struct represents a single investment holding from an NPORT-P filing (monthly portfolio holdings report for registered investment companies).

Mapped Fields (linked to company data):

  • mapped_ticker_symbol: Option<String> - Ticker symbol if matched
  • mapped_company_name: Option<String> - Company name if matched
  • mapped_company_cik_number: Option<String> - CIK if matched

Investment Identifiers:

  • name: String - Investment name
  • lei: String - Legal Entity Identifier
  • cusip: String - Committee on Uniform Securities Identification Procedures ID
  • isin: String - International Securities Identification Number
  • title: String - Investment title

Financial Values:

  • balance: Decimal - Number of shares or units held
  • val_usd: Decimal - Value in USD
  • pct_val: Decimal - Percentage of total portfolio value
  • cur_cd: String - Currency code

Classifications:

  • asset_cat: String - Asset category
  • issuer_cat: String - Issuer category
  • payoff_profile: String - Payoff profile type
  • inv_country: String - Investment country

Utility Methods:

  • sort_by_pct_val_desc(investments: &mut Vec<NportInvestment>) - Sorts holdings by percentage value in descending order

Sources: src/models/nport_investment.rs:1-46 src/models.rs:16-17

InvestmentCompany

The InvestmentCompany struct represents an investment company (mutual fund, ETF, etc.) registered with the SEC. It is returned by the fetch_investment_companies function.

Sources: src/models.rs:13-14

Filing Data Structure Relationships

Sources: src/models.rs:1-18 src/models/nport_investment.rs:8-46


FundamentalConcept Enumeration

The FundamentalConcept enum is the most critical enumeration in the system (importance: 8.37), defining 64 standardized financial concepts derived from US GAAP (Generally Accepted Accounting Principles) taxonomies. These concepts are used by the distill_us_gaap_fundamental_concepts transformer to normalize diverse financial reporting variations into a consistent taxonomy.

Definition: src/enums/fundamental_concept_enum.rs:1-72

Traits Implemented:

  • Eq, PartialEq - Equality comparison
  • Hash - Hashable for use in maps
  • Clone - Cloneable
  • EnumString - Parse from string
  • EnumIter - Iterate over all variants
  • Display - Format as string
  • Debug - Debug formatting

Sources: src/enums/fundamental_concept_enum.rs:1-5 src/enums.rs:4-5

Concept Categories

The 64 FundamentalConcept variants are organized into four primary categories corresponding to major financial statement types:

CategoryConcept CountDescription
Balance Sheet13Assets, liabilities, equity positions
Income Statement23Revenues, expenses, income/loss measures
Cash Flow Statement13Operating, investing, financing cash flows
Equity & Comprehensive Income6Equity attributions and comprehensive income
Other9Special items, metadata, and miscellaneous

Sources: src/enums/fundamental_concept_enum.rs:4-72

Complete FundamentalConcept Taxonomy

Balance Sheet Concepts

VariantDescription
AssetsTotal assets
CurrentAssetsAssets expected to be converted to cash within one year
NoncurrentAssetsLong-term assets
LiabilitiesTotal liabilities
CurrentLiabilitiesObligations due within one year
NoncurrentLiabilitiesLong-term obligations
LiabilitiesAndEquityTotal liabilities plus equity (must equal total assets)
EquityTotal shareholder equity
EquityAttributableToParentEquity attributable to parent company shareholders
EquityAttributableToNoncontrollingInterestEquity attributable to minority shareholders
TemporaryEquityMezzanine equity (e.g., redeemable preferred stock)
RedeemableNoncontrollingInterestRedeemable minority interest
CommitmentsAndContingenciesOff-balance-sheet obligations

Income Statement Concepts

VariantDescription
IncomeStatementStartPeriodYearToDateStatement start period marker
RevenuesTotal revenues (consolidated from 57+ variations)
RevenuesExcludingInterestAndDividendsNon-interest revenues
RevenuesNetInterestExpenseRevenues after interest expense
CostOfRevenueDirect costs of producing goods/services
GrossProfitRevenue minus cost of revenue
OperatingExpensesOperating expenses excluding COGS
ResearchAndDevelopmentR&D expenses
CostsAndExpensesTotal costs and expenses
BenefitsCostsExpensesEmployee benefits and related costs
OperatingIncomeLossIncome from operations
NonoperatingIncomeLossIncome from non-operating activities
OtherOperatingIncomeExpensesOther operating items
IncomeLossBeforeEquityMethodInvestmentsIncome before equity investments
IncomeLossFromEquityMethodInvestmentsIncome from equity-method investees
IncomeLossFromContinuingOperationsBeforeTaxPre-tax income from continuing operations
IncomeTaxExpenseBenefitIncome tax expense or benefit
IncomeLossFromContinuingOperationsAfterTaxAfter-tax income from continuing operations
IncomeLossFromDiscontinuedOperationsNetOfTaxAfter-tax income from discontinued operations
ExtraordinaryItemsOfIncomeExpenseNetOfTaxExtraordinary items (after-tax)
NetIncomeLossBottom-line net income or loss
NetIncomeLossAttributableToParentNet income attributable to parent shareholders
NetIncomeLossAttributableToNoncontrollingInterestNet income attributable to minority shareholders
NetIncomeLossAvailableToCommonStockholdersBasicNet income available to common shareholders
PreferredStockDividendsAndOtherAdjustmentsPreferred dividends and adjustments

Industry-Specific Income Statement Concepts

VariantDescription
InterestAndDividendIncomeOperatingInterest and dividend income (financial institutions)
InterestExpenseOperatingInterest expense (financial institutions)
InterestIncomeExpenseOperatingNetNet interest income (banks)
InterestAndDebtExpenseTotal interest and debt expense
InterestIncomeExpenseAfterProvisionForLossesInterest income after loan loss provisions
ProvisionForLoanLeaseAndOtherLossesProvision for credit losses (banks)
NoninterestIncomeNon-interest income (financial institutions)
NoninterestExpenseNon-interest expense (financial institutions)

Cash Flow Statement Concepts

VariantDescription
NetCashFlowTotal net cash flow
NetCashFlowContinuingNet cash flow from continuing operations
NetCashFlowDiscontinuedNet cash flow from discontinued operations
NetCashFlowFromOperatingActivitiesCash from operating activities
NetCashFlowFromOperatingActivitiesContinuingOperating cash flow (continuing)
NetCashFlowFromOperatingActivitiesDiscontinuedOperating cash flow (discontinued)
NetCashFlowFromInvestingActivitiesCash from investing activities
NetCashFlowFromInvestingActivitiesContinuingInvesting cash flow (continuing)
NetCashFlowFromInvestingActivitiesDiscontinuedInvesting cash flow (discontinued)
NetCashFlowFromFinancingActivitiesCash from financing activities
NetCashFlowFromFinancingActivitiesContinuingFinancing cash flow (continuing)
NetCashFlowFromFinancingActivitiesDiscontinuedFinancing cash flow (discontinued)
ExchangeGainsLossesForeign exchange gains/losses

Equity & Comprehensive Income Concepts

VariantDescription
ComprehensiveIncomeLossTotal comprehensive income
ComprehensiveIncomeLossAttributableToParentComprehensive income attributable to parent
ComprehensiveIncomeLossAttributableToNoncontrollingInterestComprehensive income attributable to minorities
OtherComprehensiveIncomeLossOther comprehensive income (OCI)

Other Concepts

VariantDescription
NatureOfOperationsDescription of business operations
GainLossOnSalePropertiesNetTaxGains/losses on property sales (after-tax)

Sources: src/enums/fundamental_concept_enum.rs:4-72

FundamentalConcept Taxonomy Diagram

Sources: src/enums/fundamental_concept_enum.rs:4-72


Other Enumerations

CacheNamespacePrefix

The CacheNamespacePrefix enum defines namespace prefixes used by the caching system to organize cached data by type. Each prefix corresponds to a specific data fetching operation.

Usage: Cache keys are constructed by combining a namespace prefix with a specific identifier (e.g., CacheNamespacePrefix::CompanyTickers + ticker_symbol).

Sources: src/enums.rs:1-2

TickerOrigin

The TickerOrigin enum indicates the source or origin of a ticker symbol.

Variants:

  • Different ticker sources (e.g., SEC company tickers API, NPORT filings, manual mapping)

Usage: Stored in the Ticker struct to track data provenance.

Sources: src/enums.rs:7-8

Url

The Url enum defines the various SEC EDGAR API endpoints used by the application. Each variant represents a specific API URL pattern.

Usage: Used by the network layer to construct API requests without hardcoding URLs throughout the codebase.

Sources: src/enums.rs:10-11

Enumeration Module Structure

Sources: src/enums.rs:1-12


graph TB
    subgraph Identifiers["SEC Identifier Models"]
Cik["Cik\nvalue: u64"]
Ticker["Ticker\ncik: Cik\nticker_symbol: String\ncompany_name: String\norigin: TickerOrigin"]
AccessionNumber["AccessionNumber\ncik: Cik\nyear: u16\nsequence: u32"]
end
    
    subgraph Filings["Filing Data Models"]
CikSubmission["CikSubmission\ncik: Cik\naccession_number: AccessionNumber\nform: String\nprimary_document: String\nfiling_date: String"]
InvestmentCompany["InvestmentCompany\n(identified by Cik)"]
NportInvestment["NportInvestment\nmapped_ticker_symbol: Option&lt;String&gt;\nmapped_company_name: Option&lt;String&gt;\nmapped_company_cik_number: Option&lt;String&gt;\nname, lei, cusip, isin\nbalance, val_usd, pct_val"]
end
    
    subgraph Enums["Enumerations"]
FundamentalConcept["FundamentalConcept\n64 variants\n(Balance Sheet, Income Statement,\nCash Flow, Equity)"]
TickerOrigin["TickerOrigin\n(ticker source/origin)"]
CacheNamespacePrefix["CacheNamespacePrefix\n(cache organization)"]
UrlEnum["Url\n(SEC EDGAR endpoints)"]
end
    
    subgraph NetworkFunctions["Network Functions (3.2)"]
FetchTickers["fetch_company_tickers\n→ Vec&lt;Ticker&gt;"]
FetchCIK["fetch_cik_by_ticker_symbol\n→ Option&lt;Cik&gt;"]
FetchSubmissions["fetch_cik_submissions\n→ Vec&lt;CikSubmission&gt;"]
FetchNPORT["fetch_nport_filing\n→ Vec&lt;NportInvestment&gt;"]
FetchInvCo["fetch_investment_companies\n→ Vec&lt;InvestmentCompany&gt;"]
end
    
    subgraph Transformer["Transformation (3.3)"]
Distill["distill_us_gaap_fundamental_concepts\nuses FundamentalConcept"]
end
    
 
   Ticker -->|contains| Cik
 
   Ticker -->|uses| TickerOrigin
 
   AccessionNumber -->|contains| Cik
 
   CikSubmission -->|contains| Cik
 
   CikSubmission -->|contains| AccessionNumber
 
   InvestmentCompany -.->|identified by| Cik
 
   NportInvestment -.->|mapped to| Ticker
    
 
   FetchTickers -.->|returns| Ticker
 
   FetchCIK -.->|returns| Cik
 
   FetchSubmissions -.->|returns| CikSubmission
 
   FetchNPORT -.->|returns| NportInvestment
 
   FetchInvCo -.->|returns| InvestmentCompany
    
 
   Distill -.->|uses| FundamentalConcept

Data Model Relationships

The following diagram illustrates how the core data models relate to each other and which models contain or reference other models.

Sources: src/models.rs:1-18 src/enums.rs:1-12


Implementation Details

Error Handling

Both Cik and AccessionNumber implement custom error types to handle parsing and validation failures:

CikError:

  • InvalidCik - CIK exceeds 10 digits
  • ParseError - Parsing from string failed

AccessionNumberError:

  • InvalidLength - Accession number is not 18 digits
  • ParseError - Parsing failed
  • CikError - Wrapped CIK error

Both error types implement std::fmt::Display and std::error::Error for proper error propagation.

Sources: src/models/accession_number.rs:42-76

Serialization

The NportInvestment struct uses serde for serialization/deserialization with serde_with for custom formatting:

  • #[serde_as(as = "DisplayFromStr")] - Applied to Decimal fields (balance, val_usd, pct_val) to serialize them as strings
  • Option<T> fields are used for nullable data from NPORT filings

Sources: src/models/nport_investment.rs:3-38

Decimal Precision

Financial values in NportInvestment use the rust_decimal::Decimal type, which provides:

  • Arbitrary precision decimal arithmetic
  • No floating-point rounding errors
  • Safe comparison operations

Sources: src/models/nport_investment.rs:2-33