Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

US GAAP Concept Transformation

Relevant source files

Purpose and Scope

This page documents the US GAAP concept transformation system, which normalizes raw financial concept names from SEC EDGAR filings into a standardized taxonomy. The core functionality is provided by the distill_us_gaap_fundamental_concepts function, which maps the diverse US GAAP terminology (57+ revenue variations, 6 cost variants, multiple equity representations) into a consistent set of 64 FundamentalConcept enum variants.

For information about fetching US GAAP data from the SEC API, see Data Fetching Functions. For details on the data models that use these concepts, see Data Models & Enumerations. For the Python ML pipeline that processes the transformed concepts, see Python narrative_stack System.


System Overview

The transformation system acts as a critical normalization layer between raw SEC EDGAR filings and downstream data processing. Companies report financial data using various US GAAP concept names (e.g., Revenues, SalesRevenueNet, HealthCareOrganizationRevenue), and this system ensures all variations map to consistent concept identifiers.

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275 examples/us_gaap_human_readable.rs:1-9

graph TB
    subgraph "Input Layer"
        RawFiling["SEC EDGAR Filing\nRaw JSON Data"]
RawConcepts["Raw US GAAP Concepts\n- Revenues\n- SalesRevenueNet\n- HealthCareOrganizationRevenue\n- InterestAndDividendIncomeOperating\n- 57+ more revenue types"]
end
    
    subgraph "Transformation Layer"
        DistillFn["distill_us_gaap_fundamental_concepts"]
MappingEngine["Mapping Engine\n4 Pattern Types"]
OneToOne["One-to-One\nAssets → Assets"]
Hierarchical["Hierarchical\nCurrentAssets → [CurrentAssets, Assets]"]
Synonyms["Synonyms\n6 cost types → CostOfRevenue"]
Industry["Industry-Specific\n57+ revenue types → Revenues"]
MappingEngine --> OneToOne
 
       MappingEngine --> Hierarchical
 
       MappingEngine --> Synonyms
 
       MappingEngine --> Industry
    end
    
    subgraph "Output Layer"
        FundamentalConcept["FundamentalConcept Enum\n64 standardized variants"]
BS["Balance Sheet Concepts\n- Assets, CurrentAssets\n- Liabilities, CurrentLiabilities\n- Equity, EquityAttributableToParent"]
IS["Income Statement Concepts\n- Revenues, CostOfRevenue\n- GrossProfit, OperatingIncomeLoss\n- NetIncomeLoss"]
CF["Cash Flow Concepts\n- NetCashFlowFromOperatingActivities\n- NetCashFlowFromInvestingActivities\n- NetCashFlowFromFinancingActivities"]
FundamentalConcept --> BS
 
       FundamentalConcept --> IS
 
       FundamentalConcept --> CF
    end
    
 
   RawFiling --> RawConcepts
 
   RawConcepts --> DistillFn
 
   DistillFn --> MappingEngine
 
   MappingEngine --> FundamentalConcept
    
    style DistillFn fill:#f9f9f9
    style FundamentalConcept fill:#f9f9f9

The FundamentalConcept Taxonomy

The FundamentalConcept enum defines 64 standardized financial concept variants organized into four main categories: Balance Sheet, Income Statement, Cash Flow, and Equity classifications. Each variant represents a normalized concept that may map from multiple raw US GAAP names.

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275

graph TB
    subgraph "FundamentalConcept Enum"
        Root["FundamentalConcept\n(64 total variants)"]
end
    
    subgraph "Balance Sheet Concepts"
        Assets["Assets"]
CurrentAssets["CurrentAssets"]
NoncurrentAssets["NoncurrentAssets"]
Liabilities["Liabilities"]
CurrentLiabilities["CurrentLiabilities"]
NoncurrentLiabilities["NoncurrentLiabilities"]
LiabilitiesAndEquity["LiabilitiesAndEquity"]
CommitmentsAndContingencies["CommitmentsAndContingencies"]
end
    
    subgraph "Income Statement Concepts"
        Revenues["Revenues"]
RevenuesNetInterestExpense["RevenuesNetInterestExpense"]
RevenuesExcludingInterestAndDividends["RevenuesExcludingInterestAndDividends"]
InterestAndDividendIncomeOperating["InterestAndDividendIncomeOperating"]
CostOfRevenue["CostOfRevenue"]
GrossProfit["GrossProfit"]
OperatingExpenses["OperatingExpenses"]
OperatingIncomeLoss["OperatingIncomeLoss"]
NonoperatingIncomeLoss["NonoperatingIncomeLoss"]
IncomeTaxExpenseBenefit["IncomeTaxExpenseBenefit"]
NetIncomeLoss["NetIncomeLoss"]
NetIncomeLossAttributableToParent["NetIncomeLossAttributableToParent"]
NetIncomeLossAttributableToNoncontrollingInterest["NetIncomeLossAttributableToNoncontrollingInterest"]
end
    
    subgraph "Cash Flow Concepts"
        NetCashFlow["NetCashFlow"]
NetCashFlowContinuing["NetCashFlowContinuing"]
NetCashFlowDiscontinued["NetCashFlowDiscontinued"]
NetCashFlowFromOperatingActivities["NetCashFlowFromOperatingActivities"]
NetCashFlowFromInvestingActivities["NetCashFlowFromInvestingActivities"]
NetCashFlowFromFinancingActivities["NetCashFlowFromFinancingActivities"]
ExchangeGainsLosses["ExchangeGainsLosses"]
end
    
    subgraph "Equity Concepts"
        Equity["Equity"]
EquityAttributableToParent["EquityAttributableToParent"]
EquityAttributableToNoncontrollingInterest["EquityAttributableToNoncontrollingInterest"]
TemporaryEquity["TemporaryEquity"]
RedeemableNoncontrollingInterest["RedeemableNoncontrollingInterest"]
end
    
 
   Root --> Assets
 
   Root --> CurrentAssets
 
   Root --> NoncurrentAssets
 
   Root --> Liabilities
 
   Root --> CurrentLiabilities
 
   Root --> NoncurrentLiabilities
 
   Root --> LiabilitiesAndEquity
 
   Root --> CommitmentsAndContingencies
    
 
   Root --> Revenues
 
   Root --> RevenuesNetInterestExpense
 
   Root --> RevenuesExcludingInterestAndDividends
 
   Root --> InterestAndDividendIncomeOperating
 
   Root --> CostOfRevenue
 
   Root --> GrossProfit
 
   Root --> OperatingExpenses
 
   Root --> OperatingIncomeLoss
 
   Root --> NonoperatingIncomeLoss
 
   Root --> IncomeTaxExpenseBenefit
 
   Root --> NetIncomeLoss
 
   Root --> NetIncomeLossAttributableToParent
 
   Root --> NetIncomeLossAttributableToNoncontrollingInterest
    
 
   Root --> NetCashFlow
 
   Root --> NetCashFlowContinuing
 
   Root --> NetCashFlowDiscontinued
 
   Root --> NetCashFlowFromOperatingActivities
 
   Root --> NetCashFlowFromInvestingActivities
 
   Root --> NetCashFlowFromFinancingActivities
 
   Root --> ExchangeGainsLosses
    
 
   Root --> Equity
 
   Root --> EquityAttributableToParent
 
   Root --> EquityAttributableToNoncontrollingInterest
 
   Root --> TemporaryEquity
 
   Root --> RedeemableNoncontrollingInterest

Mapping Pattern Types

The transformation system implements four distinct mapping patterns to handle the diverse ways companies report financial concepts.

Pattern 1: One-to-One Mapping

Simple direct mappings where a single US GAAP concept name maps to exactly one FundamentalConcept variant.

graph LR
 
   A["Assets"] --> FA["FundamentalConcept::Assets"]
B["Liabilities"] --> FB["FundamentalConcept::Liabilities"]
C["GrossProfit"] --> FC["FundamentalConcept::GrossProfit"]
D["CommitmentsAndContingencies"] --> FD["FundamentalConcept::CommitmentsAndContingencies"]
style FA fill:#f9f9f9
    style FB fill:#f9f9f9
    style FC fill:#f9f9f9
    style FD fill:#f9f9f9
Raw US GAAP ConceptFundamentalConcept Output
Assetsvec![Assets]
Liabilitiesvec![Liabilities]
GrossProfitvec![GrossProfit]
OperatingIncomeLossvec![OperatingIncomeLoss]
CommitmentsAndContingenciesvec![CommitmentsAndContingencies]

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:4-17 tests/distill_us_gaap_fundamental_concepts_tests.rs:31-36 tests/distill_us_gaap_fundamental_concepts_tests.rs:336-341

Pattern 2: Hierarchical Mapping

Specific concepts map to multiple variants, including both the specific concept and parent categories. This enables queries at different levels of granularity.

graph LR
    subgraph "Input"
        CA["AssetsCurrent"]
CL["LiabilitiesCurrent"]
SE["StockholdersEquity"]
end
    
    subgraph "Output: Multiple Concepts"
        CA1["FundamentalConcept::CurrentAssets"]
CA2["FundamentalConcept::Assets"]
SE1["FundamentalConcept::EquityAttributableToParent"]
SE2["FundamentalConcept::Equity"]
end
    
 
   CA --> CA1
 
   CA --> CA2
 
   SE --> SE1
 
   SE --> SE2
    
    style CA1 fill:#f9f9f9
    style CA2 fill:#f9f9f9
    style SE1 fill:#f9f9f9
    style SE2 fill:#f9f9f9
Raw US GAAP ConceptFundamentalConcept Output (Ordered)
AssetsCurrentvec![CurrentAssets, Assets]
StockholdersEquityvec![EquityAttributableToParent, Equity]
NetIncomeLossvec![NetIncomeLossAttributableToParent, NetIncomeLoss]
IncomeLossFromContinuingOperationsvec![IncomeLossFromContinuingOperationsAfterTax, NetIncomeLoss]

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:10-17 tests/distill_us_gaap_fundamental_concepts_tests.rs:158-165 tests/distill_us_gaap_fundamental_concepts_tests.rs:262-286 tests/distill_us_gaap_fundamental_concepts_tests.rs:692-698

Pattern 3: Synonym Consolidation

Multiple US GAAP concept names that represent the same financial concept are consolidated into a single FundamentalConcept variant.

graph LR
    subgraph "Input: 6 Cost Variants"
        C1["CostOfRevenue"]
C2["CostOfGoodsAndServicesSold"]
C3["CostOfServices"]
C4["CostOfGoodsSold"]
C5["CostOfGoodsSoldExcludingDepreciationDepletionAndAmortization"]
C6["CostOfGoodsSoldElectric"]
end
    
    subgraph "Output: Single Concept"
        CO["FundamentalConcept::CostOfRevenue"]
end
    
 
   C1 --> CO
 
   C2 --> CO
 
   C3 --> CO
 
   C4 --> CO
 
   C5 --> CO
 
   C6 --> CO
    
    style CO fill:#f9f9f9

Cost of Revenue Synonyms

Raw US GAAP ConceptFundamentalConcept Output
CostOfRevenuevec![CostOfRevenue]
CostOfGoodsAndServicesSoldvec![CostOfRevenue]
CostOfServicesvec![CostOfRevenue]
CostOfGoodsSoldvec![CostOfRevenue]
CostOfGoodsSoldExcludingDepreciationDepletionAndAmortizationvec![CostOfRevenue]
CostOfGoodsSoldElectricvec![CostOfRevenue]

Equity Noncontrolling Interest Synonyms

Raw US GAAP ConceptFundamentalConcept Output
MinorityInterestvec![EquityAttributableToNoncontrollingInterest]
PartnersCapitalAttributableToNoncontrollingInterestvec![EquityAttributableToNoncontrollingInterest]
MinorityInterestInLimitedPartnershipsvec![EquityAttributableToNoncontrollingInterest]
MinorityInterestInOperatingPartnershipsvec![EquityAttributableToNoncontrollingInterest]
MinorityInterestInJointVenturesvec![EquityAttributableToNoncontrollingInterest]
NonredeemableNoncontrollingInterestvec![EquityAttributableToNoncontrollingInterest]
NoncontrollingInterestInVariableInterestEntityvec![EquityAttributableToNoncontrollingInterest]

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:80-112 tests/distill_us_gaap_fundamental_concepts_tests.rs:196-259

Pattern 4: Industry-Specific Revenue Mapping

The most complex pattern handles 57+ industry-specific revenue variations, mapping them all to the Revenues concept. Some revenue types also map to hierarchical categories.

graph TB
    subgraph "Industry-Specific Revenue Types"
        R1["Revenues"]
R2["SalesRevenueNet"]
R3["HealthCareOrganizationRevenue"]
R4["RealEstateRevenueNet"]
R5["OilAndGasRevenue"]
R6["FinancialServicesRevenue"]
R7["AdvertisingRevenue"]
R8["SubscriptionRevenue"]
R9["RoyaltyRevenue"]
R10["ElectricUtilityRevenue"]
R11["PassengerRevenue"]
R12["CargoAndFreightRevenue"]
R13["... 45+ more types"]
end
    
    subgraph "Special Revenue Categories"
        RH1["InterestAndDividendIncomeOperating"]
RH2["RevenuesExcludingInterestAndDividends"]
RH3["RevenuesNetOfInterestExpense"]
RH4["InvestmentBankingRevenue"]
end
    
    subgraph "Output Concepts"
        Rev["FundamentalConcept::Revenues"]
RevInt["FundamentalConcept::InterestAndDividendIncomeOperating"]
RevExcl["FundamentalConcept::RevenuesExcludingInterestAndDividends"]
RevNet["FundamentalConcept::RevenuesNetInterestExpense"]
end
    
 
   R1 --> Rev
 
   R2 --> Rev
 
   R3 --> Rev
 
   R4 --> Rev
 
   R5 --> Rev
 
   R6 --> Rev
 
   R7 --> Rev
 
   R8 --> Rev
 
   R9 --> Rev
 
   R10 --> Rev
 
   R11 --> Rev
 
   R12 --> Rev
 
   R13 --> Rev
    
 
   RH1 --> RevInt
 
   RH1 --> Rev
 
   RH2 --> RevExcl
 
   RH2 --> Rev
 
   RH3 --> RevNet
 
   RH3 --> Rev
 
   RH4 --> RevExcl
 
   RH4 --> Rev
    
    style Rev fill:#f9f9f9
    style RevInt fill:#f9f9f9
    style RevExcl fill:#f9f9f9
    style RevNet fill:#f9f9f9

Sample Revenue Mappings

IndustryRaw US GAAP ConceptFundamentalConcept Output
GeneralRevenuesvec![Revenues]
GeneralSalesRevenueNetvec![Revenues]
HealthcareHealthCareOrganizationRevenuevec![Revenues]
Real EstateRealEstateRevenueNetvec![Revenues]
EnergyOilAndGasRevenuevec![Revenues]
MiningRevenueMineralSalesvec![Revenues]
HospitalityRevenueFromLeasedAndOwnedHotelsvec![Revenues]
FranchiseFranchisorRevenuevec![Revenues]
MediaSubscriptionRevenuevec![Revenues]
MediaAdvertisingRevenuevec![Revenues]
EntertainmentAdmissionsRevenuevec![Revenues]
LicensingLicensesRevenuevec![Revenues]
LicensingRoyaltyRevenuevec![Revenues]
TransportationPassengerRevenuevec![Revenues]
TransportationCargoAndFreightRevenuevec![Revenues]
UtilitiesElectricUtilityRevenuevec![Revenues]
FinancialInterestAndDividendIncomeOperatingvec![InterestAndDividendIncomeOperating, Revenues]
FinancialInvestmentBankingRevenuevec![RevenuesExcludingInterestAndDividends, Revenues]

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:954-1188 tests/distill_us_gaap_fundamental_concepts_tests.rs:1191-1225


The distill_us_gaap_fundamental_concepts Function

The core transformation function accepts a string representation of a US GAAP concept name and returns an Option<Vec<FundamentalConcept>>. The return type is an Option because not all US GAAP concepts are mapped (unmapped concepts return None), and a Vec because some concepts map to multiple standardized variants (hierarchical pattern).

graph TB
    subgraph "Function Input"
        InputStr["Input: &str\nUS GAAP concept name"]
end
    
    subgraph "distill_us_gaap_fundamental_concepts"
        Parse["Parse input string"]
MatchEngine["Pattern Matching Engine"]
Decision{"Concept\nRecognized?"}
BuildVec["Construct Vec<FundamentalConcept>\nApply mapping pattern"]
ReturnSome["Return Some(Vec<FundamentalConcept>)"]
ReturnNone["Return None"]
end
    
    subgraph "Function Output"
        OutputSome["Option::Some(Vec<FundamentalConcept>)\n1-2 variants typically"]
OutputNone["Option::None\nUnrecognized concept"]
end
    
 
   InputStr --> Parse
 
   Parse --> MatchEngine
 
   MatchEngine --> Decision
 
   Decision -->|Yes| BuildVec
 
   Decision -->|No| ReturnNone
 
   BuildVec --> ReturnSome
 
   ReturnSome --> OutputSome
 
   ReturnNone --> OutputNone
    
    style MatchEngine fill:#f9f9f9
    style ReturnSome fill:#f9f9f9

Function Signature and Flow

Example Usage

Sources: examples/us_gaap_human_readable.rs:1-9 tests/distill_us_gaap_fundamental_concepts_tests.rs:4-8


Complex Hierarchical Examples

Some concepts demonstrate multi-level hierarchical relationships where a specific concept may relate to multiple parent categories.

graph TD
    subgraph "Net Income Loss Hierarchy"
        NIL1["NetIncomeLoss\n(raw input)"]
NIL2["FundamentalConcept::NetIncomeLossAttributableToParent"]
NIL3["FundamentalConcept::NetIncomeLoss"]
NILCS["NetIncomeLossAvailableToCommonStockholdersBasic\n(raw input)"]
NILCS1["FundamentalConcept::NetIncomeLossAvailableToCommonStockholdersBasic"]
NILCS2["FundamentalConcept::NetIncomeLoss"]
ILCO1["IncomeLossFromContinuingOperations\n(raw input)"]
ILCO2["FundamentalConcept::IncomeLossFromContinuingOperationsAfterTax"]
ILCO3["FundamentalConcept::NetIncomeLoss"]
NIL1 --> NIL2
 
       NIL1 --> NIL3
 
       NILCS --> NILCS1
 
       NILCS --> NILCS2
 
       ILCO1 --> ILCO2
 
       ILCO1 --> ILCO3
    end
    
    subgraph "Comprehensive Income Hierarchy"
        CI1["ComprehensiveIncomeNetOfTax\n(raw input)"]
CI2["FundamentalConcept::ComprehensiveIncomeLossAttributableToParent"]
CI3["FundamentalConcept::ComprehensiveIncomeLoss"]
CI1 --> CI2
 
       CI1 --> CI3
    end
    
    subgraph "Revenue Hierarchy"
        RNOI["RevenuesNetOfInterestExpense\n(raw input)"]
RNOI1["FundamentalConcept::RevenuesNetInterestExpense"]
RNOI2["FundamentalConcept::Revenues"]
RNOI --> RNOI1
 
       RNOI --> RNOI2
    end
    
    style NIL2 fill:#f9f9f9
    style NIL3 fill:#f9f9f9
    style NILCS1 fill:#f9f9f9
    style NILCS2 fill:#f9f9f9
    style ILCO2 fill:#f9f9f9
    style ILCO3 fill:#f9f9f9
    style CI2 fill:#f9f9f9
    style CI3 fill:#f9f9f9
    style RNOI1 fill:#f9f9f9
    style RNOI2 fill:#f9f9f9

Income Statement Hierarchies

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:686-725 tests/distill_us_gaap_fundamental_concepts_tests.rs:39-77 tests/distill_us_gaap_fundamental_concepts_tests.rs:976-982

Cash Flow Hierarchies

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:550-683


Equity Concept Mappings

Equity concepts demonstrate all four mapping patterns, including complex organizational structures (stockholders vs. partners vs. members) and noncontrolling interest variations.

Equity Structure Overview

Organizational TypeRaw US GAAP ConceptFundamentalConcept Output
Corporation (total)StockholdersEquityIncludingPortionAttributableToNoncontrollingInterestvec![Equity]
Corporation (parent)StockholdersEquityvec![EquityAttributableToParent, Equity]
Partnership (total)PartnersCapitalIncludingPortionAttributableToNoncontrollingInterestvec![Equity]
Partnership (parent)PartnersCapitalvec![EquityAttributableToParent, Equity]
LLC (parent)MembersEquityvec![EquityAttributableToParent, Equity]
GenericCommonStockholdersEquityvec![Equity]

Temporary Equity Variations

Raw US GAAP ConceptFundamentalConcept Output
TemporaryEquityCarryingAmountvec![TemporaryEquity]
TemporaryEquityRedemptionValuevec![TemporaryEquity]
RedeemablePreferredStockCarryingAmountvec![TemporaryEquity]
TemporaryEquityCarryingAmountAttributableToParentvec![TemporaryEquity]
TemporaryEquityCarryingAmountAttributableToNoncontrollingInterestvec![TemporaryEquity]
TemporaryEquityLiquidationPreferencevec![TemporaryEquity]

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:150-193 tests/distill_us_gaap_fundamental_concepts_tests.rs:262-286 tests/distill_us_gaap_fundamental_concepts_tests.rs:1228-1274


graph TB
    subgraph "Test Structure"
        TestFile["distill_us_gaap_fundamental_concepts_tests.rs\n1275 lines, 70+ tests"]
BalanceSheet["Balance Sheet Tests\n- Assets (one-to-one)\n- Current Assets (hierarchical)\n- Equity (multiple org types)\n- Temporary Equity (7 variations)"]
IncomeStatement["Income Statement Tests\n- Revenues (57+ variations)\n- Cost of Revenue (6 synonyms)\n- Net Income (hierarchical)\n- Comprehensive Income (hierarchical)"]
CashFlow["Cash Flow Tests\n- Operating Activities\n- Investing Activities\n- Financing Activities\n- Continuing vs Discontinued"]
Special["Special Category Tests\n- Commitments and Contingencies\n- Nature of Operations\n- Exchange Gains/Losses\n- Research and Development"]
TestFile --> BalanceSheet
 
       TestFile --> IncomeStatement
 
       TestFile --> CashFlow
 
       TestFile --> Special
    end

Testing Strategy

The transformation system has comprehensive test coverage with 70+ test functions covering all 64 FundamentalConcept variants and their various input mappings.

Test Organization

Test Pattern Examples

Each test function validates one or more related mappings:

Test Coverage Summary

Concept CategoryTest FunctionsTotal AssertionsPattern Types Tested
Balance Sheet1535+All 4 patterns
Income Statement2580+All 4 patterns
Cash Flow1825+Hierarchical, One-to-One
Equity1240+All 4 patterns
Total70+180+All 4 patterns

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275


graph TB
    subgraph "Data Source"
        SECAPI["SEC EDGAR API\ndata endpoint"]
RawJSON["Raw JSON Response\nCompany Facts"]
end
    
    subgraph "Rust Processing Pipeline"
        FetchFn["fetch_us_gaap_fundamentals\n(network module)"]
ParseJSON["Parse JSON\nExtract concept names"]
DistillFn["distill_us_gaap_fundamental_concepts\n(transformers module)"]
FilterMapped["Filter to mapped concepts\nDiscard None results"]
BuildRecords["Build data records\nwith FundamentalConcept enum"]
end
    
    subgraph "Storage Layer"
        CSVOutput["CSV Files\nus-gaap/[ticker].csv"]
PolarsDF["Polars DataFrame\nStructured columns"]
end
    
    subgraph "Python ML Pipeline"
        Ingestion["Data Ingestion\nus_gaap_store.ingest_us_gaap_csvs"]
Preprocessing["Preprocessing\nConcept/unit pair extraction"]
end
    
 
   SECAPI --> RawJSON
 
   RawJSON --> FetchFn
 
   FetchFn --> ParseJSON
 
   ParseJSON --> DistillFn
 
   DistillFn --> FilterMapped
 
   FilterMapped --> BuildRecords
 
   BuildRecords --> PolarsDF
 
   PolarsDF --> CSVOutput
 
   CSVOutput --> Ingestion
 
   Ingestion --> Preprocessing
    
    style DistillFn fill:#f9f9f9
    style FilterMapped fill:#f9f9f9

Integration with Data Pipeline

The distill_us_gaap_fundamental_concepts function is a critical component in the broader data processing pipeline, serving as the normalization layer between raw SEC filings and structured data storage.

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275 examples/us_gaap_human_readable.rs:1-9


Summary

The US GAAP concept transformation system provides:

  1. Standardization : Maps 200+ raw US GAAP concept names to 64 standardized FundamentalConcept variants
  2. Flexibility : Supports four mapping patterns (one-to-one, hierarchical, synonyms, industry-specific) to handle diverse reporting practices
  3. Queryability : Hierarchical mappings enable queries at multiple granularity levels (e.g., query for all Assets or specifically CurrentAssets)
  4. Reliability : Comprehensive test coverage with 70+ test functions and 180+ assertions validates all mapping patterns
  5. Integration : Serves as critical normalization layer between SEC EDGAR API and downstream data processing/ML pipelines

The transformation system represents the highest-importance component (8.37) in the Rust codebase, enabling consistent financial data analysis across companies with varying reporting conventions.

Sources: tests/distill_us_gaap_fundamental_concepts_tests.rs:1-1275 examples/us_gaap_human_readable.rs:1-9