This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Filing Retrieval & Rendering
Loading…
Filing Retrieval & Rendering
Relevant source files
- examples/ipo_list.rs
- examples/ipo_show.rs
- src/network/filings/fetch_10k.rs
- src/network/filings/fetch_10q.rs
- src/network/filings/fetch_13f.rs
- src/network/filings/fetch_8k.rs
- src/network/filings/fetch_def14a.rs
- src/network/filings/fetch_filings.rs
- src/network/filings/fetch_form4.rs
- src/network/filings/fetch_nport.rs
- src/network/filings/fetch_s1.rs
- src/network/filings/fetch_s2.rs
- src/network/filings/fetch_s3.rs
- src/network/filings/fetch_schedule_13d.rs
- src/network/filings/fetch_schedule_13g.rs
- src/network/filings/filing_index.rs
- src/network/filings/mod.rs
- src/views/embedding_text_view.rs
- src/views/html_helpers.rs
- src/views/markdown_view.rs
Purpose and Scope
This document details the filings submodule within the network layer, responsible for retrieving and parsing form-specific SEC filings. It covers the implementation of specialized fetchers for major form types (10-K, 10-Q, 8-K, 13F, Form 4, N-PORT, etc.), the FilingIndex parser for exploring filing archives, IPO feed polling, and the views system for rendering document content into Markdown or embedding-optimized text.
Filing Retrieval Architecture
The filing retrieval system is built on top of the SecClient and CikSubmission models. It provides a tiered approach: generic filing retrieval, form-specific convenience functions, and deep-parsing functions that extract structured data (like XML-based holdings) from within a filing.
Diagram: Filing Retrieval Logic Flow
graph TD
subgraph "Code Entity Space: Filing Retrieval"
FetchFilings["fetch_filings()\nsrc/network/filings/fetch_filings.rs"]
SpecificFetch["fetch_10k_filings()\nfetch_8k_filings()\n..."]
DeepFetch["fetch_13f()\nfetch_form4()\nfetch_nport()"]
FilingIndex["fetch_filing_index()\nsrc/network/filings/filing_index.rs"]
end
subgraph "Natural Language Space: SEC Concepts"
Submissions["Company Submissions\n(submissions/CIK.json)"]
Archive["EDGAR Archive Directory\n(data/CIK/Accession/)"]
PrimaryDoc["Primary Document\n(10-K HTML, 13F XML)"]
Exhibits["Exhibits & Supporting Docs\n(EX-99.1, InfoTable.xml)"]
end
FetchFilings -->|Filters| Submissions
SpecificFetch -->|Wraps| FetchFilings
DeepFetch -->|Parses| PrimaryDoc
DeepFetch -->|Uses| FilingIndex
FilingIndex -->|Scrapes| Archive
Archive --> Exhibits
Sources: src/network/filings/mod.rs:1-29 src/network/filings/fetch_filings.rs:67-77 src/network/filings/filing_index.rs:108-114
Form-Specific Fetchers
The library provides dedicated functions for common SEC forms. These functions encapsulate form-specific logic, such as including historical variants (e.g., 10-K405) or handling amendments.
| Function | Form Type(s) | Key Features |
|---|---|---|
fetch_10k_filings | 10-K, 10-K405 | Returns comprehensive annual reports; re-sorts mixed types newest-first. src/network/filings/fetch_10k.rs:59-77 |
fetch_10q_filings | 10-Q | Returns quarterly reports (Q1-Q3). src/network/filings/fetch_10q.rs:43-52 |
fetch_8k_filings | 8-K | Returns material event notifications. src/network/filings/fetch_8k.rs:54-63 |
fetch_13f_filings | 13F-HR | Returns institutional holdings report metadata. src/network/filings/fetch_13f.rs:32-41 |
fetch_form4_filings | 4, 4/A | Returns insider trading reports and amendments. src/network/filings/fetch_form4.rs:33-49 |
fetch_def14a_filings | DEF 14A | Returns definitive proxy statements for shareholder meetings. src/network/filings/fetch_def14a.rs:57-66 |
fetch_nport_filings | NPORT-P | Returns monthly portfolio holdings for registered funds. src/network/filings/fetch_nport.rs:35-44 |
Sources: src/network/filings/fetch_10k.rs:59-77 src/network/filings/fetch_13f.rs:32-41 src/network/filings/fetch_form4.rs:33-49
Filing Index and Deep Parsing
While most filings are identified by a primary document, many (like 13F or 8-K) contain critical data in secondary files or require XML parsing of the primary document.
Filing Index Parser
The fetch_filing_index function scrapes the EDGAR -index.htm page to discover all files associated with an accession number. This is necessary because the submissions.json API only points to the “Primary Document,” which may not be the file containing the raw data (e.g., a 13F’s informationTable.xml).
- Implementation : Uses regex to parse the HTML table in the index page src/network/filings/filing_index.rs:9-12
- Normalization : Automatically strips iXBRL viewer prefixes (
/ix?doc=) to find the actual file path src/network/filings/filing_index.rs:44-51
Deep Data Extraction
Several functions go beyond metadata to return structured Rust models:
fetch_13f: Uses theFilingIndexto find the “INFORMATION TABLE” XML, then parses it intoThirteenfHoldingobjects src/network/filings/fetch_13f.rs:77-107fetch_form4: Strips XSLT prefixes from the primary document path to fetch raw XML and parses it intoForm4Transactionobjects src/network/filings/fetch_form4.rs:83-109fetch_nport: Fetches the primary XML and enriches it with ticker symbols fromfetch_company_tickerssrc/network/filings/fetch_nport.rs:73-97
Sources: src/network/filings/filing_index.rs:23-76 src/network/filings/fetch_13f.rs:81-93 src/network/filings/fetch_form4.rs:91-102
graph LR
subgraph "Code Entity Space: IPO Tracking"
GetIPOFeed["get_ipo_feed_entries()\nsrc/ops/ipo_ops.rs"]
GetIPOReg["get_ipo_registration_filings()\nsrc/ops/ipo_ops.rs"]
FormType["FormType::IPO_REGISTRATION_FORM_TYPES\nsrc/enums/form_type.rs"]
end
subgraph "Natural Language Space: IPO Lifecycle"
S1["S-1 / F-1\n(Initial Registration)"]
S1A["S-1/A / F-1/A\n(Amendments)"]
B4["424B4\n(Final Pricing)"]
AtomFeed["EDGAR Live Feed\n(Polling)"]
end
GetIPOFeed -->|Polls| AtomFeed
AtomFeed -->|Filters for| FormType
GetIPOReg -->|Aggregates| S1
GetIPOReg -->|Aggregates| S1A
GetIPOReg -->|Aggregates| B4
IPO Feed Polling
The system includes specialized logic for tracking Initial Public Offerings (IPOs) via the EDGAR Atom feed.
Diagram: IPO Feed and Registration Lifecycle
get_ipo_feed_entries: Polls the EDGAR Atom feed (the fastest source for new filings) and filters for S-1, F-1, and 424B4 forms examples/ipo_list.rs:43-51get_ipo_registration_filings: Retrieves the full timeline of a company’s IPO process, from initial S-1 through all amendments to the final prospectus examples/ipo_show.rs:48-58
Sources: examples/ipo_list.rs:1-17 examples/ipo_show.rs:21-33 examples/ipo_list.rs:88-108
Document Rendering (Views)
The views system provides traits and implementations for converting SEC HTML/XBRL documents into readable text formats.
The FilingView Trait
The core abstraction for rendering. It defines how to format headers, sections, and tables.
Implementations
MarkdownView:- Goal : Lossless representation.
- Tables : Preserved as GitHub-Flavored Markdown (GFM) pipe tables examples/ipo_show.rs:94-95
- Usage : Standard reading and documentation.
EmbeddingTextView:- Goal : Optimization for Large Language Model (LLM) embeddings.
- Tables : Flattened into labeled sentences to preserve semantic context (e.g., “The value of Assets for 2023 was 100M”) examples/ipo_show.rs:96-97
- Prose : Cleaned of excessive whitespace and HTML artifacts.
Rendering Pipeline
The render_filing operation examples/ipo_show.rs48 orchestrates the process:
- Fetch the document content.
- Clean HTML using
html_helpers. - Apply the selected
FilingViewimplementation.
Sources: examples/ipo_show.rs:92-108 src/views/markdown_view.rs:1-10 src/views/embedding_text_view.rs:1-10
Dismiss
Refresh this wiki
Enter email to refresh