This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Utility Functions
Relevant source files
Purpose and Scope
This document covers the utility functions and helper modules provided by the utils module in the Rust sec-fetcher application. These utilities provide cross-cutting functionality used throughout the codebase, including data structure transformations, runtime mode detection, and collection extensions.
For information about the main application architecture and data flow, see Main Application Flow. For details on data models and core structures, see Data Models & Enumerations.
Sources: src/utils.rs:1-12
Module Overview
The utils module is organized as a collection of focused sub-modules, each providing a specific utility function or set of related functions. The module uses Rust's re-export pattern to provide a clean public API.
| Sub-module | Primary Export | Purpose |
|---|---|---|
invert_multivalue_indexmap | invert_multivalue_indexmap() | Inverts multi-value index mappings |
is_development_mode | is_development_mode() | Runtime environment detection |
is_interactive_mode | is_interactive_mode(), set_interactive_mode_override() | Interactive mode state management |
vec_extensions | VecExtensions trait | Extension methods for Vec<T> |
The module structure follows a pattern where each utility is isolated in its own file, promoting maintainability and testability while the parent utils.rs file acts as a facade.
Sources: src/utils.rs:1-12
Utility Module Architecture
Sources: src/utils.rs:1-12
IndexMap Inversion Utility
Function Signature
The invert_multivalue_indexmap function transforms a mapping where keys point to multiple values into a reverse mapping where values point to all their associated keys.
Algorithm and Complexity
The function performs the following operations:
- Initialization : Creates a new
IndexMapwith capacity matching the input map - Iteration : Traverses all key-value pairs in the original mapping
- Inversion : For each value in a key's vector, adds that key to the value's entry in the inverted map
- Preservation : Maintains insertion order via
IndexMapdata structure
Time Complexity: O(N) where N is the total number of key-value associations across all vectors
Space Complexity: O(N) for storing the inverted mapping
Use Cases in the Codebase
This utility is primarily used in the US GAAP transformation pipeline where bidirectional concept mappings are required. The function enables:
- Synonym Resolution : Mapping from multiple US GAAP tags to a single
FundamentalConceptenum variant - Reverse Lookups : Given a
FundamentalConcept, finding all original US GAAP tags that map to it - Hierarchical Queries : Supporting both forward (tag → concept) and reverse (concept → tags) navigation
Example Usage Pattern
Sources: src/utils/invert_multivalue_indexmap.rs:1-65
Development Mode Detection
The is_development_mode function provides runtime detection of development versus production environments. This utility enables conditional behavior based on the execution context.
Typical Implementation Pattern
Development mode detection typically checks for:
- Cargo Profile : Whether compiled with
--releaseflag - Environment Variables : Presence of
RUST_DEV_MODE,DEBUG, or similar markers - Build Configuration : Compile-time flags like
#[cfg(debug_assertions)]
Usage in Configuration System
The function integrates with the configuration system to adjust behavior:
| Development Mode | Production Mode |
|---|---|
| Relaxed validation | Strict validation |
| Verbose logging enabled | Minimal logging |
| Mock data allowed | Real API calls required |
| Cache disabled or short TTL | Full caching enabled |
This utility is likely referenced in src/config/config_manager.rs to adjust validation rules and in src/main.rs to control application initialization.
Sources: src/utils.rs:4-5
Interactive Mode Management
The interactive mode utilities manage application state related to user interaction, controlling whether the application should prompt for user input or run in automated mode.
Function Signatures
State Management Pattern
These functions likely implement a static or thread-local state management pattern:
- Default Behavior : Detect if running in a TTY (terminal) environment
- Override Mechanism : Allow explicit setting via
set_interactive_mode_override() - Query Interface : Check current state via
is_interactive_mode()
Use Cases
| Scenario | Interactive Mode | Non-Interactive Mode |
|---|---|---|
| Missing config | Prompt user for input | Exit with error |
| API rate limit | Pause and notify user | Automatic retry with backoff |
| Data validation failure | Ask user to continue | Fail fast and exit |
| Progress reporting | Show progress bars | Log to stdout/file |
Integration with Main Application Flow
The interactive mode flags are likely used in src/main.rs to control:
- Whether to display progress indicators during fund processing
- How to handle missing credentials (prompt vs. error)
- Whether to confirm before writing large CSV files
- User confirmation for destructive operations
Sources: src/utils.rs:7-8
Vector Extensions Trait
The VecExtensions trait provides extension methods for Rust's standard Vec<T> type, adding domain-specific functionality needed by the sec-fetcher application.
Trait Pattern
Extension traits in Rust follow this pattern:
Likely Extension Methods
Based on common patterns in data processing applications and the context of US GAAP data transformation, the trait likely provides:
| Method Category | Potential Methods | Purpose |
|---|---|---|
| Chunking | chunk_by_size(), batch() | Process large datasets in manageable batches |
| Deduplication | unique(), deduplicate_by() | Remove duplicate entries from fetched data |
| Filtering | filter_not_empty(), compact() | Remove null or empty elements |
| Transformation | map_parallel(), flat_map_concurrent() | Parallel data transformation |
| Validation | all_valid(), partition_valid() | Separate valid from invalid records |
Usage in Data Pipeline
The extensions are likely used extensively in:
- Network Module : Batching API requests, deduplicating ticker symbols
- Transform Module : Parallel processing of US GAAP concepts
- Main Application : Chunking investment company lists for concurrent processing
Sources: src/utils.rs:10-11
Utility Function Relationships
Sources: src/utils.rs:1-12 src/utils/invert_multivalue_indexmap.rs:1-65
Design Principles
Modularity
Each utility is isolated in its own sub-module with a single, well-defined responsibility. This design:
- Reduces Coupling : Utilities can be tested independently
- Improves Reusability : Functions can be used across different modules without dependencies
- Simplifies Maintenance : Changes to one utility don't affect others
Generic Programming
The invert_multivalue_indexmap function demonstrates generic programming principles:
- Type Parameters : Works with any types implementing
Eq + Hash + Clone - Zero-Cost Abstractions : No runtime overhead compared to specialized versions
- Compile-Time Guarantees : Type safety ensured by the compiler
Order Preservation
The use of IndexMap instead of HashMap in the inversion utility preserves insertion order, which is critical for:
- Deterministic Output : Consistent CSV file generation across runs
- Reproducible Transformations : Same input always produces same output order
- Debugging : Predictable ordering aids in troubleshooting
Sources: src/utils/invert_multivalue_indexmap.rs:4-14
Performance Characteristics
IndexMap Inversion
| Operation | Complexity | Notes |
|---|---|---|
| Inversion | O(N) | N = total key-value associations |
| Lookup in result | O(1) average | Hash-based access |
| Insertion order | Preserved | Via IndexMap internal ordering |
| Memory overhead | O(N) | Stores inverted mapping |
Runtime Mode Detection
| Operation | Complexity | Caching Strategy |
|---|---|---|
| First check | O(1) - O(log n) | Environment lookup or compile-time constant |
| Subsequent checks | O(1) | Static or lazy_static cached value |
The mode detection functions likely use lazy initialization patterns to cache their results, avoiding repeated environment variable lookups or system calls.
Sources: src/utils/invert_multivalue_indexmap.rs:26-28
Integration Points
US GAAP Transformation Pipeline
The utilities integrate with the concept transformation system (see US GAAP Concept Transformation):
- Forward Mapping :
distill_us_gaap_fundamental_conceptsuses hard-coded mappings - Reverse Mapping :
invert_multivalue_indexmapgenerates reverse index - Lookup Optimization : Enables O(1) queries for "which concepts contain this tag?"
Configuration System
The development and interactive mode utilities integrate with configuration management (see Configuration System):
- Validation :
is_development_moderelaxes validation in development - Credential Handling :
is_interactive_modedetermines prompt vs. error - Logging : Both functions control verbosity and output format
Main Application Loop
The utilities support the main processing flow (see Main Application Flow):
- Batch Processing :
VecExtensionsenables efficient chunking of investment companies - User Feedback :
is_interactive_modecontrols progress display - Error Handling : Mode detection influences retry behavior and error messages
Sources: src/utils.rs:1-12