Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Utility Functions

Relevant source files

Purpose and Scope

This document covers the utility functions and helper modules provided by the utils module in the Rust sec-fetcher application. These utilities provide cross-cutting functionality used throughout the codebase, including data structure transformations, runtime mode detection, and collection extensions.

For information about the main application architecture and data flow, see Main Application Flow. For details on data models and core structures, see Data Models & Enumerations.

Sources: src/utils.rs:1-12


Module Overview

The utils module is organized as a collection of focused sub-modules, each providing a specific utility function or set of related functions. The module uses Rust's re-export pattern to provide a clean public API.

Sub-modulePrimary ExportPurpose
invert_multivalue_indexmapinvert_multivalue_indexmap()Inverts multi-value index mappings
is_development_modeis_development_mode()Runtime environment detection
is_interactive_modeis_interactive_mode(), set_interactive_mode_override()Interactive mode state management
vec_extensionsVecExtensions traitExtension methods for Vec<T>

The module structure follows a pattern where each utility is isolated in its own file, promoting maintainability and testability while the parent utils.rs file acts as a facade.

Sources: src/utils.rs:1-12


Utility Module Architecture

Sources: src/utils.rs:1-12


IndexMap Inversion Utility

Function Signature

The invert_multivalue_indexmap function transforms a mapping where keys point to multiple values into a reverse mapping where values point to all their associated keys.

Algorithm and Complexity

The function performs the following operations:

  1. Initialization : Creates a new IndexMap with capacity matching the input map
  2. Iteration : Traverses all key-value pairs in the original mapping
  3. Inversion : For each value in a key's vector, adds that key to the value's entry in the inverted map
  4. Preservation : Maintains insertion order via IndexMap data structure

Time Complexity: O(N) where N is the total number of key-value associations across all vectors

Space Complexity: O(N) for storing the inverted mapping

Use Cases in the Codebase

This utility is primarily used in the US GAAP transformation pipeline where bidirectional concept mappings are required. The function enables:

  • Synonym Resolution : Mapping from multiple US GAAP tags to a single FundamentalConcept enum variant
  • Reverse Lookups : Given a FundamentalConcept, finding all original US GAAP tags that map to it
  • Hierarchical Queries : Supporting both forward (tag → concept) and reverse (concept → tags) navigation

Example Usage Pattern

Sources: src/utils/invert_multivalue_indexmap.rs:1-65


Development Mode Detection

The is_development_mode function provides runtime detection of development versus production environments. This utility enables conditional behavior based on the execution context.

Typical Implementation Pattern

Development mode detection typically checks for:

  • Cargo Profile : Whether compiled with --release flag
  • Environment Variables : Presence of RUST_DEV_MODE, DEBUG, or similar markers
  • Build Configuration : Compile-time flags like #[cfg(debug_assertions)]

Usage in Configuration System

The function integrates with the configuration system to adjust behavior:

Development ModeProduction Mode
Relaxed validationStrict validation
Verbose logging enabledMinimal logging
Mock data allowedReal API calls required
Cache disabled or short TTLFull caching enabled

This utility is likely referenced in src/config/config_manager.rs to adjust validation rules and in src/main.rs to control application initialization.

Sources: src/utils.rs:4-5


Interactive Mode Management

The interactive mode utilities manage application state related to user interaction, controlling whether the application should prompt for user input or run in automated mode.

Function Signatures

State Management Pattern

These functions likely implement a static or thread-local state management pattern:

  1. Default Behavior : Detect if running in a TTY (terminal) environment
  2. Override Mechanism : Allow explicit setting via set_interactive_mode_override()
  3. Query Interface : Check current state via is_interactive_mode()

Use Cases

ScenarioInteractive ModeNon-Interactive Mode
Missing configPrompt user for inputExit with error
API rate limitPause and notify userAutomatic retry with backoff
Data validation failureAsk user to continueFail fast and exit
Progress reportingShow progress barsLog to stdout/file

Integration with Main Application Flow

The interactive mode flags are likely used in src/main.rs to control:

  • Whether to display progress indicators during fund processing
  • How to handle missing credentials (prompt vs. error)
  • Whether to confirm before writing large CSV files
  • User confirmation for destructive operations

Sources: src/utils.rs:7-8


Vector Extensions Trait

The VecExtensions trait provides extension methods for Rust's standard Vec<T> type, adding domain-specific functionality needed by the sec-fetcher application.

Trait Pattern

Extension traits in Rust follow this pattern:

Likely Extension Methods

Based on common patterns in data processing applications and the context of US GAAP data transformation, the trait likely provides:

Method CategoryPotential MethodsPurpose
Chunkingchunk_by_size(), batch()Process large datasets in manageable batches
Deduplicationunique(), deduplicate_by()Remove duplicate entries from fetched data
Filteringfilter_not_empty(), compact()Remove null or empty elements
Transformationmap_parallel(), flat_map_concurrent()Parallel data transformation
Validationall_valid(), partition_valid()Separate valid from invalid records

Usage in Data Pipeline

The extensions are likely used extensively in:

  • Network Module : Batching API requests, deduplicating ticker symbols
  • Transform Module : Parallel processing of US GAAP concepts
  • Main Application : Chunking investment company lists for concurrent processing

Sources: src/utils.rs:10-11


Utility Function Relationships

Sources: src/utils.rs:1-12 src/utils/invert_multivalue_indexmap.rs:1-65


Design Principles

Modularity

Each utility is isolated in its own sub-module with a single, well-defined responsibility. This design:

  • Reduces Coupling : Utilities can be tested independently
  • Improves Reusability : Functions can be used across different modules without dependencies
  • Simplifies Maintenance : Changes to one utility don't affect others

Generic Programming

The invert_multivalue_indexmap function demonstrates generic programming principles:

  • Type Parameters : Works with any types implementing Eq + Hash + Clone
  • Zero-Cost Abstractions : No runtime overhead compared to specialized versions
  • Compile-Time Guarantees : Type safety ensured by the compiler

Order Preservation

The use of IndexMap instead of HashMap in the inversion utility preserves insertion order, which is critical for:

  • Deterministic Output : Consistent CSV file generation across runs
  • Reproducible Transformations : Same input always produces same output order
  • Debugging : Predictable ordering aids in troubleshooting

Sources: src/utils/invert_multivalue_indexmap.rs:4-14


Performance Characteristics

IndexMap Inversion

OperationComplexityNotes
InversionO(N)N = total key-value associations
Lookup in resultO(1) averageHash-based access
Insertion orderPreservedVia IndexMap internal ordering
Memory overheadO(N)Stores inverted mapping

Runtime Mode Detection

OperationComplexityCaching Strategy
First checkO(1) - O(log n)Environment lookup or compile-time constant
Subsequent checksO(1)Static or lazy_static cached value

The mode detection functions likely use lazy initialization patterns to cache their results, avoiding repeated environment variable lookups or system calls.

Sources: src/utils/invert_multivalue_indexmap.rs:26-28


Integration Points

US GAAP Transformation Pipeline

The utilities integrate with the concept transformation system (see US GAAP Concept Transformation):

  1. Forward Mapping : distill_us_gaap_fundamental_concepts uses hard-coded mappings
  2. Reverse Mapping : invert_multivalue_indexmap generates reverse index
  3. Lookup Optimization : Enables O(1) queries for "which concepts contain this tag?"

Configuration System

The development and interactive mode utilities integrate with configuration management (see Configuration System):

  1. Validation : is_development_mode relaxes validation in development
  2. Credential Handling : is_interactive_mode determines prompt vs. error
  3. Logging : Both functions control verbosity and output format

Main Application Loop

The utilities support the main processing flow (see Main Application Flow):

  1. Batch Processing : VecExtensions enables efficient chunking of investment companies
  2. User Feedback : is_interactive_mode controls progress display
  3. Error Handling : Mode detection influences retry behavior and error messages

Sources: src/utils.rs:1-12