Performance

whichtime is designed for high-performance parsing. This guide explains the optimizations and how to get the best performance.

Performance Characteristics

whichtime achieves high performance through three main optimizations:

1. Aho-Corasick Scanner

Before running individual parsers, whichtime pre-scans the input text using an Aho-Corasick automaton:

Single pass over the input text
SIMD-optimized via the aho-corasick crate
Identifies all potential date-related tokens at once
Enables fast-path filtering with should_apply()

This means parsers can quickly determine if they should even attempt to parse the text.

2. PHF Dictionaries

Keywords (months, weekdays, time units, etc.) are stored in compile-time perfect hash functions:

O(1) guaranteed lookup time
No runtime hashing - computed at compile time
Zero heap allocations for lookups
Maps stored in read-only .rodata section

rust

// Example: Month name lookup is O(1)
pub static MONTH_MAP: phf::Map<&'static str, u32> = phf_map! {
    "january" => 1, "jan" => 1,
    "february" => 2, "feb" => 2,
    // ...
};

3. FastComponents

Date/time components use an optimized storage format:

Fixed-size array [i32; 10] instead of HashMap
Bitflags for tracking known vs. implied components
Copy semantics - no heap allocation, no cloning cost
Fits in ~44 bytes (cache-line friendly)

rust

#[derive(Clone, Copy, Default)]
pub struct FastComponents {
    values: [i32; 10],      // Component values
    known: ComponentFlags,   // Known (certain) components
    implied: ComponentFlags, // Implied components
}

Benchmarking

Run the included benchmarks:

bash

cargo bench -p whichtime-sys

This uses Criterion and produces HTML reports in target/criterion/.

Benchmark Categories

Simple inputs - Single expressions like "tomorrow"
Long text - Paragraphs with embedded dates
Pathological - Edge cases and complex expressions
Locales - Per-locale parsing performance

Performance Tips

1. Reuse Parser Instances

Creating a parser involves initializing regex patterns and other state. Reuse parsers:

rust

// Good: Create once
let parser = WhichTime::new();
for text in inputs {
    parser.parse_date(text, None)?;
}

// Bad: Create each time
for text in inputs {
    let parser = WhichTime::new(); // Wasteful
    parser.parse_date(text, None)?;
}

2. Use the Right Method

If you only need the first date, use parse_date() instead of parse():

rust

// More efficient if you only need the first date
let date = parser.parse_date("text with date", None)?;

// Less efficient - parses all dates, then you take the first
let results = parser.parse("text with date", None)?;
let date = results.first();

3. Consider Text Length

For very long texts, parsing time scales roughly linearly. If you have extremely long documents, consider:

Chunking the text into smaller pieces
Pre-filtering to identify relevant sections
Using the scanner to check if dates exist before full parsing

4. Batch Processing

When processing many texts, batch them to amortize any overhead:

rust

let parser = WhichTime::new();
let results: Vec<_> = inputs
    .iter()
    .map(|text| parser.parse_date(text, None))
    .collect();

Memory Usage

whichtime is designed for minimal memory allocation:

Parser state - Fixed size, allocated once
Results - Allocated per parse, proportional to matches found
Components - Stack-allocated (Copy type)
Dictionaries - Static, read-only memory

For typical inputs, memory usage is minimal. Very long texts with many matches will use more memory for results.

Comparison with Alternatives

While we don't publish formal benchmarks against other libraries, whichtime is designed to be competitive with or faster than JavaScript-based solutions, especially for:

Batch processing of many inputs
Server-side applications
Mobile applications where startup time matters

Profiling

To profile whichtime in your application:

bash

# Using flamegraph (requires cargo-flamegraph)
cargo flamegraph --bench comparison -p whichtime-sys

# Using perf (Linux)
perf record --call-graph dwarf cargo bench -p whichtime-sys
perf report

Reporting Performance Issues

If you encounter performance problems:

Identify the specific input causing issues
Run benchmarks to quantify the problem
Open an issue with reproduction steps and benchmark results

Performance ​

Performance Characteristics ​

1. Aho-Corasick Scanner ​

2. PHF Dictionaries ​

3. FastComponents ​

Benchmarking ​

Benchmark Categories ​

Performance Tips ​

1. Reuse Parser Instances ​

2. Use the Right Method ​

3. Consider Text Length ​

4. Batch Processing ​

Memory Usage ​

Comparison with Alternatives ​

Profiling ​

Reporting Performance Issues ​

Performance

Performance Characteristics

1. Aho-Corasick Scanner

2. PHF Dictionaries

3. FastComponents

Benchmarking

Benchmark Categories

Performance Tips

1. Reuse Parser Instances

2. Use the Right Method

3. Consider Text Length

4. Batch Processing

Memory Usage

Comparison with Alternatives

Profiling

Reporting Performance Issues