burntsushi

Yeah this is a tough one. I'm not sure the right thing to do is for me to go around blasting PRs at those projects. They're probably already carrying support for both chrono and time, and asking them to support a third that is brand new is a bit of a stretch I think. Especially since I've promised breaking changes in the not-too-distant future. (Although I would like to do a Jiff 1.0 release about 1 year from now and commit to stability.) At least, I know I'd be hesitant if I were on the other side of it. But maybe folks are more flexible than me, I'm not sure.

I've been noodling on just adding these integrations to jiff itself. I do worry that if I do that, then the integrations will always stay with Jiff, even at 1.0. But maybe there just isn't another feasible choice.

But, why do you mention humantime? humantime doesn't have any integrations with time or chrono. humantime is more like a thin wrapper on top of std::time::Duration and std::time::SystemTime to make parsing and printing a bit nicer.

3

burntsushi @programming.dev 2y ago

You should absolutely not need to handle ISO 8601 and RFC 3339 manually. They are supported via the Display and FromStr trait implementations on every main type in Jiff (Span, Zoned, Timestamp, civil::DateTime, civil::Date and civil::Time). It's technically an implementation of a mixture of ISO 8601, RFC 3339 and RFC 9557, but the grammar is specified precisely by Temporal. See: https://docs.rs/jiff/latest/jiff/fmt/temporal/index.html

burntsushi @programming.dev 2y ago

OK, I've beefed that example up a little bit: https://github.com/BurntSushi/jiff/commit/08dfdde204c739e38147faf70b648e2d417d1c2e

I think the comparison is a bit more muddied now and probably worse overall to be honest. Maybe removing DateTime<Local> is the right thing to do. I'll think on it.

It's a subtle comparison to make... Probably most people don't even realize that they're doing the wrong thing.

2

burntsushi @programming.dev 2y ago

OK, fair enough. What should it say instead? Just omit the mention of DateTime<Local>? I used it because it's literally the only way to derive(Deserialize) in Chrono in a way that gives you DST aware arithmetic on the result without getting time zone information via some out-of-band mechanism.

4

burntsushi @programming.dev 2y ago

Again, to be clear, I'm not saying it's impossible to do. But in order to do it, you have to build your own abstractions. And even then, you still can't do it because tzfile doesn't give you enough to do it. And tzfile has a platform specific API with no caching, so every time you parse a datetime with a tz ID in it, it's completely reloading the TZif data from disk.

Some of these things are implementation quality issues that can be fixed. Others are library design problems where you can achieve your objective by building your own abstractions. Like do you really not see this as something that shouldn't be mentioned in a comparison between these crates? You must recognize the difference between what you're doing and just plopping a Zoned in your struct, deriving Serialize and Deserialize, and then just letting the library do the right thing for you. And that mentioning this is appropriate in the context of the "facts of comparison" because it translates into a real user experience difference for callers.

6

burntsushi @programming.dev 2y ago

Time zone transition changes happen all the time. Once you start storing datetimes in the future, you're in a bit of a precarious position here. Moreover, this is a standardized interchange format that other libraries will know how to read/write. (It's relatively newly standardized, but has been used in practice among other datetime libraries.)

I think you also glossed over some of my other points. How do you write your serialization code using Chrono? Does it work with both chrono-tz and tzfile?

The point is almost never about "it is literally impossible to accomplish task foo," but rather, it matters how it's approach and how easy it is to do. And if you have to rely on your users having very specific domain knowledge about this, it's likely there will be errors. As my design docs state, I didn't only make Jiff to offer more functionality. I also made it because I felt like the APIs could be better. That's a very subjective valuation, and I find arguments of the type, "well I can just use the old library in this way as long as I hold it right and it actually works just fine" to be missing the forest for the trees.

burntsushi @programming.dev 2y ago

The original name I wanted was gigawatt or some variation there of. :)

9

burntsushi @programming.dev 2y ago

It’s not built-in support.

Right. That's exactly what the code snippet says:

    
    // The serialized datetime has no time zone information,
    // so unless there is some out-of-band information saying
    // what its time zone is, we're forced to use a fixed offset:

So I feel like the point you're making here is already covered by the example comparison I wrote. It's not built-in, so you have to invent your own interchange format. And since your serialized format doesn't include offset information at the time the instant was created, it's impossible to do offset conflict resolution. For example, let's say you record one year from today in Ukraine:

 rust

    
use jiff::{ToSpan, Unit, Zoned};

fn main() -> anyhow::Result<()> {
    let now = Zoned::now().round(Unit::Minute)?.intz("Europe/Kyiv")?;
    let next_year = now.checked_add(1.year())?;
    println!("{next_year}");
    Ok(())
}

And the output:

    
$ cargo -q r
2025-07-22T17:23:00+03:00[Europe/Kyiv]

And maybe you store this datetime somewhere.

At this point, it's looking like Ukraine is going to abolish DST for next year. So what happens to that datetime above? It no longer has the right offset. So now you need to choose whether to reject it altogether (the default), respect the offset (even if the civil time changes) or respect the civil time (even if the instant changes).

Here's an example of when this happened with Brazil abolishing DST: https://docs.rs/jiff/latest/jiff/fmt/temporal/struct.DateTimeParser.html#example-3

burntsushi @programming.dev 2y ago

Is the cache invalidated if system tzdata is updated?

Yes, although at present, there is a TTL. So an update may take "time" to propagate. jiff::tz::db().reset() will force the cache to be invalidated. I expect the cache invalidation logic to get tweaked as we get real experience with it.

And what effect does the answer have on the example from “Jiff supports detecting time zone offset conflicts” if both zoned datetimes used the system timezone which got updated between 1. opening 2. parsing the two zoned datetimes.

It's hard to know precisely what you mean. But once you get a jiff::tz::TimeZone, that value is immutable: https://docs.rs/jiff/latest/jiff/tz/struct.TimeZone.html#a-timezone-is-immutable

New updates to tzdb are only observed when you do a tzdb lookup.

In this section, wouldn’t be more realistic for chrono users to use timezone info around the wire instead of on the wire, rather than using Local+FixedOffset?

That's kinda my point. How do they do that? And does it work with chrono-tz and tzfile? And what happens if tzdb updates lead to a serialized datetime with an incorrect offset in a future update of tzdb? There are all sorts of points of failure here that Jiff will handle for you by virtue of tighter integration with tzdb as a first class concept.

burntsushi @programming.dev 2y ago

How are you doing a date/time library without platform dependencies like libc or windows-sys? Are you rolling your own bindings in order to get the local time zone? (Or perhaps you aren't doing that at all.)

burntsushi @programming.dev 3y ago

Ah gotya, thanks!

2

burntsushi @programming.dev 3y ago

Disclosure: I'm the author of the memchr crate.

You mention the memchr crate, but you don't seem to have benchmarked it. Instead, you benchmarked the needle crate (last updated 7 years ago). Can you explain a bit more about your methodology?

The memchr crate in particular doesn't just use Rabin-Karp. It also uses Two-Way. And SIMD (with support for x86-64, aarch64 and wasm32).

burntsushi @programming.dev 3y ago

Both Perl and Python use backtracking regex engines and are thus susceptible to similar problems as discussed in the OP.

1

burntsushi @programming.dev 3y ago

Cross-posting from reddit:

The PR has more details, but here are a few ad hoc benchmarks using ripgrep on my M2 mac mini while searching a 5.5GB file.

This one is just a case insensitive search. A case insensitive regex expands to something like (ignoring Unicode) [Ss][Hh][Ee][Rr]..., which means that it has multiple literal prefixes. In fact, you can enumerate them! As long as the set is small enough, this is something that the new SIMD acceleration on aarch64 can handle (and has done for a long time on x86-64):

        $ time rg-before-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en
    3055

    real    8.208
    user    7.731
    sys     0.467
    maxmem  5600 MB
    faults  191

    $ time rg-after-teddy-aarch64 -i -c 'Sherlock Holmes' OpenSubtitles2018.half.en
    3055

    real    1.137
    user    0.695
    sys     0.430
    maxmem  5904 MB
    faults  203

And of course, using multiple literals explicitly also uses this optimization:

        $ time rg-before-teddy-aarch64 -c 'Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty' OpenSubtitles2018.half.en
    3804

    real    9.055
    user    8.580
    sys     0.474
    maxmem  4912 MB
    faults  11

    $ time rg-after-teddy-aarch64 -c 'Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty' OpenSubtitles2018.half.en
    3804

    real    1.121
    user    0.697
    sys     0.422
    maxmem  4832 MB
    faults  11

And it doesn't just work for prefixes, it also works for inner literals too:

        $ time rg-before-teddy-aarch64 -c '\w+\s+(Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)\s+\w+' OpenSubtitles2018.half.en
    773

    real    9.065
    user    8.586
    sys     0.477
    maxmem  6384 MB
    faults  11

    $ time rg-after-teddy-aarch64 -c '\w+\s+(Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty)\s+\w+' OpenSubtitles2018.half.en

    773

    real    1.124
    user    0.702
    sys     0.421
    maxmem  6784 MB
    faults  11

If you're curious about how the SIMD stuff works, you can read my description of Teddy here. I ported this algorithm out of the Hyperscan project several years ago, and it has been one of the killer ingredients for making ripgrep fast in a lot of common cases. But it only worked on x86-64. With the rise and popularity of aarch64 and Apple silicon, I was motivated to port it over. I just recently finished analogous work for the memchr crate as well.

burntsushi @programming.dev 3y ago

Shortly after we resigned, the top-level team leads, project directors to the Foundation, core team members and the new mods got together to form an interim leadership cohort. Sometimes called the "leadership chat." That then evolved into the Leadership Council by way of an RFC on goverance.

burntsushi

@ burntsushi @programming.dev

Posts

2
Comments

19
Joined

3 yr. ago

I love to code.

burntsushi

Jiff 0.2.0 is released

uv: Unified Python packaging

uv: Unified Python packaging

uv: Unified Python packaging

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

Jiff is a new date-time library for Rust that encourages you to jump into the pit of success

What are you working on this week? (June. 16, 2024)

Trying to invent a better substring search algorithm

Trying to invent a better substring search algorithm

A comprehensive guide to the dangers of Regular Expressions in JavaScript

aho-corasick (and thus the regex crate too) now uses SIMD on aarch64 (e.g., Apple silicon) to greatly accelerate some searches

aho-corasick (and thus the regex crate too) now uses SIMD on aarch64 (e.g., Apple silicon) to greatly accelerate some searches

Leadership change in the Rust Infrastructure Team