But it's still a "blind" fuzzer and it would be nice to write one that gets feedback from code coverage somehow. Instead, you have to run code coverage yourself and figure out how to change test data generation to improve it.
I've talked with lots of people in the PBT world who have always seen something like this as the end goal of the PBT ecosystem. It seemed like a thing that would happen eventually, someone just had to do it. I'm super excited to actually be doing it and bringing great PBT to every and any language.
It doesn't hurt that this is coming right as great PBT in every language is suddenly a lot more important thanks to AI code!
There's no doubt, I think, testing will remain important and possibly become more important with more AI use, and so better testing is helpful, PBT included. But the problem remains verifying that the tests actually test what they're supposed to. Mutation tests can allow agents to get good coverage with little human intervention, and PBT can make tests better and more readable. But still, people have to read them and understand them, and I suspect that many people who claim to generate thousands of LOC per day don't.
And even if the tests were great and people carefully reviewed them, that's not enough to make sure things don't go terribly wrong. Anthropic's C compiler experiment didn't fail because of bad testing. Not only were the tests good, it took humans years to write the tests by hand, and the agents still failed to converge.
I think good tests are a necessary condition for AI not generating terrible software, but we're clearly not yet at a point where they're a sufficient one. So "a huge part" - possibly, but there are other huge parts still missing.
PBT is for sure the future - which is apparently now? 10 years ago when I was talking about QuickCheck [0] all the JS and Ruby programmers in my city just looked at me like I had two heads.
[0] https://github.com/ryandv/chesskell/blob/master/test/Test/Ch...
Given Curry-Howard isomorphism, couldn't we ask AI to directly prove the property of the binary executable under the assumption of the HW model, instead of running PBTs?
By no means I want to dismiss PBTs - but it seems that this could be both faster and more reliable.
10 years ago might have been a little early (Hypothesis 1.0 came out 11 years ago this coming Thursday), but we had pretty wide adoption by year two and it's only been growing. It's just that the other languages have all lagged behind.
It's by no means universally adopted, but it's not a weird rare thing that nobody has heard of.
I think right now if you're a happy proptest user it's probably not clear that you should switch to Hegel. I'd love to hear about people trying, but I can't hand on my heart say that it's clearly the correct thing for you to do given its early state, even though I believe it will eventually be.
But roughly the things that I think are clearly better about the Hegel approach and why it might be worth trying Hegel if you're starting greenfield are:
* Much better generator language than proptest (I really dislike proptest's choices here. This is partly personal aesthetic preferences, but I do think the explicitly constructed generators work better as an approach and I think this has been borne out in Hypothesis). Hegel has a lot of flexible tooling for generating the data you want.
* Hegel gets you great shrinking out of the box which always respects the validity requirements of your data. If you've written a generator to always ensure something is true, that should also be true of your shrunk data. This is... only kindof true in proptest at best. It's not got quite as many footguns in this space as original quickcheck and its purely type-based shrinking, but you will often end up having to make a choice between shrinking that produces good results and shrinking that you're sure will give you valid data.
* Hegel's test replay is much better than seed saving. If you have a failing test and you rerun it, it will almost immediately fail again in exactly the same way. With approaches that don't use the Hypothesis model, the best you can hope for is to save a random seed, then rerun shrinking from that failing example, which is a lot slower.
There are probably a bunch of other quality of life improvements, but these are the things that have stood out to me when I've used proptest, and are in general the big contrast between the Hypothesis model and the more classic QuickCheck-derived ones.
Definitely. It's a lot harder to fake this with PBT than with example-based testing, but you can still write bad property-based tests and agents are pretty good at doing so.
I have generally found that agents with property-based tests are much better at not lying to themselves about it than agents with just example-based testing, but I still spend a lot of time yelling at Claude.
> So "a huge part" - possibly, but there are other huge parts still missing.
No argument here. We're not claiming to solve agentic coding. We're just testing people doing testing things, and we think that good testing tools are extra important in an agentic world.
That angle is legibility. How do you know your AI-written slop software is doing the right thing? One would normally read all the code. Bad news: that's not much less labor intensive as not using AI at all.
But, if one has comprehensive property-based tests, they can instead read only the property-based tests to convince themselves the software is doing the right thing.
By analogy: one doesn't need to see the machine-checked proof to know the claim is correct. One only needs to check the theorem statement is saying the right thing.
Yeah, I know. Just an opportunity to talk about some of the delusions we're hearing from the "CEO class". Keep up the good work!
* Hypothesis/Hegel are very much focused on using test assertions rather than a single property that can be true or false. This naturally drives a style that is much more like "normal" testing, but also has the advantage that you can distinguish between different types of failing test. We don't go too hard on this, but both Hegel and Hypothesis will report multiple distinct failures if your test can fail in multiple ways.
* Hegelothesis's data generation and how it interacts with testing is much more flexible and basically fully imperative. You can basically generate whatever data you like wherever in your test you like, freely interleaving data generation and test execution.
* QuickCheck is very much type-first and explicit generators as an afterthought. I think this is mostly a mistake even in Haskell, but in languages where "just wrap your thing in a newtype and define a custom implementation for it" will get you a "did you just tell me to go fuck myself?" response, it's a nonstarter. Hygel is generator first, and you can get the default generator for a type if you want but it's mostly a convenience function with the assumption that you're going to want a real generator specification at some point soon.
From an implementation point of view, and what enables the big conveniences, Hypothesis has a uniform underlying representation of test cases and does all its operations on them. This means you get:
* Test caching (if you rerun a failing test, it will immediately fail in the same way with the previously shrunk example)
* Validity guarantees on shrinking (your shrunk test case will always be ones your generators could have produced. It's a huge footgun in QuickCheck that you can shrink to an invalid test case)
* Automatically improving the quality of your generators, never having to write your own shrinkers, and a whole bunch of other quality of life improvements that the universal representation lets us implement once and users don't have to care about.
The validity thing in particular is a huge pain point for a lot of users of PBT, and is what drove a lot of the core Hypothesis model to make sure that this problem could never happen.
The test caching is because I personally hated rerunning tests and not knowing whether it was just a coincidence that they were passing this time or that the test case had changed.
My point isn't so much about PBT, but about how we don't yet know just how much agents help write real software (and how to get the most help from them).
[1]: I'm only using that number because Garry Tan, CEO of YC, claimed to generate 10K lines of text per day that he believes to be working code and developers working with AI agents know they can't be.
My personal hope is that we can port most of the Hypothesis test suite to hegel-rust, then point Claude at all the relevant code and tell it to write us a hegel-core in rust with that as its test harness. Liam thinks this isn't going to work, I think it's like... 90% likely to get us close enough to working that we can carry it over the finish line. It's not a small project though. There are a lot of fiddly bits in Hypothesis, and the last time I tried to get Claude to port it to Rust the result was better than I expected but still not good enough to use.
I also observed the cheating to increase. I recently tried to do a specific optimization on a big complex function. Wrote a PBT that checks that the original function returns the same values as the optimized function on all inputs. I also tracked the runtime to confirm that performance improved. Then I let Claude loose. The PBT was great at spotting edge cases but eventually Claude always started cheating: it modified the test, it modified the original function, it implemented other (easier) optimizations, ...
My guess is that you wouldn't have had a better time without PBT here and it would still have either cheated or claimed victory incorrectly, but definitely agreed that PBT can't fully fix the problem, especially if it's PBT that the agent is allowed to modify. I've still anecdotally found that the results are better than without it because even if agents will often cheat when problems are pointed out, they'll definitely cheat if problems aren't pointed out.
The necessity thing is the big thing - why unfold in this way and not some other way. Because the premises in which you set up your argument can lead to extreme distortions, even if you think you're being "charitable" or whatever. Descartes introduced mind-body dualisms with the method of pure doubt, which at a first glance seemingly is a legitimate angle of attack.
Unfortunately that's about as nuanced as I know. Importantly this excludes out a wide amount of "any conflict that ends in a resolution validates Hegel" kind of sophistry.
Will: "Apparently Hegel actually hated the whole Hegelian dialectic and it's falsely attributed to him."
Me: "Oh, hm. But the name is funny and I'm attached to it now. How much of a problem is that?"
Will: "Well someone will definitely complain about it on hacker news."
Me: "That's true. Is that a problem?"
Will: "No, probably not."
(Which is to say: You're entirely right. But we thought the name was funny so we kept it. Sorry for the philosophical inaccuracy)
Statistics wasn’t really quite mature enough to be applied to let’s say political economy a.k.a. economics which is what Hegel was working in.
JB Say (1) was the leading mind in statistics at the time but wasn’t as popular in political circles (Notably Proudhon used Says work as epistemology versus Hegel and Marx)
I’ve been in serious philosophy courses where they take the dialectic literally and it is the epistemological source of reasoning so it’s not gone
This is especially true in how marx expanded into dialectical materialism - he got stuck on the process as the right epistemological approach, and marxists still love the dialectic and Hegelian roots (zizek is the biggest one here).
The dialectic eventually fell due to robust numerical methods and is a degenerate version version of the sampling Markov Process which is really the best in class for epistemological grounding.
Someone posted this here years ago and I always thought it was a good visual: https://observablehq.com/@mikaelau/complete-system-of-philos...
I believe (unless my memory is broken) they get into this a bunch in Ep 15 of my favourite podcast "What's Left Of Philosophy": https://podcasts.apple.com/gb/podcast/15-what-is-dialectics-...
Also if you're not being complained about on HN, are you even really nerd-ing?
This is not quite accurate. Kant says very explicitly in the (rarely studied) Transcendental Doctrine of Method (Ch 1 Section 4, A789/B817) that this kind of proof method (he calls it "apagogic") is unsuitable to transcendental proofs.
You might be thinking of the much more well studied Antinomies of Pure Reason, in which he uses this kind of proof negatively (which is to say, the circumscribe the limits of reason) as part of his proof against the way the metaphysical arguments from philosophers of his time (which he called "dogmatic" use of reason) about the nature of the cosmos were posed.
The method he used in his Deduction is a transcendental argument, which is typically expressed using two things, X and Y. X is problematic (can be true but not necessarily so), and Y is dependent on X. So then if Y is true, then X must necessarily be true as well.
One of the evilest tricks in marketing to developers is to ensure your post contains one small inaccuracy so somebody gets nerdsniped... not that I have ever done that.
I guess it would be more accurate to state Kants actual premises here as making the distinction between appearance and thing-in-itself rather than the deduction, but the deduction technique itself was fascinating when I first learned it so that's what I associate most with Kant lol.
I guess I have not thought critically why we couldn't use a Transcendental argument to support Descartes. I just treated it as a vague category error (to be fair I don't actually know Descartes philosophy that well, even less than I know Kants lol). Could be a fun exercise when I have time.
Trump did this a lot with the legacy media in his first term. He would make inaccurate statements to the media on the topic he wanted to be in the spotlight, and the media would jump to "fact check" him. Guess what, now everyone is talking about illegal immigration, tariffs, or whatever subject Trump thought was to their advantage.
I think it’s worth again pointing out that Hegel was at the height of contemporary philosophy at the time but he wasn’t a mathematician and this is the key distinction.
Hagel lives in the pre-mathematical economics world. The continental philosophy world of words with Kant etc… and never crossed into the mathematical world. So I liking it too he was doing limited capabilities and tools that he had
Again compare this to the scientific process described by Francis Bacon. There are no remixes to that there’s just improvements.
Ultimately using the dialectic is trying to use an outdated technology for understanding human behavior
I spent a little time looking at Hegel last week and it wasn't quite clear to me how I'd go about having something like a canonical generator for a type (similar to proptest's Arbitrary). I've found that to be very helpful while generating large structures to test something like serialization roundtripping against — in particular, the test-strategy library has derive macros that work very well for business logic types with, say, 10-15 enum variants each of which may have 0-10 subfields. I'm curious if that is supported today, or if you have plans to support this kind of composition in the future.
edit: oh I completely missed the macro to derive DefaultGenerator! Whoops
Interestingly, a lot of arguments and formulations Kant had were lifted from Leibniz and reframed with a less mathematical flavor. I remember in particular his argument against infinite regress was pretty much pound for pound just reciting some conjecture from Leibniz (without attribution)
It's no doubt that basically nobody could've predicted a priori 20th century mathematics and physics. Not too familiar with the physics side, but any modern philosopher who doesn't take computability seriously isn't worth their salt, for example. Not too familiar with statistics but I believe you that statistics and modern economic theories could disprove say, Marxism as he envisioned it.
That definitely doesn't mean that all those tools from back then are useless or even just misinformed IMO. I witness plenty of modern people (not you) being philosophically bankrupt when making claims.
So as a function applications of mathematics trended towards things that were not human focused and they were machine focused and financial focused
So the big transition happened after TV and Internet (really just low cost high reach advertising) became pervasive and social scientists began utilizing statistical methods across consumer and attention action as social science experimentation platforms
Social science moved from the squishy into the precise precisely to give companies a market advantage in capturing market share through manipulating human behavior
ultimately that was the wet dream of political philosophers since pahotep
Hegel is irrelevant in the age of measurement
This is one of the areas we've dogfooded the least, so we'd definitely be happy to get feedback on any sharp corners here!
I think `from_type` is one of Hypothesis's most powerful and ergonomic strategies, and that while we probably can't get quite to that level in rust, we can still get something that's pretty great.
As Liam says, the derive generator is not very well dogfooded at present. The claude skill is a bit better, but we've only been through a few iterations of using it and getting Claude to improve it, and porting from proptest is one of the less well tested areas (because we don't use proptest much ourselves).
I expect all of this works, but I'd like to know ways that it works less well than it could. Or, you know, to bask in the glow of praise of it working perfectly if that turns out to be an option.

Senior Engineer
Hello. I wrote Hypothesis. Then, back in November, I joined Antithesis, shortly followed by Liam DeVoe (another core Hypothesis maintainer). The inevitable result was synthesis, which is why today we’re introducing our new family of property-based testing libraries, Hegel.1
Hegel is an attempt to bring the quality of property-based testing found in Hypothesis to every language, and to make this seamlessly integrate with Antithesis to increase its bug-finding power. Today we’re releasing Hegel for Rust, but this is the first of many libraries. We plan to release Hegel for Go in the next week or two, and we’ve got Hegel libraries in various states of readiness for C++, OCaml, and TypeScript that we plan to release over the coming weeks or months.
Here’s an example from Hegel for Rust to whet your appetite:
#[hegel::test(test_cases = 1000)]
fn test_fraction_parse_robustness(tc: hegel::TestCase) {
let s: String = tc.draw(generators::text());
let _ = Fraction::from_str(&s); // should never panic
}
This finds a bug in the fraction crate where from_str("0/0") panics rather than returning an error value.2
If that was already enough of a sales pitch for you, you can check out Hegel here.
If not, let me tell you a bit more about why property-based testing, and Hegel in particular, are pretty great and why I think you should use them.
We saw an example of it above with Hegel for Rust: Property-based testing is testing where, rather than providing a full concrete test case yourself, you instead use the library to specify a range of values for which the test should pass. In our fraction example, our claim was a common one: Our parser should never crash, it should always either produce a valid result or error value.
You can think of that property-based test as infinitely many copies of tests that look like the following, where each test replaces the s value with a different string:
#[test]
fn test_fraction_parse_robustness() {
let s: String = "0/0";
let _ = Fraction::from_str(&s); // should never panic
}
The benefit of property-based testing libraries is that you don’t have to come up with those strings.
“Doesn’t crash” is probably the most boring property-based test, but it’s surprisingly useful. Coming from Python, it’s very useful (it’s surprisingly hard to write a Python program that never crashes), but as we saw, this happens even in Rust.
Here’s another example of a more interesting common property:
use hegel::generators::{self, Generator, integers, booleans};
use rust_decimal::Decimal;
use std::str::FromStr;
#[hegel::composite]
fn decimal_gen(tc: hegel::TestCase) -> Decimal {
let int_part = tc.draw(integers::<i64>());
let has_frac = tc.draw(booleans());
if has_frac {
let frac_digits = tc.draw(integers::<u32>()
.min_value(1).max_value(28));
let frac_val = tc.draw(integers::<u64>()
.max_value(10u64.saturating_pow(frac_digits.min(18))));
let s = format!("{}.{:0>width$}", int_part, frac_val,
width = frac_digits as usize);
Decimal::from_str(&s).unwrap_or(Decimal::from(int_part))
} else {
Decimal::from(int_part)
}
}
#[hegel::test(test_cases = 1000)]
fn test_decimal_scientific_roundtrip(tc: hegel::TestCase) {
let d = tc.draw(decimal_gen());
let sci = format!("{:e}", d);
let parsed = Decimal::from_scientific(&sci)
.expect(&format!("Failed to parse {:?} from {}", sci, d));
assert_eq!(d, parsed);
}
Here we had to define our own custom generator for Decimal using Hegel’s support for composing generators. After that, we got to test a common property called “round tripping” — if you serialize a value into some format and then read it back, you should get the same value back. This is probably one of the most common non-trivial properties that it’s worth testing in most projects, as most software needs to transform data between different formats at some point. In this case it turns out that rust_decimal doesn’t correctly handle zero when converting numbers to scientific notation, and this test finds the bug.
I have a rough classification of bugs found by property-based testing as falling into three categories:
At Antithesis we’re most excited about the third category, but generally I find a lot of the initial value of property-based testing comes from shaking out the first two, because bugs of this type are so easy to find.
For example, here’s a test that shows heck running afoul of Unicode being cursed (reported bug):
use heck::ToTitleCase;
#[hegel::test(test_cases = 1000)]
fn test_title_case_idempotent(tc: hegel::TestCase) {
let s: String = tc.draw(generators::text());
let once = s.to_title_case();
let twice = once.to_title_case();
assert_eq!(once, twice);
}
This tests the intuitive property that once you’ve converted something into title case, it’s in title case and shouldn’t need further changes. Unfortunately, this fails by drawing “ß”, which the first to_title_case turns into "SS" which the second then turns into "Ss".
The best example I’ve got for you right now of “complicated structural invariants” comes from this (it turns out, already known) bug Hegel found in the im library:
#[hegel::test(test_cases = 1000)]
fn test_ordmap_get_prev(tc: hegel::TestCase) {
// Trick to boost the size to make sure we test on large key sets.
let n = tc.draw(generators::integers::<usize>().max_value(200));
let keys: Vec<i32> = tc.draw(generators::vecs(generators::integers()).min_size(n));
let im_map: OrdMap<i32, i32> = keys.iter().map(|&k| (k, k)).collect();
let bt_map: BTreeMap<i32, i32> = keys.iter().map(|&k| (k, k)).collect();
let key = tc.draw(generators::integers::<i32>());
let im_prev = im_map.get_prev(&key).map(|(k, v)| (*k, *v));
let bt_prev = bt_map.range(..=key).next_back().map(|(&k, &v)| (k, v));
assert_eq!(im_prev, bt_prev, "get_prev({}) mismatch with {} keys", key, im_map.len());
}
This finds that above a certain size, get_prev returns the wrong value.
This sort of test is a simple example of what we usually call “model-based testing” — you’ve got something you want to test, and you construct a “model” of it — usually some bad implementation of the same thing that e.g. stores everything in memory, or implements things inefficiently. You can then use property-based testing to check that the model and reality always agree.
There are many more ways to use property-based testing than this. This post just showcases some of the more effective sorts of tests you can write with it. When getting started I actually tend to recommend starting with one of your existing tests and refactoring it, but once you start thinking in terms of this sort of testing you’ll start to see examples like the above ones everywhere.
If you’re not familiar with it, Hypothesis is the most widely used property-based testing library in the world.
Some of why Hypothesis is the most widely used library of this sort is because it’s written in Python, which I’m given to understand has a few users. But Hypothesis wasn’t the first property-based testing library in Python, only the first that achieved widespread use. This is because it has a lot of benefits over other property-based testing libraries.
The main3 ones are:
My running joke with Hypothesis is that every other property-based testing library is based on QuickCheck, which was a great innovation in testing, but is fundamentally written for Haskell programmers, and Haskell programmers are willing to put up with a lot of suffering for correctness. If Python programmers were willing to put up with suffering to achieve correctness, they’d not be writing Python in the first place!
Like everything else, Hypothesis started as basically a QuickCheck port, but over time as I (and later we) listened to what people found annoying about that, it diverged further and further from the original style of property-based testing which looks more like writing theorems about your code, and moved much more to a highly ergonomic extension to “normal” testing that increases its bug-finding power.
All of these benefits follow from the underlying model of Hypothesis, which is relatively simple.5 But the reality is that the real competitive advantage of Hypothesis is that we (me, Liam, and Zac Hatfield-Dodds) put an unreasonable amount of work into it. As a result, not many other libraries come close, because most people are only willing to put a reasonable amount of work in. Go’s Rapid library is probably the most credible port we’ve seen, but most of the other libraries that claim to be Hypothesis inspired didn’t adopt the core model, and as a result don’t get the benefits of it.
And, to be honest, we’re not willing to put that much work in again for new languages either! We’d love it if every language had a Hypothesis-quality property-based testing library, but not as much as we’d love not to have to maintain that for every language.
This led to the slightly crazy idea that I pitched when joining Antithesis: What if, instead of writing Hypothesis for every language, we just make it easy for other languages to use Hypothesis? It’s extremely common to wrap libraries in other languages in Python bindings, so why not go the other way?6
This is the core idea of Hegel: We run Hypothesis,7 and let it be the source of all generated data, wrapping it with a thin client library that turns it into values in your preferred target language. We get to implement the full feature-set of Hypothesis, because it’s all there for us already.
This means, each time you want to spin up a new Hegel library, you just have to implement the Hegel protocol, figure out the right API for the target language, and you’ve got a new high quality property-based testing library for your language of choice. The only actually hard part is a bit of care and good taste to make sure that it feels like a native citizen of the language.
As well as Hypothesis-grade property-based testing for every language, the other part of this is of course Antithesis. Medium-to-long term, the plan is that Hegel becomes one of the major entry points to running on Antithesis. That way, you can write your Hegel tests outside of Antithesis,8 get them working smoothly on your own infrastructure, and then easily run them on Antithesis to get increased bug-finding power, as well as all the usual debugging and reproducibility benefits you get from running on Antithesis.
Short term, this plan already more or less works! Hegel isn’t yet particularly good at testing the sort of highly concurrent distributed systems that are the bread-and-butter of Antithesis testing — it has largely inherited the limitations of Hypothesis in this regard. So we think Antithesis will be great if you’re writing Hegel tests and you want a bit more oomph, but Hegel will only sometimes be great if you’ve got Antithesis and want to improve your testing on it. Watch this space, though! Hopefully we’ll have some more updates on that over the coming months.
I’m obviously biased, but I really think Hegel is going to be a huge part of the future of how we do software development. I’d think that even if we weren’t currently in the middle of AI-based workflows changing everything, but we are and that makes a big difference.
As Liam has recently articulated well, property-based testing is going to be a huge part of how we make AI-agent-based software development not go terribly. For those of us who use property-based testing, it’s already been a huge part of how we make human-based software development not go terribly for the last several decades, but all of the advantages we’ve been leaning on are now extra important.
I’ve done a bunch of work on AI evaluations in the past, and one of the things that always stood out is how many times an AI would pass a coding evaluation and then you’d add property-based tests and find that a substantial fraction of its solutions now failed (this is, to be fair, also the experience of humans writing code and property-based testing it for the first time). AI has gotten much better since then, but its code is still, for want of a better word, sloppy, and we need tools to compensate for that.
But the converse of this is that it’s also never been easier to get started with property-based testing than before, because agents are actually pretty good at writing the tests! I have a confession to make: All those examples of bugs we found using Hegel? I didn’t write them. Claude did.
As well as the core Hegel libraries, we’re also releasing a Hegel skill for getting agents to write property-based tests for you. I don’t think it can — or should — replace you writing your own property-based tests, but the hardest part of property-based testing for people has always seemed to be writing the first test, because it forces you to think a lot more about how to generate data for testing your code. Letting an agent get you over that initial hump is going to be a huge win.
All of this is, of course, an argument that you should be using property-based testing, rather than Hegel in particular. Why should you use Hegel in particular?
Well, if you’ve already got great property-based tests that you’re happy with, you probably shouldn’t. Hegel is still early days and while we want it to be the best property-based testing library in every language, and are confident that we’ll get it there, we can’t deny that it’s got some rough edges. That being said, if you want to check it out anyway, I bet Claude will one-shot porting over your existing tests to it, and you can decide for yourself which you prefer (and if it’s the existing ones, we would really appreciate your telling us why so we can fix it!).
If, on the other hand, you’d like to get started on some green field property-based testing, we think Hegel is a great place to do it. It inherits a lot of power from its Hypothesis core, and we’ve made it as easy to use as possible.
In the short term, the big thing we’re working on is Hegel for other languages. As mentioned, we’ve got Go, C++, OCaml, and TypeScript in the works at various levels of readiness. Expect to see some or all of these over the coming weeks.
Between that, supporting users, and the inevitable feature requests and bug reports we expect/hope to get, we’re going to be a bit busy in the short term, but we’ve also got some more ambitious plans coming up.
We’d like to drop the Python dependency for Hegel. As well as being kind of weird, it’s definitely the current performance limiter on running Hegel tests. Our current long-term plan is to implement a second Hegel server in Rust, but we’re not promising this will happen or committing to any timelines yet.
After that, our top priority is getting Hegel better at the sort of workloads that Antithesis shines at. Currently we expect it to work well for the traditional sort of property-based testing that Hypothesis is already good at, but we’re looking to expand it to be better at highly concurrent and non-deterministic tests.
As well as making this better for testing the sorts of distributed systems people use Antithesis to test, it’s also a prerequisite for better integration with Antithesis’s other new open source property-based initiative, Bombadil.9 From the very beginning, it’s been the plan that Bombadil is going to get great shrinking and great Antithesis fuzzer integration through the Hegel protocol, but we didn’t have a runner capable of it when Oskar started the project, and we agreed it would be crazy to delay the project for that. Figuring out how to bridge that gap is very much on our roadmap.
Right now, Hegel is more or less a “developer preview”. We expect the underlying logic to be pretty rock solid, because Hypothesis is pretty rock solid, but there are definitely going to be some rough edges in how we interact with it. We’re pretty happy with the API but expect we’ve not got it 100% right.
We’d love it if you checked out Hegel and let us know about any bugs you find with it, whether they’re in your code or ours!