1. Fil-C is slower and bigger. Noticeably so. If you were OK with slower and bigger then the rewrite you should have considered wasn't to Rust in the last ten years but to Java or C# much earlier. That doesn't invalidate Fil'C's existence, but I want to point that out.
2. You're still writing C. If the program is finished or just occasionally doing a little bit of maintenance that's fine. I wrote C for most of my career, it's not a miserable language, and you are avoiding a rewrite. But if you're writing much new code Rust is just so much nicer. I stopped writing any C when I learned Rust.
3. This is runtime safety and you might need more. Rust gives you a bit more, often you can express at compile time things Fil-C would only have checked at runtime, but you might need everything and languages like WUFFS deliver that. WUFFS doesn't have runtime checks. It has proved to its satisfaction during compilation that your code is safe, so it can be executed at runtime in absolute safety. Your code might be wrong. Maybe your WUFFS GIF flipper actually makes frog GIFs purple instead of flipping them. But it can't crash, or execute x86 machine code hidden in the GIF, or whatever, that's the whole point.
I'm not convinced that tying the lifetimes into the type system is the correct way to do memory management. I've read too many articles of people being forced into refactoring the entire codebase to implement a feature.
Fil-Qt: A Qt Base build with Fil-C experience (143 points, 3 months ago, 134 comments) https://news.ycombinator.com/item?id=46646080
Linux Sandboxes and Fil-C (343 points, 4 months ago, 156 comments) https://news.ycombinator.com/item?id=46259064
Ported freetype, fontconfig, harfbuzz, and graphite to Fil-C (67 points, 5 months ago, 56 comments) https://news.ycombinator.com/item?id=46090009
A Note on Fil-C (241 points, 5 months ago, 210 comments) https://news.ycombinator.com/item?id=45842494
Notes by djb on using Fil-C (365 points, 6 months ago, 246 comments) https://news.ycombinator.com/item?id=45788040
Fil-C: A memory-safe C implementation (283 points, 6 months ago, 135 comments) https://news.ycombinator.com/item?id=45735877
Fil's Unbelievable Garbage Collector (603 points, 7 months ago, 281 comments) https://news.ycombinator.com/item?id=45133938
Guaranteed memory safety at compile time is clearly the better approach when you care about programs that are both functionally correct and memory safe. If I'm writing something that takes untrusted user input like a web API memory safety issues still end up as denial-of-service vulns. That's better, but it's still not great.
Not to disparage the Fil-C work, but the runtime approach has limitations.
When's the last time you told a C/C++ programmer you could add a garbage collector to their program, and saw their eyes light up?
> "rewrite it in rust for safety" just sounds stupid
To be fair, Fil-C is quite a bit slower than Rust, and uses more memory.
On the other hand, Fil-C supports safe dynamic linking and is strictly safer than Rust.
It's a trade off, so do what you feel
Fil-C just does the job with existing software in C or C++ without an expensive and bug riddled re-write and serves as a quick protection layer against the common memory corruption bugs found in those languages.
I love Fil-C. It's underrated. Not the same niche as Rust or Ada.
If it's guaranteed to crash, then it's memory-safe.
If you dislike that definition, then no mainstream language is memory-safe, since they all use crashes to handle out of bounds array accesses
https://play.rust-lang.org/?version=stable&mode=debug&editio...
It’s true that, assuming all things equal, compile-time checks are better than run-time. I love Rust. But Rust is only practical for a subset of correct programs. Rust is terrible for things like games where Rust simply can not prove at compile-time that usage is correct. And inability to prove correctness does NOT imply incorrectness.
I love Rust. I use it as much as I can. But it’s not the one true solution to all things.
Interesting, how costly would be hardware acceleration support for Fil-C code.
ar->invisible_bytes = calloc(length, sizeof(AllocationRecord));Other languages have runtime exceptions on out-of-bounds access, Fil-C has unrecoverable crashes. This makes it pretty unsuitable to a lot of use cases. In Go or Java (arbitrary examples) I can write a web service full of unsafe out-of-bounds array reads, any exception/panic raised is scoped to the specific malformed request and doesn't affect the overall process. A design that's impossible in Fil-C.
- Explicitly unsafe
- Runtime crash
- Runtime crash w/ compile time avoidence when possible
(Also I think the commenter you're replying to just worded their comment innacurately, code that crashes instead of violating memory safety is memory safe, a compilation error would just have been more useful than a runtime crash in most cases)
And inability to prove incorrectness does NOT imply correctness. I think most Rust users don't understand either, because of the hype.
Catch the panic & unwind, safe program execution continues. Fundamentally impossible in Fil-C.
But Rust provides both checked alternatives to indexed reads/writes (compile time safe returning Option<_>), and an exception recovery mechanism for out-of-bounds unsafe read/write. Fil-C only has one choice which is "crash immediately".
[1]: https://en.wikipedia.org/wiki/Capability_Hardware_Enhanced_R...
I am the author of Fil-C
If you want to see my write-ups of how it works, start here: https://fil-c.org/how
- Me. I'm a C++ programmer.
- Any C++ programmer who has added a GC to their C++ program. (Like the programmers who used the web browser you're using right now.)
- Folks who are already using Fil-C.
try-catch isn't a particularly complete solution either if you have any code outside of it (at the very least, the catch arm) or if data can get preserved across iterations that can easily get messed up if left half-updated (say, caches, poisoned mutexes, stuck-borrowed refcells) so you'll likely want a full restart to work well too, and might even prefer it sometimes.
Here's why:
1. For the first year of Fil-C development, I was doing it on a Mac, and it worked fine. I had lots of stuff running. No GUI in that version, though.
2. You could give Fil-C an FFI to Yolo-C. It would look sort of like the FFIs that Java, Python, or Ruby do. So, it would be a bit annoying to bridge to native APIs, but not infeasible. I've chosen not to give Fil-C such an FFI (except a very limited FFI to assembly for constant time crypto) because I wanted to force myself to port the underlying libraries to Fil-C.
3. Apple could do a Fil-C build of their userland, and MS could do a Fil-C build of their userland. Not saying they will do it. But the feasibility of this is "just" a matter of certain humans making choices, not anything technical.
I also don't think it's that niche a use case. It's one encountered by every web server or web client (scope exception to single connection/request). Or anything involving batch processing, something like "extract the text from these 10k PDFs on disk".
My original foray into GCs was making real time ones, and the Fil-C GC is based on that work. I haven’t fully made it real time friendly (the few locks it has aren’t RT-friendly) but if I had more time I could make it give you hard guarantees.
It’s already full concurrent and on the fly, so it won’t pause you
I just don’t like that design. It’s a matter of taste
Generally, I think one could want to recover from errors. But error recovery is something that needs to be designed in. You probably don't want to catch all errors, even in a loop handling requests for an application. If your application isn't designed to handle the same kinds of memory access issues as we're talking about here, the whole thing turns into non-existent-apples to non-existent-apples lol.
I've seen lots of chatter about Fil-C recently, which pitches itself as a memory safe implementation of C/C++. You can read the gritty details of how this is achieved, but for people coming across it for the first time, I think there is value in showing a simplified version, as once you've understood the simplified version it becomes a smaller mental step to then understand the production-quality version.
The real Fil-C has a compiler pass which rewrites LLVM IR, whereas the simplified model is an automated rewrite of C/C++ source code: unsafe code is transformed into safe code. The first rewrite is that within every function, every local variable of pointer type gains an accompanying local variable of AllocationRecord* type, for example:
| Original Source | After Fil-C Transform |
|---|---|
void f() {
T1* p1;
T2* p2;
uint64_t x;
...
|
void f() {
T1* p1; AllocationRecord* p1ar = NULL;
T2* p2; AllocationRecord* p2ar = NULL;
uint64_t x;
...
|
Where AllocationRecord is something like:
struct AllocationRecord {
char* visible_bytes;
char* invisible_bytes;
size_t length;
};
Trivial operations on local variables of pointer type are rewritten to also move around the AllocationRecord*:
| Original Source | After Fil-C Transform |
|---|---|
p1 = p2; |
p1 = p2, p1ar = p2ar; |
p1 = p2 + 10; |
p1 = p2 + 10, p1ar = p2ar; |
p1 = (T1*)x; |
p1 = (T1*)x, p1ar = NULL; |
x = (uintptr_t)p1; |
x = (uintptr_t)p1; |
When pointers are passed-to or returned-from functions, the code is rewritten to include the AllocationRecord* as well as the original pointer. Calls to particular standard library functions are additionally rewritten to call Fil-C versions of those functions. Putting this together, we get:
| Original Source | After Fil-C Transform |
|---|---|
p1 = malloc(x);
...
free(p1);
|
{p1, p1ar} = filc_malloc(x);
...
filc_free(p1, p1ar);
|
The (simplified) implementation of filc_malloc actually performs three distinct allocations rather than just the requested one:
void* filc_malloc(size_t length) {
AllocationRecord* ar = malloc(sizeof(AllocationRecord));
ar->visible_bytes = malloc(length);
ar->invisible_bytes = calloc(length, 1);
ar->length = length;
return {ar->visible_bytes, ar};
}
When a pointer variable is dereferenced, the accompanying AllocationRecord* is used to perform bounds checks:
| Original Source | After Fil-C Transform |
|---|---|
x = *p1;
...
*p2 = x;
|
assert(p1ar != NULL);
uint64_t i = (char*)p1 - p1ar->visible_bytes;
assert(i < p1ar->length);
assert((p1ar->length - i) >= sizeof(*p1));
x = *p1;
...
assert(p2ar != NULL);
uint64_t i = (char*)p2 - p2ar->visible_bytes;
assert(i < p2ar->length);
assert((p2ar->length - i) >= sizeof(*p2));
*p2 = x;
|
Things become more interesting when the value being stored or loaded is itself a pointer. As already seen, local variables of pointer type have their accompanying AllocationRecord* variable inserted by the compiler, which the compiler can do because it has full control and visibility of all local variables. Once pointers exist in the heap rather than just in local variables, things become harder, but this is where invisible_bytes comes in: if there is a pointer at visible_bytes + i, then its accompanying AllocationRecord* is at invisible_bytes + i. In other words, invisible_bytes is an array with element type AllocationRecord*. To ensure sane access to this array, i must be a multiple of sizeof(AllocationRecord*). The extra logic for this is highlighted in green:
| Original | After Fil-C Transform |
|---|---|
p2 = *p1;
...
*p1 = p2;
|
assert(p1ar != NULL);
uint64_t i = (char*)p1 - p1ar->visible_bytes;
assert(i < p1ar->length);
assert((p1ar->length - i) >= sizeof(*p1));
assert((i % sizeof(AllocationRecord*)) == 0);
p2 = *p1;
p2ar = *(AllocationRecord**)(p1ar->invisible_bytes + i);
...
assert(p1ar != NULL);
uint64_t i = (char*)p1 - p1ar->visible_bytes;
assert(i < p1ar->length);
assert((p1ar->length - i) >= sizeof(*p1));
assert((i % sizeof(AllocationRecord*)) == 0);
*p1 = p2;
*(AllocationRecord**)(p1ar->invisible_bytes + i) = p2ar;
|
One thing we've not yet seen is filc_free, which does something like:
void filc_free(void* p, AllocationRecord* par) {
if (p != NULL) {
assert(par != NULL);
assert(p == par->visible_bytes);
free(par->visible_bytes);
free(par->invisible_bytes);
par->visible_bytes = NULL;
par->invisible_bytes = NULL;
par->length = 0;
}
}
The eagle-eyed will note that filc_malloc made three allocations, but filc_free only frees two of them: the AllocationRecord object isn't freed by filc_free. This gap gets covered by the addition of a garbage collector (GC). You heard that right - this is C/C++ with a GC. The production-quality Fil-C has a parallel concurrent incremental collector, but a stop-the-world collector suffices for a simple model. The collector traces through AllocationRecord objects, and frees any unreachable ones. It also does two more things:
AllocationRecord, call filc_free on it.AllocationRecord has length 0, any pointers to that AllocationRecord will be changed to point at a single canonical AllocationRecord with length 0.Point 1 means that if you're using Fil-C, forgetting to call free is no longer a memory leak: the memory will be automatically freed by the GC. That isn't to say that calling free is useless, as it allows memory to be freed earlier than the GC might otherwise choose to. Point 2 means that after calling free on something, the accompanying AllocationRecord will eventually become unreachable, and thus itself eventually be freed.
Once a GC is present, it becomes tempting to use it more. One such use is making it safe to take the address of local variables, even if the resultant pointer is used after the local variable goes out of scope. If the compiler sees that a local variable has its address taken, and cannot prove that the address doesn't escape beyond the lifetime of the local variable, then the Fil-C transform will promote that local variable to be heap-allocated via malloc rather than stack-allocated. A matching free doesn't need to be inserted, as the GC will pick it up.
The final thing I want to highlight is the Fil-C version of memmove. This function from the C standard library manipulates arbitrary memory, and the compiler has no knowledge of what pointers might be present in that memory. To get past this problem, a reasonable heuristic is used: any pointers within arbitrary memory need to be completely within arbitrary memory, and need to be correctly aligned. This has the interesting consequence that memmove of eight aligned bytes behaves differently to eight separate 1-byte memmoves of the constituent bytes: the former will also memmove the corresponding range of invisible_bytes, whereas the latter will not.
That wraps up the simplified model. Some of the additional complications in the production-quality version include:
filc_free can't immediately free anything, as the free-ing thread might be racing with a different thread trying to access the underlying memory. Atomic operations on pointers also need some extra magic, as the default rewriting of a pointer load or store is to two loads or stores, which breaks atomicity.AllocationRecord is used to denote that the visible_bytes pointer is a pointer to executable code rather than regular data. Calls through a function pointer p1 check that p1 == p1ar->visible_bytes and that p1ar denotes a function pointer. To avoid type confusion attacks on function pointers, the function calling ABI also needs to verify that the type signature is correct. One way of doing this is to make all functions take the same type signature: all parameters are passed as if they were packed into a structure and passed through memory, and at ABI boundaries, every function expects to receive just a single AllocationRecord corresponding to that structure.filc_malloc avoid immediately allocating invisible_bytes, and instead allocate it on-demand later should it ever be required. It is also tempting to colocate the AllocationRecord and visible_bytes into a single allocation. If the underlying malloc prepends metadata to every allocation, it looks tempting to put that metadata in AllocationRecord instead.With the baseline understanding in place, I want to finish on a question: when might you want to use Fil-C? Personally, my answers are:
p1 and p2 have the same type, is it valid for a compiler to rewrite if (p1 == p2) { f(p1); } to if (p1 == p2) { f(p2); }? In Fil-C, the answer is clearly "no", as it changes which AllocationRecord* gets passed along to f. This makes Fil-C a useful example of a concrete system which has pointer provenance.