Meanwhile, game engines need operator overloading for adding/multiplying vectors (spatial transforms, lighting, physics) and core zig design philosophy prevents operator overloading.
Blind leading the blind. Disclaimer - I do professional rendering engines.
This is a frustrating decision. My use cases for low level languages overlap closely with my use cases for vectors (etc) with operator overloading. It was one of the first things which put a bad taste in my mouth about Zig.
https://6it.dev/blog/infographics-operation-costs-in-cpu-clo...
The general technique of SoA is pretty useful both in games and other applications, but of course I cannot speak to the specific use-case you are describing.
Also does one really need operator overloading? That feels a little strong. I've gotten by with functions just fine.. Does that make the GPU not like me Mr. wise engineer?
Zig professes to be a C replacement, not a C++ replacement, so leaving out operator overloading is consistent with that design goal. But I agree, I would prefer to program in a language that expresses mathematical relationships more naturally.
/edit
Yes, as confirmed with cURL, using my browser's "User Agent": 410 blocked. Using some other "User Agent" and it passes along the data. Pretty silly, IMHO.
(physics_data.velocity + omega * change) * frame_delta_time
to
physics_data.velocity.add(omega.mul(change)).mul(frame_delta_time)
We learn to read and think about math a certain way, which is incompatible with Zig. Also, Zig's design philosophy of "reading code over writing code" is incompatible with the kind of small modification-test-cycles required when doing games, and creative programming in general. So Zig is sort of DOA anyway for that kind of thing.
But I've been using Zig for non-game projects and it's been fantastic, so definitely not "Blind leading the blind" for the overall language design, imo.
x = x.add(step.mul(2)).mod(width)
Or in C x = imod(iadd(x, imul(step, 2)), width)
vs x = (x + 2*step) % width
For me the answer is very simple: Operators make it easier to read the code which makes it easier to spot bugs. It also makes it easier to turn formulas from textbooks into code.If 50% of the code you're working with is using vectors and matrices, not having operators for those parts is quite annoying.
Note that you can have vector operators without overloading, e.g. Odin has built in vector and matrix types.
But personally I think it's better to give the user more power instead of only letting the compiler author pick which types to allow operators on. Like how Java overloads + but only on the String class. Why do they get to do it, but not me?
That being said, the parent commenter is actually referring to other recent proposals as opposed to existing `@Vector` functionality:
const @"<+>" = @import("operator_module").plus;
...
const x = (a <+> b); math("(v + Ω * c) * Δt", .{ .v = physics_data.velocity, .@"Ω" = omega, .c = change, .@"Δt" = frame_delta_time})
I know this is already possible with comptime, though I haven't implemented it yet since I haven't needed vector math in what I'm working on currently. Can't decide whether using math names is better or worse than using the full variable names though.I mean as an avid Lisp fan, I feel like Lisp basically answers the question of how much syntax you need in a langauge. I must admit though, not having to deal with operators precedence is really nice
(mod (+ x (* 2 step)) width)One is to allow the use of simple mathematical symbols as names for functions, instead of allowing only alphanumeric identifiers.
Most programming languages allow only a small fixed set of symbols to be used as "operators", i.e. as function names.
The better solution is to allow any Unicode character from certain categories, e.g. "Sm" and "Po" ("Symbol, math" and "Punctuation, other"), which does not have an already assigned role in the language syntax, to be used as a function name.
Most LISP variants allow the use of various kinds of character symbols as function names.
The second problem is overloading. Overloading must be treated uniformly for any kind of functions, regardless if their names are identifiers or operator symbols, i.e. not like in Java, where forbidding operator overloading was a mistake (that was an overreaction to C++, which allows the overloading of a few "operators" that are not normal functions and whose overloading should not have been allowed, e.g. the comma operator).
The overloading of operators, especially for user-defined data types is something absolutely essential for scientific and technical computing.
The majority of programmers have not been exposed to programs that contain a great amount of computations, so they are accustomed only with simple expressions that contain a few variables.
In scientific and technical computing it is very frequent to have very big expressions, which may contain a large number of operations and variables, where the variables may have various types, like complex numbers, vectors, matrices, complex vectors, complex matrices, or there may be a type system with distinct types for various physical quantities, like voltages, electric currents, capacitances and so on.
Anyone who had to write frequently such big expressions will definitely prefer, both for writing and for reading, to use overloaded operator symbols instead of long function names, which would fill most of the visual space with superfluous characters, obscuring the structure of the big expression.
The third problem is the syntax of function invocation. Most programming languages allow functions whose names are identifiers to use only prefix invocation but for some symbolic operators they allow infix invocation.
Here I also prefer the languages that do not differentiate between functions with alphanumeric names and functions with symbolic names (i.e. operators). There are languages where for any function it may be specified that it must be invoked as an infix operator, if this is desired.
Which is the best between the 3 classic solutions for expression syntax, traditional expressions with infix operators and multi-level precedence rules (like in FORTRAN and ALGOL), expressions with infix operators and a unique precedence rule for all operators (like in APL) and expressions without infix operators (like in LISP), is debatable.
Each of the 3 solutions has advantages and disadvantages, so the choice between them is a matter of personal preferences.
I'd argue though that the real disadvantage to having overloadable arithmetic is that you're limited to one implementation. This is actually my biggest beef with Rust, namely traits/type classes. It locks you into a single implementation when you may want to do something different based on the context. Zig pushes the dispatch decision to the callsite, not a trait subsystem (see how Zig implements hash mays for example). So I'd personally prefer to use a DSL, since it lets me specify what type of dispatch to use.
> It locks you into a single implementation when you may want to do something different based on the context.
If you want differing behavior in a certain context, and if you don't want to use a different method to make the differing behavior explicit (e.g. the `wrapping_add` methods that Rust provides on numeric types), then you can use a different type for that context, e.g. the `std::num::Wrapping` type that Rust provides.
In general perhaps not, but in Zig it definitely does. Zig considers calling a function to change control flow, because it's no longer just an operator but something that can cause side effects, includinh mutating in place. Perhaps control flow isn't the right term, maybe non-trivial would be better?
With regard to wrappers, I personally find them ugly since 1. They bring in indirection, and I have a personal vendetta against unnecessary indirection, 2. Wrapping doesn't compose well and is a pain to shephard between representations, 3. It's harder to make a function generic across different representations, and 4. Wrappers often don't re-export everything available to their underlying value.
Andreas Hohmann May 04, 2024 #software #zig #data oriented programming #comptime
If you want to see the power of Zig's comptime, look no further than MultiArrayList. This collection stores the data of a list of structs in struct of array (SoA) rather than an array of structs (AoS). This technique is a staple of data-oriented design and array-oriented programming (long live APL) as used in high-performance applications such as game engines, scientific computing, and compilers. The latter is probably the reason why it already exists in Zig's standard library (see Andrew Kelley's Practical Data-Oriented Design presentation).
Generating a struct-of-arrays type for a struct is a non-trivial type manipulation that I expect from type-oriented languages such as TypeScript or Scala, but not from a low-level system programming language such as Zig. How does Zig pull this off? It turns out that it's "just" compile-time execution and reflection.
Types in Zig are compile time values. They can be assigned to constants, passed as arguments to (compile-time) functions, and returned by those functions. This is already visible in the syntax for type definitions. A named struct type, for example, is defined by initializing a constant with an anonymous struct declaration.
1const Token = struct {
2 kind: enum { id, string, number },
3 data: []const u8,
4};
5
6test "token size" {
7 try testing.expectEqual(24, @sizeOf(Token));
8 try testing.expectEqual(2400, @sizeOf([100]Token));
9}
As the test shows, this structure needs 24 bytes on my 64 bit machine, 2*8 bytes for the data slice (pointer and length) and 8 bytes for the kind enum and alignment. An array of 100 Token structs needs 2400 bytes.
The MultiArrayList cuts the memory needed down to 1700 bytes: 1600 bytes for the data slices and 100 bytes for the kinds. The test below demonstrates this by using a fixed buffer allocator with exactly 1704 bytes (I don't know why the 4 extra bytes are needed).
1const TokenList = std.MultiArrayList(Token);
2
3test "token lists" {
4 var buffer: [1704]u8 = undefined;
5 var fba = std.heap.FixedBufferAllocator.init(&buffer);
6 const allocator = fba.allocator();
7
8 var tokens = TokenList{};
9 try tokens.setCapacity(allocator, 100);
10 defer tokens.deinit(allocator);
11
12 tokens.appendAssumeCapacity(.{ .kind = .number, .data = "1000" });
13 try testing.expectEqual(Token{ .kind = .number, .data = "1000" }, tokens.get(0));
14}
How does this work under the hood? Types are passed to a function using a compile-time (comptime) parameter of type type. This is the foundation for Zig's generic types such as the collections in the standard library. Here is a minimal version of an array list of fixed size (see array_list.zig for the real thing).
1pub fn FixedArrayList(comptime T: type) type {
2 return struct {
3 const Self = @This();
4
5 items: []T,
6 allocator: Allocator,
7
8 pub fn init(allocator: Allocator, length: usize) Allocator.Error!Self {
9 return .{
10 .allocator = allocator,
11 .items = try allocator.alloc(T, length),
12 };
13 }
14
15 pub fn deinit(self: *Self) void {
16 self.allocator.free(self.items.ptr[0..self.items.len]);
17 }
18 };
19}
20
21test "allocates array list" {
22 const allocator = testing.allocator;
23
24 const n = 10;
25 const PointList = FixedArrayList(Point);
26 var points = try PointList.init(allocator, n);
27 defer points.deinit();
28
29 points.items[5] = .{ .x = 10, .y = 20 };
30 try testing.expectEqual(20, points.get(5).y);
31}
This is not a very useful structure as it only wraps a slice that one could use in most cases directly, but it demonstrates a simple type generator function. The type argument T is only used once for the type of the items slice, and we don't need to know anything about the inner structure of T.
To construct the struct-of-arrays for a given struct, we have to go a step further. Our type construction function must be able to look at the original struct's fields and their types to construct the new struct-of-arrays type. Zig offers a compile-time reflection API for this purpose.
To get a feel for this API, let's define a type constructor for points with explicit coordinates x1, x2, x3, and so forth. Our type function takes the type T of the coordinates and the dimension n and returns a struct with one field for each of the dimensions.
1pub fn PointN(comptime T: type, comptime N: comptime_int) type {
2 var fields: [N]std.builtin.Type.StructField = undefined;
3 for (0..N) |i| {
4 var num_buf: [128]u8 = undefined;
5 fields[i] = .{
6 .name = std.fmt.bufPrintZ(&num_buf, "x{d}", .{i + 1}) catch unreachable,
7 .type = T,
8 .default_value = null,
9 .is_comptime = false,
10 .alignment = @alignOf(T),
11 };
12 }
13 return @Type(.{
14 .Struct = .{
15 .is_tuple = false,
16 .layout = .Auto,
17 .decls = &.{},
18 .fields = &fields,
19 },
20 });
21}
22
23test "n-dimensional point" {
24 const P3 = PointN(i32, 3);
25 const p = P3{ .x1 = 10, .x2 = 20, .x3 = 30 };
26 try expectEqual(p.x2, 20);
27}
We construct the struct type dynamically using the @Type function. The metadata of the fields is an array of StructField objects. We are still not using the inner structure of the type T (besides its alignment obtained with @alignOf), but we can see what kind of metadata Zig is using to describe structs and fields. The final missing piece is a way to obtain this metadata for a given type. Zig's @typeInfo function does just that.
The MultiArrayList stores the data in a single byte array. This array is the concatenation of the arrays for the fields of the original structure. To optimize alignment, these sub-arrays are included in decending order of their alignment (the alignment of the corresponding field). The generated struct-of-arrays type keeps the field sizes and field index permutation (due to the sorting) in the sizes structure. This structure is used to compute the overall allocation size (capacityInBytes) and the offsets for the field slices.
The rest of the MultiArrayList code implements the various methods for adding and removing items, resizing, and conversion to a slice of structs. Considering the low-level pointer and index gymnastics, it is fairly readable.
Zig's compile-time type construction has its limits. It's not possible, for example, to generate methods dynamically, in contrast to the field construction that we used for the PointN structure. However, this is a current restriction of the reflection API and not a fundamental issue of the type generation functions, and this restriction may be lifted in the future. The main benefit of the comptime function approach is that we don't have to learn a new language (such as Rust's procedural macros), even though we do have to learn the reflection API.