Emit LLVM bitcode without using LLVM #19031

antlilja · 2024-02-21T17:45:38Z

This PR should close #13265

With this PR bitcode is created through the Builder.zig, this bitcode is then parsed into a module by LLVM through LLVMParseBitcodeInContext2 and then compiled into object files as before. The LLVM DIBuilder has been replaced by a system in Builder.zig. This PR also removes all uses of the LLVM library inside Builder.zig and removes a lot of the bindings in bindings.zig and zig_llvm.h/.cpp.

Performance

Hare are some perf stats of a release safe version of the compiler compiling itself (on the llvm-bc branch) with an empty cache:

llvm-bc branch (ReleaseSafe):

Performance counter stats for 'zig build -Dstatic-llvm --zig-lib-dir lib --search-prefix zig-bootstrap/release/x86_64-linux-musl-native -Dno-langref -Dno-autodocs -Dtarget=x86_64-linux-musl -p stage3 install':

        209,102.81 msec task-clock:u                     #    2.236 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
         8,347,566      page-faults:u                    #   39.921 K/sec
   787,734,595,025      cycles:u                         #    3.767 GHz
     3,345,576,458      stalled-cycles-frontend:u        #    0.42% frontend cycles idle
    10,671,110,379      stalled-cycles-backend:u         #    1.35% backend cycles idle
   970,481,802,496      instructions:u                   #    1.23  insn per cycle
                                                  #    0.01  stalled cycles per insn
   195,371,479,830      branches:u                       #  934.332 M/sec
     6,637,680,252      branch-misses:u                  #    3.40% of all branches

      93.498796731 seconds time elapsed

     172.877783000 seconds user
      36.207457000 seconds sys

master branch (ReleaseSafe):

 Performance counter stats for 'zig build -Dstatic-llvm --zig-lib-dir lib --search-prefix zig-bootstrap/release/x86_64-linux-musl-native -Dno-langref -Dno-autodocs -Dtarget=x86_64-linux-musl -p stage3 install':

        220,426.12 msec task-clock:u                     #    2.026 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
         9,726,043      page-faults:u                    #   44.124 K/sec
   821,802,723,056      cycles:u                         #    3.728 GHz
     3,671,828,684      stalled-cycles-frontend:u        #    0.45% frontend cycles idle
     9,380,976,638      stalled-cycles-backend:u         #    1.14% backend cycles idle
 1,025,703,403,709      instructions:u                   #    1.25  insn per cycle
                                                  #    0.01  stalled cycles per insn
   206,035,478,692      branches:u                       #  934.714 M/sec
     6,999,582,116      branch-misses:u                  #    3.40% of all branches

     108.792765074 seconds time elapsed

     179.451902000 seconds user
      40.951977000 seconds sys

The LLVM bitcode requires all type references in structs to be to earlier defined types. We make sure types are ordered in the builder itself in order to avoid having to iterate the types multiple times and changing the values of type indicies.

value_indices keeps track of the value index of each instruction in the function (i.e skips instruction which do not have a result)

* Added missing legacy field (unused_algebra) * Made struct correct size (u32 -> u8)

This fixes a problem where empty strings where not emitted as null. This should also emit a smaller stringtab as all metadata strings were emitted in both the strtab and in the strings block inside the metadata block.

The bitcode abbrev was missing the subrange code

andrewrk · 2024-02-24T03:14:51Z

Nice work, @antlilja! I'm looking forward to merging this.

By the way, did you take any peak RSS measurements with these changes? I'm curious to know how it affected memory usage.

jacobly0 · 2024-02-24T15:45:32Z

I also have an idea for a new debug location system which should have a less clunky interface than my last one without over emitting debug locations. And a plan for how to deal with these "phantom" debug locations when emitting text ir. I'll ping you when it's done.

I have a fix for this, but it can emit the same debug location up to twice in LLVM IR, so I'll wait to see what your plan is.

antlilja · 2024-02-24T21:40:18Z

Nice work, @antlilja! I'm looking forward to merging this.

By the way, did you take any peak RSS measurements with these changes? I'm curious to know how it affected memory usage.

I haven't but I'll definitely take a look at that as well when benchmarking some improvements to debug locations.

jacobly0 · 2024-02-24T21:56:11Z

I realize these aren't completely comparable, but I compared the compiler compiling itself to unoptimized bitcode with an empty cache from before the llvm rewrite started to after this change lands:

3bada8e: (before rewrite)
13.08s
2.214172 GB RSS

edb6486: (after rewrite)
19.49s
2.836292 GB RSS

edb6486 + the following patch:
13.87s
1.0562 GB RSS

diff --git a/src/codegen/llvm.zig b/src/codegen/llvm.zig
index 5ea749d6d9..4eee469cc6 100644
--- a/src/codegen/llvm.zig
+++ b/src/codegen/llvm.zig
@@ -1187,14 +1187,8 @@ pub const Object = struct {
             }
         }
 
-        var bitcode_arena_allocator = std.heap.ArenaAllocator.init(
-            std.heap.page_allocator,
-        );
-        errdefer bitcode_arena_allocator.deinit();
-
-        const bitcode = try self.builder.toBitcode(
-            bitcode_arena_allocator.allocator(),
-        );
+        const bitcode = try self.builder.toBitcode(self.gpa);
+        defer self.gpa.free(bitcode);
 
         if (options.pre_bc_path) |path| {
             var file = try std.fs.cwd().createFile(path, .{});
@@ -1250,7 +1244,6 @@ pub const Object = struct {
 
             break :blk module;
         };
-        bitcode_arena_allocator.deinit();
 
         const target_triple_sentinel =
             try self.gpa.dupeZ(u8, self.builder.target_triple.slice(&self.builder).?);
diff --git a/src/codegen/llvm/Builder.zig b/src/codegen/llvm/Builder.zig
index a5aeb7dee3..38903c6289 100644
--- a/src/codegen/llvm/Builder.zig
+++ b/src/codegen/llvm/Builder.zig
@@ -14945,7 +14945,7 @@ pub fn toBitcode(self: *Builder, allocator: Allocator) bitcode_writer.Error![]co
         try strtab_block.end();
     }
 
-    return bitcode.toSlice();
+    return bitcode.toOwnedSlice();
 }
 
 const Allocator = std.mem.Allocator;
diff --git a/src/codegen/llvm/bitcode_writer.zig b/src/codegen/llvm/bitcode_writer.zig
index bfb406d087..d48a92dd40 100644
--- a/src/codegen/llvm/bitcode_writer.zig
+++ b/src/codegen/llvm/bitcode_writer.zig
@@ -40,9 +40,9 @@ pub fn BitcodeWriter(comptime types: []const type) type {
             self.buffer.deinit();
         }
 
-        pub fn toSlice(self: BcWriter) []const u32 {
+        pub fn toOwnedSlice(self: *BcWriter) Error![]const u32 {
             std.debug.assert(self.bit_count == 0);
-            return self.buffer.items;
+            return self.buffer.toOwnedSlice();
         }
 
         pub fn length(self: BcWriter) usize {

antlilja · 2024-02-24T22:12:03Z

I realize these aren't completely comparable, but I compared the compiler compiling itself to unoptimized bitcode with an empty cache from before the llvm rewrite started to after this change lands:

3bada8e: (before rewrite) 13.08s 2.214172 GB RSS

edb6486: (after rewrite) 19.49s 2.836292 GB RSS

edb6486 + the following patch: 13.87s 1.0562 GB RSS

diff --git a/src/codegen/llvm.zig b/src/codegen/llvm.zig
index 5ea749d6d9..4eee469cc6 100644
--- a/src/codegen/llvm.zig
+++ b/src/codegen/llvm.zig
@@ -1187,14 +1187,8 @@ pub const Object = struct {
             }
         }
 
-        var bitcode_arena_allocator = std.heap.ArenaAllocator.init(
-            std.heap.page_allocator,
-        );
-        errdefer bitcode_arena_allocator.deinit();
-
-        const bitcode = try self.builder.toBitcode(
-            bitcode_arena_allocator.allocator(),
-        );
+        const bitcode = try self.builder.toBitcode(self.gpa);
+        defer self.gpa.free(bitcode);
 
         if (options.pre_bc_path) |path| {
             var file = try std.fs.cwd().createFile(path, .{});
@@ -1250,7 +1244,6 @@ pub const Object = struct {
 
             break :blk module;
         };
-        bitcode_arena_allocator.deinit();
 
         const target_triple_sentinel =
             try self.gpa.dupeZ(u8, self.builder.target_triple.slice(&self.builder).?);
diff --git a/src/codegen/llvm/Builder.zig b/src/codegen/llvm/Builder.zig
index a5aeb7dee3..38903c6289 100644
--- a/src/codegen/llvm/Builder.zig
+++ b/src/codegen/llvm/Builder.zig
@@ -14945,7 +14945,7 @@ pub fn toBitcode(self: *Builder, allocator: Allocator) bitcode_writer.Error![]co
         try strtab_block.end();
     }
 
-    return bitcode.toSlice();
+    return bitcode.toOwnedSlice();
 }
 
 const Allocator = std.mem.Allocator;
diff --git a/src/codegen/llvm/bitcode_writer.zig b/src/codegen/llvm/bitcode_writer.zig
index bfb406d087..d48a92dd40 100644
--- a/src/codegen/llvm/bitcode_writer.zig
+++ b/src/codegen/llvm/bitcode_writer.zig
@@ -40,9 +40,9 @@ pub fn BitcodeWriter(comptime types: []const type) type {
             self.buffer.deinit();
         }
 
-        pub fn toSlice(self: BcWriter) []const u32 {
+        pub fn toOwnedSlice(self: *BcWriter) Error![]const u32 {
             std.debug.assert(self.bit_count == 0);
-            return self.buffer.items;
+            return self.buffer.toOwnedSlice();
         }
 
         pub fn length(self: BcWriter) usize {

Yeah now that I'm thinking about it it's much more reasonable to not use an arena as it will definitely over allocate.

andrewrk · 2024-02-24T22:32:51Z

Here is my measurement of the impact of this change building the self-hosted compiler:

Benchmark 1 (3 runs): master/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          85.7s  ± 2.98s     82.2s  … 87.7s           0 ( 0%)        0%
  peak_rss           4.71GB ±  763KB    4.71GB … 4.71GB          0 ( 0%)        0%
  cpu_cycles          330G  ± 2.28G      328G  …  332G           0 ( 0%)        0%
  instructions        460G  ±  303M      460G  …  460G           0 ( 0%)        0%
  cache_references   23.4G  ±  184M     23.3G  … 23.6G           0 ( 0%)        0%
  cache_misses       2.01G  ±  206M     1.78G  … 2.19G           0 ( 0%)        0%
  branch_misses      2.22G  ± 2.09M     2.22G  … 2.23G           0 ( 0%)        0%
Benchmark 2 (3 runs): this-pr/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           109s  ± 2.25s      106s  …  110s           0 ( 0%)        💩+ 26.9% ±  7.0%
  peak_rss           4.92GB ±  544KB    4.92GB … 4.92GB          0 ( 0%)        💩+  4.5% ±  0.0%
  cpu_cycles          409G  ± 3.65G      405G  …  412G           0 ( 0%)        💩+ 23.7% ±  2.1%
  instructions        592G  ±  283M      592G  …  592G           0 ( 0%)        💩+ 28.7% ±  0.1%
  cache_references   26.1G  ± 30.6M     26.1G  … 26.1G           0 ( 0%)        💩+ 11.4% ±  1.3%
  cache_misses       3.53G  ±  141M     3.38G  … 3.66G           0 ( 0%)        💩+ 75.3% ± 19.9%
  branch_misses      2.54G  ± 1.52M     2.53G  … 2.54G           0 ( 0%)        💩+ 14.0% ±  0.2%

@antlilja did you perhaps get your branches mixed up when measuring?

antlilja · 2024-02-24T22:44:56Z

Here is my measurement of the impact of this change building the self-hosted compiler:

Benchmark 1 (3 runs): master/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          85.7s  ± 2.98s     82.2s  … 87.7s           0 ( 0%)        0%
  peak_rss           4.71GB ±  763KB    4.71GB … 4.71GB          0 ( 0%)        0%
  cpu_cycles          330G  ± 2.28G      328G  …  332G           0 ( 0%)        0%
  instructions        460G  ±  303M      460G  …  460G           0 ( 0%)        0%
  cache_references   23.4G  ±  184M     23.3G  … 23.6G           0 ( 0%)        0%
  cache_misses       2.01G  ±  206M     1.78G  … 2.19G           0 ( 0%)        0%
  branch_misses      2.22G  ± 2.09M     2.22G  … 2.23G           0 ( 0%)        0%
Benchmark 2 (3 runs): this-pr/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           109s  ± 2.25s      106s  …  110s           0 ( 0%)        💩+ 26.9% ±  7.0%
  peak_rss           4.92GB ±  544KB    4.92GB … 4.92GB          0 ( 0%)        💩+  4.5% ±  0.0%
  cpu_cycles          409G  ± 3.65G      405G  …  412G           0 ( 0%)        💩+ 23.7% ±  2.1%
  instructions        592G  ±  283M      592G  …  592G           0 ( 0%)        💩+ 28.7% ±  0.1%
  cache_references   26.1G  ± 30.6M     26.1G  … 26.1G           0 ( 0%)        💩+ 11.4% ±  1.3%
  cache_misses       3.53G  ±  141M     3.38G  … 3.66G           0 ( 0%)        💩+ 75.3% ± 19.9%
  branch_misses      2.54G  ± 1.52M     2.53G  … 2.54G           0 ( 0%)        💩+ 14.0% ±  0.2%

@antlilja did you perhaps get your branches mixed up when measuring?

Yeah it's definitely possible I missed or did something weird when switching between branches. Seems like the most reasonable explanation, there haven't been any changes since the benchmark that should have that huge of an impact. The only major thing that is different is how were emitting debug locations but is definitely not the cause of that difference. My bad.

andrewrk · 2024-02-24T23:25:12Z

Well, I'm just glad we caught it now. Thanks for all your efforts on this branch.

Here's a new data point that includes #19069 vs old master:

Benchmark 1 (3 runs): 0.12.0-dev.2931+8d651f512/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          84.6s  ± 2.89s     81.3s  … 86.7s           0 ( 0%)        0%
  peak_rss           4.68GB ± 70.2MB    4.64GB … 4.76GB          0 ( 0%)        0%
  cpu_cycles          330G  ± 6.41G      327G  …  338G           0 ( 0%)        0%
  instructions        464G  ± 12.6G      456G  …  478G           0 ( 0%)        0%
  cache_references   23.9G  ±  427M     23.6G  … 24.4G           0 ( 0%)        0%
  cache_misses       1.78G  ±  239M     1.51G  … 1.93G           0 ( 0%)        0%
  branch_misses      2.25G  ± 66.6M     2.21G  … 2.32G           0 ( 0%)        0%
Benchmark 2 (3 runs): master+19069/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           105s  ± 4.07s      101s  …  109s           0 ( 0%)        💩+ 24.5% ±  9.5%
  peak_rss           4.92GB ±  141KB    4.92GB … 4.92GB          0 ( 0%)        💩+  5.0% ±  2.4%
  cpu_cycles          410G  ± 4.33G      405G  …  414G           0 ( 0%)        💩+ 24.1% ±  3.8%
  instructions        596G  ±  200M      596G  …  596G           0 ( 0%)        💩+ 28.6% ±  4.4%
  cache_references   26.3G  ± 42.8M     26.2G  … 26.3G           0 ( 0%)        💩+ 10.0% ±  2.9%
  cache_misses       3.24G  ±  303M     3.05G  … 3.59G           0 ( 0%)        💩+ 81.8% ± 34.6%
  branch_misses      2.56G  ± 1.66M     2.56G  … 2.56G           0 ( 0%)        💩+ 14.1% ±  4.8%

Increased peak RSS? I'm not sure how that happened. Now I'm starting to doubt my methodology. Why are my results so different than both @jacobly0 and @antlilja?

Jarred-Sumner · 2024-02-24T23:35:27Z

i wonder if toOwnedSlice is increasing memory usage since it potentially re-allocates the shrunk array before freeing the old one? or, if arena allocator is in use when toOwnedSlice is called or on ArrayList that expand a lot that could also cause this

antlilja · 2024-02-24T23:50:39Z

Well, I'm just glad we caught it now. Thanks for all your efforts on this branch.

Here's a new data point that includes #19069 vs old master:

Benchmark 1 (3 runs): 0.12.0-dev.2931+8d651f512/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          84.6s  ± 2.89s     81.3s  … 86.7s           0 ( 0%)        0%
  peak_rss           4.68GB ± 70.2MB    4.64GB … 4.76GB          0 ( 0%)        0%
  cpu_cycles          330G  ± 6.41G      327G  …  338G           0 ( 0%)        0%
  instructions        464G  ± 12.6G      456G  …  478G           0 ( 0%)        0%
  cache_references   23.9G  ±  427M     23.6G  … 24.4G           0 ( 0%)        0%
  cache_misses       1.78G  ±  239M     1.51G  … 1.93G           0 ( 0%)        0%
  branch_misses      2.25G  ± 66.6M     2.21G  … 2.32G           0 ( 0%)        0%
Benchmark 2 (3 runs): master+19069/zig build-exe ...
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           105s  ± 4.07s      101s  …  109s           0 ( 0%)        💩+ 24.5% ±  9.5%
  peak_rss           4.92GB ±  141KB    4.92GB … 4.92GB          0 ( 0%)        💩+  5.0% ±  2.4%
  cpu_cycles          410G  ± 4.33G      405G  …  414G           0 ( 0%)        💩+ 24.1% ±  3.8%
  instructions        596G  ±  200M      596G  …  596G           0 ( 0%)        💩+ 28.6% ±  4.4%
  cache_references   26.3G  ± 42.8M     26.2G  … 26.3G           0 ( 0%)        💩+ 10.0% ±  2.9%
  cache_misses       3.24G  ±  303M     3.05G  … 3.59G           0 ( 0%)        💩+ 81.8% ± 34.6%
  branch_misses      2.56G  ± 1.66M     2.56G  … 2.56G           0 ( 0%)        💩+ 14.1% ±  4.8%

Increased peak RSS? I'm not sure how that happened. Now I'm starting to doubt my methodology. Why are my results so different than both @jacobly0 and @antlilja?

Hmm, maybe those results differ from @jacobly0 because he only does the bitcode emission without actually compiling to a binary? But maybe I'm misinterpreting this line:

... I compared the compiler compiling itself to unoptimized bitcode with an empty cache ...

I'll do some benchmarks as a sanity check

andrewrk · 2024-02-25T00:00:38Z

It's looking like, while the time and memory spent in Zig land is reduced, the time and memory spent in C++ land is increased even more. That's too bad. Looks like my prediction was very wrong.

That being said, it does not change the plan. This is a large step towards the long term goal of eliminating the library dependency on LLVM and providing fast compilation speed by avoiding LLVM altogether.

andrewrk · 2024-02-25T00:05:33Z

2 more data points:

Hello World (-ODebug):

Benchmark 1 (5 runs): zig-dev build-exe hello.zig
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.04s  ± 19.6ms    1.03s  … 1.08s           0 ( 0%)        0%
  peak_rss            174MB ±  304KB     174MB …  175MB          0 ( 0%)        0%
  cpu_cycles         4.34G  ± 22.1M     4.31G  … 4.37G           0 ( 0%)        0%
  instructions       6.17G  ±  993K     6.17G  … 6.17G           0 ( 0%)        0%
  cache_references    279M  ± 2.07M      276M  …  281M           0 ( 0%)        0%
  cache_misses       10.3M  ± 83.3K     10.2M  … 10.4M           0 ( 0%)        0%
  branch_misses      34.7M  ± 42.3K     34.6M  … 34.7M           0 ( 0%)        0%
Benchmark 2 (5 runs): zig build-exe hello.zig
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          1.15s  ± 29.1ms    1.13s  … 1.20s           0 ( 0%)        💩+ 10.1% ±  3.5%
  peak_rss            181MB ±  160KB     180MB …  181MB          0 ( 0%)        💩+  3.7% ±  0.2%
  cpu_cycles         4.73G  ± 18.7M     4.71G  … 4.76G           0 ( 0%)        💩+  9.1% ±  0.7%
  instructions       7.15G  ± 3.55M     7.15G  … 7.16G           0 ( 0%)        💩+ 15.9% ±  0.1%
  cache_references    278M  ± 1.35M      276M  …  279M           0 ( 0%)          -  0.5% ±  0.9%
  cache_misses       10.4M  ±  222K     10.2M  … 10.7M           0 ( 0%)          +  0.8% ±  2.4%
  branch_misses      38.2M  ± 53.4K     38.1M  … 38.2M           0 ( 0%)        💩+ 10.1% ±  0.2%

Hello World (-OReleaseFast):

Benchmark 1 (3 runs): zig-dev build-exe hello.zig -OReleaseFast
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.61s  ± 1.60s     6.63s  … 9.45s           0 ( 0%)        0%
  peak_rss            195MB ± 26.6MB     179MB …  225MB          0 ( 0%)        0%
  cpu_cycles         32.7G  ± 7.18G     28.6G  … 41.0G           0 ( 0%)        0%
  instructions       44.7G  ± 9.70G     39.1G  … 55.9G           0 ( 0%)        0%
  cache_references   2.08G  ±  400M     1.84G  … 2.55G           0 ( 0%)        0%
  cache_misses       41.9M  ± 8.42M     36.9M  … 51.6M           0 ( 0%)        0%
  branch_misses       252M  ± 55.6M      220M  …  316M           0 ( 0%)        0%
Benchmark 2 (3 runs): zig build-exe hello.zig -OReleaseFast
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          7.80s  ± 1.50s     6.73s  … 9.51s           0 ( 0%)          +  2.5% ± 46.2%
  peak_rss            197MB ± 23.3MB     182MB …  223MB          0 ( 0%)          +  1.0% ± 29.2%
  cpu_cycles         32.8G  ± 6.73G     28.9G  … 40.6G           0 ( 0%)          +  0.3% ± 48.2%
  instructions       45.8G  ± 9.91G     40.1G  … 57.3G           0 ( 0%)          +  2.6% ± 49.8%
  cache_references   2.04G  ±  397M     1.81G  … 2.50G           0 ( 0%)          -  2.0% ± 43.3%
  cache_misses       39.6M  ± 6.53M     35.0M  … 47.1M           0 ( 0%)          -  5.5% ± 40.8%
  branch_misses       255M  ± 55.0M      223M  …  318M           0 ( 0%)          +  1.0% ± 49.7%

Note that for the near future, the plan is to not use LLVM for debug builds but only for release builds, in which the perf difference of this data point is insignificant.

andrewrk · 2024-02-25T01:18:56Z

@Jarred-Sumner

i wonder if toOwnedSlice is increasing memory usage since it potentially re-allocates the shrunk array before freeing the old one? or, if arena allocator is in use when toOwnedSlice is called or on ArrayList that expand a lot that could also cause this

It's 100% guaranteed to reuse the same pointer when shrinking, when using the C allocator:

zig/lib/std/heap.zig

Lines 121 to 123 in 31763d2

    
           if (new_len <= buf.len) { 
        
               return true; 
        
           }

however I did notice a related possible improvement to make to std.heap.raw_c_allocator. #19073. I expect this to have near zero effect.

Jarred-Sumner · 2024-02-25T22:01:45Z

Another thought

What if LLVM's implementation uses the shorthand for repeated elements? Is the file size of the .bc files similar between the Zig-generated one and the LLVM-generated one?

antlilja · 2024-02-26T00:29:16Z

Another thought

What if LLVM's implementation uses the shorthand for repeated elements? Is the file size of the .bc files similar between the Zig-generated one and the LLVM-generated one?

I have some measurements in my new PR: #19083

jacobly0 · 2024-07-03T12:14:48Z

src/codegen/llvm/Builder.zig

+    self: *Builder,
+    elements: []const u32,
+) Allocator.Error!Metadata {
+    try self.ensureUnusedMetadataCapacity(1, Metadata.Expression, elements.len * @sizeOf(u32));


Sorry for not noticing during the original review, but all of these calls are passing the wrong third argument.

Submitted a fix: #20484

antlilja and others added 27 commits February 21, 2024 16:24

LLVM: Assign correct values to enum/union tags

8feae5d

LLVM Builder: Make Type.Simple reflect LLVM codes

9ccd715

LLVM Builder: Add strtab helper to String

2801bf6

LLVM Builder: Add toLlvm helper to Alignment

fd3b81f

Fixed values in AtomicOrdering enum

6df8302

Added values to AtomicRmw.Operation enum fields

ff76ba6

Added opcode functions to Instruction/Constant.Tag

52e8434

Builder: Add function_attributes_set

c57b4e7

Made .block = false in WipFunction.hasResultWip

423c2c3

Added value_indices and valueIndex to Function

7cb8813

value_indices keeps track of the value index of each instruction in the function (i.e skips instruction which do not have a result)

Fix FastMath packed struct

ec5a433

* Added missing legacy field (unused_algebra) * Made struct correct size (u32 -> u8)

LLVM Builder: Add integer values to more enums fields

1502959

LLVM Builder: Add dbg.declare and dbg.value intrinsics

4653fc4

Add LLVM bitcode writer

2396806

LLVM IR specific bitcode

5bd2a7c

LLVM: Add toBitcode to Builder

b4369df

LLVM Builder: Add debug/metadata system

a456631

LLVM Builder: Add debug locations to instructions

049cad4

LLVM: Emit bitcode even if libllvm is not present

d305548

Add LLVM bindings for parsing LLVM bitcode

c16818d

codegen/llvm: Remove use of DIBuilder and output bin by parsing bitcode

c11c7a2

LLVM: Add Metadata/Debug bitcode IR

d35080b

LLVM Builder: Emit debug info and metadata

626c3f7

LLVM: Remove use of LLVM in Builder

f6d275b

LLVM: Remove unused from llvm/bindings.zig and zig_llvm.h/.cpp

713a555

llvm: fix builds that don't link with libllvm

48bd0ed

andrewrk mentioned this pull request Feb 21, 2024

directly output LLVM bitcode rather than using LLVM's IRBuilder API #13265

Closed

antlilja added 2 commits February 21, 2024 21:46

LLVM Builder: Rework MetadataString to not rely on String

e57f553

This fixes a problem where empty strings where not emitted as null. This should also emit a smaller stringtab as all metadata strings were emitted in both the strtab and in the strings block inside the metadata block.

LLVM Builder: Correctly emit debug subranges

5e9d0da

The bitcode abbrev was missing the subrange code

Builder: Emit metadata attachment for globals

9b39e82

antlilja force-pushed the llvm-bc branch from 4d41655 to 9b39e82 Compare February 24, 2024 01:28

andrewrk enabled auto-merge February 24, 2024 06:16

jacobly0 disabled auto-merge February 24, 2024 12:24

Builder: fix bitcode widths

7e9f321

jacobly0 force-pushed the llvm-bc branch from 2bc4bbd to 7e9f321 Compare February 24, 2024 15:42

jacobly0 enabled auto-merge February 24, 2024 15:42

BitcodeWriter: cleanup type widths

edb6486

jacobly0 merged commit b344ff0 into ziglang:master Feb 24, 2024
10 checks passed

jacobly0 mentioned this pull request Feb 24, 2024

llvm: rework memory management in emit #19069

Merged

antlilja mentioned this pull request Feb 25, 2024

Rework LLVM debug locations to not emit them twice #19074

Merged

antlilja deleted the llvm-bc branch February 26, 2024 17:04

xdBronch mentioned this pull request Apr 3, 2024

regression: @divExact no longer emits *div exact LLVM IR #19527

Closed

jacobly0 reviewed Jul 3, 2024

View reviewed changes

antlilja mentioned this pull request Jul 3, 2024

LLVM Builder: Pass correct argument to ensureUnusedMetadataCapacity #20484

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit LLVM bitcode without using LLVM #19031

Emit LLVM bitcode without using LLVM #19031

antlilja commented Feb 21, 2024

andrewrk commented Feb 24, 2024

jacobly0 commented Feb 24, 2024 •

edited

Loading

antlilja commented Feb 24, 2024

jacobly0 commented Feb 24, 2024

antlilja commented Feb 24, 2024

andrewrk commented Feb 24, 2024 •

edited

Loading

antlilja commented Feb 24, 2024

andrewrk commented Feb 24, 2024 •

edited

Loading

Jarred-Sumner commented Feb 24, 2024

antlilja commented Feb 24, 2024

andrewrk commented Feb 25, 2024

andrewrk commented Feb 25, 2024

andrewrk commented Feb 25, 2024 •

edited

Loading

Jarred-Sumner commented Feb 25, 2024

antlilja commented Feb 26, 2024

jacobly0 Jul 3, 2024

antlilja Jul 3, 2024

Emit LLVM bitcode without using LLVM #19031

Emit LLVM bitcode without using LLVM #19031

Conversation

antlilja commented Feb 21, 2024

Performance

llvm-bc branch (ReleaseSafe):

master branch (ReleaseSafe):

andrewrk commented Feb 24, 2024

jacobly0 commented Feb 24, 2024 • edited Loading

antlilja commented Feb 24, 2024

jacobly0 commented Feb 24, 2024

antlilja commented Feb 24, 2024

andrewrk commented Feb 24, 2024 • edited Loading

antlilja commented Feb 24, 2024

andrewrk commented Feb 24, 2024 • edited Loading

Jarred-Sumner commented Feb 24, 2024

antlilja commented Feb 24, 2024

andrewrk commented Feb 25, 2024

andrewrk commented Feb 25, 2024

andrewrk commented Feb 25, 2024 • edited Loading

Jarred-Sumner commented Feb 25, 2024

antlilja commented Feb 26, 2024

jacobly0 Jul 3, 2024

Choose a reason for hiding this comment

antlilja Jul 3, 2024

Choose a reason for hiding this comment

jacobly0 commented Feb 24, 2024 •

edited

Loading

andrewrk commented Feb 24, 2024 •

edited

Loading

andrewrk commented Feb 24, 2024 •

edited

Loading

andrewrk commented Feb 25, 2024 •

edited

Loading