-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit LLVM bitcode without using LLVM #19031
Conversation
The LLVM bitcode requires all type references in structs to be to earlier defined types. We make sure types are ordered in the builder itself in order to avoid having to iterate the types multiple times and changing the values of type indicies.
value_indices keeps track of the value index of each instruction in the function (i.e skips instruction which do not have a result)
* Added missing legacy field (unused_algebra) * Made struct correct size (u32 -> u8)
This fixes a problem where empty strings where not emitted as null. This should also emit a smaller stringtab as all metadata strings were emitted in both the strtab and in the strings block inside the metadata block.
The bitcode abbrev was missing the subrange code
Nice work, @antlilja! I'm looking forward to merging this. By the way, did you take any peak RSS measurements with these changes? I'm curious to know how it affected memory usage. |
I have a fix for this, but it can emit the same debug location up to twice in LLVM IR, so I'll wait to see what your plan is. |
I haven't but I'll definitely take a look at that as well when benchmarking some improvements to debug locations. |
I realize these aren't completely comparable, but I compared the compiler compiling itself to unoptimized bitcode with an empty cache from before the llvm rewrite started to after this change lands: 3bada8e: (before rewrite) edb6486: (after rewrite) edb6486 + the following patch: diff --git a/src/codegen/llvm.zig b/src/codegen/llvm.zig
index 5ea749d6d9..4eee469cc6 100644
--- a/src/codegen/llvm.zig
+++ b/src/codegen/llvm.zig
@@ -1187,14 +1187,8 @@ pub const Object = struct {
}
}
- var bitcode_arena_allocator = std.heap.ArenaAllocator.init(
- std.heap.page_allocator,
- );
- errdefer bitcode_arena_allocator.deinit();
-
- const bitcode = try self.builder.toBitcode(
- bitcode_arena_allocator.allocator(),
- );
+ const bitcode = try self.builder.toBitcode(self.gpa);
+ defer self.gpa.free(bitcode);
if (options.pre_bc_path) |path| {
var file = try std.fs.cwd().createFile(path, .{});
@@ -1250,7 +1244,6 @@ pub const Object = struct {
break :blk module;
};
- bitcode_arena_allocator.deinit();
const target_triple_sentinel =
try self.gpa.dupeZ(u8, self.builder.target_triple.slice(&self.builder).?);
diff --git a/src/codegen/llvm/Builder.zig b/src/codegen/llvm/Builder.zig
index a5aeb7dee3..38903c6289 100644
--- a/src/codegen/llvm/Builder.zig
+++ b/src/codegen/llvm/Builder.zig
@@ -14945,7 +14945,7 @@ pub fn toBitcode(self: *Builder, allocator: Allocator) bitcode_writer.Error![]co
try strtab_block.end();
}
- return bitcode.toSlice();
+ return bitcode.toOwnedSlice();
}
const Allocator = std.mem.Allocator;
diff --git a/src/codegen/llvm/bitcode_writer.zig b/src/codegen/llvm/bitcode_writer.zig
index bfb406d087..d48a92dd40 100644
--- a/src/codegen/llvm/bitcode_writer.zig
+++ b/src/codegen/llvm/bitcode_writer.zig
@@ -40,9 +40,9 @@ pub fn BitcodeWriter(comptime types: []const type) type {
self.buffer.deinit();
}
- pub fn toSlice(self: BcWriter) []const u32 {
+ pub fn toOwnedSlice(self: *BcWriter) Error![]const u32 {
std.debug.assert(self.bit_count == 0);
- return self.buffer.items;
+ return self.buffer.toOwnedSlice();
}
pub fn length(self: BcWriter) usize { |
Yeah now that I'm thinking about it it's much more reasonable to not use an arena as it will definitely over allocate. |
Here is my measurement of the impact of this change building the self-hosted compiler:
@antlilja did you perhaps get your branches mixed up when measuring? |
Yeah it's definitely possible I missed or did something weird when switching between branches. Seems like the most reasonable explanation, there haven't been any changes since the benchmark that should have that huge of an impact. The only major thing that is different is how were emitting debug locations but is definitely not the cause of that difference. My bad. |
Well, I'm just glad we caught it now. Thanks for all your efforts on this branch. Here's a new data point that includes #19069 vs old master:
Increased peak RSS? I'm not sure how that happened. Now I'm starting to doubt my methodology. Why are my results so different than both @jacobly0 and @antlilja? |
i wonder if |
Hmm, maybe those results differ from @jacobly0 because he only does the bitcode emission without actually compiling to a binary? But maybe I'm misinterpreting this line:
I'll do some benchmarks as a sanity check |
It's looking like, while the time and memory spent in Zig land is reduced, the time and memory spent in C++ land is increased even more. That's too bad. Looks like my prediction was very wrong. That being said, it does not change the plan. This is a large step towards the long term goal of eliminating the library dependency on LLVM and providing fast compilation speed by avoiding LLVM altogether. |
2 more data points: Hello World (-ODebug):
Hello World (-OReleaseFast):
Note that for the near future, the plan is to not use LLVM for debug builds but only for release builds, in which the perf difference of this data point is insignificant. |
It's 100% guaranteed to reuse the same pointer when shrinking, when using the C allocator: Lines 121 to 123 in 31763d2
however I did notice a related possible improvement to make to |
Another thought What if LLVM's implementation uses the shorthand for repeated elements? Is the file size of the .bc files similar between the Zig-generated one and the LLVM-generated one? |
I have some measurements in my new PR: #19083 |
self: *Builder, | ||
elements: []const u32, | ||
) Allocator.Error!Metadata { | ||
try self.ensureUnusedMetadataCapacity(1, Metadata.Expression, elements.len * @sizeOf(u32)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not noticing during the original review, but all of these calls are passing the wrong third argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitted a fix: #20484
This PR should close #13265
With this PR bitcode is created through the
Builder.zig
, this bitcode is then parsed into a module by LLVM throughLLVMParseBitcodeInContext2
and then compiled into object files as before. The LLVMDIBuilder
has been replaced by a system inBuilder.zig
. This PR also removes all uses of the LLVM library insideBuilder.zig
and removes a lot of the bindings inbindings.zig
andzig_llvm.h/.cpp
.Performance
Hare are some
perf
stats of a release safe version of the compiler compiling itself (on the llvm-bc branch) with an empty cache:llvm-bc branch (ReleaseSafe):
master branch (ReleaseSafe):