Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Zks functions to use LLVM intrinsics #1298

Closed
wants to merge 4 commits into from

Conversation

luojia65
Copy link
Contributor

@luojia65 luojia65 commented Mar 21, 2022

This pull request replaces use of inline assembly into LLVM intrinsics for RISC-V Zks functions. This change would enable optimizations by using LLVM intrinsics. This pull request also adds a target feature attribute on each Zks function.

The Zks functions are now gated under RISC-V Rust target feature:
图片

Use of as cast: RISC-V Zks instructions does not use upper 32 bits of parameter and return value (we sign extened it as parameter, and truncate from return value). Ref: https://github.com/riscv/riscv-crypto/blob/master/doc/scalar/insns/sm3p0.adoc#Operation

Compile output:

; core_arch::core_arch::riscv_shared::sm3p0
; Function Attrs: inlinehint mustprogress nofree nosync nounwind readnone willreturn
define internal fastcc i32 @_ZN9core_arch9core_arch12riscv_shared5sm3p017h50602641902a2471E(i32 %x) unnamed_addr #0 {
start:
  %_3 = zext i32 %x to i64
  %_2 = tail call i64 @llvm.riscv.sm3p0.i64(i64 %_3) #5
  %0 = trunc i64 %_2 to i32
  ret i32 %0
}
000000008000004e <_ZN9core_arch9core_arch12riscv_shared5sm3p017h50602641902a2471E>:
    8000004e:	1502                	slli	a0,a0,0x20
    80000050:	9101                	srli	a0,a0,0x20
    80000052:	10851513          	sm3p0	a0,a0
    80000056:	8082                	ret

0000000080000058 <_ZN9core_arch9core_arch12riscv_shared5sm3p117h9f0bc8b014c2d03aE>:
    80000058:	1502                	slli	a0,a0,0x20
    8000005a:	9101                	srli	a0,a0,0x20
    8000005c:	10951513          	sm3p1	a0,a0
    80000060:	8082                	ret

Current LLVM does not have llvm.riscv.sm3p0.i32 on 64-bit machine, it would result in LLVM ERROR: unimplemented operand if we use function llvm.riscv.sm3p0.i32.

FIXME: Rust requires all functions under #[target_feature] be called in unsafe blocks, however in this case the Zks functions would be safe. How do deal with this?

Add target_feature_11 to feature list

Use LLVM intrinsics

LLVM intrinsic type differ from function type; transmute if necessary.
@rust-highfive
Copy link

r? @Amanieu

(rust-highfive has picked a reviewer for you, use r? to override)

@luojia65
Copy link
Contributor Author

It says:

the feature `aarch64_target_feature` has been stable since 1.61.0 and no longer requires an attribute to enable

Should I remove these stable feature gates as well?

The feature `aarch64_target_feature` and `adx_target_feature` have been stable since 1.61.0 and no longer requires an attribute to enable
@Amanieu
Copy link
Member

Amanieu commented Mar 21, 2022

You can probably avoid the zero-extension by sign-extending to isize instead. On RISC-V a u32 is held in sign-extended for in a 64-bit register so this should be a no-op.

FIXME: Rust requires all functions under #[target_feature] be called in unsafe blocks, however in this case the Zks functions would be safe. How do deal with this?

It's only unsafe if the caller itself doesn't have that target feature enabled. Basically it's unsafe because you need to check that the CPU you are running on actually supports this feature. This is necessary to avoid using undefined instructions which could have unknown behavior.

@luojia65
Copy link
Contributor Author

luojia65 commented Mar 22, 2022

@Amanieu Thanks for your suggestion! I tried to update code into transmution to i32 then use as to isize, it emitted as follows in LLVM.

#[inline]
#[target_feature(enable = "zksh")]
pub fn sm3p0(x: u32) -> u32 {
    // sign extend parameter to isize
    unsafe { sm3p0_isize(transmute::<_, i32>(x) as isize) as u32 }
}
#[target_feature(enable = "zksh")]
pub fn start4(a: u32) -> u32 {
    sm3p1(sm3p1(sm3p0(sm3p0(a)))) // test code generation
}
; Function Attrs: mustprogress nofree nosync nounwind readnone willreturn
define internal fastcc i32 @_ZN11hello_world6start417h1004698f151d95b4E(i32 %a) unnamed_addr #2 {
start:
  %_3.i = sext i32 %a to i64
  %_2.i = tail call i64 @llvm.riscv.sm3p0.i64(i64 %_3.i) #4
  %sext = shl i64 %_2.i, 32
  %_3.i1 = ashr exact i64 %sext, 32
  %_2.i2 = tail call i64 @llvm.riscv.sm3p0.i64(i64 %_3.i1) #4
  %sext7 = shl i64 %_2.i2, 32
  %_3.i3 = ashr exact i64 %sext7, 32
  %_2.i4 = tail call i64 @llvm.riscv.sm3p1.i64(i64 %_3.i3) #4
  %sext8 = shl i64 %_2.i4, 32
  %_3.i5 = ashr exact i64 %sext8, 32
  %_2.i6 = tail call i64 @llvm.riscv.sm3p1.i64(i64 %_3.i5) #4
  %0 = trunc i64 %_2.i6 to i32
  ret i32 %0
}

The LLVM code includes sign extension from i32 to i64, however the generated code still includes an sext.w instruction:

0000000080000008 <_ZN11hello_world6start417h1004698f151d95b4E>:
80000008: 01 25        	sext.w	a0, a0
8000000a: 13 15 85 10  	sm3p0	a0, a0
8000000e: 01 25        	sext.w	a0, a0
80000010: 13 15 85 10  	sm3p0	a0, a0
80000014: 01 25        	sext.w	a0, a0
80000016: 13 15 95 10  	sm3p1	a0, a0
8000001a: 01 25        	sext.w	a0, a0
8000001c: 13 15 95 10  	sm3p1	a0, a0
80000020: 82 80        	ret

How to hint LLVM that this intrinsic function will ignore the upper 32 bits of input parameter?

It's only unsafe if the caller itself doesn't have that target feature enabled. Basically it's unsafe because you need to check that the CPU you are running on actually supports this feature. This is necessary to avoid using undefined instructions which could have unknown behavior.

Thanks! I'll note that.

If code after this commit generates more than one instruction,
it should be reverted.
@Amanieu
Copy link
Member

Amanieu commented Mar 22, 2022

You don't need the transmute, you can just do as i32 as isize. But that probably won't fix the unnecessary sext.

How to hint LLVM that this intrinsic function will ignore the upper 32 bits of input parameter?

I don't think that is possible. This would require LLVM to recognize that the intrinsic only reads the low 32 bits.

@luojia65
Copy link
Contributor Author

luojia65 commented Mar 26, 2022

We may wait before 32-bit RISC-V Zks intrinsic functions land in LLVM :)

@bors
Copy link
Contributor

bors commented Sep 13, 2022

☔ The latest upstream changes (presumably b9cf2d7) made this pull request unmergeable. Please resolve the merge conflicts.

@Amanieu
Copy link
Member

Amanieu commented Oct 25, 2022

Rust now uses LLVM 15, are these functions available yet?

@coastalwhite
Copy link
Contributor

This PR is now already done as part of #1453. I think this PR can be closed.

@Amanieu Amanieu closed this Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants