Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64: lower iabs.i64x2 using a single AVX512 instruction when possible #2819

Merged
merged 4 commits into from
Apr 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions cranelift/codegen/src/isa/x64/inst/args.rs
Original file line number Diff line number Diff line change
Expand Up @@ -460,9 +460,7 @@ pub(crate) enum InstructionSet {
BMI1,
#[allow(dead_code)] // never constructed (yet).
BMI2,
#[allow(dead_code)]
AVX512F,
#[allow(dead_code)]
AVX512VL,
}

Expand Down Expand Up @@ -995,13 +993,11 @@ impl fmt::Display for SseOpcode {

#[derive(Clone)]
pub enum Avx512Opcode {
#[allow(dead_code)]
Vpabsq,
}

impl Avx512Opcode {
/// Which `InstructionSet`s support the opcode?
#[allow(dead_code)]
pub(crate) fn available_from(&self) -> SmallVec<[InstructionSet; 2]> {
match self {
Avx512Opcode::Vpabsq => smallvec![InstructionSet::AVX512F, InstructionSet::AVX512VL],
Expand Down
22 changes: 21 additions & 1 deletion cranelift/codegen/src/isa/x64/inst/emit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@ use crate::isa::x64::inst::args::*;
use crate::isa::x64::inst::*;
use crate::machinst::{inst_common, MachBuffer, MachInstEmit, MachLabel};
use core::convert::TryInto;
use encoding::evex::{EvexInstruction, EvexVectorLength};
use encoding::rex::{
emit_simm, emit_std_enc_enc, emit_std_enc_mem, emit_std_reg_mem, emit_std_reg_reg, int_reg_enc,
low8_will_sign_extend_to_32, low8_will_sign_extend_to_64, reg_enc, LegacyPrefixes, RexFlags,
low8_will_sign_extend_to_32, low8_will_sign_extend_to_64, reg_enc, LegacyPrefixes, OpcodeMap,
RexFlags,
};
use log::debug;
use regalloc::{Reg, Writable};
Expand Down Expand Up @@ -1404,6 +1406,24 @@ pub(crate) fn emit(
};
}

Inst::XmmUnaryRmREvex { op, src, dst } => {
let opcode = match op {
Avx512Opcode::Vpabsq => 0x1f,
};
match src {
RegMem::Reg { reg: src } => EvexInstruction::new()
.length(EvexVectorLength::V128)
.prefix(LegacyPrefixes::_66)
.map(OpcodeMap::_0F38)
.w(true)
.opcode(opcode)
.reg(dst.to_reg().get_hw_encoding())
.rm(src.get_hw_encoding())
.encode(sink),
_ => todo!(),
};
}

Inst::XmmRmR {
op,
src: src_e,
Expand Down
7 changes: 7 additions & 0 deletions cranelift/codegen/src/isa/x64/inst/emit_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3865,6 +3865,12 @@ fn test_x64_emit() {
"cvtdq2pd %xmm2, %xmm8",
));

insns.push((
Inst::xmm_unary_rm_r_evex(Avx512Opcode::Vpabsq, RegMem::reg(xmm2), w_xmm8),
"6272FD081FC2",
"vpabsq %xmm2, %xmm8",
));

// Xmm to int conversions, and conversely.

insns.push((
Expand Down Expand Up @@ -4276,6 +4282,7 @@ fn test_x64_emit() {
let mut isa_flag_builder = x64::settings::builder();
isa_flag_builder.enable("has_ssse3").unwrap();
isa_flag_builder.enable("has_sse41").unwrap();
isa_flag_builder.enable("has_avx512f").unwrap();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there were a way to avoid this here and instead specify it up above when necessary--suggestions welcome because I can't think of a great way to do this without rewriting this entire file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a single array with all the testing tuples, we could have several ones, one for each Flag combination we'd like to test. Then we could refactor this function so that it takes flags from a parameter and an array of test tuples, or something like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe not something to worry too much about, since there's no actual instruction selection test here. What was your concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess that's true. It just looked weird to lump the AVX512 flag in with the baseline flags. Maybe in a follow-on PR I will try to factor out the "check these instructions" code so that I can do as you suggest above.

let isa_flags = x64::settings::Flags::new(&flags, isa_flag_builder);

let rru = regs::create_reg_universe_systemv(&flags);
Expand Down
Loading