Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added parse.py, constants.py and Makefile modifications to generate yaml files. #21

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AFOliveira
Copy link
Collaborator

I believe that this type of file organization is a good workaround to not having everything in the upstream repo yet.
This version still has some bugs namely on pseudoinstructions and on some assembly code. I will be working on those and I'll submit a new PR when they're solved.

@AFOliveira
Copy link
Collaborator Author

Sorry for the ammount of pushes, but this is a more clean structure with fewer LoC, reutilizing the riscv-opcode submodule

Signed-off-by: Afonso Oliveira <Afonso.Oliveira@synopsys.com>
Copy link
Collaborator

@dhower-qc dhower-qc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instructions with a full (no variables) encoding have a YAML format error. The match field starts with a single quote, but isn't terminated. For example:


ecall:
  long_name: No synopsis available.
  description: |
      No description available.
  definedBy: [I]
  assembly: ecall
  encoding:
    match: '00000000000000000000000001110011
    variables: []
  access:
    s: TODO
    u: TODO
    vs: TODO
    vu: TODO
  operation(): |
      

@dhower-qc
Copy link
Collaborator

FYI, the auto-inst YAML files were very useful finding some errors in the hand-written instruction encodings.

Most have been fixed in the pr/AFOliveira/21 branch.

The remaining issues fall into two categories:

  • Missing left_shift on the auto-inst YAML.
  • Differences in how auto-inst and the handwritten handle encodings of shifts between RV32 and RV64

Details:

Variable mismatch for auipc:
  auto: {"name"=>"imm", "location"=>"31-12"}
  arch: {"name"=>"imm", "location"=>"31-12", "left_shift"=>12}
Variable mismatch for beq:
  auto: {"name"=>"imm", "location"=>"31|7|30-25|11-8"}
  arch: {"name"=>"imm", "location"=>"31|7|30-25|11-8", "left_shift"=>1}
Variable mismatch for bge:
  auto: {"name"=>"imm", "location"=>"31|7|30-25|11-8"}
  arch: {"name"=>"imm", "location"=>"31|7|30-25|11-8", "left_shift"=>1}
Variable mismatch for bgeu:
  auto: {"name"=>"imm", "location"=>"31|7|30-25|11-8"}
  arch: {"name"=>"imm", "location"=>"31|7|30-25|11-8", "left_shift"=>1}
Variable mismatch for blt:
  auto: {"name"=>"imm", "location"=>"31|7|30-25|11-8"}
  arch: {"name"=>"imm", "location"=>"31|7|30-25|11-8", "left_shift"=>1}
Variable mismatch for bltu:
  auto: {"name"=>"imm", "location"=>"31|7|30-25|11-8"}
  arch: {"name"=>"imm", "location"=>"31|7|30-25|11-8", "left_shift"=>1}
Variable mismatch for bne:
  auto: {"name"=>"imm", "location"=>"31|7|30-25|11-8"}
  arch: {"name"=>"imm", "location"=>"31|7|30-25|11-8", "left_shift"=>1}
Variable mismatch for jal:
  auto: {"name"=>"imm", "location"=>"31|19-12|20|30-21"}
  arch: {"name"=>"imm", "location"=>"31|19-12|20|30-21", "left_shift"=>1, "sign_extend"=>true}
Encoding mismatch for lui
  auto: -------------------------0110111
  arch: -------------------------0110011
Variable mismatch for sd:
  auto: {"name"=>"imm", "location"=>"31-25|11-7"}
  arch: {"name"=>"imm", "location"=>"31-25|11-7", "sign_extend"=>true}
Encoding mismatch for sll
  auto: 0000000----------001-----0110011
  arch: 0000000----------001-----0010011
Encoding mismatch for slli: Auto does not split by XLEN, but arch does
Variable mismatch for slliw:
  auto: {"name"=>"shamt", "location"=>"24-20"}
  arch: {"name"=>"shamt", "location"=>"25-20"}
Encoding mismatch for srai: Auto does not split by XLEN, but arch does
Variable mismatch for sraiw:
  auto: {"name"=>"shamt", "location"=>"24-20"}
  arch: {"name"=>"shamt", "location"=>"25-20"}
Encoding mismatch for srli: Auto does not split by XLEN, but arch does
Variable mismatch for srliw:
  auto: {"name"=>"shamt", "location"=>"24-20"}
  arch: {"name"=>"shamt", "location"=>"25-20"}
Encoding mismatch for slli.uw
  auto: 000010-----------001-----0011011
  arch: 0000010----------001-----0011011
Encoding mismatch for rori: Auto does not split by XLEN, but arch does
Encoding mismatch for rori: Auto does not split by XLEN, but arch does
Encoding mismatch for bclri: Auto does not split by XLEN, but arch does
Encoding mismatch for bexti: Auto does not split by XLEN, but arch does
Encoding mismatch for binvi: Auto does not split by XLEN, but arch does
Encoding mismatch for bseti: Auto does not split by XLEN, but arch does
Encoding mismatch for csrrs
  auto: -----------------010-----1110011
  arch: -----------------010-----0010011
Encoding mismatch for rori: Auto does not split by XLEN, but arch does
Encoding mismatch for rori: Auto does not split by XLEN, but arch does
Encoding mismatch for rori: Auto does not split by XLEN, but arch does

@AFOliveira
Copy link
Collaborator Author

AFOliveira commented Sep 6, 2024

Instructions with a full (no variables) encoding have a YAML format error. The match field starts with a single quote, but isn't terminated. For example:

This should be solved on the new branch with my most recent commit.

Differences in how auto-inst and the handwritten handle encodings of shifts between RV32 and RV64

This was already solved it was only a bug when I converted the files to this structure, I will fix it and upload it

@AFOliveira
Copy link
Collaborator Author

@dhower-qc

Getting Back on this, my latest commit, added the possiblity to add left_shift encapsulation to variables.
I only added the ones present on the extensions already done on the arch folder. There are ambiguous results and therefores I will also be looking at riscv-opcodes to see if there is any change there that can enable this,

Missing left_shift on the auto-inst YAML.
I believe I actually found a way to do this due to instructions name, I will try and then commit it

@AFOliveira
Copy link
Collaborator Author

I think almost all the issues are now adressed on the PR branch with the exception of "sign_extend"=>true , which can be solved in the same way of the left_shift if you also agree so. I would like to know if there are any other instructions that were already hand-written on arch, so that I could add to the left_shift, which as of now is still pretty incomplete.

@dhower-qc
Copy link
Collaborator

I think almost all the issues are now adressed on the PR branch with the exception of "sign_extend"=>true , which can be solved in the same way of the left_shift if you also agree so. I would like to know if there are any other instructions that were already hand-written on arch, so that I could add to the left_shift, which as of now is still pretty incomplete.

Great! Like we discussed, we can just forget about sign_extend; it isn't being used and I will just get rid of it in existing files.

@AFOliveira
Copy link
Collaborator Author

@dhower-qc can I PR that separate branch or do you prefer to keep it on the side?

@dhower-qc
Copy link
Collaborator

I'm still seeing three left shift discrepancies:

Variable mismatch for auipc:
  auto: {"name"=>"imm", "location"=>"31-12"}
  arch: {"name"=>"imm", "location"=>"31-12", "left_shift"=>12}
Variable mismatch for jal:
  auto: {"name"=>"imm", "location"=>"31|19-12|20|30-21"}
  arch: {"name"=>"imm", "location"=>"31|19-12|20|30-21", "left_shift"=>1}
Variable mismatch for lui:
  auto: {"name"=>"imm", "location"=>"31-12"}
  arch: {"name"=>"imm", "location"=>"31-12", "left_shift"=>12}

@dhower-qc
Copy link
Collaborator

When there is a field that has a value restriction, e.g.:

c.addiw:
  encoding:
    match: ----------------001-----------01
    variables:
    - name: imm
      location: 12|6-2
    - name: rd/rs!=0
      location: 11-7

riscv-unified-db is expecting a "not" field, like so:

c.addiw:
  encoding:
    match: ----------------001-----------01
    variables:
    - name: imm
      location: 12|6-2
    - name: rd_rs
      location: 11-7
      not: 0

Is that something we can do automatically?

@AFOliveira
Copy link
Collaborator Author

I'm still seeing three left shift discrepancies:

Variable mismatch for auipc:
  auto: {"name"=>"imm", "location"=>"31-12"}
  arch: {"name"=>"imm", "location"=>"31-12", "left_shift"=>12}
Variable mismatch for jal:
  auto: {"name"=>"imm", "location"=>"31|19-12|20|30-21"}
  arch: {"name"=>"imm", "location"=>"31|19-12|20|30-21", "left_shift"=>1}
Variable mismatch for lui:
  auto: {"name"=>"imm", "location"=>"31-12"}
  arch: {"name"=>"imm", "location"=>"31-12", "left_shift"=>12}

This the case because there were only three left shifts in the .yaml and therefore I decided to wait for more instructions before properly handling all of them. The back-end is already done for them though.

@AFOliveira
Copy link
Collaborator Author

When there is a field that has a value restriction, e.g.:

c.addiw:
  encoding:
    match: ----------------001-----------01
    variables:
    - name: imm
      location: 12|6-2
    - name: rd/rs!=0
      location: 11-7

riscv-unified-db is expecting a "not" field, like so:

c.addiw:
  encoding:
    match: ----------------001-----------01
    variables:
    - name: imm
      location: 12|6-2
    - name: rd_rs
      location: 11-7
      not: 0

Is that something we can do automatically?

Yes, it can. I forgot to mention it, my bad. What happened was that I could not find any examples on how this was being handled, but it can definitely be done automatically. I'll handle as soon as I can, thank you for pointing that out!

@AFOliveira
Copy link
Collaborator Author

Is that something we can do automatically?

This is now done!

@AFOliveira
Copy link
Collaborator Author

I just found an way of automating the left_shift that I believe will be applicable for all instructions.

@AFOliveira
Copy link
Collaborator Author

Left_shift is now fully automated for all instructions.

@AFOliveira
Copy link
Collaborator Author

@dhower-qc I'm still doing some changes on this to see if we possibly could indeed move a significant number of instructions in before the RISC-V Summit NA and I wanted to ask how should we describe pseudoinstructions other than on the the origininstructions.

For fence as an example:

  pseudoinstructions:
  - when: (pred == 0x3) && (succ == 0x3) && (fm == 1)
    to: fence.tso
  - when: (pred == 1) && (succ == 0) && (fm == 0) && (rd == 0) && (rs1 == 0)
    to: pause

Is this enough to be able to generate a subsequent index for pseudo instructions as well or should we also print those pseudo instruction wish a flag of some type in order to identify them?

@dhower-qc
Copy link
Collaborator

Here is the checker script I've been using. When placed in ext/auto-instr, I run it like this (to use the container enviornment):

command

./bin/ruby ext/auto-instr/compare.rb

compare.rb

require 'pathname'
require 'yaml'

root = Pathname.new(__FILE__).dirname.realpath
arch_path = (root / '..' / '..' /'arch').realpath
auto_path = (root / 'yaml_output').realpath

def compare_variables(inst_name, auto_vars, arch_vars)
  auto_vars.each do |var|
    warn "No variables for #{inst_name}?" if arch_vars.nil?
    unless arch_vars.include?(var)
      arch_var = arch_vars.find { |v| v["name"] == var["name"]}
      warn "No variable '#{var["name"]}' for #{inst_name} in arch (#{arch_vars})" if arch_var.nil?
      if var["location"] == "#{arch_var["location"]}-#{arch_var["location"]}"
        next # ok, just written differently
      end
      warn "Variable mismatch for #{inst_name}:"
      warn "  auto: #{var}"
      warn "  arch: #{arch_var}"
    end
  end
end

Dir.glob("#{auto_path}/**/*.yaml") do |f|
  inst_name = File.basename(f, ".yaml")
  matches = arch_path.glob("**/#{inst_name}.yaml")
  next if matches.empty?

  auto_yaml = YAML.load_file(f)
  arch_yaml = YAML.load_file(matches.first)

  if !auto_yaml[inst_name]["encoding"]["RV32"].nil?
    if arch_yaml[inst_name]["encoding"]["RV32"].nil?
      warn "Encoding mismatch for #{inst_name}: Auto splits by XLEN, but arch does not"
      next
    elsif auto_yaml[inst_name]["encoding"]["RV32"]["match"] != arch_yaml[inst_name]["encoding"]["RV32"]["match"]
      warn "Encoding mismatch for #{inst_name}, RV32"
      warn "  auto: #{auto_yaml[inst_name]["encoding"]["RV32"]["match"]}"
      warn "  arch: #{arch_yaml[inst_name]["encoding"]["RV32"]["match"]}"
      next
    elsif auto_yaml[inst_name]["encoding"]["RV64"]["match"] != arch_yaml[inst_name]["encoding"]["RV64"]["match"]
      warn "Encoding mismatch for #{inst_name}, RV32"
      warn "  auto: #{auto_yaml[inst_name]["encoding"]["RV64"]["match"]}"
      warn "  arch: #{arch_yaml[inst_name]["encoding"]["RV64"]["match"]}"
      next
    end
  elsif !arch_yaml[inst_name]["encoding"]["RV32"].nil?
    if auto_yaml[inst_name]["encoding"]["RV32"].nil?
      warn "Encoding mismatch for #{inst_name}: Auto does not split by XLEN, but arch does"
      next
    elsif auto_yaml[inst_name]["encoding"]["RV32"]["match"] != arch_yaml[inst_name]["encoding"]["RV32"]["match"]
      warn "Encoding mismatch for #{inst_name}, RV32"
      warn "  auto: #{auto_yaml[inst_name]["encoding"]["RV32"]["match"]}"
      warn "  arch: #{arch_yaml[inst_name]["encoding"]["RV32"]["match"]}"
      next
    elsif auto_yaml[inst_name]["encoding"]["RV64"]["match"] != arch_yaml[inst_name]["encoding"]["RV64"]["match"]
      warn "Encoding mismatch for #{inst_name}, RV32"
      warn "  auto: #{auto_yaml[inst_name]["encoding"]["RV64"]["match"]}"
      warn "  arch: #{arch_yaml[inst_name]["encoding"]["RV64"]["match"]}"
      next
    end
  elsif auto_yaml[inst_name]["encoding"]["match"] != arch_yaml[inst_name]["encoding"]["match"]
    warn "Encoding mismatch for #{inst_name}"
    warn "  auto: #{auto_yaml[inst_name]["encoding"]["match"]}"
    warn "  arch: #{arch_yaml[inst_name]["encoding"]["match"]}"
    next
  end

  # check variables
  if !auto_yaml[inst_name]["encoding"]["RV32"].nil?
    compare_variables(inst_name, auto_yaml[inst_name]["encoding"]["RV32"]["variables"], arch_yaml[inst_name]["encoding"]["RV32"]["variables"])
    compare_variables(inst_name, auto_yaml[inst_name]["encoding"]["RV64"]["variables"], arch_yaml[inst_name]["encoding"]["RV64"]["variables"])
  else
    compare_variables(inst_name, auto_yaml[inst_name]["encoding"]["variables"], arch_yaml[inst_name]["encoding"]["variables"])
  end
end

@AFOliveira
Copy link
Collaborator Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants