[Vector ABI] Type size and alignment for vector types.

The issue of Vector alignment is discussed in #347. It is mentioned that aligning to 128 bytes might deliver better performance on some RISC-V cores, but this behavior could lead to considerable stack wastage on zve32 and zve64 cores. For instance, in order to ensure a vector value in the stack conforms to the ABI specification, we could potentially waste up to 96 bits per vector object in stack for zve32, and the performance difference isn't always evident across all core implementations. Therefore, this proposal sets the alignment of vector types to element alignment, to avoid wasting a significant amount of stack space in zve32 and zve64 configurations. Also, since the ABI only specify the minimum alignment and doesn't limit the compiler from adopting higher alignment for specific CPUs. Fix #347.
riscv-non-isa · Jan 10, 2024 · 6b3f877 · 6b3f877
1 parent d4c38ee
commit 6b3f877
Show file tree

Hide file tree

Showing 5 changed files with 663 additions and 1 deletion.
diff --git a/Makefile b/Makefile
@@ -19,3 +19,6 @@ $(NAME).pdf: $(NAME).adoc $(wildcard *.adoc) resources/themes/risc-v_spec-pdf.ym
 	    -a pdf-fontsdir=resources/fonts \
 	    -v \
 	    $< -o $@
+
+regen_vector_type_infos:
+	python3 gen_vector_type_infos.py > vector_type_infos.adoc
diff --git a/gen_vector_type_infos.py b/gen_vector_type_infos.py
@@ -0,0 +1,112 @@
+import subprocess
+import os
+import sys
+import math
+
+LMUL=["mf8", "mf4", "mf2", "m1", "m2", "m4", "m8"]
+NF=[2, 3, 4, 5, 6, 7, 8]
+SEW=[8, 16, 32, 64]
+TYPES = ["int", "uint", "float", "bfloat"]
+
+def lmul2nreg(lmul):
+    if (lmul.startswith("mf")):
+        return 1
+    else:
+        return int(lmul[1:])
+
+def lmul2num(lmul):
+    if (lmul.startswith("mf")):
+        if lmul=="mf8":
+            return 0.125
+        if lmul=="mf4":
+            return 0.25
+        if lmul=="mf2":
+            return 0.5
+        assert false
+    else:
+        return int(lmul[1:])
+
+def sizestr(lmul, nf=1):
+    sz_mul = lmul2num(lmul) * nf
+    if (sz_mul == 1):
+        return "(VLEN / 8)"
+    if (sz_mul > 1):
+        return "(VLEN / 8) * %g" %(sz_mul)
+    if (sz_mul < 1):
+        if (sz_mul == 0.5):
+            return "(VLEN / 8) / 2"
+        if (sz_mul == 0.25):
+            return "(VLEN / 8) / 4"
+        if (sz_mul == 0.125):
+            return "(VLEN / 8) / 8"
+        if (sz_mul == 0.375):
+            return "(VLEN / 8) * 0.375"
+        if (sz_mul == 0.625):
+            return "(VLEN / 8) * 0.625"
+        if (sz_mul == 0.75):
+            return "(VLEN / 8) * 0.75"
+        if (sz_mul == 0.875):
+            return "(VLEN / 8) * 0.875"
+        assert false
+
+def valid_type(sew, lmul, base_t, nf=1):
+    nreg = lmul2nreg(lmul)
+    if nreg * nf > 8:
+        return False
+    if t == "bfloat" and sew != 16:
+        return False
+    if t == "float" and sew == 16:
+        return False
+    return True
+
+def get_note(sew, lmul, base_t, nf=1):
+    ln = lmul2num(lmul)
+    x = (32 / sew * ln)
+    notes = []
+    if (32 / sew * ln) < 1:
+        notes += ["`*1`"]
+    if base_t == "float" and sew == 16:
+        notes += ["`*2`"]
+    if base_t == "bfloat":
+        notes += ["`*3`"]
+    if base_t == "float" and sew == 32:
+        notes += ["`*4`"]
+    if base_t == "float" and sew == 64:
+        notes += ["`*5`"]
+    if (base_t == "int" or base_t == "uint") and sew == 64:
+        notes += ["`*6`"]
+    return ", ".join(notes)
+
+print (".Type sizes and alignments for vector data types")
+print ("[cols=\"4,3,>3,>2\"]")
+print ("[width=80%]")
+print ("|===")
+print("| Internal Name | Type | Description")
+print("")
+
+for sew in SEW:
+    for lmul in LMUL:
+        for t in TYPES:
+            if not valid_type(sew, lmul, t):
+                continue
+            typename = "v%s%s%s_t" %(t, sew, lmul)
+            mname = "__rvv_" + typename
+            size = sizestr(lmul)
+            print ("| %-22s | %-20s | %-18s | %d" %(mname, typename, size, sew/8))
+
+print (".Type sizes and alignments for vector tuple types")
+print ("[cols=\"4,3,>3,>2\"]")
+print ("[width=80%]")
+print ("|===")
+
+print ("| Internal Name          | Type                 | Size (Bytes)  | Alignment (Bytes)")
+for sew in SEW:
+    for lmul in LMUL:
+        for nf in NF:
+            for t in TYPES:
+                if not valid_type(sew, lmul, t, nf):
+                    continue
+                typename = "v%s%s%sx%s_t" %(t, sew, lmul, nf)
+                mname = "__rvv_" + typename
+                size = sizestr(lmul, nf)
+                print ("| %-22s | %-20s | %-18s | %d" %(mname, typename, size, sew/8))
diff --git a/riscv-cc.adoc b/riscv-cc.adoc
@@ -626,6 +626,38 @@ of the vararg save area.  The `va_arg` macro will increment its `va_list`
 argument according to the size of the given type, taking into account the
 rules about 2×XLEN aligned arguments being passed in "aligned" register pairs.
 
+=== Vector type sizes and alignments
+
+This section defines the sizes and alignments for the vector types defined in
+the _RISC-V Vector Extension Intrinsic Document_<<rvv-intrinsic-doc>>.
+The actual size of each type is determined by the hardware configuration, which
+is based on the content of the `vlenb` register.
+
+There are three classes of vector types: the vector mask types, the vector
+data types and the vector tuple types.
+
+.Type sizes and alignments for vector mask types
+[cols="4,3,>3,>2"]
+[width=80%]
+|===
+| Internal Name              | Type                 | Size (Bytes)       | Alignment (Bytes)
+
+| __rvv_vbool1_t             | vbool1_t             |  VLENB             |  1
+| __rvv_vbool2_t             | vbool2_t             |  VLENB / 2         |  1
+| __rvv_vbool4_t             | vbool4_t             |  VLENB / 4         |  1
+| __rvv_vbool8_t             | vbool8_t             |  ceil(VLENB / 8)   |  1
+| __rvv_vbool16_t            | vbool16_t            |  ceil(VLENB / 16)  |  1
+| __rvv_vbool32_t            | vbool32_t            |  ceil(VLENB / 32)  |  1
+| __rvv_vbool64_t            | vbool64_t            |  ceil(VLENB / 64)  |  1
+|===
+
+include::vector_type_infos.adoc[]
+
+NOTE: The vector mask types utilize a portion of the space, while the remaining
+content may be undefined, both in the register and in memory.
+
+NOTE: Size must be a positive integer.
+
 [appendix]
 == Linux-specific ABI
 

diff --git a/riscv-elf.adoc b/riscv-elf.adoc
@@ -142,12 +142,45 @@ any vector registers.
 
 {Cpp} name mangling for RISC-V follows
 the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>;
-there are no RISC-V specific mangling rules.
+plus mangling for RISC-V vector data types and vector mask types,
+which are defined in the following section.
 
 See the "Type encodings" section in _Itanium {Cpp} ABI_
 for more detail on how to mangle types. Note that `__bf16` is mangled in the
 same way as `std::bfloat16_t`.
 
+=== Name Mangling for Vector Data Types, Vector Mask Types and Vector Tuple Types.
+
+The vector data types and vector mask types, as defined in the section
+<<Vector type sizes and alignments>>, are treated as vendor-extended types in
+the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>. These mangled name for
+these types is `"u"<len>"__rvv_"<type-name>`. Specifically,
+prefixing the type name with `__rvv_`, which is prefixed by
+a decimal string indicating its length, which is prefixed by "u".
+
+For example:
+
+[,c]
+----
+    void foo(vint8m1_t x);
+----
+
+is mangled as
+[,c]
+----
+    _Z3foou15__rvv_vint8m1_t
+----
+[source,abnf]
+----
+mangled-name = "u" len "__rvv_" type-name
+
+len = nonzero *DIGIT
+nonzero = "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
+
+type-name = identifier-nondigit *identifier-char
+identifier-nondigit = ALPHA / "_"
+identifier-char = identifier-nondigit / "_"
+----
 == ELF Object Files
 
 The ELF object file format for RISC-V follows the
@@ -1977,3 +2010,6 @@ RISC-V International.
 
 * [[[riscv-zc-extension-group]]] "ZC* extension specification"
 https://github.com/riscv/riscv-code-size-reduction
+
+* [[[rvv-intrinsic-doc]]] "RISC-V Vector Extension Intrinsic Document"
+https://github.com/riscv-non-isa/rvv-intrinsic-doc