Skip to content

Latest commit

 

History

History
196 lines (163 loc) · 7.2 KB

UIP-0127.md

File metadata and controls

196 lines (163 loc) · 7.2 KB
uip title description author status type category created
0127
Lagoon IEEE 754 Reals
Support jetted array operations
~lagrev-nocfep
Draft
Standards Track
Arvo
2024-03-26

Abstract

Nock and its predicated software programs are designed to facilitate computing as a legible state machine. At-scale scientific computing has never really been part of the design criteria for the Urbit ecosystem. While it's hard to imagine density functional theory quantum chemistry packages preferring Hoon, it is far easier to envision a world of machine learning and even LLMs that are built on top of or adjacent to Urbit. At their core, essentially all of the “scientific computing” approaches rely heavily on linear algebra packages: classically BLAS and LAPACK, but many others as well today. It behooves us to provide this to future developers building on Mars.

Lagoon has matured through several refactors into a developer-friendly interface. It is time to prepare and release a subset of completed array-operational work to the developer community. Since Lagoon relies on jets to operate reasonably quickly, we will prepare and release %real-valued operations first.

Motivation

Lagoon (Linear AlGebra in hOON) will offer BLAS-like vector and matrix operations (like NumPy's linear algebra essentials, but not solvers). This will facilitate a common and performant interface for all developers relying on a common lagnuage of data types and operators to represent vector, matrix, and tensor data. By focusing on %reals first, we can vet the interface further in production and lay the groundwork for expansion in further types in subsequent releases.

Specification

Data Types

Lagoon defines the following data types for %real-valued numbers (which are IEEE 754 floats compatible with corresponding @r types in Hoon). (Note that we are commenting out other prospective types for which some work has been done.)

  ::  /sur/lagoon
::::  Types for Lagoon compatibility
::
|%
+$  ray               ::  $ray:  n-dimensional array
  $:  =meta           ::  descriptor
      data=@ux        ::  data, row-major order
  ==
::
+$  prec  [a=@ b=@]   ::  fixed-point precision, a+b+1=bloq
+$  meta              ::  $meta:  metadata for a $ray
  $:  shape=(list @)  ::  list of dimension lengths
      =bloq           ::  logarithm of bitwidth
      =kind           ::  name of data type
      fxp=(unit prec) ::  fixed-point scale
  ==
::
+$  kind              ::  $kind:  type of array scalars
  $?  %real           ::  IEEE 754 float
      :: %uint           ::  unsigned integer
      :: %int2           ::  2s-complement integer
      :: %cplx           ::  BLAS-compatible packed floats
      :: %unum           ::  unum/posit
      :: %fixp           ::  fixed-precision
  ==
::
+$  baum              ::  $baum:  ndray with metadata
  $:  =meta           ::
      data=ndray      ::
  ==
::
+$  ndray             ::  $ndray:  n-dim array as nested list
    $@  @             ::  single item
    (list ndray)      ::  nonempty list of children, in row-major order
::
+$  slice  (unit [(unit @) (unit @)])
--

The core data type is a +$ray, which consists of a pair of metadata +$meta and data as a bit-aligned atom. The metadata which we track for any particular array are:

  1. shape=(list @), the dimensions of an array.
  2. bloq, the block size of an array in standard Hoon $2^n$ terms.
  3. kind, whether the value is %real or otherwise. (Only %real will be supported in this UIP's release.)
  4. fxp=(unit prec), the binary-point scale of a fixed-precision number as a unit. (This will always be ~ for %reals.)

data are in CBLAS-like row-major order, with the leading value of the array in the most signficant position bitwise. A +$ray's data atom has a leading 1 bit at the most significant bit of the array plus one. This allows us to represent leading zeroes.

There is also a +$ndray type corresponding to an unpacked list of the same data.

Operators

At the current time, Lagoon defines the following operators, a handful of which are for internal library convenience. This interface is the result of several complete refactors and we do not foresee additional major alterations in the course of producing the final product.

  • ++print
  • ++slog
  • ++to-tank
  • ++get-term
  • ++squeeze
  • ++submatrix
  • ++product
  • ++gather
  • ++get-item
  • ++set-item
  • ++get-row
  • ++set-row
  • ++get-col
  • ++set-col
  • ++get-bloq-offset
  • ++get-item-number
  • ++strides
  • ++get-dim
  • ++get-item-index
  • ++ravel
  • ++en-ray
  • ++de-ray
  • ++get-item-baum
  • ++fill
  • ++spac
  • ++unspac
  • ++scalar-to-ray
  • ++eye
  • ++zeros
  • ++ones
  • ++iota
  • ++magic
  • ++range
  • ++linspace
  • ++urge
  • ++scale
  • ++max
  • ++argmax
  • ++min
  • ++argmin
  • ++cumsum
  • ++prod
  • ++reshape
  • ++stack
  • ++hstack
  • ++vstack
  • ++transpose
  • ++diag
  • ++trace
  • ++dot
  • ++mmul
  • ++abs
  • ++add-scalar
  • ++sub-scalar
  • ++mul-scalar
  • ++div-scalar
  • ++mod-scalar
  • ++add
  • ++sub
  • ++mul
  • ++div
  • ++mod
  • ++pow-n
  • ++pow
  • ++exp
  • ++log
  • ++gth
  • ++gte
  • ++lth
  • ++lte
  • ++mpow-n
  • ++is-close
  • ++any
  • ++all
  • ++fun-scalar
  • ++trans-scalar
  • ++el-wise-op
  • ++bin-op

The IEEE 754 rounding mode is a property of the operation rather than of the value(s) involved. To effect this correctly, /lib/lagoon implements a ++lake wrapper in a door-like pattern to modify the core's behavior. This information propagates to gates.

Jetting

Jets will resolve at the level of (e.g.) ++add rather than ++bin-op.

Jets are built on top of the SoftBLAS which implements BLAS operations on top of SoftFloat. The build process for vere has been modified to link SoftBLAS (in the lagoon-jets branch of urbit/vere).

Because jets need to be compiled into the vere binary at the current time, /lib/lagoon needs to ship on the %base desk. However, we propose a departure from past convention: jet the library in /lib rather than move it into /sys.

In the long run, as more possibilities for userspace jets become available, /lib/lagoon may be moved out of %base. However, one change in policy at a time is all that we propose now.

Lagoon will jet /lib into the non chapter and place corresponding jets in /i.

Rationale for %real

IEEE 754 %reals are a well-understood, well-vetted component in the numerical ecosystem for scientific computing and Urbit's @r affordances.

We have comprehensive unit test coverage of /lib/lagoon and SoftBLAS, lending credence to verified behavior.

Taken together, these lead us to believe that introducing %real to the developer community first is a conservative assay to roll out or try several new features:

  1. Jetted /lib.
  2. Tetchy hardware–software coupling
  3. Linear algebra interface and affordances

Resources

Primary development is taking place in urbit/numerics.

Details are available in ~mopfel-winrux/numeric-computation-and-machine-learning and in summary form.

Copyright

Copyright and related rights waived via CC0.