Skip to content
This repository has been archived by the owner on Dec 18, 2024. It is now read-only.

intel/xetla

Repository files navigation

PROJECT NOT UNDER ACTIVE MANAGEMENT

This project will no longer be maintained by Intel.

Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.

Intel no longer accepts patches to this project.

If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.

Important

The alternative project CUTLASS will include all XeTLA features, refer: Cutlass-Fork.

Contact: webadmin@linux.intel.com

Intel® Xe Templates for Linear Algebra

Intel® XeTLA v0.3.7 - December 2023

Intel® Xe Templates for Linear Algebra (Intel® XeTLA) is a collection of SYCL/ESIMD templates that enable high-performance General Matrix Multiply (GEMM), Convolution (CONV), and related computations on Intel Xe GPU architecture. Intel® XeTLA offers reusable C++ templates for kernel, group and subgroup levels, allowing developers to optimize and specialize kernels based on data types, tiling policies, algorithms, fusion policies, and more.

One of the key features of Intel® XeTLA is its ability to abstract and hide details of Xe hardware implementations, particularly those related to matrix computations, such as the systolic array and other low level instructions. This ensures that SYCL/DPC++ developers can focus on leveraging the performance benefits of Intel® XeTLA without being burdened by hardware-specific instructions.

Compatibility

Category Requirement Installation
OS Ubuntu 22.04 Install Ubuntu
GPU Card Intel® Data Center GPU Max Series N/A
GPU Driver Stable 736.25 or later Install Intel GPU driver
Toolchain Intel® oneAPI Base Toolkit 2024.0.1 or later Install Intel® oneAPI Base Toolkit

Features

  • GEMM
    • Data Type
      • Vector-engine-based: fp32
      • Matrix-engine-based: tf32, fp16, bf16, int8
    • Memory Layout
      • Matrix A: row-major, col-major
      • Matrix B: row-major, col-major
      • Matrix C: row-major
  • Epilogue
    • Bias Add
    • GELU Forward
    • GELU Backward
    • RELU
    • Residual Add

Documentation

Project Structure

include/                       # Definitions of Intel® XeTLA APIs
    common/                    #    - Low level APIs that wrap the same functionality APIs from ESIMD
    experimental/              #    - Experimental features
    group/                     #    - Group level APIs 
    kernel/                    #    - Kernel level APIs
    subgroup/                  #    - Subgroup level APIs
    xetla.hpp                  #    - Unified and unique external head file

tests/                         # Tests to verify correctness of Intel® XeTLA APIs
    integration/               #    - Integration testes
    unit/                      #    - Unit tests
    utils/                     #    - Utils implement of unit and integration tests

examples/                      # Examples of Intel® XeTLA basic/fused kernels

tools/                         # Tools for code format, build environment...

media/                         # Documents

Contributing

Refer to Contributing Guidelines.

Limitations

Refer to Limitations.

Security

See Intel's Security Center for information on how to report a potential security issue or vulnerability.

See also: Security Policy

Copyright

Copyright (c) 2022-2023 Intel Corporation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.