Performance-portable, length-agnostic SIMD with runtime dispatch
-
Updated
Dec 20, 2024 - C++
Performance-portable, length-agnostic SIMD with runtime dispatch
Fast inference engine for Transformer models
Achieve peak performance on x86 CPUs and NVIDIA GPUs
Pintool library for running Quantum Break on pre-SSE4.2 CPUs
Fast SIMD alpha overlay and blending for Raspberry Pi and other ARM systems.
Example implementations of spinlocks
What is a camera calibration, why is it necessary, and how do we compute it?
Fast Recursive SHA256
An discoverable fractal world.
A C++ header-only library for vector, matrix, and quaternion math.
My C/C++/Intrinsic, OpenGL/OpenGLES2 experiments for desktop computers.
A GUI for viewing Intel intrinsic information combined with uops.info measurement data.
ECS-API is a ECS API framework, built to be very performing yet lightweight and easy to use
OpenCV4 C++ camera calibration in some lines
This program utilizes C++ and specific Windows compiler intrinsics to decode and display the CPU manufacturer ID by accessing the CPU's built-in CPUID information. It demonstrates how to extract and format this low-level information for display.
IQ-TREE ported to work for systems with ARM NEON ISA
Add a description, image, and links to the intrinsics topic page so that developers can more easily learn about it.
To associate your repository with the intrinsics topic, visit your repo's landing page and select "manage topics."