Metal for Compute
This blog post describes how to use Apple’s Metal API for computation on a GPU in a C++ project. The accompanying repository is say4n/metal.compute and is based heavily off of Apple’s Swift/Objective-C original tutorial for performing calculations on a GPU.
Be sure to check them out!
before we get started
Head over to Apple’s getting started with metal-cpp
guide and follow the
instruction in the section titled “Step 1. Prepare your Mac”.
Additionally, make sure to also follow the part of the instruction that describe how to use metal-cpp as a single
header include (section titled “Metal-cpp single header alternative”) in a project.
We will be using Xcode for development, I used version 13.2.1 (13C100).
getting started
At this stage, this post assumes that you have the metal-cpp
sources extracted to a convenient location and that you
have executed the command to use metal-cpp
as a single header include by following the steps described previously.
The next step is to set up an Xcode C++ project.
setting up an Xcode project for C++
Create a new project with the command line application template in Xcode with macOS as the target platform.
Make sure to select C++ as the language!
configuring Xcode to use metal-cpp
In the build settings tab under the project navigator, search for the “Header Search Path” and set it to the base directory where you extracted metal-cpp
.
Next search for “C++ Language Dialect” and set it to C++17 or higher.
And that’s it, we are now ready to use the metal-cpp
to interface with Apple’s Metal API in our C++ project.
the metal shading language
Our tiny metal program will perform the very basic operation of multiplication. To that end, here is the code that defines the operation in the metal shading language.
//
// compute.metal
// metal-test
//
// Created by Sayan on 28/12/21.
//
#include <metal_stdlib>
using namespace metal;
kernel void work_on_arrays(device const float* inA,
device const float* inB,
device float* result,
uint index [[thread_position_in_grid]])
{
result[index] = inA[index] * inB[index];
}
What this does is it takes array elements from two float array inA
and inB
and stores their product in a third array result
.
You can read up more about the metal shading language in its language specification here.
You can also follow Apple’s Swift/Objective-C tutorial for performing calculations on a GPU for a better introduction to programs in the metal shading language.
the driver code
Now that we have defined our compute operation, we need to write the driver code that actually makes it all happen! The files of interest and their respective functions are described in the following sections.
main.cpp
This is the main file that ties everything together.
It instantiates an NS::AutoreleasePool
and a MTL::Device
object.
It then instantiates an object of the compute wrapper class that we describe below.
Then it passes the Metal device to be used for compute to the said wrapper class.
The data for the compute operation is then prepared and the compute command is send.
Finally, the NS::AutoreleasePool
is released.
metalComputeWrapper.hpp
This defines the metalComputeWrapper
class, a wrapper class that takes a MTL::Device
and performs operations define by the kernel in metal shading language.
It also prepares the various buffers as well as the data required by the various operations defined in the compute kernel.
It also verifies that the results produced after computation are as expected and lie within a margin of error.
metalComputeWrapper.cpp
This file contains the implementations of the various member functions of the metalComputeWrapper
class.
Note: This blog post is a Work in Progress and content may be added to it from time to time.