DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
When used standalone, the DirectML API is a low-level DirectX 12 library and is suitable for high-performance, low-latency applications such as frameworks, games, and other real-time applications. The seamless interoperability of DirectML with Direct3D 12 as well as its low overhead and conformance across hardware makes DirectML ideal for accelerating machine learning when both high performance is desired, and the reliability and predictability of results across hardware is critical.
More information about DirectML can be found in Introduction to DirectML.
- Getting Started with DirectML
- DirectML Samples
- DxDispatch Tool
- Windows ML on DirectML
- ONNX Runtime on DirectML
- TensorFlow with DirectML
- PyTorch with DirectML
- Feedback
- External Links
- Contributing
Visit the DirectX Landing Page for more resources for DirectX developers.
DirectML is distributed as a system component of Windows 10, and is available as part of the Windows 10 operating system (OS) in Windows 10, version 1903 (10.0; Build 18362), and newer.
Starting with DirectML version 1.4.0, DirectML is also available as a standalone redistributable package (see Microsoft.AI.DirectML), which is useful for applications that wish to use a fixed version of DirectML, or when running on older versions of Windows 10.
DirectML requires a DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include:
- AMD GCN 1st Gen (Radeon HD 7000 series) and above
- Intel Haswell (4th-gen core) HD Integrated Graphics and above
- NVIDIA Kepler (GTX 600 series) and above
- Qualcomm Adreno 600 and above
DirectML exposes a native C++ DirectX 12 API. The header and library (DirectML.h/DirectML.lib) are available as part of the redistributable NuGet package, and are also included in the Windows 10 SDK version 10.0.18362 or newer.
- The Windows 10 SDK can be downloaded from the Windows Dev Center
- Microsoft.AI.DirectML on the NuGet Gallery
- DirectML programming guide
- DirectML API reference
DirectML is built-in as a backend to several frameworks such as Windows ML, ONNX Runtime, and TensorFlow.
See the following sections for more information:
DirectML C++ sample code is available under Samples.
- HelloDirectML: A minimal "hello world" application that executes a single DirectML operator.
- DirectMLSuperResolution: A sample that uses DirectML to execute a basic super-resolution model to upscale video from 540p to 1080p in real time.
- yolov4: YOLOv4 is an object detection model capable of recognizing up to 80 different classes of objects in an image. This sample contains a complete end-to-end implementation of the model using DirectML, and is able to run in real time on a user-provided video stream.
DirectML Python sample code is available under Python/samples. The samples require PyDirectML, an open source Python projection library for DirectML, which can be built and installed to a Python executing environment from Python/src. Refer to the Python/README.md file for more details.
- MobileNet: Adapted from the ONNX MobileNet model. MobileNet classifies an image into 1000 different classes. It is highly efficient in speed and size, ideal for mobile applications.
- MNIST: Adapted from the ONNX MNIST model. MNIST predicts handwritten digits using a convolution neural network.
- SqueezeNet: Based on the ONNX SqueezeNet model. SqueezeNet performs image classification trained on the ImageNet dataset. It is highly efficient and provides results with good accuracy.
- FNS-Candy: Adapted from the Windows ML Style Transfer model sample, FNS-Candy re-applies specific artistic styles on regular images.
- Super Resolution: Adapted from the ONNX Super Resolution model, Super-Res upscales and sharpens the input images to refine the details and improve image quality.
DxDispatch is simple command-line executable for launching DirectX 12 compute programs (including DirectML operators) without writing all the C++ boilerplate.
Windows ML (WinML) is a high-performance, reliable API for deploying hardware-accelerated ML inferences on Windows devices. DirectML provides the GPU backend for Windows ML.
DirectML acceleration can be enabled in Windows ML using the LearningModelDevice with any one of the DirectX DeviceKinds.
For more information, see Get Started with Windows ML.
- Windows Machine Learning Overview (docs.microsoft.com)
- Windows Machine Learning GitHub
- WinMLRunner, a tool for executing ONNX models using WinML with DirectML
ONNX Runtime is a cross-platform inferencing and training accelerator compatible with many popular ML/DNN frameworks, including PyTorch, TensorFlow/Keras, scikit-learn, and more.
DirectML is available as an optional execution provider for ONNX Runtime that provides hardware acceleration when running on Windows 10.
For more information about getting started, see Using the DirectML execution provider.
TensorFlow is a popular open source platform for machine learning and is a leading framework for training of machine learning models.
DirectML acceleration for TensorFlow 1.15 is currently available for Public Preview. TensorFlow on DirectML enables training and inference of complex machine learning models on a wide range of DirectX 12-compatible hardware.
TensorFlow on DirectML is supported on both the latest versions of Windows 10 and the Windows Subsystem for Linux, and is available for download as a PyPI package. For more information about getting started, see GPU accelerated ML training (docs.microsoft.com)
- TensorFlow on DirectML GitHub repo
- TensorFlow on DirectML samples
- tensorflow-directml PyPI project
- TensorFlow GitHub | RFC: TensorFlow on DirectML
- TensorFlow homepage
DirectML acceleration for Pytorch 1.13.0 is currently available for Public Preview. PyTorch with DirectML enables training and inference of complex machine learning models on a wide range of DirectX 12-compatible hardware. New in Pytorch version 1.13.0, pytorch-directml is now built as a separate plugin to the Pytorch library. DirectML acceleration for Pytorch 1.8.0 is still available but is now deprecated.
PyTorch on DirectML is supported on both the latest versions of Windows 10 and the Windows Subsystem for Linux, and is available for download as a PyPI package. For more information about getting started, see GPU accelerated ML training (docs.microsoft.com)
- PyTorch-1.13 on DirectML samples
- (Deprecated) Pytorch-1.8 on DirectML samples
- torch-directml PyPI project
- PyTorch homepage
We look forward to hearing from you!
-
For TensorFlow with DirectML issues, bugs, and feedback; or for general DirectML issues and feedback, please file an issue or contact us directly at [email protected].
-
For PyTorch with DirectML issues, bugs, and feedback; or for general DirectML issues and feedback, please file an issue or contact us directly at [email protected].
-
For Windows ML issues, please file a GitHub issue at microsoft/Windows-Machine-Learning or contact us directly at [email protected].
-
For ONNX Runtime issues, please file an issue at microsoft/onnxruntime.
DirectML programming guide
DirectML API reference
Introducing DirectML (Game Developers Conference '19)
Accelerating GPU Inferencing with DirectML and DirectX 12 (SIGGRAPH '18)
Windows AI: hardware-accelerated ML on Windows devices (Microsoft Build '20)
Gaming with Windows ML (DirectX Developer Blog)
DirectML at GDC 2019 (DirectX Developer Blog)
DirectX ❤ Linux (DirectX Developer Blog)
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.