Microsoft Unveils Surface Laptop Ultra: Nvidia RTX Spark Architecture Pairs Grace CPU and Blackwell GPU on Windows on Arm
Microsoft has announced the Surface Laptop Ultra, a high-performance workstation built on the Nvidia RTX Spark platform featuring a 20-core Grace CPU and a Blackwell GPU connected via NVLink-C2C. The architecture introduces system-level optimizations to Windows 11, enabling up to 128GB of dynamically allocated unified memory and local execution of 120-billion-parameter models.
The Silicon Architecture: Grace, Blackwell, and NVLink-C2C
The core of the Surface Laptop Ultra is the Nvidia RTX Spark platform, utilizing custom Nvidia N1X silicon. This system-on-chip (SoC) integrates a 20-core Nvidia Grace CPU, co-developed with MediaTek, alongside an Nvidia Blackwell RTX GPU. The GPU features up to 6,144 CUDA cores and fifth-generation Tensor Cores with support for FP4 precision.
Rather than relying on traditional PCIe lanes, the CPU and GPU are linked via Nvidia's high-speed NVLink-C2C (chip-to-chip) interconnect. This high-bandwidth interface facilitates the platform's unified memory architecture, allowing the system to dynamically allocate a single pool of up to 128GB of memory between the processor and graphics pipeline depending on real-time workload demands.
Kernel-Level Windows 11 Tuning and Memory Management
To leverage this custom hardware architecture, Microsoft has implemented system-level changes to Windows 11 on Arm. The operating system features a rewritten memory management subsystem that supports a significantly higher limit on the total system memory accessible directly by the GPU. Additionally, Windows 11 introduces more efficient page-size management within shared memory regions to optimize developer and creator rendering pipelines.
On the scheduling side, Microsoft introduced a new workload profile scheduling system designed to scale tasks across the 20 Grace processor cores. For thermal and power management, the Microsoft Power and Thermal Framework was adapted for the RTX Spark platform to balance performance with sustained workloads within the dual-fan chassis.
Emulation, Security, and Ecosystem Enablement
Legacy application execution relies on an optimized version of Microsoft's Prism emulation layer. Prism has been updated to utilize the RTX Spark microarchitecture, including support for AVX and AVX2 instruction set extensions to run x86 binaries under emulation.
Security for local AI workflows is managed via Nvidia's OpenShell runtime, which integrates with new Microsoft-designed security and containment primitives. These containment features sandbox local agents such as Hermes and OpenClaw to isolate them from the core operating system.
In terms of ecosystem compatibility, the RTX Spark platform features native support for Epic and BattlEye anti-cheat engines. Consequently, games such as League of Legends, Valorant, and PUBG Battlegrounds run natively on the architecture. Adobe is also rearchitecting Premiere and Photoshop to run natively, utilizing the unified memory pool and TensorRT.
Hardware Specifications and Chassis Engineering
The physical chassis weighs under 4.5 pounds (approximately 2 kg) and is cooled by a dual-fan thermal solution designed to mitigate throttling during sustained compute tasks.
- 15-inch mini-LED PixelSense Ultra touchscreen
- 2880 x 1920 resolution at 262 pixels per inch
- 2,000 nits peak HDR brightness
- Dedicated IO: 1x HDMI, USB-C, USB-A, SD card reader, 3.5mm headphone jack
- Replaceable SSD, user-accessible repair guides, and replacement parts
The local 1-petaflop AI compute capacity allows developers to execute up to 120-billion-parameter AI models entirely locally, leveraging full CUDA support on the unified memory architecture.