Device Platform — NVIDIA, Qualcomm and AMD edge AI

Device · Robotics edge

NVIDIA AGX Thor

A Blackwell-class edge supercomputer for physical AI. NVIDIA Jetson AGX Thor packs up to 2,070 FP4 TFLOPS of generative-AI compute and 128 GB of unified memory into a power-configurable module small enough to live inside a robot, a vehicle or a machine — running several large models, vision and multi-sensor fusion at once, on-prem. We build, optimize and operate the full Unovie stack on Thor, so your edge agents run where the data is born.

2,070TFLOPS

FP4 AI compute

128GB

unified LPDDR5X

~7.5×

vs Jetson Orin

The device

An edge box you can hold.

Thor ships as a compact, fan-cooled edge node: a dense I/O wall of USB, networking, display and capture, with a Blackwell GPU and 128 GB of unified memory behind it. Mount it on the line, in the cab or at the cell — and run the models where the data is born.

NVIDIA AGX Thor edge node — periwinkle line drawing of the chassis and front I/O

01 — What it does

Physical AI at the edge

/blackwell

Blackwell on a module

A datacenter-class Blackwell GPU with FP4 and a transformer engine, packed into a module — generative and vision models that used to need a rack now run inside the machine.

BlackwellFP4transformer-engine

/fusion

Multi-sensor, multi-model

A 14-core Arm Neoverse CPU and high-bandwidth memory run camera, lidar, radar and language models together, fused in real time for autonomy and inspection.

sensor-fusionmulti-modelreal-time

/safety

Partitioned & safety-ready

MIG carves the GPU into isolated slices inside a configurable 40–130W envelope, with a functional-safety design for robots and autonomous machines.

MIG40–130Wsafety

02 — How it works

Silicon to autonomy

Provision

Image Thor with the Unovie edge stack.

Serve

Local models + Nexus context, on-device.

Fuse

Vision, sensors and agents reason live.

Act

Closed-loop control, fully on-prem.

03 — Architecture

Built for the machine

/compute

Blackwell GPU + Tensor Cores

2,560 CUDA cores and next-gen Tensor Cores with FP4 and a transformer engine for on-device generative AI.

CUDATensorFP4

/cpu

14-core Arm Neoverse

A 14-core Arm Neoverse-V3AE cluster feeds the GPU and runs the control plane, sensors and OS.

Neoverse-V3AE14-core

/io

Sensor-grade I/O

High-speed camera, networking and PCIe lanes ingest many sensors at once with deterministic latency.

MIPI/CSIPCIe10/25G

04 — By the numbers

By the numbers

2,070TFLOPS

FP4 (sparse)

128GB

LPDDR5X

273GB/s

memory bandwidth

40–130W

configurable

Device · Desktop supercomputer

NVIDIA DGX Spark

A petaFLOP AI supercomputer that fits on a desk. NVIDIA DGX Spark pairs the GB10 Grace Blackwell Superchip with 128 GB of coherent unified memory and up to 1,000 TFLOPS of FP4 compute — enough to prototype, fine-tune and run models up to ~200B parameters locally, or ~405B across a linked pair. We run it as your private development and inference node: the full Unovie stack, your data, your room.

1,000TFLOPS

FP4 AI compute

128GB

coherent memory

200B

params, local

The device

A supercomputer that fits on a desk.

Spark is a desktop-sized chassis with a perforated cooling top and a full I/O wall — a GB10 Grace Blackwell Superchip and 128 GB of coherent memory inside. Develop, fine-tune and serve large models locally; promote them to the edge unchanged.

NVIDIA DGX Spark desktop AI supercomputer — periwinkle line drawing

01 — What it does

A supercomputer you own

/gb10

Grace Blackwell GB10

A 20-core Arm Grace CPU and a Blackwell GPU joined by NVLink-C2C share one coherent memory space — no PCIe copies between CPU and GPU.

GB10NVLink-C2Ccoherent

/memory

128 GB for big models

Unified LPDDR5X holds models up to ~200B parameters; two units linked over ConnectX scale to ~405B — inference and fine-tuning without the cloud.

200B local405B linkedConnectX

/stack

The full NVIDIA AI stack

Runs NIM microservices, CUDA frameworks and the same containers as DGX in the datacenter — develop locally, deploy to the edge unchanged.

NIMCUDAportable

02 — How it works

Desk to deployment

Build

Prototype & fine-tune locally on Spark.

Ground

Wire in your Nexus context and data.

Validate

Run the same containers as production.

Promote

Ship unchanged to edge or MicroCloud.

03 — Architecture

One coherent memory space

/superchip

GB10 Grace Blackwell

Grace CPU and Blackwell GPU on one package, joined by NVLink-C2C at chip-to-chip bandwidth.

GB10NVLink-C2C

/memory

128 GB unified LPDDR5X

CPU and GPU address one coherent pool — no host-device copies, and room for ~200B-parameter models.

unifiedcoherent200B

/fabric

ConnectX scale-out

ConnectX networking links two Sparks into a single ~405B-parameter inference target.

ConnectXRDMA405B

04 — By the numbers

By the numbers

1,000TFLOPS

FP4 AI

128GB

unified memory

Arm Grace cores

4TB

NVMe storage

Device · Power-efficient edge

Qualcomm QCS6490

A power-efficient edge-AI processor for robots, cameras and handhelds. The Qualcomm QCS6490 pairs an octa-core Kryo CPU, an Adreno GPU and a Hexagon AI processor for up to 12 TFLOPS — multi-camera vision and on-device models on a fanless, battery-friendly power budget, with Wi-Fi 6E and long industrial lifecycle support. We bring the Unovie stack to it, so intelligence runs at the far edge, on hardware you own.

12TFLOPS

Hexagon AI

concurrent cameras

Wi-Fi 6E

FastConnect

The device

Built for the far edge.

QCS6490 reference hardware brings a full I/O wall — USB-C, USB 3.0, dual Ethernet, 10GbE and HDMI — to a compact, fanless box. Premium-tier on-device AI without the power bill, deployed where wires and watts are scarce.

Qualcomm QCS6490 edge box — periwinkle line drawing of the chassis and front I/O

01 — What it does

AI on a power budget

/hexagon

Hexagon AI at low watts

Up to 12 TFLOPS from the Hexagon processor with a fused tensor accelerator — vision, speech and sensor models on a budget that fits a fanless box or a battery.

12 TFLOPSHexagonlow-power

/vision

Triple ISP, many cameras

A Spectra triple ISP ingests up to five concurrent cameras with computer-vision hardware — multi-camera perception for robots, handhelds and smart cameras.

Spectra ISP5 camerasCV

/connect

Wi-Fi 6E, built to last

FastConnect Wi-Fi 6E and Bluetooth 5.2 keep the edge connected wirelessly, with wide-temperature, long-lifecycle industrial availability.

Wi-Fi 6EBT 5.2industrial

02 — How it works

Sense to inference

Capture

Up to 5 cameras and sensors stream in.

Process

Kryo CPU + Adreno GPU + Hexagon NPU.

Infer

Vision and language models on-device.

Connect

Results over Wi-Fi 6E, no cloud.

03 — Architecture

A heterogeneous compute engine

/cpu

Octa-core Kryo CPU

A 6 nm octa-core Qualcomm Kryo CPU runs the OS, control and classical workloads beside the AI engines.

Kryoocta-core6 nm

/npu

Hexagon + Adreno

The Hexagon processor with a fused tensor accelerator and the Adreno GPU share inference and graphics — up to 12 TFLOPS.

HexagonAdreno12 TFLOPS

/isp

Spectra triple ISP

A triple ISP captures up to five concurrent camera streams with 4K HDR video and on-sensor computer vision.

Spectra5 cameras4K HDR

04 — By the numbers

By the numbers

12TFLOPS

Hexagon AI

concurrent cameras

Wi-Fi 6E

FastConnect

6nm

process

Device · Private AI server

AMD Ryzen AI Max+ 395

A private AI server in a small metal box. The AMD Ryzen AI Max+ 395 fuses 16 Zen 5 CPU cores, a Radeon 8060S iGPU and a next-gen XDNA 2 NPU for 126 platform AI TFLOPS, paired with 128 GB of LPDDR5X-8000 — enough to run 70B-class models locally, behind dual 10GbE and USB4 so nodes cluster into a compute hub. We deploy the Unovie stack on it for secure, private inference on hardware you own.

126TFLOPS

platform AI

128GB

LPDDR5X-8000

70B

models, local

The device

A server that hides in plain sight.

An all-metal chassis with a built-in 230 W supply exposes dual 10GbE, dual USB4 and fast PCIe 4.0 NVMe on its I/O wall — a quiet, durable node you can rack a few of, or set one on a desk.

AMD Ryzen AI Max+ 395 mini AI server — periwinkle line drawing

01 — What it does

A private model server

/apu

16 Zen 5 + Radeon + XDNA 2

Sixteen Zen 5 CPU cores, a Radeon 8060S iGPU and a next-gen XDNA 2 NPU combine for 126 AI TFLOPS — CPU, GPU and NPU inference in one package.

Zen 5Radeon 8060SXDNA 2

/memory

128 GB for big models

128 GB of LPDDR5X-8000 keeps large models — 70B-class and up — resident and private, with no weights leaving the box.

128 GBLPDDR5X-800070B local

/cluster

Clusters into a hub

Dual 10GbE and dual USB4 at 40 Gbps link nodes into an AI compute hub for distributed, local inference.

dual 10GbEUSB4 40Gclustering

02 — How it works

Box to private cloud

Load

70B-class models resident in 128 GB.

Serve

CPU + Radeon iGPU + XDNA 2 NPU.

Cluster

Link nodes over 10GbE / USB4.

Operate

Private inference, fully on-prem.

03 — Architecture

One package, three engines

/cpu

16 Zen 5 cores

A 16-core Zen 5 CPU drives orchestration, data prep and classical workloads alongside inference.

Zen 516-core

/gpu

Radeon 8060S + XDNA 2

The Radeon 8060S iGPU and XDNA 2 NPU share AI work for 126 TFLOPS across vision, language and agents.

Radeon 8060SXDNA 2126 TFLOPS

/thermal

140W, vapor-chamber cooled

Dual turbine fans and a full-coverage vapor chamber sustain 140 W at about 32 dB — full performance, near silence.

140W TDPvapor chamber~32 dB

04 — By the numbers

By the numbers

126TFLOPS

platform AI

128GB

LPDDR5X-8000

16TB

NVMe · PCIe 4.0

140W

TDP, ~32 dB

Hardware you own.

NVIDIA AGX Thor

An edge box you can hold.

Physical AI at the edge

Blackwell on a module

Multi-sensor, multi-model

Partitioned & safety-ready

Silicon to autonomy

Provision

Serve

Fuse

Act

Built for the machine

Blackwell GPU + Tensor Cores

14-core Arm Neoverse

Sensor-grade I/O

By the numbers

NVIDIA DGX Spark

A supercomputer that fits on a desk.

A supercomputer you own

Grace Blackwell GB10

128 GB for big models

The full NVIDIA AI stack

Desk to deployment

Build

Ground

Validate

Promote

One coherent memory space

GB10 Grace Blackwell

128 GB unified LPDDR5X

ConnectX scale-out

By the numbers

Qualcomm QCS6490

Built for the far edge.

AI on a power budget

Hexagon AI at low watts

Triple ISP, many cameras

Wi-Fi 6E, built to last

Sense to inference

Capture

Process

Infer

Connect

A heterogeneous compute engine

Octa-core Kryo CPU

Hexagon + Adreno

Spectra triple ISP

By the numbers

AMD Ryzen AI Max+ 395

A server that hides in plain sight.

A private model server

16 Zen 5 + Radeon + XDNA 2

128 GB for big models

Clusters into a hub

Box to private cloud

Load

Serve

Cluster

Operate

One package, three engines

16 Zen 5 cores

Radeon 8060S + XDNA 2

140W, vapor-chamber cooled

By the numbers

Pick the silicon. We'll run it.