Architectural improvements of the Ampere architecture include the following: •
CUDA Compute Capability 8.0 for A100 and 8.6 for
the GeForce 30 series •
TSMC's
7 nm FinFET process for A100 • Custom version of
Samsung's
8 nm process (8N) for the GeForce 30 series • Third-generation Tensor Cores with FP16,
bfloat16, TensorFloat-32 (TF32) and FP64 support and sparsity acceleration. for the GeForce 30 series and feature set J for A100 • 5
NVDEC for A100 • Adds new hardware-based 5-core
JPEG decode (
NVJPG) with YUV420, YUV422, YUV444, YUV400, RGBA. Should not be confused with Nvidia
NVJPEG (GPU-accelerated
library for JPEG encoding/decoding)
Chips • GA100 • GA102 • GA103 • GA104 • GA106 • GA107 • GA10B Comparison of Compute Capability: GP100 vs GV100 vs GA100 Comparison of Precision Support Matrix Legend: • FPnn: floating point with nn bits • INTn: integer with n bits • INT1: binary • TF32: TensorFloat32 • BF16: bfloat16 Comparison of Decode Performance ==Ampere dies==