Ampere (microarchitecture)

Architectural improvements of the Ampere architecture include the following: • CUDA Compute Capability 8.0 for A100 and 8.6 for the GeForce 30 series • TSMC's 7 nm FinFET process for A100 • Custom version of Samsung's 8 nm process (8N) for the GeForce 30 series • Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 (TF32) and FP64 support and sparsity acceleration. for the GeForce 30 series and feature set J for A100 • 5 NVDEC for A100 • Adds new hardware-based 5-core JPEG decode (NVJPG) with YUV420, YUV422, YUV444, YUV400, RGBA. Should not be confused with Nvidia NVJPEG (GPU-accelerated library for JPEG encoding/decoding) Chips • GA100 • GA102 • GA103 • GA104 • GA106 • GA107 • GA10B Comparison of Compute Capability: GP100 vs GV100 vs GA100 Comparison of Precision Support Matrix Legend: • FPnn: floating point with nn bits • INTn: integer with n bits • INT1: binary • TF32: TensorFloat32 • BF16: bfloat16 Comparison of Decode Performance ==Ampere dies==