en.wikipedia.org/wiki/Graphics_processing_unit
3 corrections found
first introduced in 2007
CUDA was introduced in 2006, not 2007. NVIDIA’s own documentation dates CUDA’s introduction to November 2006.
Full reasoning
NVIDIA's own programming guide states: "In November 2006, NVIDIA introduced CUDA". NVIDIA's CUDA FAQ likewise says CUDA has been deployed "since its introduction in 2006".
That means the date in this sentence is off by about a year. Even if widespread adoption grew in 2007 and later, the platform itself was introduced in 2006, not 2007.
2 sources
- CUDA C++ Programming Guide
In November 2006, NVIDIA introduced CUDA, a general purpose parallel computing platform and programming model...
- CUDA FAQ | NVIDIA Developer
Since its introduction in 2006, CUDA has been widely deployed through thousands of applications...
Three of the ten most powerful supercomputers in the world take advantage of GPU acceleration.
This statement is out of date. In the June 2025 TOP500 list, 9 of the top 10 systems use GPUs or GPU-class accelerators, not just 3.
Full reasoning
The current TOP500 list contradicts this sentence.
On the June 2025 TOP500 list, ranks 1, 2, 3, 4, 5, 6, 8, 9, and 10 all explicitly list accelerators/GPUs in their configurations:
- #1 El Capitan — AMD Instinct MI300A
- #2 Frontier — AMD Instinct MI250X
- #3 Aurora — Intel Data Center GPU Max
- #4 JUPITER Booster — NVIDIA GH200 Superchip
- #5 Eagle — NVIDIA H100
- #6 HPC6 — AMD Instinct MI250X
- #8 Alps — NVIDIA GH200 Superchip
- #9 LUMI — AMD Instinct MI250X
- #10 Leonardo — NVIDIA A100
Only #7 Fugaku is listed without a GPU accelerator. So the current figure is 9 of the top 10, not 3 of the top 10.
1 source
- TOP500 List - June 2025 | TOP500
Rank 1 El Capitan ... AMD Instinct MI300A ... 2 Frontier ... AMD Instinct MI250X ... 3 Aurora ... Intel Data Center GPU Max ... 4 JUPITER Booster ... NVIDIA GH200 Superchip ... 5 Eagle ... NVIDIA H100 ... 6 HPC6 ... AMD Instinct MI250X ... 7 Supercomputer Fugaku ... 8 Alps ... NVIDIA GH200 Superchip ... 9 LUMI ... AMD Instinct MI250X ... 10 Leonardo ... NVIDIA A100.
using 4×4 matrix multiplication and division.
Tensor/AI cores do matrix multiply-accumulate operations, not division. NVIDIA’s own Tensor Core documentation describes 4×4 matrix operations as fused multiply-add / multiply-accumulate.
Full reasoning
This sentence misdescribes how Tensor Cores work.
NVIDIA's Tensor Core documentation says Volta Tensor Cores operate on 4×4 matrices and perform floating-point fused-multiply-add (FMA) operations. NVIDIA's Turing tuning guide is even more explicit: "Each Tensor Core performs the matrix multiply-accumulate: D = A x B + C."
So the 4×4 part is broadly right for early Tensor Cores, but "division" is wrong. The operation is matrix multiply-accumulate (or multiply-add), not matrix division.
2 sources
- Turing Tuning Guide :: CUDA Toolkit Documentation
Each Tensor Core performs the matrix multiply-accumulate: D = A x B + C.
- Tensor Cores in NVIDIA Volta Architecture | NVIDIA
Each of Tesla V100's 640 Tensor Cores operates on a 4x4 matrix... each performing 64 floating-point fused-multiply-add (FMA) operations per clock.