But fear not: Nvidia’s supplied all that juicy information in a supplementary Pascal tech deep dive for developers.
The details reveal some interesting nuggets. While the Pascal GP100 GPU features smaller 16nm transistors than the Titan X, which was built on 28nm technology, its die is actually roughly the same size, at 600mm squared. Pascal puts the space to more efficient use though, stuffing 3584 CUDA cores and 240 texture units into 64 streaming multiprocessors (SM). By comparison, the most potent Maxwell GPU, found in the Titan X and Tesla M40, features 3072 CUDA cores. And the Pascal architecture is technically capable of even more, supporting up to 3840 maximum CUDA cores.
Here’s a block diagram of the Pascal GP100 architecture overall. (Blame Nvidia for the small size, though you can click the image to enlarge it.)
And here’s a closer look at the design of each of Pascal’s streaming multiprocessors, each of which packs 64 single-precision (FP32) CUDA cores and 32 double-precision (FP64) CUDA units, which is good for 10.6 teraflops of single-precision floating-point performance and 5.3 teraflops of double-precision performance, respectively. Here’s a closer look at Pascal’s SM design:
Finally, here’s a full breakdown of the Pascal GP100 GPU’s key tech specs, comparing it (as the Tesla P100, Pascal’s premiere product) against the Maxwell-based Tesla M40 and Kepler-based Tesla K40.
All these numbers and diagrams are just the tip of the iceberg, though. Check out Nvidia’s Pascal GP100 introduction post for a far deeper dive into the new GPU’s capabilities (seriously—there’s a lot more). You’ll also want to check out PCWorld’s Pascal GPU coverage for more information about the rest of the chip, like its 16GB of second-gen HBM memory and ludicrously fast new NVLink interconnect technology. Remember: All these delicious goodies will drip down to consumer graphics cards sooner than later, with the first 16nm GeForce models expected to land later this year.