About the content of the AMD technical communication meeting

In the processor market, AMD Ryzen is actually putting more and more pressure on Intel. According to foreign media reports, the processor market share report released by Mercury Research in the third quarter of 2018 shows that AMD’s market share in desktop x86 processors increased to 13%, an increase of 0.8 percentage points from the previous month and a year-on-year increase of 2.1 percentage points. In the notebook processor market, AMD's share increased by 1.5 percentage points in the third quarter to 10.9%, and the server market increased by 10.6%.

The AMD Zen architecture has achieved unprecedented success. This year, it has also been optimized to the Zen+ enhanced version, and is supported by the same optimized 12nm process. Now we finally usher in the new second-generation Zen 2 architecture and the new 7nm process blessing.

AMD originally planned to use GlobalFoundries 7nm for its own new CPU and GPU foundry, but the latter has given up on 7nm and subsequent processes. Fortunately, there is still the Tianzi No. 1 foundry TSMC, AMD 7nm CPU and GPU have all been transferred to the past, and it is currently seen It went smoothly, both the product design and the roadmap were on schedule.

According to data provided by TSMC, 7nm can double the transistor density compared to the current 14/12nm, reduce power consumption by half at the same frequency, and improve performance by more than 25% at the same power consumption.

If 12nm is more "numbers" ahead of Intel's 14nm, this time 7nm has completely surpassed in technology, and it can also slightly lead Intel's still difficult 10nm.

Zen 2, which is the world's first high-performance x86 CPU with a 7nm process, in addition to the new process, the main changes include: CPU core execution enhancements, deeper security enhancements, modular design, flexible configuration, and reduced manufacturing difficulty.

Zen 2 achieves twice the throughput of the first generation, mainly thanks to improvements in execution pipeline, doubling of floating-point units and load-store units, doubling core density, and halving power consumption per operation.

In front-end design, Zen 2 focuses on improving and optimizing branch prediction, instruction prefetching, instruction cache, and operation cache.

In terms of floating point, Zen 2 doubles the floating point width to 256-bit, doubles the load and store bandwidth, and improves the distribution/fallback bandwidth, all modules maintain high throughput.

In terms of security, AMD emphasized that the new architecture can be immune to Spectre Spectre security vulnerabilities at the hardware level.

This new modular design is more flexible and can be individually optimized and provisioned for each module, while greatly optimizing overall latency and power consumption with the help of I/O Die.

However, note that the CPU Die part uses the 7nm process, and the I/O Die part is still 14nm, because most of the latter are analog circuits and are not sensitive to new processes, even if 7nm is used, it will not bring integration, The performance and power consumption are significantly improved, but the cost will be significantly increased, so this hybrid process module combination is adopted.

In fact, Intel is also trying a combination of different processes within the same chip, all for the same purpose.

The above is the theoretical part of the Zen 2 architecture, and it will eventually be implemented in EPYC Xiaolong and Ryzen Ruilong products, and there will be different performances, but it can be expected that the new architecture and new technology will inevitably bring significantly higher frequencies , lower power consumption, and whether desktop or server, AMD will maintain compatibility between previous and future generations in recent years.

AMD also confirmed for the first time that the Zen 4 architecture is already in the design and will be available after the 7nm+ Zen 3, which is estimated to be at least 2021.

The 7nm Radeon Instinct series is the first to release two products, the models are MI60 and MI50, which are mainly used in machine learning training and high-performance fields, and can also be used for virtualization and machine learning inference. They will be shipped by the end of this year, and related systems and applications will be available next year, 2020 Years and next-generation products.

The Radeon Instinct MI60 and MI50 are still based on the Vega GPU core architecture, but on the one hand they upgrade the new process, and on the other hand, they are also adjusted and optimized for data center applications, including computing units, video memory, PCI-E 4.0 and so on.

The 14nm Vega 10 core has 12.5 billion transistors, and the core area is 484 square millimeters. The 7nm Vega increases to 13.2 billion transistors (an increase of 6.4%), but the area is reduced to 331 square millimeters (31.6%), which is only equivalent to the opponent ( 815 square meters of 12nm GV100) about 40%.

The new core has made targeted adjustments and enhancements to the basic computing units. For example, the vector ALU unit supports 16/32/64-bit operations, and all modules support ECC.

It is also the world's first GPU with a memory bandwidth of 1TB/s, up to 32GB HBM2.

Architecture optimization and frequency increase (specifically undisclosed), compared with MI25, the performance improvement of MI60 is very amazing. For example, FP16 floating point performance is 20% faster, INT8, INT4 integer performance is faster respectively 140%, 380%, and new instruction sets that are more suitable for executing machine learning applications.

If you just do matrix multiplication, the MI60 will only improve by more than 25%, but for a specific application like Resnet-50, the improvement can be up to 2.8 times, which is very amazing.

The improvement of TensorFlow FP32 is between 25-50%, and with Infinity Fabric, MI60 also supports multi-channel expansion that is almost linear. For example, the performance of four-channel is almost 4 times that of single-channel.

Both Vega and EPYC now support PCI-E 4.0, but the platform has not yet been built, so the performance improvement of eight-way parallel under PCI-E 3.0 will be limited to a certain extent.

PCI-E 4.0, 7nm Vega is the first supported GPU, and Rome EPYC is the first supported CPU. The two cooperate with each other, the bidirectional bandwidth can reach 64GB/s, and up to four blocks can be paralleled.

The Infinity Fabric bus can provide 200GB/s of bandwidth between different graphics cards, which is 6 times that of PCI-E 3.0. However, note that hardware bridges are used for interconnection, in order to more easily handle the large amount of transmitted data.

Thanks to the support for hardware virtualization (the only one), MI60/56 can also have up to eight cards in parallel, but the implementation method is slightly different. PCI-E bus interconnect.

If you don't need so many cards, you can also use one, two, or four to form a virtual machine and connect them in parallel in the same system, but note that the cards must be of the same type and cannot be mixed.

MI60 is a complete body, integrating 64 computing units, 4096 stream processors, peak integer performance INT4 118Tops, INT8 59Tops, peak floating point performance FP16 29.5TFlops, FP32 14.7TFlops, FP64 7.4TFlops, technical features support full-chip ECC error correction Test, RAS, PCI-E 4.0, dual-link Infinity Fabric, video memory with 4096-bit 32GB HBM2, bandwidth 1TB/s, thermal design power consumption 300W.

MI50 is reduced to 60 computing units and 3840 stream processors, the performance is reduced by about 9.5%, the video memory capacity is halved to 16GB, and the others are completely the same as above.

The opponent is so aggressive, Intel can't sit still. According to Anandtech, Intel announced that it will hold a "Forward-Looking" architecture communication meeting on December 11.

This event is highly classified, the agenda and content are still unknown, and only a few media and analysts can participate. For some reason, Intel is now more "tactful" about the technical details of the architecture, and even canceled the Intel Developer Conference.

For Intel, there are quite a lot of topics to talk about, such as CPU, GPU, AI, FPGA, PCIe 4.0/DDR5, security vulnerabilities, 10nm family, Moore's Law, etc., even including Jim Keller/Raja Koduri and other industry leaders after joining Adjustments to Intel's product line.

In addition, there have been rumors that the Core architecture will come to an end. Is this time Intel is going to make more moves?

Data Centre

Data Centre,Fiber Optic Patch Cord Assembly,Multimode Fiber Optic Patch Cord,Outdoor Fiber Optic Patch Cord

Huizhou Fibercan Industrial Co.Ltd , https://www.fibercannetworks.com

Posted on