Huawei’s new chip is the most complex CPU ever built

Huawei Kunpeng

Earlier this month, Huawei announced a new server processor, the Kunpeng 920, at CES. This CPU turned heads in the industry as it was the first silicon design from the Chinese behemoth to feature rather impressive specs and benchmark numbers.

TechRadar Pro managed to get an exclusive interview with Mr Ai Wei, Fellow, Chipsets and Hardware Technology Strategy at Huawei, to discuss this new ARM-based processor in more detail, and some interesting titbits emerged concerning performance and transistor counts.

Mr Ai Wei

Mr Ai Wei, Fellow, Chipsets and Hardware Technology Strategy at Huawei

TechRadar Pro (TRP): Are the new processors (Kunpeng 920) going to be used by Huawei only in its servers or will they be available to other vendors (and potential Huawei competitors)?

Ai Wei: Our new CPU Kunpeng 920 will only be used on Huawei’s servers and other Huawei equipment, delivering value for our customers through equipment and cloud services. We will not be selling these chips directly.

TRP: Huawei servers were until now almost all x86-powered. What made you choose to go for an ARM architecture? How will both architectures sit within your product portfolio?

AW: The x86-based product market is still growing. At the same time, the service and data demands across multiple scenarios are driving the diversity of computing, and presenting new opportunities to the ARM industry. Huawei actively works with global partners to provide competitive products and solutions to customers.

TRP: How does your performance per core per MHz compare to the competition (Intel Xeon Gold, Cavium ThunderX2, Ampere eMAG)?

AW: Kunpeng 920 was independently designed by Huawei based on an ARMv8 architecture license. Kunpeng 920 significantly improves single-core performance by optimizing branch prediction algorithms, increasing the number of execution units, and adopting out-of-order execution. The CPU’s SPECint2006 score exceeded 10/GHz per core.

(Ed: In comparison, an HP 2P AMD EPYC 7601 system scored 14/GHz per core).

TRP: Why do you think Huawei will succeed with Kunpeng 920 where others have failed and lost hundreds of millions of dollars in the process?

AW: We are entering an intelligent society where all things are connected, sensing, and intelligent. In view of the industry trends and application requirements, a new era of diversified computing is unfolding. Multiple data types and scenarios are driving computing architecture optimization. Combining multiple computing architectures for optimal performance is a must. We remain customer-centric and offer multiple paths for our customers to address their diverse needs.

With the development of smartphones, edge computing, and the Internet of Things (IoT), and as data diversity drives more diversity in computing, the ARM industry will see many new opportunities for development. The ARM architecture is highly energy-efficient and can address new requirements from specific application scenarios. Technological improvement will help ARM deliver higher performance for data centers. According to estimates by ARM, 100 billion ARM-based CPUs will be shipped between 2017 and 2020, representing a huge market.

TRP: The Kunpeng 920 lands at a time of great changes in the land of processors with FPGA and accelerators playing an increasingly important role in tackling data center loads. How do you see the Kunpeng family evolving over the next few years?

AW: The explosion and diversity of data presents new opportunities and challenges. We will continue to innovate the Kunpeng series CPUs, and provide higher-speed I/Os with increased computing power. We will also constantly work with our industry partners to provide better solutions for our customers.

TRP: Are the cores standard ARM A76 cores, or ARM’s Ares platform, or something else? What is the expected die size at 7nm (roughly)? Any details regarding the transistor count?

AW: There is no relevance between these cores and ARM’s standard A76 or Ares cores, it is fully independently-developed by Huawei. In terms of transistors there are roughly over 20-billion transistors integrated. Information regarding die size is confidential.

TRP: You showed a SPECint score of 930. That’s SPECint2006 rate? What was the SPECfp result?

AW: Yes, that number is the score of running the benchmark SPECint2006 rate. The score of SPECfp is over 800 with 64 core@2.6GHz.