RFHPC168: Running Down the Latest TOP500 List

From left, Henry Newman, Dan Olds, Shahin Khan, and Rich Brueckner are the Radio Free HPC team

In this podcast, the Radio Free HPC team reviews the latest TOP500 list in front of a live audience in Denver at SC17.

The fiftieth TOP500 list of the fastest supercomputers in the world has China overtaking the US in the total number of ranked systems by a margin of 202 to 144. It is the largest number of supercomputers China has ever claimed on the TOP500 ranking, with the US presence shrinking to its lowest level since the list’s inception 25 years ago.

Just six months ago, the US led with 169 systems, with China coming in at 160. Despite the reversal of fortunes, the 144 systems claimed by the US gives them a solid second place finish, with Japan in third place with 35, followed by Germany with 20, France with 18, and the UK with 15.

China has also overtaken the US in aggregate performance as well. The Asian superpower now claims 35.3 percent of the TOP500 flops, with the US in second place with 29.8 percent.

The top 10 systems remain largely unchanged since the June 2017 list, with a couple of notable exceptions.

Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC), and installed at the National Supercomputing Center in Wuxi, maintains its number one ranking for the fourth time, with a High Performance Linpack (HPL) mark of 93.01 petaflops.

Tianhe-2 (Milky Way-2), a system developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzho, China, is still the number two system at 33.86 petaflops.

Piz Daint, a Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland, maintains its number three position with 19.59 petaflops, reaffirming its status as the most powerful supercomputer in Europe. Piz Daint was upgraded last year with NVIDIA Tesla P100 GPUs, which more than doubled its HPL performance of 9.77 petaflops.

The new number four system is the upgraded Gyoukou supercomputer, a ZettaScaler-2.2 system deployed at Japan’s Agency for Marine-Earth Science and Technology, which was the home of the Earth Simulator. Gyoukou was able to achieve an HPL result of 19.14 petaflops. using PEZY-SC2 accelerators, along with conventional Intel Xeon processors. The system’s 19,860,000 cores represent the highest level of concurrency ever recorded on the TOP500 rankings of supercomputers.

Titan, a five-year-old Cray XK7 system installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory, and still the largest system in the US, slips down to number five. Its 17.59 petaflops are mainly the result of its NVIDIA K20x GPU accelerators.

Sequoia, an IBM BlueGene/Q system installed at DOE’s Lawrence Livermore National Laboratory, is the number six system on the list with a mark of 17.17 petaflops. It was deployed in 2011.

The new number seven system is Trinity, a Cray XC40 supercomputer operated by Los Alamos National Laboratory and Sandia National Laboratories. It was recently upgraded with Intel “Knights Landing” Xeon Phi processors, which propelled it from 8.10 petaflops six months ago to its current high-water mark of 14.14 petaflops.

Cori, a Cray XC40 supercomputer, installed at the National Energy Research Scientific Computing Center (NERSC), is now the eighth fastest supercomputer in the world. Its 1,630 Intel Xeon “Haswell” processor nodes and 9,300 Intel Xeon Phi 7250 nodes yielded an HPL result of 14.01 petaflops.

At 13.55 petaflops, Oakforest-PACS, a Fujitsu PRIMERGY CX1640 M1 installed at Joint Center for Advanced High Performance Computing in Japan, is the number nine system. It too is powered by Intel “Knights Landing” Xeon Phi processors.

Fujitsu’s K computer installed at the RIKEN Advanced Institute for Computational Science (AICS) in Kobe, Japan, is now the number 10 system at 10.51 petaflops. Its performance is derived from its 88 thousand SPARC64 processor cores linked by Fujitsu’s Tofu interconnect. Despite its tenth-place showing on HPL, the K Computer is the top-ranked system on the High-Performance Conjugate Gradients (HPCG) benchmark.

For the first time, each of the top 10 supercomputers delivered more than 10 petaflops on HPL. There are also 181 systems with performance greater than a petaflop – up from 138 on the June 2017 list. Taking a broader look, the combined performance of all 500 systems has grown to 845 petaflops, compared to 749 petaflops six months ago and 672 petaflops one year ago. Even though aggregate performance grew by nearly 100 petaflops, the relative increase is well below the list’s long-term historical trend.

A further reflection of this slowdown is the list turnover. The entry point in the latest rankings moved up to 548 teraflops, compared to 432 teraflops in June. The 548-teraflop system was in position 370 in the previous TOP500 list. The turnover is in line with what has been observed over the last four years, but is much lower than previous levels.

A total of 101 systems employ accelerator/coprocessor technology, compared to 91 six months ago. 86 of these use NVIDIA GPUs, 12 systems make use Intel Xeon Phi coprocessor technology, and 5 are using PEZY Computing accelerators. Two systems use a combination of NVIDIA GPU and Intel Xeon Phi coprocessors. An additional 17 systems now use Xeon Phi chips as the main processing unit.

Green500 Highlights

Turning to the new Green500 rankings, the top three positions are taken by newly installed systems in Japan, all of which are based on the ZettaScaler-2.2 architecture and the PEZY-SC2 accelerator. The SC2 is a second-generation 2048-core chip that provides a peak performance of 8.192 teraflops in single-precision.

The most efficient of these ZettaScaler supercomputers is the Shoubu system B installed at RIKEN’s Advanced Center for Computing and Communication. It achieved a power efficiency of 17.0 gigaflops/watt.

The number two Green500 system is the Suiren2 cluster at the High Energy Accelerator Research Organization/KEK, which managed to reach 16.8 gigaflops/watt.

The number three Green500 slot was captured by the PEZY Computing’s own Sakura system. It achieved 14.2 gigaflops/watt. All of these top three systems are positioned in the bottom half of the TOP500 rankings: Shoubu system B at position 258, Suiren2 at 306, and Sakura at 275.

The fourth greenest supercomputer is a DGX SaturnV Volta system, which is installed at NVIDIA headquarters in San Jose, California. It achieved 15.1 gigaflops/watt, and comes in at number 149 on the TOP500 list. The number five system is Gyoukou, yet another ZettaScaler-2.2 machine. It achieved an efficiency of 14.2 gigaflops/watt and it currently ranks as the fourth most powerful supercomputer in the world.

Vendor trends

A total of 471 systems, representing 94.2 percent of the total, are now using Intel processors, which is slightly up from 92.8 percent six months ago. The share of IBM Power processors is at 14 systems, down from 21 systems in June.

The number of systems using Gigabit Ethernet is unchanged at 228 systems, in large part thanks to 204 systems now using 10G Ethernet. InfiniBand technology is found in 163 systems, down from 178 systems in the previous list, and remains the second most-used system interconnect technology in the list. Intel Omni-Path technology is now in 35 systems, down from 38 six month ago.

HPE has the lead in the number of installed supercomputers at 123, which represents nearly a quarter of all TOP500 systems. This includes several systems originally installed by SGI, which is now owned by HPE. HPE accounted for 144 systems six months ago.

Lenovo follows HPE with 81 systems down from 88 systems on the June list. Inspur rose further in the ranks and has now 56 systems, up from only 20 six month ago. Cray now has 53 systems, down from 57 systems six month ago. Sugon features 51 systems in the list, up from 44 in June. IBM follows with only 19 systems remaining under their label. These are mostly BlueGene/Q supercomputers, reflecting an aging install base. The average age of IBM systems on the list is now five years.

Cray continues to be the clear performance leader, claiming 19.5 percent of the list’s aggregate performance. HPE is second with 15.2 percent of the TOP500 flops. Thanks to the number one Sunway TaihuLight system, NRCPC retains the third spot with 11.1 percent of the total performance. Lenovo is fourth with 9.1 percent of performance, followed by Inspur at 6.3 percent, IBM at 6.1 percent and Sugon at 5.2 percent. All top vendors, with the exception of Inspur and Sugon, lost performance share compared to six months ago.

HPCG Results

The TOP500 list is now incorporating the High-Performance Conjugate Gradient (HPCG) benchmark results into the list to provide a more balanced look at system performance. The benchmark incorporates calculations in sparse matrix multiplication, global collectives, and vector updates, which more closely represents the mix of computational and data access patterns used in many supercomputing codes.

As previously mentioned, the fastest system using the HPCG benchmark remains Fujitsu’s K computer, which is ranked number 10 in the overall TOP500 rankings. It achieved 602.7 teraflops on HPCG, followed closely by Tianhe-2 with a score of 580.0 teraflops. The upgraded Trinity supercomputer comes in at number three at 546.1 teraflops, followed by Piz Daint at number four with 486.4 teraflops, and Sunway TaihuLight at number five at 480.8 teraflops.

The International Space Station computer, built by HPE, is now listed in the HPCG results, making it the “highest” computer on the list.

About the TOP500 List

The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be onto something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.

See our complete coverage of SC17

Download the MP3 * Subscribe on iTunes * RSS Feed

RFHPC146: Day-by-Day Preview of ISC 2017 in Frankfurt

In this podcast, Rich gives us a day-by-day preview of the upcoming ISC 2017 conference. The event takes place June 18-22 in Frankfurt, Germany.

“ISC High Performance focuses on HPC technological development and its application in scientific fields, as well as its adoption in commercial environments. In 2017 we offer you 13 fascinating HPC topics grouped under three categories: systems, applications, and emerging topics. All topics will be addressed in different power-packed sessions. The ISC tutorials, workshops and the exhibition will complement these sessions.”

Friday, June 16

  • HP-CAST is the high performance user group meeting for HPE customers and partners. Over 300 attendees are expected for the two-day meeting, which you can read about here on insideHPC.

Saturday, June 17

  • The Student Cluster Competition teams begins their system buildout at the Frankfurt Messe. An exhibitor pass is required for entry to the hall.
  • HP-CAST Day 2 has a focus on partner sessions.

Sunday, June 18

Monday, June 19

Tuesday, June 20

  • Exhibits 10-6
  • DDN User Group, 12:30pm – 5:00pm at the Movenpick Hotel

Wednesday, June 21

  • Exhibits 10-6
  • Student Cluster Awards Ceremony

Thursday, June 22

  • Workshops all day at the Marriott
  • Women in HPC Workshop 9:00am – 1:00pm
  • Dell HPC Community meeting, 8:00am – 3:00pm. The Dell EMC HPC Community event will feature keynote presentations by HPC experts and a networking breakfast to discuss best practices in the use of Dell EMC HPC Systems.
  • Rich heads out for Motorcycle tour of the Alps

Monday, June 26-28

After that, we do our Catch of the Week:

  • Dan interviews Carolyn Posti from Redline on the topic of HPC Benchmarking
  • Rich buys an Antminer S7 for mining Bitcoins

Download the MP3 * Subscribe on iTunes * RSS Feed

RFHPC144: Henry’s Trip to Best Buy

In this podcast, Henry goes on a shopping spree at Best Buy. His mom was moving into a new place, so he got her all-new electronics to the tune of $1624. Is Henry a good son or did he go cheap? We’ll find out.

After that, we do our Catch of the Week:

Rich notes that recent reports about the Aurora supercomputer were incorrect. Rick Borchelt from DoE: “On the record, Aurora contract is not cancelled.”

Shahin has been trying to keep up with the boom in cryptocurrency, which now has a market cap of something like $91 Billion dollars.

Dan is excited that Hitachi has stopped building its own mainframes but will supply IBM z Systems loaded with Hitachi VOS3 operating system software.

Download the MP3 * Subscribe on iTunes * RSS Feed

RFHPC142: A Look at the New Nvidia Volta GPUs

In this podcast, the Radio Free HPC team looks at Volta, Nvidia’s new GPU architecture that delivers up to 5x the performance of its predecessor.

At the GPU Technology Conference, Nvidia CEO Jen-Hsun Huang introduced a lineup of new Volta-based AI supercomputers including a powerful new version of our DGX-1 deep learning appliance; announced the Isaac robot-training simulator; unveiled the NVIDIA GPU Cloud platform, giving developers access to the latest, optimized deep learning frameworks; and unveiled a partnership with Toyota to help build a new generation of autonomous vehicles.

Built with 21 billion transistors, the newly announced Volta V100 “delivers deep learning performance equal to 100 CPUs.” Representing an investment by NVIDIA of more than $3 billion, the processor is built “at the limits of photolithography,” Huang told the crowd.

Download the MP3 * Subscribe on iTunes * RSS Feed

RFHPC140: Catching up with the Exascale Computing Project

In this podcast, the Radio Free HPC team looks at a recent update on the Exascale Computing Project by Paul Messina.

“The Exascale Computing Project (ECP) was established with the goals of maximizing the benefits of HPC for the United States and accelerating the development of a capable exascale computing ecosystem. The ECP is a collaborative effort of two U.S. Department of Energy organizations – the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA).”

Recent milestones include:

  • PathForward will soon announce six awards to vendors to develop new technologies that will be instrumental in Exascale system development.
  • The ECP Industry Council met for the first time recently with C-Level executives from industry to lay our application requirements for exascale systems. The end goal is to improve industrial competitiveness in the United States.

After that, we do our Catch of the Week:

  • Shahin is impressed by the new Wolfram Data Depository, a public resource that hosts an expanding collection of computable datasets, curated and structured to be suitable for immediate use in computation, visualization, analysis and more. Building on the Wolfram Data Framework and the Wolfram Language, the Wolfram Data Repository provides a uniform system for storing data and making it immediately computable and useful. With datasets of many types and from many sources, the Wolfram Data Repository is built to be a global resource for public data and data-backed publication.
  • Henry informs us to always tug on the front panel of your ATM before using. “Once you understand how easy and common it is for thieves to attach “skimming” devices to ATMs and other machines that accept debit and credit cards, it’s difficult not to closely inspect and even tug on the machines before using them.”

Download the MP3 * Subscribe on iTunes * RSS Feed

RFHPC129: Cray Looks to ARM HPC

In this podcast, the Radio Free HPC team looks at two hot stories from last week:

Cray to Develop ARM-based Isambard Supercomputer for UK Met Office. The GW4 Alliance, together with Cray and the UK Met Office, has been awarded £3m by EPSRC to deliver a new Tier 2 HPC service for UK-based scientists. This unique new service, named ‘Isambard’ after the renowned Victorian engineer Isambard Kingdom Brunel, will provide multiple advanced architectures within the same system in order to enable evaluation and comparison across a diverse range of hardware platforms.

Steve Pawlowski presentation from Persistent Memory Summit“As data proliferation continues to explode, computing architectures are struggling to get the right data to the processor efficiently, both in terms of time and power. But what if the best solution to the problem is not faster data movement, but new architectures that can essentially move the processing instructions into the data? Persistent memory arrays present just such an opportunity. Like any significant change, however, there are challenges and obstacles that must be overcome. Industry veteran Steve Pawlowski will outline a vision for the future of computing and why persistent memory systems have the potential to be more revolutionary than perhaps anyone imagines.” 

Download the MP3 * Subscribe on iTunes * RSS Feed

 

RFHPC128: Quantum Software Goes Open Source

In this podcast, the Radio Free HPC team looks at D-Wave’s new open source software for quantum computing. The software is available on github along with a whitepaper written by Cray Research alums Mike Booth and Steve Reinhardt.

D-Wave Systems released the open-source, quantum software tool as part of its strategic initiative to build and foster a quantum software development ecosystem. The new tool, qbsolv, enables developers to build higher-level tools and applications leveraging the quantum computing power of systems provided by D-Wave, without the need to understand the complex physics of quantum computers.

Just as a software ecosystem helped to create the immense computing industry that exists today, building a quantum computing industry will require software accessible to the developer community,” said Bo Ewald, president, D-Wave International Inc. “D-Wave is building a set of software tools that will allow developers to use their subject-matter expertise to build tools and applications that are relevant to their business or mission. By making our tools open source, we expand the community of people working to solve meaningful problems using quantum computers.”

 After that, we do the Catch of the Week: 

  • Shahin points us to the story about the miniaturization of accelerometers that could help with motion sickness and thus save lives.
  • Hater Dan shares the story that users are suing the Apple Store for being a Monopoly.
  • Rich notes that Cray has announced the appointment of Stathis Papaefstathiou to the position of senior vice president of research and development. He fills the slot vacated by Peg Williams, who will retire.

Download the MP3 * Subscribe on iTunes * RSS Feed

 

RFHPC127: Technologies We’re Looking Forward to in 2017

In this podcast, the Radio Free HPC team shares the things we’re looking forward to in 2017.

  • Shahin is looking forward to the iPhone 8. Henry and Dan will stick with Android. Shahin is also actively watching for much needed advancements in IoT security.
  • Henry is looking forward to storage innovations and camera technologies in the fight against crime. He also heralds the return of specialized processing devices for specific application worksloads. 
  • Dan thinks the continuing technology wars between processors and GPUs and Omni-Path vs InfiniBand are great theater.
  • Rich is looking forward to traveling to a great set conferences in the first half of the year. He has just updated the insideHPC Events Calendar with the lion’s share of major HPC events for 2017.

After that, we each share our Catch of the Week:

Download the MP3 * Subscribe on iTunes * RSS Feed

RFHPC123: SC16 Student Cluster Competition & Results

In this podcast, the Radio Free HPC team reviews the results from SC16 Student Cluster Competition. 

This year, the advent of clusters with the new Nvidia Tesla P100 GPUs made a huge impact, nearly tripling the Linpack record for the competition.

The Student Cluster Competition returned for its 10th year at SC16, The competition which debuted at SC07 in Reno and has since been replicated in Europe, Asia and Africa, is a real-time, non-stop, 48-hour challenge in which teams of six undergraduates assemble a small cluster at SC16 and race to complete a real-world workload across a series of scientific applications, demonstrate knowledge of system architecture and application performance, and impress HPC industry judges. The students partner with vendors to design and build a cutting-edge cluster from commercially available components, not to exceed a 3120-watt power limit and work with application experts to tune and run the competition codes.

For the first-time ever, the team that won top honors also won the award for achieving highest performance for the Linpack benchmark application. The team “SwanGeese” is from the University of Science and Technology of China. In traditional Chinese culture, the rare Swan Goose stands for teamwork, perseverance and bravery. This is the university’s third appearance in the competition.

Also, an ACM SIGHPC Certificate of Appreciation is presented to the authors of a recent SC paper to be used for the SC16 Student Cluster Competition Reproducibility Initiative. The selected paper was “A Parallel Connectivity Algorithm for de Bruijn Graphs in Metagenomic Applications” by Patrick Flick, Chirag Jain, Tony Pan and Srinivas Aluru from Georgia Institute of Technology.

After that, we go round-robin for our Catch of the Week:

Download the MP3 * Subscribe on iTunes * RSS Feed

See our complete coverage of SC16, which takes place Nov. 13-18 in Salt Lake City.