ExaScale is a 4-way Competition

ExaScale is a 4-way Competition

In this post-ISC show, the RadioFree team discusses

  • Magical cooling technology from Europe. Dan goes over magic beads that draw heat away and can carry-on doing it pretty much forever in a technology from the venerable Fraunhofer Institute and showcased by Lenovo.
  • How pursuit of ExaScale computing is turning into heated competition with the US, China, Japan, and Europe. The European effort is targeting 2 pre-exa installation in the coming months, and 2 actual ExaScale installations in the 2022-2023 timeframe at least one of which will be based on European technology. This presumably refers to the European Process Initiative.
    The software ecosystem is an important consideration and how they all evolve and whether or not they converge will be a big issue.
  • Another heated competition at the ISC Student Cluster Competition with the team from South Africa claiming the top spot. Dan has developed an efficiency metric that he will unveil in a future episode. This could separate the prowess of the team from that of the system!

Catch of the Week

Henry:

Henry point out the challenge for customers when the company that breached their data goes out of business.

Collections Firm Behind LabCorp, Quest Breaches Files for Bankruptcy

A medical billing firm responsible for a recent eight-month data breach that exposed the personal information on nearly 20 million Americans has filed for bankruptcy, citing “enormous expenses” from notifying affected consumers and the loss of its four largest customers.

 

Shahin:

Shahin highlights a paper on the beginnings of the programming language APL. A cool historical account.

The Socio-Technical Beginnings of APL, by Eugene McDonnell

This paper gives some of the history of implementations of APL, and concentrates on the system aspects of these implementations, paying special attention to the evolution of the workspace concept, the time-sharing scheduling strategy, and the handling of the terminal. It contrasts the development of APL with the development of other time-sharing systems which were being built at the same time.

Dan:

Dan relays the sad story of the multi-year demise of a the honor bar at the hotel near ISC.

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

 

 

TOP500 Jun2019, Facebook Coin

The new TOP500 list of most powerful supercomputers is out and we do our usual quick analysis. Not much changed in the TOP10 but a lot is changing further down the list. Here is a quick take:

  • There are 65 new entries in 2019.
  • US science is receiving support via DOE sites and academic sites like TACC.
  • 26 countries are represented. China continues to widen its lead, now with 219 entries, followed by the US with 116, Japan with 29, France with 19, the UK with 18, Germany with 14, Ireland and the Netherlands with 13 each, and Singapore with 10.
  • Vendors substantially reflect the country standings. Lenovo has 175 entries, Inspur 71, and Sugon 63, all in China. Cray with 42 and HPE with 40 (which will combine when their deal closes), followed by Dell at 17 and IBM at 16.  Bull has 21 entries.
  • There are a lot of “accidental supercomputers” on the list. These are systems that probably are not be doing much science or AI work but they could, and the vendors counted them and it seems to be within the rules to list them. It’s controversial but not a new practice.
  • There are several systems listed as “Internet” companies. Hard to tell what that means but it points to the existence of very large clusters in the cloud for whatever purpose. Last year, there was one system listed as Amazon EC2, which remains on the list. This time, there is also one at Facebook. Usually the big social/cloud players don’t care to participate, though they obviously could summon the resources to run the benchmarks.
  • Just over half of systems use Ethernet as a fabric. A quarter us InfiniBand, nearly 50 use Intel’s OmniPath, and the rest, 55, use custom interconnects like the ones Cray provides. The team talks about Cray+HPE entering the interconnect business for real and if so, they will be formidable.
  • The majority of entries, 367, do not have any accelerators. 125 use Nvidia GPUs.
  • The overwhelming majority of the systems, 478 of them, are based on Intel CPUs. 13 are IBM, and there is 1 system based on Arm provided by Cavium, now part of Marvell.
  • So the when it comes to chips, it’s an Intel game with a respectable showing by Nvidia when GPUs are used. Alternatives are bound to appear as the tens and tens of AI chips in the works become available and Arm, AMD, and IBM build on. The recently announced system at Oakridge will be all AMD, and that will point to an alternative as well.
  • Notably, Intel is listed as the vendor for 2 entries and Nvidia is listed for 4. While Intel has stayed largely away from looking like a system vendor, Nvidia is going for it with its usual alacrity. That, and the pending acquisition of Mellanox by Nvidia should serve as a warning to all system vendors who might feel stuck between treating Nvidia as an important supplier and an up and coming competitor.

CryptoSuper500

Shahin mentions the 2nd edition of the CryptoSuper500 list (really 50 for now), a list developed by his colleague Dr. Stephen Perrenod, which was launched last November, and is being released at the same time as the TOP500. The TOP500 has spawned variations that look at different workloads and attributes, for example, the Green500Graph500, and IO500 lists. CryptoSuper500 was inspired by those lists. The material for the inaugural edition of the CryptoSuper500 list here.

Cryptocurrency mining operations are often pooled and are very much supercomputing class, typically using accelerator technologies such as custom ASICs, FPGAs, or GPUs. Bitcoin is the most notable of such currencies. Scroll down for the top-10 list and see the slides for the full list and the methodology.

Catch of the Week

Henry:

Henry talks about check-out lanes at Target all being down for unknown reasons, though he hesitates to call that a cybersecurity breach. It turned out he’s right and the company blamed an “internal technology issue”.

Target down (then back up) as cash registers fail and leave long lines

Target’s payment systems appeared to be missing the mark the day before Father’s Day, as terminals went AWOL for a couple of hours in a number of the company’s US retail outlets. The outage caused long lines but prompted an encouraging show of sympathy for Target employees from people on Twitter. And there were some jokes too, of course.

Shahin:

Facebook is expected to release a new cryptocurrency that is already impacting the crypto market.

Here’s what we know so far about the secretive Facebook coin

Facebook is likely to release information about its secretive cryptocurrency project, codenamed Libra, as soon as June 18, TechCrunch reports.

As is traditional with new cryptocurrencies, the social networking giant is expected to release a so-called “white paper” outlining how the currency works and the company’s plans for it.

 

Dan:

Dan reminds us all of the inimitable Erich Anton Paul von Däniken and his ancient astronauts hypotheses!

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter

Enterprises go HPC, Chips go Open Source, China goes for the top spot

We continue to want to make these introductions pretty brief here but not this time, apparently! Here’s this week’s synopsis.

Nvidia GTC 2019 announcements

We discussed the recent GTC conference. Dan has been attending since well before it became the big and important conference that it is today. We get a quick update on what was covered: the long keynote, automotive and robotics, the Mellanox acquisition, how a growing fraction of enterprise applications will be AI.

In agreement with the message from GTC, Shahin re-iterates his long-held belief that the future of enterprise applications will be HPC and once again asserts that AI as we know it today is a subset of HPC. Not everyone agrees. Henry brings up varying precisions in AI and a discussion ensues about what is HPC. There seems to be agreement that regardless of what label you put on it, it is the same (HPC) industry and community that is driving this new trend. And that led to a discussion of selling into the enterprise and the need for new models and vocabulary and such.

Speaking of varying precision, there is also Nvidia’s new automatic mixed precision capability for Tensorflow and there is a bit of discussion on that.

China plans multibillion dollar investment in supercomputing

On the heels of the Aurora announcement, there was news in the South China Morning Post that the top spot in supercomputing is something the country is investing in. No surprise, but interesting to see, and consistent with the general view that supercomputing drives competitive strength.

Catch of the Week

Henry:

Facebook Stored Hundreds of Millions of User Passwords in Plain Text for Years

Hundreds of millions of Facebook users had their account passwords stored in plain text and searchable by thousands of Facebook employees — in some cases going back to 2012, KrebsOnSecurity has learned. Facebook says an ongoing investigation has so far found no indication that employees have abused access to this data.

Shahin:

MIPS R6 Architecture Now Available for Open Use

MIPS 32-bit and 64-bit architecture – the most recent version, release 6 – will become available Thursday (March 28) for anyone to download at MIPS Open web page. Under the MIPS Open program, participants have full access to the MIPS R6 architecture free of charge – with no licensing or royalty fees.

Dan:

Vengeful sacked IT bod destroyed ex-employer’s AWS cloud accounts. Now he’ll spent rest of 2019 in the clink

An irate sacked techie who rampaged through his former employer’s AWS accounts with a purloined login, nuking 23 servers and triggering a wave of redundancies, has been jailed.

 

Dead LAN’s hand: IT staff ‘locked out’ of data center’s core switch after the only bloke who could log into it dies

‘We can replace it but we have no idea what the config is on the device’

Listen in to hear the full conversation.

Download the MP3 * Subscribe on iTunes * RSS Feed

Sign up for our insideHPC Newsletter