Global Lambda Integrated Facility

To be, or not to be?

8 October 2018 -- The 18th Annual Global LambaGrid Workshop was held on 19-21 September 2018 at the Kulturværftet in Helsingør (Elsinore), Denmark. Kronberg Castle, located next to the venue, was immortalised as Elsinore in the William Shakespeare play Hamlet, but there proved to be nothing rotten with the state of high-bandwidth networking as 50 participants from 19 countries came to hear how these networks are facilitating exascale computing in support of biological, medical, physics, energy production and environmental research, and to discuss the latest infrastructure developments. An important focus of the workshop was also to consider how high-bandwidth research connections and exchange points can be better planned and coordinated, and whether the GLIF and GNA (Global Network Architecture) Tech activities should be merged into a new entity.

The event was co-located with the 30th NORDUnet Conference (NDN18), and thanks go to NORDUnet for hosting and supporting the workshop.

The keynote was provided by Steven Newhouse (EBI) who presented the ELIXIR Compute Platform which was being used for analysing life science data. In common with high-energy physics, genomics research produces a lot of data, but this is more complex and variable, requires sequencing and imqging on shorter timescales, and of course has privacy issues. The European Molecular Biology Laboratory is based across six countries and employs over 1,600 people, but also collaborates with thousands of other scientists and requires access to existing national repositories as well. High-bandwidth networks are therefore necessary to interconnect their on-site computer and storage clusters, but will increasingly be necessary to facilitate connectivity with other research and commercial cloud resources such as EGI.eu and HelixNebula.

David Martin (Argonne National Labs) continued this theme, by presenting on the US Department of Energy's Exascale Computing Initiative. This aims to develop and operate the next generation of supercomputers at the Argonne, Lawrence Livermore, Los Alamos and Oak Ridge National Labs by 2021, along with a software stack that will present a common computing platform for supporting advanced research applications and neural networks. The Argonne Labs Computing Facility will be based around an Intel Aurora supercomputer with over 1,000 petaflops of processing, 8 PB of memory, and 10 TB/s of input/output capability that will require future network connections in the petabit-per-second range.

Joe Mambretti (Northwestern University) then discussed the Open Science Cloud (OSDC) which is an open-source cloud-based infrastructure that allows scientists to manage, share and analyse large datasets. The aim is to have 1-2 PB of storage at each participating campus, interconnected with 100 Gb/s+ links, but presented and managed as a common namespace with uniform interfaces and policies.

The rest of the day was devoted to how network automation can integrate compute and storage facilities, particularly across multiple domains. Migiel de Vos (SURFnet) presented the work being undertaken for SURFnet 7, and explained the distinction between automation and orchestration whereby the former is considered task and domain specific, whilst the latter is developing intelligent processes that consist of multiple automated tasks across multiple domains. This required the development of new information models, standardised interfaces, automated administration, and then predetermined service delivery agreements.

Gerben van Malenstein (SURFnet) then discussed LHCONE Point-to-Point Service that allowed Layer 2 circuits to be dynamically established between Data Transfer Nodes for exchanging data from the Large Hadron Collider. This was built on the AutoGOLE work which was now enabled on 21 open exchange points. Nevertheless, whilst AutoGOLE was a functional and proven multi-domain system, there was still limited uptake by network services and end-users, which was necessary to completely remove human configuration of network equipment and create a truly global research platform.

The Governance Working Group chaired by David Wilde (AARNet) largely focused on the reporting formalities for the activities and finances during the previous year, as the main discussion with respect to the GLIF and GNA proposals was scheduled for the following day. There was discussion about the format for future meetings though, which was suggested to comprise of two events per year held in conjunction with GNA Tech meetings and preferably co-located with other major R&E networking conferences. However, one of these should ideally be located before Supercomputing each year, and in a venue that allowed high-performance demonstrations to be given a 'trial run' in advance of SC. It was also felt beneficial to keep the Programme Committee for these meetings, and Buseung Cho (KISTI) volunteered to join this.

The GLIF Americas Working Group (chaired by Maxine Brown, University of Chicago at Illinois and Joe Mambretti, Northwestern University) was also held the day before the workshop. This discussed developments and requirements in North and South American R&E networking.

Most of the following day was devoted to technical discussions chaired by Lars Fischer (NORDUnet) and Eric Boyd (University of Michigan). These focused around some practical examples of network automation being used at the University of Michigan, a passive network measurement system with programmable querying at 100 Gb/s line rates that was being developed by the IRNC AMIS Project, as well as discussions on how to automate the generation of network topology maps.

Topology maps are useful for users to show how they can reach counterparts in other parts of the world, and where particular services are available. They are also useful as a marketing tool to show investors and stakeholders how they contribute towards creating a truly global infrastructure, and demonstrate how the NREN model is accepted around the world, and for example, the GLIF map has become a somewhat iconic piece of artwork.

Other developments were the establishment of a new GOLE called South Atlantic Crossroads (SAX) based in Fortaleza, Brazil that was expected to interconnect with new cable systems to Angola (SACS) and Portugal (EllaLink), as well as to AMPATH and SouthernLight over the existing MONET connection. There were also plans to build procure a new 100 Gb/s connection from Europe to the Asia-Pacific, from Geneva to Singapore via the Indian Ocean to supplement the existing link from Amsterdam to Tokyo via Russia.

There were further updates on the new KREOnet network which supported 100 Gb/s links between five major Korean cities and Chicago (StarLight) via KRLight, as well as multiple 10 Gb/s links to 11 other Korean cities, Hong Kong and Seattle. The KREOnet-S infrastructure further offered SDN capabilities permitting dynamic and on-demand virtual network slicing, whilst a Science DMZ provided high-performance computing facilities for KISTI's new 25.5 petaflop supercomputer.

SURFnet was transitioning its network to SURFnet 8 and would be upgrading its core network and international links, whilst StarLight was developing a Trans-Pacific SDN testbed, as well as an SDX for the GENI initiative.

The closing plenary session focused on how to better coordinate and/or combine the GLIF and GNA Technical Working Group activities. The GLIF Co-Chairs Jim Ghadbane (CANARIE) and David Wilde (AARNet) outlined some ideas around this, and then opened the floor for discussion on how things should proceed.

It was acknowledged GLIF and GNA Tech have different focuses - with GNA more concerned with delivering outcomes for production networks, whilst GLIF was more research oriented with one of its strengths being its ability to test technologies that are not always production ready. However, there was a significant overlap amongst the participants, it was recognised that GLIF has moved beyond facilitating lambdas, and that the GLIF and GNA activities were complementary. There was therefore benefit in holding meetings together and coordinating the respective activities, with a view to creating a global forum for discussing lower-layer issues in R&E networks. Equally though, it was felt that GLIF has been beneficial to many projects, activities and infrastructures over years, and the essence of its open, consensus-driven approach should be maintained.

The details of how this should be undertaken needs further discussion, and there was consensus that a multi-stakeholder governance model needed to be defined before the scope of the activities were agreed. The GLIF and GNA leadership will discuss the next steps and come-up with a more specific proposal during October, with a view to holding a videoconference amongst stakeholders towards the end of the month.

The workshop concluded with a closing address from GLIF Co-Chair David Wilde who thanked NORDUnet for hosting the workshop, the speakers, and everyone who contributed to the discussions over the three days.

The proceedings of the workshop are available at http://www.glif.is/meetings/2018/

About GLIF -- The Global Lambda Integrated Facility (GLIF) is an international virtual organisation of NRENs, consortia and institutions that promotes lambda networking. GLIF provides lambdas internationally as an integrated facility to support data-intensive scientific research, and supports middleware development for lambda networking. It brings together some of the world's premier networking engineers to develop an international infrastructure by identifying equipment, connection requirements, and necessary engineering functions and services. More information is available on the GLIF website at http://www.glif.is/