Handbook of European HPC projects

ASPIDE

exAScale ProgramIng models for extreme Data procEssing

OBJECTIVES 

The ASPIDE project will contribute with the definition of a new programming paradigms, APIs, runtime tools and methodologies for expressing data-intensive tasks on Exascale systems, which can pave the way for the exploitation of massive parallelism over a simplified model of the system architecture, promoting high performance and efficiency, and offering powerful operations and mechanisms for processing extreme data sources at high speed and/or real-time. 

DCEX PROGRAMMING MODEL 

The design of parallel patterns for program-ming Exascale applications has been one of the main goals of the ASPIDE project. DCEx addresses various features to enhance performance of data intensive computations in Exascale systems. Indeed, the cost of access-ing, moving, and processing data across a parallel system can be enormous. The work-flow modelling in the DCEx enables set-up a data life cycle management, allowing data locality and data affinity. The elementary workflow units permit to either place the data close to computational node where data is processed (data locality) or to distribute the computation where data was previously generated (data affinity avoids data movements). This way, the proposed solution assists application developers to access and use resources without the need to manage low-level architectural entities. In the same way, we want to provide a way to easily switch among different execution modes or policies without requiring to modify the applications source code. 

AD-HOC IN-MEMORY STORAGE SYSTEM (IMSS) 

IMSS is a proposal to enhance I/O in both traditional HPC and High-Performance Data Analytics (HPDA) systems. The architectural design follows a client-server design model where the client itself will be responsible of the server entities deployment. We propose an application-attached deployment constrained to application’s nodes and an application-detached considering offshore nodes. The client layer is in charge of dealing with data locality exploitation alongside the implementation of multiple I/O patterns providing diverse data distribution policies. 

INTELLIGENT ANOMALIES DETECTION SYSTEM 

ASPIDE proposes an engine for data analysis for events detection in monitoring data for the purposes of application autotuning. Overall, we developed an anomalies detection approach based on machine learning, capable of detecting anomalies during Exascale application execution, such as hardware failures and communication bottlenecks. We utilise the events and anomalies detection engine to constrain the search space of the optimization problem, thus further improving the execution efficiency of the Exascale applications.

AUTO-TUNING

We introduce the ASPIDE auto-tuning approach based on a multi-objective optimization algorithm that considers multi-dimensional search space with pluggable objectives, including execution time and energy. More- over, to further improve the application execution, the ASPIDE approach utilizes a machine learning (ML) based events detection approach, capable of identifying point and contextual anomalies. In general, the ASPIDE auto-tuner assists developers in understanding the non-functional properties of their applications by making it easy to analyse and experiment with the input parameters. The auto-tuner further supports them in exposing their obtained insights using tunable parameters.


LARGE SCALE MONITORING

One important challenge in ExaScale computing consists of developing scalable components that are able to monitor in a coordinated and efficient manner the use of the hard-ware resources and the behaviour of the applications. ASPIDE provides a hierarchical monitor system based on well-known components based on data aggregators, machine learning techniques, and time series analysis that aim to reduce the overhead of sensing large scale infrastructures.

PROJECT’S CONTACT:

Javier Garcia- Blas

Transition to Exascale Computing

Call:
FETHPC-02-2017

Coordinating Organization:
Universidad Carlos III de Madrid, Spain

Project Timespan
2018-06-15 – 2021-06-14

Other Partners:
  • Institute e-Austria Timisoara, Romania
  • Università della Calabria, Italy
  • Universität Klagenfurt, Austria
  • Instytut Chemii Bioorganicznej Polskiej Akademii Nauk – Poznań Supercomputing and Networking Center (PSNC), Poland
  • Servicio Madrileño de Salud, Spain
  • INTEGRIS SA, Italy
  • Atos (Bull SAS), France