Author: Francisco Javier Nieto De Santos (ATOS SPAIN)
New ICT applications for agriculture are becoming more and more complex. Additionally, it is possible to access new types of data and very large amounts of information that were not available before. This is enabling the possibility to do research on new aspects of agriculture (how crops can be grown in an optimal way, how to take better decisions, how they can be affected by infections and climate changes, etc…) and on new applications for agriculture.
EUXDAT aims at facilitating a e-Infrastructure for supporting both the research on agriculture-related topics and for enabling the possibility to provide applications for agriculture, by enabling access to a large amount of data and computational resources.
The main features that we wanted to cover with EUXDAT are:
- Facilitate several data connectors in order to enable the collection of data of different types and from several data sources (satellite images, images taken from drones with hyperspectral cameras, sensors data, soil-related datasets, LPIS, DEM, etc…);
- Provide easy access to computational and storage resources, both Cloud and HPC, so users would not need to know about the details and the usage is optimal;
- Facilitate data cataloging, management and movement in such a way that it is easy for users to publish and search for data, and to move and use it as needed;
- Provide data analytics tools, development tools and some implemented scenarios, so users can run applications for agriculture and experiment with their own codes.
In order to implement such features, we have proposed an architecture that covers all functional and non-functional requirements that we have been collecting since the beginning of the project. Such architecture was defined in detail in one of our deliverables (D2.6).
We proposed a few components to manage the infrastructure part in the background. This is the case of the Orchestrator, the Monitoring, the SLA Manager and the Billing & Accounting components.
The Orchestrator takes workflows defining the applications and takes care of running them by using HPC and Cloud resources, aiming at optimizing the resources. The SLA Manager provides the possibility to use Service Level Agreements, allowing different quality levels for the execution (i.e. gold, silver, etc…) for different costs. The Monitoring collects information about the infrastructure and applications (their status, if there are errors, availability, etc…), that is used by the Orchestrator and the SLA Manager, among other components. The Billing & Accounting just take care of collecting information about users’ activity to, later on, charge them depending on the resources/applications used.
The boxes in yellow list components in a higher level, which provide functionalities which are closer to the users, related to the access to data and codes: Data Manager, Data & Algorithms Catalogue and Data & Algorithms Repository.
The Data Manager provides not only tools for moving data, but also a good list of data connectors, so users can access climatic information, sensors data, satellite images, etc. Such data and the code of applications and libraries for processing it can be hosted using the Data & Algorithms Repository (i.e. providing an internal Git repository). On the other hand, the Data & Algorithms Catalogue provide a way to publish, search and access both data and applications/codes, so users can navigate through such catalogues and select what they want to use.
Since security is an important aspect, not only because of regulations like GDPR, but also because some data should be kept confidential and because not all users will have access to all the features. Therefore, the Identity and Authorization Manager component takes care of managing user accounts and granting access to the functionalities, enabling Single Sign On through the e-Infrastructure.
Finally, we defined the EUXDAT Portal as the one-stop-shop for accessing all the features provided by the e-Infrastructure. It allows access to the catalogues, to a GUI providing monitoring information, to the Jupyter Notebooks (for doing online developments), to the scenarios’ frontends, etc… It is a set of interfaces that make easy the access and usage of the e-Infrastructure features.
Thanks to this set of high-level components, it is possible to implement a full solution that covers the current requirements and that, in the future, will allow for a flexible evolution and improvement.