Quantyca Data Science Lab BNL
Scopri
Business Summary

BNL Gruppo BNP Paribas, with over 100 years of history, is one of Italy’s leading banking groups, with 2.5 million private customers, 130,000 small businesses and professionals and over 33,000 companies and institutions.
Since 2006, the Bank has been part of the BNP Paribas Group, one of the world leaders in banking and financial services, operating in 72 countries.

 

The need to better understand correlations and trends concerning complex phenomena such as customer preferences, changes in demand for a specific product and/or service, or the analysis of market competition, has led BNL to invest in the acquisition of specific skills for extracting useful information from the multitude of data it has at its disposal on a daily basis.

The results and impact of all strategic actions dictated by the insights gained, coupled with the growth in the availability of structured and unstructured data, led BNL to create in 2020 a platform for the industrialisation of data science processes that promotes a systematic approach to data analysis.

The new platform created in collaboration with Quantyca has enabled BNL to standardise the set of methods, processes, algorithms and technologies used by its data scientists, thereby reducing the cost and time required to develop and release Machine Learning (ML) and Artificial Intelligence (AI) models.

Challenges

The continuous demands from business to gain insights through data analysis have led to a steady growth in technical skills as well as in the number of data professionals joining the various teams.
As of the beginning of 2022, there are 10 working groups carrying out data analysis on a daily basis (each consisting of 4-5 members), with a total of about 50 data scientists working today. These numbers, which are still growing, require a technological and methodological standardization of the processes followed by these work teams.

The main problems associated with not having an IT infrastructure capable of hosting end-to-end Data Science processes relate primarily to the simplicity of reproducing experiments. That is, a model must be able to be reconstructed at any time after its initial implementation with marginal variations in inference performance. Reproducibility makes it possible to reduce errors, speed up experimentation times, encourage reuse, and create confidence in the validity of the results obtained by all stakeholders. Without reproducibility, it would also not be possible to enable appropriate practices to support the principles of collaboration and automation.

The absence of a basic infrastructure also leads to an increase in the time, and thus cost, of model development and, above all, makes it difficult to move models into production. The interception of any degradation of the algorithms must be enabled by a monitoring system that can assess the performance of the models, and processes must be standardized to make the activities of data scientists compliant with privacy and security policies.

Reasoning on a platform for the industrialisation of Data Science processes, finally, makes it possible to invest in the relationship between traditional IT and Data Scientists, guaranteeing the development agility required by the latter and, at the same time, the characteristics of architectural robustness proper to Data/Software Engineers.

Solution

The entire solution is based on an infrastructure capable of automating the data processing process for the calculation of the features required for ML models, the training and execution of ML models and, at the same time, is able to integrate easily with BNL’s systems via API exposure (online serving) or batch integration techniques (offline serving).

 

Thanks to Quantyca we have built a digital ecosystem available to our data scientists that enables the development and training of Machine Learning models, but which is at the same time highly integrable with pre-existing banking systems. The integrability is the real added value of the platform as it allows the improvement of our business processes and the achievement of objectives through modern Machine Learning and Artificial Intelligence techniques

Giovanni Cauzillo - Head of Data Intelligence Platform at BNL

 

BNL, with the support of Quantyca, was able to create the Data Science Lab, a self-service platform available to data scientists for the development, training and deployment of machine learning models that facilitate and automate the process, while making it scalable and reproducible.
There are also tools for isolating environments and projects, provisioning the development environment, and versioning code and models.

The environment hosting DataLab is divided into two macro areas: laboratory and production. The laboratory area is designed to manage the components used by data scientists at the time of development. The production area, on the other hand, is dedicated to the storage of projects and models and their execution.

The development environment provides self-service access to JupyterHub servers pre-configured with what is deemed necessary (Python and Conda libraries) to create, train and run models on Jupyter notebooks.
The choice of models to be brought into production is made through the use of MLflow, which allows the executions of the developed models to be saved and their parameters and performance metrics to be collected.

The production environment uses GitLab to create the CI/CD pipelines that automate the construction of exploitable Docker images for model execution, version them in a private Docker registry and expose them to consumers via the Flask API.
In addition to the previously mentioned technologies, there are also the services offered by the Elastic Stack for monitoring the infrastructure and the events and performance of individual components, and Kubernetes as an orchestrator of Docker containers that takes care of the automatic allocation of the necessary resources.

Results

Thanks to the new platform for the industrialisation of Data Science processes, BNL was able to achieve an empowerment of its Data Scientists and their ecosystem through the acquisition of skills and tools that enable the production of deliverables with the same architectural robustness as traditional development.
In essence, the Data Lab environment reconciles the agility needs of data scientists with the stability and maintainability needs of IT.

The platform has been successfully adopted by BNL’s data scientists, thus reducing development costs and time and facilitating and speeding up the process of releasing new models into production, thus responding more effectively and efficiently to business needs.

Risorse

Whitepaper
Free
17/06/2022

BNL – Data Science Lab

Need personalised advice? Contact us to find the best solution!

This field is for validation purposes and should be left unchanged.

Join the Quantyca team, let's be a team!

We are always looking for talented people to join the team, discover all our open positions.

SEE ALL VACANCIES