ArangoDB announced the release of
ArangoML Pipeline
Cloud,
a fully-hosted, fully-managed common metadata layer for production-grade data
science and Machine Learning (ML) platforms. ArangoML Pipeline Cloud runs on
ArangoDB Oasis, ArangoDB's recently released cloud service, and is the latest
offering in ArangoDB's ML extension,
ArangoML.
ArangoML
Pipeline Cloud meets the needs of both data scientists, who are concerned with
the quality of the data, feature training, and model results, as well as
DevOps, who need to manage which datasets and deployments are in use, their
performance, and how they are being deployed. ArangoML Pipeline Cloud
centralizes the metadata produced across the ML pipeline, providing a common
interface to show relationships of the data, features, and model training
results, as well as the deployments, management, and serving logistics.
ArangoML Pipeline solution is pipeline agnostic, allowing any combination of
pipeline components to be connected. Additionally, as a cloud-based service, it
can be up & running in just a few clicks.
"Common metadata is an often overlooked aspect when
building production grade ML pipelines, but is equally as important as good
training data," said Jörg Schad, Head of Engineering and Machine
Learning at ArangoDB. "It is not only crucial for DataOps teams when looking
for reproducible builds, audit trails, or compliance with privacy regulations, but extremely valuable for data scientists as well --
allowing them to easily grasp the lineage of models, what artifacts are
involved, and also enabling performance comparisons across different models and
approaches."
As
a multi-model database, ArangoDB can easily accommodate and unite unstructured,
highly-interlinked data, such as inference and model descriptions, and allow
relationships between them to be stored as a graph that can be managed by the
DevOps engineer and used by the data scientist at the same time.