DevOps Basics - 101 Theory

What is DevOps?

_images/devops1.jpg
  • it is not just tools
  • it is not just processes
  • it is not a trendy job title
  • it is not just automation

DevOps is the practice of operations and development engineers participating together in the entire service lifecycle, from design through the development process to production support.

_images/devops-whatisdevops.png

Adopting these practices and operations can lead to more robutst and reliable systems. As well as positively impact the whole development cycle: from R&D to development and production. Meaning: reducing deployment turnarounds, enhanced system montoring and alerting and better development planning. evOps focuses on continuous integration and continuous delivery of software by leveraging on-demand IT resources (infrastructure as code) and by automating integration, test and deployment of code.

Why do we need DevOps in Data Science?

Although DataOps inherits many principles from DevOps DataOps is highly driven by the stakeholders:

https://cdn-images-1.medium.com/max/1600/1*3rC6Y-U0uUeWQ46Qb-4a6g.png

In this case the main stakeholders are data scientists or analysts who are focused on building and deploying models and visualizations. They focus on domain expertise. They are interested in getting models to be more predictive or deciding how to best visually render data.

When the final product is a deployed model, a data scientist’s workflow looks something like this:

https://github.com/Azure/MachineLearningNotebooks/raw/master/how-to-use-azureml/machine-learning-pipelines/aml-pipelines-concept.png

with the internediate steps often being a series of iterative processes.

By adopting principles like continuous integration and continuous deployment as well as monitoring of the deployed apps/models we can increase the reliability of our data solutions.