Machine Learning project: Agile or Waterfall approach?

This question is so easy: agile approach. Why? Because it recognizes that the construction of the solution requires different loops.

Reason 1: ML models change overtime

Machine Learning projects are supported on ML models, and models change overtime. Why do a model change overtime?

A model change overtime because the data used to train the model has certain distributions. Data distributions can be expected to drift over time, deploying a model is not a one-time exercise but rather a continuous process.

You can manage the initiative as a waterfall project having certain internal loops for the model implementation, but at the end of the project you will have to hand over the project to operations and they have to take into account continuous monitoring of data and the potential update of the model.

Reason 2: ML initiatives starts with small scope, then once validated, the scope grows

Machine Learning initiatives requires the evolution of different parts of the organization, the data and the validation of the proposed solutions. To walk this path you have to “cross the river by feeling the stones”.

Usually, the ML initiatives start covering a small scope, then once the solution is validated, the scope increases. This also implies that during this initial scope:

  1. The organization has learnt about the ML implications.
  2. The organization has started to build capabilities and now they start to understand the gaps they have.
  3. The organization understands the costs and can make numbers about scale ML solutions.
  4. The organization can learn from mistakes, wrong decisions, etc.
  5. They can pivot as the scope put in production is small and have not major impact on operations.
  6. The organization can understand better and estimate the efforts to prepare data (this is a major challenge in majority of initiatives).
  7. Operations can start understanding the implications of ML solutions where they have to take care of infrastructure (with new requirements), applications (with different architecture) and data (with different requirements and availability).

Every executive wants the ML snowball to grow, and they understand that start small and increase the scope is the right approach for the construction of capabilities supported by ML solutions. By this reason agile methodologies that recognize the existence of different loops is the best approach.


By these two main reasons, I think that the selection of an agile methodology is more suitable for ML project.

I think the probability of failure using waterfall approach for a solution that requires continuous loops is high.

Leave a Comment