Sunday, February 19, 2023

 

MLOps using AirFlow, MLFlow and Kafka 

Apache Kafka is a distributed messaging platform that allows you to sequentially log streaming data into topic-specific feeds, which other applications in turn can tap into.
Apache Airflow is a task scheduling platform that allows you to create, orchestrate and monitor data workflows
MLFlow is an open-source tool that enables you to keep track of your ML experiments, amongst others by logging parameters, results, models and data of each trial .

In this hypothetical example, below are required 

a container which has Airflow and your typical data science
toolkit installed (in our case Pandas, NumPy and Keras) in order to create and update the model, whilst also schedule such tasks
a PostgreSQL container which serves as Airflow’s underlying metadata database
a Kafka container, which handles streaming data
a Zookeeper container, which amongst others is responsible for keeping track of Kafka topics, partitions and alike (later more on this!)
a MLFlow container, which keeps track of the results of the update runs and the characteristics of the resulting models




A typical. folder structure can be as below .
project_folder
├── dags
│ └── src
│ ├── data
│ ├── models
│ └── preprocessing
├── data
│ ├── to_use_for_training
│ ├── used_for_training
├── models
│ ├── current_model
│ └── archive
├── airflow_docker
├── mlflow_docker
└── docker_compose.yml


This example utilises the MNIST data set. One of the Airflow task DAG is to fetch the data and split into test, train and streaming set. Streaming set is to simulate the dynamic data that is coming in after the initial model is put into action. and puts them in the right format for training the CNN.

 Construct & fit the model - Task 2 amongst others fetches the train and test set from the previous step above. 

It then constructs and fits the CNN and stores it in the current_model folder






References 
https://www.vantage-ai.com/en/blog/keeping-your-ml-model-in-shape-with-kafka-airflow-and-mlflow


No comments:

Post a Comment