This assignment builds on your individual work with Kafka for real-time air quality prediction. Before continuing, make sure your team members have successfully completed the individual assignment, as this project extends those concepts to create a complete production-ready system.
Unlike the individual assignment that focused primarily on setting up Kafka and basic modeling, this team project will guide you through implementing a comprehensive MLOps pipeline. We recognize that some technologies in this assignment may be new to you (MLFlow, Docker, Kubernetes, Evidently), so we've structured the assignment to provide additional guidance where needed.
Your team will develop an end-to-end production-ready environmental monitoring system that transforms your individual Kafka-based models into a scalable, deployable, and monitorable solution. This project covers the complete AI system lifecycle, from model development to deployment and monitoring.
By the end of this assignment, you will:
- Advance Model Development Using MLFlow: Learn to track experiments and manage models
- Containerize Applications Using Docker: Package your solution for consistent deployment
- Deploy Applications with Kubernetes: Create basic deployments for your containerized application
- Implement Real-time Data Processing: Enhance your Kafka implementation
- Set Up Model Monitoring: Learn to track model performance with Evidently
- Communicate Technical Solutions: Present your work through documentation and presentations
You'll continue working with the UCI Air Quality Dataset from your individual assignment:
- 9,358 hourly instances with measurements of various pollutants
- Features include CO, NOx, NO2, Benzene, and other measurements