House Rent Predictor — Dockerized App with Health Checkpoints and Docker Hub Deployment
Introduction
The House Rent Predictor project showcases the complete process of designing, containerizing, and deploying a machine learning application using modern cloud technologies. The primary goal of the system is to predict monthly house rent based on several user-defined features such as BHK, area (in square feet), number of bathrooms, city, furnishing status, tenant preference, locality, and floor number. The predictive model is a regression-based machine learning model that was pre-trained and integrated into a user-friendly web interface built using Streamlit.
To ensure reproducibility and platform independence, the entire application—along with its dependencies and configurations—was containerized using Docker. The solution consists of two main containers: one hosting the Streamlit frontend and prediction logic, and another managing the MySQL database that stores user inputs and predicted results. These containers are orchestrated through Docker Compose, ensuring smooth communication, health checks, and persistent data management using Docker volumes.
This documentation comprehensively covers the entire lifecycle of the project, organized into three logical parts (Part 1, Part 2, and Part 3). It includes detailed architecture descriptions, procedural steps with original commands and corresponding screenshots, an explanation of container configurations and health checks, and the final deployment to Docker Hub for public access.
Through this project, the integration of machine learning and cloud computing is effectively demonstrated — from model creation and application development to containerization, testing, and deployment. The resulting architecture provides a scalable, modular, and easily reproducible framework that can serve as a reference for deploying similar data-driven applications in cloud environments.
Objectives of Part 1
Build the application image for the Streamlit-based rent predictor.
-
Package the trained model file (
trainedmodel.sav) and dataset (House_Rent_Dataset.csv) into the image (excluded large files with.dockerignoreas needed). -
Ensure a reproducible local development image that runs the Streamlit UI at
http://localhost:8501. -
Verify model inference inside the container.
Objectives of Part 2
Split the project into two services:
house_rent_app(Streamlit app) andhouse_rent_db(MySQL database).-
Add Docker health checks for both app (HTTP probe) and DB (mysqladmin ping) and use
depends_onto wait for DB health. -
Ensure persistent storage for MySQL using a named Docker volume.
-
Implement DB initialization (table creation) and logging of predictions into MySQL.
-
Validate connectivity and data insertion from the app to the DB.
Objectives of Part 3
Tag and push the final application image to Docker Hub (
shebin21/house_rent_app:latest).-
Prepare reproducible deployment instructions and
docker-compose.ymlthat uses the published image or local build. -
Produce this documentation and architecture diagrams for submission.
-
Provide GitHub/DockerHub links to the final artifacts.
Name of the containers involved and the download links
App image (My project)
shebin21/house_rent_app:latest– pushed to Docker Hub. Pull:docker pull shebin21/house_rent_app:latestDocker Hub repo:https://hub.docker.com/r/shebin21/house_rent_app
Database (official image used)
mysql:8.0– official MySQL image used for backend DB. Docker Hub: _/mysqlhttps://hub.docker.com/
Base images used during build (references)
python:3.11-slim– base for Streamlit image. _/pythonhttps://hub.docker.com/ (If used for training/testing locally)
ghcr.io/ other standard images as required.
Name of the other software involved along with the purpose
Development and Orchestration Tools
| Tool | Purpose | Version |
| Python | Core programming language used to build the Streamlit application and handle machine learning logic. | 3.11+ |
| Docker Desktop | Primary containerization platform used to build, run, and manage the application and database containers. | N/A |
| Docker Compose | Orchestration tool that defines and runs multi-container applications (Streamlit + MySQL) using a single YAML configuration. | 3.9 |
| Visual Studio Code | Integrated Development Environment (IDE) used for writing Python code, editing configuration files, and testing locally. | N/A |
| Git & Docker Hub | Version control and cloud image repository for storing and distributing Docker images publicly. | Latest |
Application Frameworks and Libraries
| Library / Framework | Role | Version |
| Streamlit | Frontend framework for building and hosting interactive web interfaces for the machine learning model. | 1.40+ |
| NumPy | Provides numerical computation support and array handling for preprocessing data. | Latest |
| Pandas | Used for data manipulation, cleaning, and dataset operations. | Latest |
| scikit-learn | Machine learning toolkit used for training and implementing the regression model. | Latest |
| XGBoost | Gradient boosting framework providing high-performance prediction capabilities. | Latest |
| pickle-mixin | Enables loading serialized ML models from .sav or .pkl files. | Latest |
| python-dotenv | Loads and manages environment variables securely from the .env file into the application environment. | Latest |
Database and Storage
| System / Tool | Purpose | Version |
| MySQL | Relational database system used to store user inputs and prediction logs. | 8.0+ |
| Docker Volume (db_data) | Persistent storage for database files ensuring data is retained across container restarts. | N/A |
| MySQL Connector for Python | Python client library that enables connection and interaction between the Streamlit app and MySQL database. | Latest |
Infrastructure and Deployment
| Component / Tool | Purpose | Version |
| Dockerfile | Defines instructions for building the custom Streamlit application image. | N/A |
| docker-compose.yml | Orchestrates frontend (app) and backend (DB) containers, defining ports, environment variables, and health checks. | N/A |
| .env File | Stores sensitive configuration details such as database user, password, and host information securely. | N/A |
| Docker Hub Repository | Hosts the final image for public access and deployment. | Latest |
Supporting and Utility Tools
| Tool | Purpose | Version |
| CMD / PowerShell / Terminal | Used to execute Docker build, run, and push commands during development and deployment. | N/A |
| pip (Python Package Installer) | Installs all required dependencies listed in requirements.txt. | Latest |
| Browser (Chrome/Edge) | Used for testing and accessing the Streamlit web interface running on port 8501. | Latest |
Overall architecture of the project
Architecture summary:
-
User (Browser) → accesses the Streamlit app at
http://localhost:8501. -
Streamlit App Container (
house_rent_app):-
Loads
trainedmodel.savand dataset to perform inference. -
Encodes categorical inputs and computes predicted rent.
-
Inserts prediction record into MySQL.
-
-
MySQL Container (
house_rent_db):-
Stores
predictionstable. -
Persistent data stored in Docker named volume
db_data.
-
-
Healthchecks:
-
App: HTTP probe to
http://localhost:8501/(or Streamlit health endpoint). -
DB:
mysqladmin pingwithappusercredentials.
-
-
Docker Hub: final app image published for reuse.
- Architecture Image
The House Rent Prediction System follows a multi-layer containerized architecture designed for modularity, scalability, and persistence. The system integrates a web-based machine learning application with a relational database, all orchestrated using Docker and Docker Compose for a fully automated deployment environment.
At the heart of the architecture is a Streamlit application container, which serves as the primary interface for users. The container exposes port 8501, allowing users to access the prediction interface through any web browser. Users input property details such as BHK, Size (sq. ft), City, Number of Bathrooms, Furnishing Status, Tenant Type, and Point of Contact.
Once the input is submitted, the application performs preprocessing — converting categorical attributes into numerical codes, replacing missing locality values with pre-computed averages, and preparing a feature vector for model inference. The trained machine learning model (trainedmodel.sav), built using Scikit-learn and XGBoost, predicts the log-transformed rent value. This prediction is then exponentiated to produce the final monthly rent estimate, which is displayed instantly to the user via the Streamlit interface.
Supporting the application is a MySQL 8.0 database container, which functions as the persistent data storage layer. This container maintains a structured table named predictions, where all user inputs, computed rent values, and timestamps are securely stored. The system uses a Docker volume (db_data) to persist database contents even when containers are stopped or rebuilt.
db). This eliminates manual configuration and ensures smooth communication between containers. To ensure system reliability and smooth startup, the architecture employs Docker Compose for service orchestration. Each container includes a health check mechanism — the database runs a mysqladmin ping command to verify readiness, while the application runs an HTTP check to confirm that the web service is accessible on port 8501.
Docker Compose uses the depends_on condition with service_healthy to manage startup sequencing — ensuring the frontend waits until the backend database is fully initialized. Once both containers are healthy, the system becomes operational without any manual intervention.
Security and maintainability are achieved by separating configuration details into a .env file, keeping sensitive data such as credentials outside the source code. Unnecessary build files and cache directories are excluded using a .dockerignore file, resulting in leaner image builds.
The application container is designed to remain stateless, meaning no critical data is stored inside it, while the database container manages persistent state through the attached volume. This separation ensures the system is both portable and fault-tolerant.
For deployment, the final Streamlit application image was published on Docker Hub under the repository shebin21/house_rent_app:latest. This enables anyone to reproduce the environment by pulling the image and running it directly on their system using:
- docker pull shebin21/house_rent_app:latest
- docker run -p 8501:8501 shebin21/house_rent_app
This approach ensures that the system remains consistent across all environments — from development to deployment — eliminating compatibility issues.
The overall system architecture, as depicted in the diagram, is organized into distinct layers:
-
User Interaction Layer: Handles user input and result display through the browser interface.
-
Frontend / Application Layer: Processes input data, executes the ML model, and communicates with the database.
-
Backend / Database Layer: Manages data storage, ensuring persistent record-keeping of all predictions.
-
Infrastructure Layer: Defines the network, service dependencies, and health checks using Docker Compose.
-
Deployment Layer: Hosts the final image on Docker Hub for easy reuse and collaboration.
This layered design provides a clear separation of responsibilities, ensuring the application is modular, reproducible, and highly portable.
It allows smooth transitions between development, testing, and deployment, while maintaining system reliability and data consistency across container restarts or environment changes.
Procedure — Part 1: Build Basic Containers and Images
Your image (e.g. house_rent_app:latest) should appear in the list.
Step 6 – Start Containers
Step 7 – Running Containers
Step 8 – Streamlit Web Application
Step 9 – Database Validation
Procedure — Part 2 (Add health checkpoints & separate services)
init.sql, mount it at /docker-entrypoint-initdb.d/init.sql to create predictions table automatically. Example init.sqlCREATE TABLE IF NOT EXISTS predictions (
latest appears (size, digest, last pushed)How to Run the Project (Deployment Guide)
Open your terminal or PowerShell and run the following command to pull the pre-built image from Docker Hub:
Command- docker pull shebin21/house_rent_app:latest
Step 2: Run the Container
After pulling the image, run the container using:
Command- docker run -p 8501:8501 shebin21/house_rent_app
This maps the application’s internal port (8501) to your local machine’s port 8501.
Step 3: Access the Application
Once the container starts successfully, open your browser and go to: http://localhost:8501
You will see the House Rent Predictor web interface. Enter property details such as BHK, size (in sqft), city, and furnishing status to get the predicted rent instantly.
Step 4: Using Docker Compose (Optional)
If you have cloned the complete repository containing both the Streamlit App and MySQL Database, navigate to the project directory and run: docker-compose up
To stop running containers: docker stop <container_id>
To remove stopped containers: docker rm <container_id>
Modifications done in the containers after building
The modifications performed to your images/containers after the initial builds include:
-
Added Healthchecks
-
App: HTTP
curl --fail http://localhost:8501/inhealthcheck. -
DB:
mysqladmin ping -h localhost -u appuser -pApp@12345.
-
-
Environment variable support
-
Externalized DB credentials into
.envand read them throughpython-dotenvinsideapp.py.
-
-
DB schema initialization
-
Either added
init.sqlin/docker-entrypoint-initdb.d/mount to auto-createpredictionstable, or created the table manually from the DB shell.
-
-
Persistence
-
Mounted named volume
db_data:/var/lib/mysqlto persist database files on host.
-
-
Exception handling & logging
-
App catches DB exceptions and logs them; ensured
warnings.filterwarnings('ignore')was set and connections committed correctly.
-
-
Pushed image to Docker Hub
-
Tagged with
shebin21/house_rent_app:latestand pushed so the image is reusable.
-
DockerHub link of your modified containers
-
Docker Hub (App image):
https://hub.docker.com/r/shebin21/house_rent_app
Pull command:docker pull shebin21/house_rent_app:latest
What are the outcomes of your DA?
-
A reproducible Docker image (
shebin21/house_rent_app) capable of running the rent predictor UI anywhere with Docker. -
A multi-service Docker Compose setup that orchestrates the Streamlit app and MySQL DB with health checks and persistent storage.
-
An automated DB initialization workflow for schema creation (predictions table).
-
Successful upload of the app image to Docker Hub enabling sharing and redeployment.
-
Verified end-to-end flow: user inputs → model inference → DB logging → verification via SQL queries.
Conclusion
The House Rent Predictor project successfully demonstrates the complete lifecycle of developing and deploying a machine learning application using cloud-native principles. By containerizing both the application and database components, the system achieves a clean separation between the stateless application layer and the stateful data persistence layer, aligning with modern best practices in distributed application design.
The project highlights how Docker-based containerization simplifies deployment, improves reproducibility, and eliminates environment-related issues that typically occur during model deployment. Through the use of Docker Compose, the application ensures robust communication between containers, automatic service orchestration, and efficient resource utilization. The inclusion of health checks, persistent volumes, and environment variables further enhances the project’s reliability, security, and maintainability, making it production-ready and adaptable to real-world use cases.
Additionally, publishing the application image to Docker Hub enables seamless accessibility — allowing any user to pull, run, and interact with the model in just a few commands, without manual setup or dependency management. This cloud-centric design not only demonstrates strong technical implementation but also emphasizes the importance of portability and scalability in modern software systems.
Overall, this project provides a comprehensive understanding of how machine learning models can be operationalized using container technologies. The three-phase process — building the core application, stabilizing it through structured container orchestration, and finally publishing it for global reproducibility — serves as a valuable template for deploying ML-powered web applications in real-world environments. The House Rent Predictor stands as an example of applying cloud computing concepts to achieve a fully functional, efficient, and easily deployable intelligent system.
References
-
Official Docker Documentation and Docker Hub: https://docs.docker.com/
-
MySQL Official Documentation: https://dev.mysql.com/doc/
-
Streamlit Documentation: https://docs.streamlit.io/
Acknowledgement
I would like to express my sincere gratitude to Dr. T. Subbulakshmi for her valuable guidance, continuous support, and clear instructions throughout the completion of this Digital Assignment. I would also like to extend my appreciation to VIT SCOPE for offering the Cloud Computing course during the present semester, which provided the foundational knowledge and motivation to work on this project. Additionally, I acknowledge the use of official resources such as Docker documentation and Docker Hub, MySQL official documentation, and the Streamlit documentation, which were instrumental in understanding and implementing the various components of this assignment.
Appendix: Useful commands
Build & run locally
docker build -t cloud-house_rent_app .
docker run -p 8501:8501 --env-file .env cloud-house_rent_app
Compose (build & start)
DB access
Tag & push

Comments
Post a Comment