Rafi Avatar Muhammad Rafi Arsya
Back to Blog
Session 1 / 6
Machine Learning Agriculture Python

Crop Disease Detector

Building an AI that identifies plant diseases from a single leaf photo — the problem, the process, and the lessons learned.

Author Muhammad Rafi Arsya
Year 2026
Read time ~15 min
Status Live ·
Crop Disease Detector · 2026
Why I Built This

It started with a simple question that I could not stop thinking about.

Every year, a significant portion of agricultural harvests in developing countries are lost — not because treatments do not exist, but because plant diseases are identified too late. By the time a farmer notices that something is wrong, the damage has already spread.

"What if a farmer could know what's killing their crops just from a photo taken on their phone?"

That question would not leave me alone. And so I built the answer.

The idea is straightforward: a farmer takes a photo of a diseased leaf, uploads it, and the AI tells them exactly what's wrong — in seconds, without needing a lab, an expert, or expensive equipment.

The Scale of the Problem
20–40%
of global crop yield lost to plant diseases annually
$220B
estimated annual economic loss worldwide
<3s
time to detect disease with this AI model
Detection speed: Traditional vs AI
Expert visit
3–14 days
Lab test
2–9 days
This AI
<3 sec
The Real Problem
Healthy leaf Healthy leaf
Diseased leaf Diseased leaf

Plant diseases are difficult to diagnose visually — especially in the early stages. The differences between a healthy leaf and a diseased one can be subtle: a slight discolouration here, a texture shift there. Patterns too small and too inconsistent for the human eye to catch reliably across thousands of plants.

This is exactly the kind of problem that machine learning is built to solve. Not because AI is smarter than humans, but because it can process patterns at a scale and consistency that humans cannot.

The model never gets told what to look for. It just sees enough examples until it learns to see what we can't.

That is the core insight behind this project — and the reason I found it worth building.

Session 2 / 6

The Dataset

41,274 leaf photos, 16 disease classes, and what I learned from exploring the data before training anything.

PlantVillage Dataset

The dataset I used is called PlantVillage — a publicly available collection of leaf images created specifically for plant disease research. It is available for free on Kaggle.

My version of the dataset contained:

41,274
Leaf images
16
Disease classes
5+
Plant types
Class distribution (images per class)
Tomato Bacterial Spot
4,254
Tomato Healthy
3,072
Tomato Late Blight
2,941
Tomato Early Blight
2,764
Pepper Bell Healthy
2,322
Tomato Septoria Spot
2,127
Potato Late Blight
1,198
Potato Healthy
304 !
Class imbalance: largest class has 14× more images than smallest — a key challenge during training.
What's Inside

The dataset covers multiple plant types including Tomato, Potato, Pepper, and others. Each class is a combination of plant and condition — for example: Tomato___Late_blight or Pepper___bell___healthy.

One thing I noticed immediately was the class imbalance. Some classes had over 4,000 images (Tomato Bacterial Spot: 4,254) while others had as few as 304 (Potato Healthy). This kind of imbalance matters — it can cause the model to "favour" the majority classes and struggle with rarer ones.

Contoh gambar dari dataset PlantVillage

Sample images from the PlantVillage dataset — each class represents a unique plant-disease combination.

This is why data exploration matters before you write a single line of model code. The data tells you what problems to expect.

Understanding this upfront shaped how I approached the training process — specifically around data augmentation, which I'll cover in the next section.

Data Augmentation

To compensate for class imbalance and make the model more robust, I applied data augmentation during training. This means artificially creating variations of existing images by:

Rotation (up to 20°)
A leaf photographed at an angle should still be recognisable as the same disease.
Horizontal & vertical flip
Disease patterns don't change based on orientation.
Zoom (up to 20%)
Simulates photos taken from different distances.

The goal was to force the model to learn the actual features of each disease — not just memorise the specific photos it was trained on.

Session 3 / 6

The Model

Why I didn't build from scratch — and what Transfer Learning actually means in practice.

Transfer Learning

Building a deep learning model from scratch requires millions of images and weeks of training on expensive hardware. That was not an option for a semester 2 project running on a laptop with no GPU.

Instead, I used Transfer Learning — one of the most powerful ideas in modern machine learning.

Transfer Learning means taking a model that has already learned to understand images — trained on millions of photos — and reusing that knowledge for a new, specific task.

Think of it like this: if you already know how to read, learning a new language is much faster than learning to read from scratch. The model already understands shapes, textures, edges, and patterns. I just needed to teach it what diseased leaves specifically look like.

MobileNetV2

The pre-trained model I chose is called MobileNetV2 — a lightweight neural network architecture designed to run efficiently even on mobile devices. It was pre-trained on ImageNet, a dataset of 14 million images across 1,000 categories.

My architecture looked like this:

# Base: MobileNetV2 (frozen, pre-trained on ImageNet) MobileNetV2 → GlobalAveragePooling2D ↓ # My custom classifier layers Dense(256, activation='relu') BatchNormalization Dropout(0.3) Dense(128, activation='relu') Dropout(0.2) ↓ # Output: 16 disease classes Dense(16, activation='softmax')

The base MobileNetV2 was frozen at first — meaning its weights did not change during training. Only my custom layers on top were trained. This is called feature extraction.

In the second phase, I unfroze the last 20 layers of MobileNetV2 for fine-tuning — allowing the model to slightly adjust its understanding of images specifically for leaf disease patterns.

Model Architecture — Visual Overview
Input Image
224 × 224 × 3
Frozen (Phase 1)
MobileNetV2
Pre-trained · ImageNet
Global Avg Pool
1280-dim vector
Dense 256 · ReLU
BatchNorm + Dropout 0.3
Dense 128 · ReLU
Dropout 0.2
Output
16 classes · Softmax
Phase 1: Feature Extraction — only top layers trained
Phase 2: Fine-tuning — last 20 MobileNetV2 layers unfrozen
Tech Stack
Python 3.11 TensorFlow 2.x Keras MobileNetV2 NumPy Pillow Matplotlib Scikit-learn
Session 4 / 6

Training & Results

What the training graphs actually mean — and an honest account of what went wrong.

The Training Process

Training happened in two phases across 15 epochs on a standard laptop CPU — no GPU. Each epoch took roughly 3-4 minutes, making the total training time around 45-60 minutes.

Phase 1 trained only my custom layers on top of the frozen MobileNetV2 base. Phase 2 unfroze the last 20 layers for fine-tuning with a much lower learning rate.

The Honest Results

The training accuracy reached around 57-58%, while validation accuracy peaked at around 31% before declining.

This gap is called overfitting.

Training vs Validation Accuracy — 15 Epochs
60% 45% 30% 15% 0% ← OVERFITTING ZONE Training acc (~57%) Validation acc (~31%)
The diverging gap between training and validation accuracy is a textbook overfitting signature.
Training and validation accuracy/loss graph

Training vs Validation accuracy & loss over 15 epochs — the diverging curves clearly show overfitting.

Overfitting is when a model learns the training data too well — it memorises the specific examples it was trained on instead of learning the underlying patterns. The result: it performs well on data it has seen, and poorly on data it hasn't.

The analogy I keep coming back to: a student who memorises every past exam paper but fails when the questions change even slightly. The knowledge is there, but the understanding is not generalised.

Why Did This Happen?

Several factors contributed to the overfitting:

1
Class imbalance
Some classes had 14x more images than others, causing the model to favour majority classes.
2
Limited dataset variety
My subset had only 16 classes. The full PlantVillage dataset has 38, meaning the model had less diversity to generalise from.
3
No GPU
Training on CPU limited the number of epochs and batch size experimentation I could do.
I am not hiding this. Overfitting is one of the most common problems in machine learning — and documenting it honestly is more valuable than pretending it does not exist.
Session 5 / 6

The Web App

Turning a trained model into something anyone can use — from Streamlit to Hugging Face Spaces.

From Model to Product

A model that only runs locally on my laptop is not useful to anyone else. The point of this project was always to build something that could actually reach the people it was designed for.

So I built a web interface using Gradio — a Python library that lets you wrap any ML model in a clean, interactive UI with minimal code.

The flow is simple:

Demo: uploading a leaf photo and getting a disease prediction in real time.

1
Upload a leaf photo
Any format — JPG or PNG, from a phone or camera.
2
Image preprocessing
The image is resized to 224×224px and normalised to values between 0 and 1.
3
Model inference
The image passes through MobileNetV2, which extracts features, and through the classifier, which outputs probabilities for each of the 16 disease classes.
4
Result displayed
The top prediction is shown with the plant name, condition, and confidence percentage.
Real Examples — What the App Actually Does

Here are two real predictions I ran using the live app — one on a clearly diseased leaf, and one on a healthy one. This is exactly what a farmer (or anyone) would see after uploading a photo.

Case 1 — Diseased Leaf
Diseased leaf prediction screenshot
Diseased
Plant PlantVillage
Condition Unknown
Confidence 70.5%
Other possibilities
Tomato — Septoria leaf spot (19.2%)
Pepper bell — Bacterial spot (5.0%)
The leaf in this photo had visible necrotic holes and yellowing — clear signs of disease. The model flagged it as diseased at 70.5% confidence. The label "Unknown" shows a known model limitation: it detects disease presence but sometimes can't pinpoint the exact type.
Case 2 — Healthy Leaf
Healthy leaf prediction screenshot
Healthy
Plant Tomato
Condition Healthy
Confidence 68.5%
Other possibilities
PlantVillage — Unknown (30.8%)
Tomato — Late blight (0.3%)
A clean, uniformly green leaf — no spots, no discolouration. The model correctly identified it as a healthy Tomato leaf at 68.5%. The 30.8% "Unknown" alternative reflects the model's uncertainty, which is honest behaviour rather than false confidence.
Both predictions are not perfect — but they demonstrate the core function working. A diseased leaf gets flagged. A healthy leaf gets cleared. That is the minimum viable signal a farmer needs.
Deployment

The app is deployed on Hugging Face Spaces — a free hosting platform for ML applications. It runs 24/7, accessible to anyone with a browser, without any login required.

The entire infrastructure stack:

Gradio (UI) TensorFlow CPU Hugging Face Spaces GitHub (version control)
Session 6 / 6

What's Next

The lessons learned, what I would do differently, and where this project goes from here.

What I Would Fix

If I were to retrain this model today, these are the first things I would change:

1
Download the full 38-class dataset
More diversity = better generalisation. The 16-class subset was a limitation.
2
Increase Dropout to 0.5
More aggressive regularisation to combat overfitting.
3
Add class weights
To compensate for the class imbalance during training.
4
Lower the learning rate further
A smaller learning rate often leads to better generalisation in fine-tuning.
What I Learned

Beyond the technical skills, this project taught me something more important: the difference between building something that works in a notebook and building something that works in the world.

Deploying a model — dealing with dependency conflicts, Python version mismatches, file format issues, and server environments — is a different skill entirely from training one. And it matters just as much.

The best model in the world is useless if nobody can access it. Building is not finished when the training is done. Building is finished when someone else can use it.

This project is not perfect. The accuracy is lower than I want it to be. The overfitting is real. But it is live, it is functional, and it taught me more than any tutorial ever could.

Try It

The app is live. Upload a leaf photo — from your phone, from the internet, from your garden — and see what the model thinks.

End of post