An end-to-end AI pipeline for brain MRI/fMRI analysis using the Kaggle API, deep learning (CNN, ResNet50, VGG16), and explainable AI (Grad-CAM) to detect neurological disorders such as brain tumors.
This project develops a Kaggle API-based AI pipeline to:
- Access and download fMRI/MRI brain scan datasets via Kaggle API
- Preprocess neuroimaging data (normalization, denoising, segmentation)
- Perform Exploratory Data Analysis (EDA) with heatmaps and visualizations
- Train CNN models from scratch and via transfer learning (ResNet50, VGG16)
- Evaluate models using Accuracy, AUC-ROC, Precision, Recall, F1-Score
- Interpret model decisions using Grad-CAM explainability
This project uses the Brain MRI Images for Brain Tumor Detection dataset from Kaggle.
| Property | Details |
|---|---|
| Source | Kaggle — navoneel/brain-mri-images-for-brain-tumor-detection |
| Size | 253 brain MRI images |
| Classes | Tumor (Yes) / No Tumor (No) |
| Format | JPEG images |
| License | Public / Open Access |
The dataset is not included in this repository.
TensorFlow version: 2.19.0
GPU available: True
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 128, 128, 32) │ 896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization │ (None, 128, 128, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D) │ (None, 128, 128, 32) │ 9,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 64, 64, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 64, 64, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D) │ (None, 64, 64, 64) │ 18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1 │ (None, 64, 64, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D) │ (None, 64, 64, 64) │ 36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 32, 32, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 32, 32, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_4 (Conv2D) │ (None, 32, 32, 128) │ 73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2 │ (None, 32, 32, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_5 (Conv2D) │ (None, 32, 32, 128) │ 147,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D) │ (None, 16, 16, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout) │ (None, 16, 16, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_average_pooling2d │ (None, 128) │ 0 │
│ (GlobalAveragePooling2D) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense) │ (None, 256) │ 33,024 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_3 │ (None, 256) │ 1,024 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout) │ (None, 256) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 1) │ 257 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 322,209 (1.23 MB)
Trainable params: 321,249 (1.23 MB)
Non-trainable params: 960 (3.75 KB)
from sklearn.svm import SVC from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score, classification_report
X_train_flat = X_train.reshape(X_train.shape[0], -1) X_test_flat = X_test.reshape(X_test.shape[0], -1)
from sklearn.decomposition import PCA pca_ml = PCA(n_components=100, random_state=RANDOM_SEED) X_train_pca = pca_ml.fit_transform(X_train_flat) X_test_pca = pca_ml.transform(X_test_flat)
print(" Training SVM...") svm_model = SVC(kernel='rbf', C=10, gamma='scale', probability=True, random_state=RANDOM_SEED) svm_model.fit(X_train_pca, y_train) svm_pred = svm_model.predict(X_test_pca) svm_acc = accuracy_score(y_test, svm_pred) print(f" SVM Test Accuracy: {svm_acc:.4f}")
print("\n Training Random Forest...") rf_model = RandomForestClassifier(n_estimators=200, max_depth=15, random_state=RANDOM_SEED, n_jobs=-1) rf_model.fit(X_train_pca, y_train) rf_pred = rf_model.predict(X_test_pca) rf_acc = accuracy_score(y_test, rf_pred) print(f" Random Forest Test Accuracy: {rf_acc:.4f}")
print("\n=== SVM Classification Report ===") print(classification_report(y_test, svm_pred, target_names=['No Tumor', 'Tumor']))
print("\n=== Random Forest Classification Report ===") print(classification_report(y_test, rf_pred, target_names=['No Tumor', 'Tumor']))
- Python 3.8 or higher
- Kaggle account with API key (Get it here)
- GPU recommended (Google Colab works perfectly)
git clone https://github.com/astro-keerthana/fmri-brain-scan-analysis.git
cd fmri-brain-scan-analysis
[Kaggle API] ──► [Download Dataset] ──► [DICOM/NIfTI/JPEG Loading]
│
▼
[Preprocessing: Resize → Denoise → Normalize]
│
▼
[EDA: Heatmaps, Distributions, Statistics]
│
▼
┌───────────────────────────────┼───────────────────────┐
▼ ▼ ▼
[Custom CNN] [ResNet50] [VGG16]
│ │ │
└───────────────────────────────┼───────────────────────┘
│
▼
[Evaluation: Accuracy, AUC, F1, Confusion Matrix]
│
▼
[Grad-CAM Explainability]
│
▼
[Final Report & Outputs]