π§ DeepMedicoβ’: Health Sensing & Sleep Breathing Irregularity Detection
An end-to-end deep learning pipeline for sleep breathing irregularity detection and sleep stage classification
This repository contains a Python script designed to load, parse, and visualize physiological data collected during a sleep study for subject AP20. The script processes various CSV files containing time-series health sensing data, including airflow, SpO2 levels, respiratory efforts, sleep stages, and detected flow events.

- Abstract
- Keywords
- Introduction
- Features
- Dataset
- Pipeline Architecture
- Methodology
- Directory Structure
- Usage
- Input Format
- Requirements
- Outputs
- Advanced Notes
Sleep-related breathing disorders such as Obstructive Sleep Apnea (OSA) and Hypopnea require accurate and scalable detection systems. DeepMedicoβ’ presents a research-grade, end-to-end framework that integrates:
Multi-modal physiological signal visualization
Signal preprocessing and dataset engineering
Deep temporal modeling using CNN and Conv-LSTM
Subject-independent evaluation using Leave-One-Participant-Out CV
The pipeline is designed to be modular, reproducible, and clinically relevant.
Sleep Study Β· Apnea Β· Hypopnea Β· Physiological Signals Β· Time-Series Β· CNN Β· Conv-LSTM Β· SpOβ Β· Respiration Β· Deep Learning
Manual polysomnography (PSG) analysis is costly and time-consuming. Automated systems must handle:
Inter-subject variability
Temporal dependencies
Strict evaluation protocols to prevent data leakage
DeepMedicoβ’ addresses these challenges with a fully automated and explainable pipeline, spanning visualization to model evaluation.
- Multi-File Processing: Automatically loads and processes multiple CSV files related to a single sleep study session.
- Data Parsing: Handles various data formats, including semicolon-separated values and different metadata structures (e.g., skipping header rows).
- Time-Series Analysis: Converts string timestamps into datetime objects for accurate time-series plotting.
- Visualization: Generates dedicated plots for:
- Sleep Flow Events (e.g., Hypopnea)
- Sleep Profile (Sleep Stages)
- SpO2 (Blood Oxygen Saturation)
- Thoracic Respiration
- Airflow
Each participantβs overnight recording includes:
Nasal airflow
Thoracic respiratory movement
SpOβ levels
Expert-annotated breathing events
Sleep stage labels DATASET : https://drive.google.com/drive/folders/1J95cTl574LLdj4uelYwjyv0094d8sOpD?usp=sharing
DeepMedicoβ’ Sleep Breathing Irregularity Detection System A complete end-to-end pipeline for detecting breathing irregularities (e.g., Apnea, Hypopnea) and classifying sleep stages from overnight sleep study signals using deep learning.
Raw Signals
β
Visualization (EDA & QC)
β
Preprocessing & Filtering
β
Windowing & Labeling
β
Parquet Dataset
β
CNN / Conv-LSTM Models
β
LOPO Cross-Validation
π Signal Visualization
Multi-signal time-aligned plots
Apnea/Hypopnea overlays
Exported as per-participant PDFs for EDA and QC
π§Ή Signal Preprocessing
Bandpass filtering (0.17β0.4 Hz)
Timestamp normalization
Noise and drift suppression
π¦ Dataset Engineering
30-second windows
50% overlap
Event-based labeling
Parquet storage for efficiency
π€ Deep Learning Models
1D CNN β local temporal features
Conv-LSTM β long-range dependencies
Multi-class classification
π Evaluation Protocol
Leave-One-Participant-Out CV
Per-class Precision / Recall / F1
Mean Β± Std across folds
Bonus Task: Sleep stage classification using the same framework.
Highly Modular: Each step can be run independently.
DeepMedico/
βββ Data/
β βββ AP20/ # Example participant folder
β β βββ nasal_airflow.csv # Nasal airflow signal (timestamp, value)
β β βββ thoracic_movement.csv # Thoracic respiration signal
β β βββ spo2.csv # Blood oxygen saturation (SpOβ)
β β βββ events.csv # Apnea / Hypopnea annotations
β β βββ sleep_profile.csv # Sleep stage labels
β βββ ... # Other participants (AP21, AP22, ...)
β
βββ Visualizations/ # Signal + annotation PDFs
βββ Dataset/ # Windowed breathing-event dataset (Parquet)
βββ SleepStageDataset/ # Sleep stage dataset & features
βββ Results/ # Metrics, logs, CV outputs
β
βββ vis.py # Signal visualization (EDA & QC)
βββ create_dataset.py # Preprocessing & windowing
βββ modeling.py # CNN / Conv-LSTM training
βββ sleep_stage_classification.py # Sleep stage classification (bonus)
β
βββ requirements.txt # Python dependencies
βββ setup.py # Package installation
python vis.py -name "Data/AP20" Create Dataset
Cleans and filters the signals.
Segments into 30s windows with 50% overlap.
Labels windows according to breathing event overlap.
python create_dataset.py -in_dir "Data" -out_dir "Dataset" --format parquet Train & Evaluate Models 1D CNN and Conv-LSTM, evaluated with leave-one-participant-out CV.
Per-class/classification metrics and mean/std result tables. complete execution pipeline:
python vis.py -name "Data/AP20" python vis.py -name "Data/AP21" python vis.py -name "Data/AP22" python vis.py -name "Data/AP23" python vis.py -name "Data/AP24"
python create_dataset.py -in_dir "Data" -out_dir "Dataset" --format parquet
python modeling.py --dataset "Dataset/sleep_breathing_dataset.parquet" --model both --epochs 100
python sleep_stage_classification.py -in_dir "Data" -out_dir "SleepStageDataset" --train
Snippets: python modeling.py --dataset "Dataset/sleep_breathing_dataset.parquet" --model both --epochs 100 (Bonus) Sleep Stage Classification Same pipeline as above, but with sleep stage rather than breathing event labels. Snippets: python sleep_stage_classification.py -in_dir "Data" -out_dir "SleepStageDataset" --train
Each participant subfolder (e.g. AP20/) should contain:
nasal_airflow.csv (timestamp,value)
thoracic_movement.csv (timestamp,value)
spo2.csv (timestamp,value)
events.csv (start_time,end_time,event_type)
sleep_profile.csv (start_time,end_time,sleep_stage)
Timestamps must be in an unambiguous format (ideally ISO 8601).
Python >= 3.8
See requirements.txt
Install requirements:
pip install -r requirements.txt Or install as a package:
python setup.py install Example Pipeline (All Steps) bash python vis.py -name "Data/AP20" python vis.py -name "Data/AP21"
python create_dataset.py -in_dir "Data" -out_dir "Dataset" --format parquet
python modeling.py --dataset "Dataset/sleep_breathing_dataset.parquet" --model both --epochs 100
python sleep_stage_classification.py -in_dir "Data" -out_dir "SleepStageDataset" --train
Visualizations/: per-participant signal PDF files (EDA).
Dataset/: Parquet file with windowed features and labels.
Results/: Model performance metrics (JSON and logs).
SleepStageDataset/: Sleep stage dataset, metadata, and (if --train) model performance.
Filtering: Bandpass 0.17-0.4 Hz (removes movement artifacts and drift).
Windowing: 30 seconds, 50% overlap, matching standard sleep study analysis.
Class Labels: 'Normal', 'Hypopnea', 'Obstructive Apnea' (event labeling).
Sleep Stages: 'Wake', 'N1', 'N2', 'N3', 'REM' (bonus/extension).
Evaluation: Only leave-one-subject-out prevents data leakage. Random splits are inappropriate for personalized physiological data.
