Multi-Agent QLoRA-Optimized LLMs for Personalized Healthcare

Team: M. Nithyashree, Pavithra N, Krupa P. Nadgir

Summary: A lightweight, memory-efficient framework for deploying multi-task Large Language Models (LLMs) on edge devices in healthcare settings. Integrates real-time IoT sensing with quantized LoRA adapters to deliver diagnostic reasoning, health education, and one-day diet planning—all on resource-constrained hardware.

Features

Modular adapter-based fine-tuning
4-bit quantization with QLoRA for minimal memory usage
Three specialized agents:
- Diagnostic reasoning
- Personalized health tips generation
- One-day diet planning
Real-time IoT integration (heart rate, SpO₂, ECG, temperature)
Edge deployment of memory-efficient LLM models

Architecture

Base Model: DeepSeek-R1-Distill-LLaMA-8B

Hosted on Hugging Face, distilled and quantized to 4-bit

Adapters:

LoRA for low-rank parameter updates. QLoRA adds adaptive scaling for quantized weights. Stored as .safetensors, dynamically loaded per task

IoT Pipeline: Hardware sensors stream vitals to the edge device
Deployment: Single quantized model in memory, Task switching by loading specific adapter

Components

Component	Description
Diagnostic Agent	Interprets symptom descriptions and suggests possible medical conditions
Health Education Generator	Produces tailored preventive care and wellness guidance
One-Day Diet Planner	Generates culturally relevant, individualized meal plans based on real-time health metrics
Web Dashboard & API	User interface for text/voice input, visualization of vitals, and real-time recommendations

Prerequisites

Python 3.10+

PyTorch 2.x

bitsandbytes

transformers

peft (Parameter-Efficient Fine-Tuning library)

NVIDIA Jetson Orin Nano (8–16 GB RAM)

Results

Fine-tuned models achieve coherent, context-aware outputs vs. generic base model responses
Training convergence within ~1 k samples per task
GPU power usage under 80 W; inference on 4-bit model with < 16 GB RAM
Real-time switching between agents with minimal overhead

Key Contributions

Demonstrated feasibility of running advanced LLMs on low-power edge hardware
Introduced dynamic adapter loading to support multiple healthcare tasks from a single base model
Integrated live IoT data for truly personalized, context-aware AI-driven health insights

Research paper published

Here is the link to our research paper on this for detailed information:

[Multi-Agent QLoRA-Optimized LLMs for Personalized Healthcare : M Nithyashree, Krupa P. Nadgir, Pavithra N and Sindhu D. V.] (https://drive.google.com/file/d/1iOIQPCFURs6PIhn7Y7YVTBb5yoASxk0u/view)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
MedLLM_v3_final.ipynb		MedLLM_v3_final.ipynb
README.md		README.md
block_diag_med_llm.png		block_diag_med_llm.png
patient_profiles_final.csv		patient_profiles_final.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent QLoRA-Optimized LLMs for Personalized Healthcare

Features

Architecture

Components

Prerequisites

Results

Key Contributions

Research paper published

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent QLoRA-Optimized LLMs for Personalized Healthcare

Features

Architecture

Components

Prerequisites

Results

Key Contributions

Research paper published

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages