Skip to content

Commit 51fb801

Browse files
committed
[docs] Added "Persistent & Scheduled Firmware Upgrades"
1 parent d49d586 commit 51fb801

File tree

1 file changed

+161
-0
lines changed

1 file changed

+161
-0
lines changed

developer/gsoc-ideas-2026.rst

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1039,3 +1039,164 @@ Expected Outcomes
10391039
- Comprehensive **documentation**, including setup guides and best
10401040
practices.
10411041
- A **short tutorial video** demonstrating installation and usage.
1042+
1043+
Persistent & Scheduled Firmware Upgrades
1044+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1045+
1046+
.. image:: ../images/gsoc/ideas/2023/firmware.jpg
1047+
1048+
.. important::
1049+
1050+
Languages and technologies used: **Python**, **Django**, **Celery**,
1051+
**REST API**, **JavaScript**.
1052+
1053+
**Mentors**: *Federico Capoano*, *TBA*.
1054+
1055+
**Project size**: 350 hours.
1056+
1057+
**Difficulty rate**: medium.
1058+
1059+
This project aims to enhance `OpenWISP Firmware Upgrader
1060+
<https://github.com/openwisp/openwisp-firmware-upgrader>`__ with two
1061+
complementary features that improve reliability and operational
1062+
flexibility for mass firmware upgrades: **persistent retries** for offline
1063+
devices (`#379
1064+
<https://github.com/openwisp/openwisp-firmware-upgrader/issues/379>`__)
1065+
and **scheduled execution** for planned maintenance windows (`#380
1066+
<https://github.com/openwisp/openwisp-firmware-upgrader/issues/380>`__).
1067+
1068+
Currently, firmware upgrades in OpenWISP happen immediately via Celery
1069+
tasks. If a device is offline at the moment of upgrade, the task fails and
1070+
requires manual retry. In large deployments, this becomes unmanageable.
1071+
Additionally, network operators need the ability to schedule upgrades
1072+
during low-usage windows without manual intervention at execution time.
1073+
1074+
Expected outcomes
1075+
+++++++++++++++++
1076+
1077+
Introduce support for **persistent mass upgrades** that automatically
1078+
retry for offline devices and **scheduled mass upgrades** that execute at
1079+
a user-defined future time.
1080+
1081+
1. **Persistent mass upgrades** (`#379
1082+
<https://github.com/openwisp/openwisp-firmware-upgrader/issues/379>`__)
1083+
1084+
Mass upgrade operations should be able to retry indefinitely for
1085+
devices that are offline at the initial execution time.
1086+
1087+
- Add a ``persistent`` boolean field to mass upgrade operations
1088+
(visible in admin and REST API, checked by default, immutable after
1089+
creation).
1090+
- Track retry count and scheduled retry time in the
1091+
``UpgradeOperation`` model.
1092+
- Implement **device online detection**:
1093+
1094+
- Prefer using the ``health_status_changed`` signal from OpenWISP
1095+
Monitoring (with mocking for testing).
1096+
- Fallback: periodic retries with randomized exponential backoff
1097+
(configurable, max once every 12 hours).
1098+
1099+
- Implement **retry strategy**:
1100+
1101+
- Randomized exponential backoff with indefinite retries.
1102+
- Periodic reminders (default every 2 months) via
1103+
``generic_notification`` to admins about devices still pending
1104+
upgrade, with links filtering pending devices.
1105+
- Continue until admin cancels or all devices are upgraded.
1106+
1107+
- **Integration with Celery**: Use a new Celery task to "wake up"
1108+
pending upgrades, with randomized delays to prevent system overload.
1109+
- **Failure handling**: Use ``generic_notification`` for failures
1110+
requiring attention (devices offline too long, upgrade errors).
1111+
- **Edge cases**: Handle concurrent signal triggers, ensure only one
1112+
upgrade per device, no rollback support needed.
1113+
1114+
2. **Scheduled mass upgrades** (`#380
1115+
<https://github.com/openwisp/openwisp-firmware-upgrader/issues/380>`__)
1116+
1117+
Allow users to schedule mass upgrades for future execution.
1118+
1119+
- **UI**: Add optional datetime scheduling on mass upgrade confirmation
1120+
page. Default is immediate execution unless a future datetime is set.
1121+
- **Validation**: Scheduled datetime must be:
1122+
1123+
- In the future
1124+
- Respect minimum delay (e.g., 10 minutes)
1125+
- Not exceed maximum horizon (e.g., 6 months)
1126+
1127+
- **Timezone handling**: User input in browser timezone, storage in
1128+
UTC, server timezone clearly indicated in UI.
1129+
- **Status model**: Extend to include ``scheduled`` state with
1130+
transitions: scheduled → running, scheduled → canceled, scheduled →
1131+
failed.
1132+
- **Execution model**: Use Celery Beat periodic task (every minute) to
1133+
scan and execute due upgrades. **Avoid Celery eta/countdown** for
1134+
reliability with far-future tasks.
1135+
- **Runtime validation**: Re-evaluate devices, permissions, firmware
1136+
availability at execution time. Cancel with error if all targets
1137+
become invalid.
1138+
- **Conflict prevention**: Prevent creating conflicting mass upgrades
1139+
(scheduled or immediate) when one is already pending.
1140+
- **Notifications**: Send ``generic_notification`` when scheduled
1141+
upgrades start and complete.
1142+
1143+
3. **Combined features**
1144+
1145+
Scheduled upgrades should also support persistence. A scheduled upgrade
1146+
that starts but has offline devices should continue retrying according
1147+
to the persistence logic.
1148+
1149+
4. **General requirements**
1150+
1151+
- Operations editable only while in ``scheduled`` status.
1152+
- Clear exposure of scheduled status and datetime in admin list, detail
1153+
view, and REST API.
1154+
- Full feature parity between Django admin and REST API.
1155+
1156+
5. **Testing and documentation**
1157+
1158+
- Test coverage **must not decrease** from current levels.
1159+
- **Browser tests** for the scheduling UI and admin interface workflows
1160+
are required.
1161+
- Documentation has to be kept up to date, including:
1162+
1163+
- Usage instructions for persistent and scheduled upgrades.
1164+
- Updated screenshots reflecting UI changes.
1165+
- One short example usage video per each feature.
1166+
1167+
Prerequisites to work on this project
1168+
+++++++++++++++++++++++++++++++++++++
1169+
1170+
Applicants must demonstrate a solid understanding of:
1171+
1172+
- **Python**, **Django**, and **JavaScript**.
1173+
- REST APIs and background task processing (Celery, Celery Beat).
1174+
- Timezone handling and datetime management.
1175+
- Experience with `OpenWISP Firmware Upgrader
1176+
<https://github.com/openwisp/openwisp-firmware-upgrader>`__ is
1177+
essential. Contributions or resolved issues in this repository are
1178+
considered strong evidence of the required proficiency.
1179+
1180+
Open questions for contributors
1181+
+++++++++++++++++++++++++++++++
1182+
1183+
1. **Persistence implementation**: What is the optimal database schema for
1184+
tracking persistent upgrade state while maintaining compatibility with
1185+
existing upgrade operation models?
1186+
2. **Scheduling mechanism**: How exactly should the Celery Beat periodic
1187+
task be configured to reliably detect and execute due scheduled
1188+
upgrades without performance issues?
1189+
3. **Timezone UX**: What is the best way to handle timezone display and
1190+
input in the admin interface to minimize user confusion?
1191+
4. **Backoff strategy**: What are the optimal parameters for randomized
1192+
exponential backoff (initial delay, max delay, randomization factor)?
1193+
5. **Conflict detection**: How should conflicting operations be detected
1194+
and prevented? What defines a "conflict"?
1195+
6. **Monitoring integration**: How exactly should the
1196+
``health_status_changed`` signal from OpenWISP Monitoring be integrated
1197+
for optimal online detection?
1198+
7. **Notification frequency**: What are the optimal default periods for
1199+
reminder notifications about pending persistent upgrades?
1200+
8. **Edge case handling**: How should edge cases be handled, such as
1201+
devices that are offline for months, or mass upgrades with very large
1202+
device counts?

0 commit comments

Comments
 (0)