Implement EnabledPeriod for scheduled scaling activity by artazar · Pull Request #243 · truefoundry/KubeElasti

artazar · 2026-02-04T09:59:47Z

Description

Introduce scheduled scale-down activity of KubeElasti.

In short: add enabledPeriod specification for ElastiService custom object that supports adding a cron schedule and duration for the KubeElasti activity. Outside of this configured schedule no scale-down should happen. By default this is set to "always".

Fixes #242

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

[+] Added comprehensive tests
[+] Build and deployed a custom image and verified that it works as expected in my cluster

Checklist:

[+] I have performed a self-review of my own code
[+] I have made corresponding changes to the documentation
[+] I have added tests that prove my fix is effective or that my feature works
[+] New and existing unit tests pass locally with my changes
[+] A PR is open to update the helm chart (link)[https://github.com/truefoundry/elasti/tree/main/charts/elasti] if applicable
[+] I have updated the CHANGELOG.md file with the changes I made

Additional notes:

Note this has been implemented with the help of Claude Code but under strict supervision and solid integration testing.

Summary by CodeRabbit

New Features
- Scheduled activity for ElastiService: add enabledPeriod (cron schedule + duration) to control when scale-to-zero is allowed.
Documentation
- Configuration guide, examples, defaults (daily midnight / 24h), UTC note, and behavior when omitted or invalid.
Behavior
- Outside enabledPeriod prevents scale-down; invalid/unparsable schedule or duration logs warnings and is treated fail-open.
CRD & Samples
- CRD schema and sample manifests updated to include enabledPeriod and examples.
Tests
- Added unit tests covering parsing and enabled-period logic.

coderabbitai · 2026-02-04T09:59:57Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds scheduled enabled periods to ElastiService (cron schedule + duration): new API/CRD fields and deepcopy, a cron utility (parsing/validation/window checks) with tests, integrates enabled-period checks into scaling decision to gate scale-downs, updates docs/samples, and adds the cron dependency.

Changes

Cohort / File(s)	Summary
API & CRD `operator/api/v1alpha1/elastiservice_types.go`, `operator/api/v1alpha1/zz_generated.deepcopy.go`, `operator/config/crd/bases/elasti.truefoundry.com_elastiservices.yaml`	Add `EnabledPeriod` type and `spec.enabledPeriod` (optional) with `schedule` and `duration`; generate deepcopy funcs and extend CRD schema with defaults.
Documentation & Samples `docs/src/gs-configure-elastiservice.md`, `operator/config/samples/elasti_v1alpha1_elastiservice.yaml`, `CHANGELOG.md`	Document `enabledPeriod` fields, semantics, examples, defaults, UTC/cron guidance; add sample manifest using `enabledPeriod`; add changelog entry.
Cron utility & tests `pkg/cronutil/cronutil.go`, `pkg/cronutil/cronutil_test.go`, `pkg/go.mod`	New cron utility: `ParseCronSchedule`, `ValidateDuration`, `IsInEnabledPeriod` (and injectable `IsInEnabledPeriodAt`) plus helper to find last trigger; comprehensive unit tests; add `github.com/robfig/cron/v3` dependency.
Scaling logic & tests `pkg/scaling/scale_handler.go`, `pkg/scaling/scale_handler_test.go`	Integrate enabled-period check into scale decision flow to prevent scale-down outside the enabled window; add `ScaleHandler.isInEnabledPeriod` with defaults/validation and handle errors by warning and assuming enabled; update scale-handler tests.
Operator modules `operator/go.mod`	Add indirect dependency on `github.com/robfig/cron/v3` required by packages.

Sequence Diagram

sequenceDiagram
    participant ScaleHandler as ScaleHandler
    participant CronUtil as CronUtil
    participant CronParser as CronParser

    ScaleHandler->>ScaleHandler: calculateScaleDirection()
    alt enabledPeriod present
        ScaleHandler->>ScaleHandler: isInEnabledPeriod(enabledPeriod)
        ScaleHandler->>CronUtil: ValidateDuration(duration)
        CronUtil-->>ScaleHandler: duration or error
        alt duration invalid
            ScaleHandler->>ScaleHandler: log warning, assume enabled (prevent scale-down)
        else duration valid
            ScaleHandler->>CronUtil: IsInEnabledPeriod(schedule, duration)
            CronUtil->>CronParser: ParseCronSchedule(5-field)
            CronParser-->>CronUtil: Schedule or error
            alt schedule invalid
                CronUtil-->>ScaleHandler: error
            else schedule valid
                CronUtil->>CronUtil: findLastTriggerTime(now, lookback)
                CronUtil-->>ScaleHandler: bool (inPeriod)
            end
        end
        alt not in enabled period
            ScaleHandler->>ScaleHandler: return ScaleUp (block scale-down)
        else in enabled period
            ScaleHandler->>ScaleHandler: continue normal scaling logic
        end
    else no enabledPeriod
        ScaleHandler->>ScaleHandler: continue normal scaling logic
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble cron lines under moonlit charts,
Windows of running stitched with tiny hearts.
Schedules whisper, durations counted clear,
Nightly hush, daytime cheer.
Hop—scale sleeps softly through the year.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Implement EnabledPeriod for scheduled scaling activity' accurately summarizes the main change: adding EnabledPeriod functionality for controlling when scale-down occurs based on schedule.
Linked Issues check	✅ Passed	All coding requirements from issue `#242` are met: scheduled scale-down capability via enabledPeriod field with cron schedule support, duration configuration, fail-open defaults, and comprehensive test coverage.
Out of Scope Changes check	✅ Passed	All changes are in scope. Beyond core EnabledPeriod implementation, sample YAML updates provide helpful documentation context, and go.mod updates add only the required cron library dependency.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

artazar · 2026-02-04T10:07:01Z

@coderabbitai review

coderabbitai · 2026-02-04T10:07:10Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@docs/src/gs-configure-elastiservice.md`:
- Around line 146-189: Add a single trailing newline at the end of the markdown
file that contains the "### **5. EnabledPeriod: Control when scale-to-zero is
active (Optional)**" section so the file ends with a newline character (fixes
the markdownlint missing-trailing-newline warning); simply open the document and
ensure the last line is followed by one newline and save.

In `@pkg/cronutil/cronutil_test.go`:
- Around line 122-195: TestIsInEnabledPeriod currently only checks for errors
and ignores tt.want and tt.mockNow; add a time-injectable helper
IsInEnabledPeriodAt(schedule string, duration time.Duration, now time.Time)
(bool, error) in pkg/cronutil/cronutil.go that parses the schedule (use
ParseCronSchedule), finds the last trigger (reuse/findLastTriggerTime) within
duration+24h, computes enabledUntil := lastTrigger.Add(duration), and returns
now.UTC().Before(enabledUntil); make IsInEnabledPeriod call
IsInEnabledPeriodAt(time.Now().UTC()). Update TestIsInEnabledPeriod to call
IsInEnabledPeriodAt(tt.schedule, tt.duration, tt.mockNow) and assert the
returned bool equals tt.want (and still check error expectations for invalid
schedules).

In `@pkg/cronutil/cronutil.go`:
- Around line 39-57: The current findLastTriggerTime implementation may iterate
every occurrence in the lookback window for high-frequency schedules; update
findLastTriggerTime to avoid linear scans by performing an exponential/backoff
search using schedule.Next to quickly find a time window that contains the last
trigger, then perform a binary search (by probing schedule.Next at midpoints)
between the last known-before-now and the first-after-now to locate the exact
lastTrigger; keep references to schedule, earliestTime/now/lookbackWindow,
triggerTime and lastTrigger when describing the probes and ensure you still
return lastTrigger (zero if none found).

docs/src/gs-configure-elastiservice.md

pkg/cronutil/cronutil_test.go

pkg/cronutil/cronutil.go

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@operator/api/v1alpha1/elastiservice_types.go`:
- Around line 29-41: The EnabledPeriod struct accepts arbitrary Schedule and
Duration strings; add CRD-level validation so malformed cron or duration values
are rejected early: add kubebuilder validation tags to the Schedule and Duration
fields on EnabledPeriod (e.g.,
+kubebuilder:validation:Pattern="<cron-5-field-regex>" for Schedule and
+kubebuilder:validation:Pattern="<duration-regex>" for Duration) to enforce a
5-item UTC cron and supported time-unit formats, or alternatively implement an
admission webhook validating EnabledPeriod.Schedule and EnabledPeriod.Duration
in the webhook handler (validate functions that parse/regex-check values) so
invalid values are denied at create/update.

In `@pkg/cronutil/cronutil_test.go`:
- Around line 197-225: The subtests in TestIsInEnabledPeriodEdgeCases are
non-deterministic because they rely on time.Now(); change both subtests to use
fixed UTC timestamps (e.g., time.Date(...).UTC()) to construct the cron
"schedule" and to compute any expected boundaries, call
IsInEnabledPeriod(schedule, duration) and assert the returned boolean (use the
returned value for the "just after trigger time" test and for the "long
duration" test assert true). Update references in the test to use the fixed time
instead of time.Now() and ensure you check the returned got value (and err) in
each subtest so the assertions are deterministic and explicit.

In `@pkg/scaling/scale_handler_test.go`:
- Around line 53-71: The test helper testPollingIntervalParsing duplicates
production logic and diverges on env var name/validation; extract the
parsing/validation into a single production function (e.g., ParsePollingInterval
or GetPollingInterval) that reads the correct env var (with optional legacy
fallback), applies the same rules (ParseDuration, positive check, default 30s,
and emits warnings via the same logger), update StartScaleDownWatcher to call
that new function, and make the test call the production function instead of
using testPollingIntervalParsing so both code paths share identical behavior.

In `@pkg/scaling/scale_handler.go`:
- Around line 184-197: The enabled-period check currently logs and falls through
on error from h.isInEnabledPeriod, which can allow scale-down; change the error
path to "fail closed" by treating parse/validation errors as not enabled: when
h.isInEnabledPeriod(es.Spec.EnabledPeriod) returns an error, log the warning
(h.logger.Warn...) and return ScaleUp, nil (or otherwise prevent scale-down /
enforce min replicas) so the handler blocks scale-down on EnabledPeriod
evaluation failures.

operator/api/v1alpha1/elastiservice_types.go

pkg/cronutil/cronutil_test.go

pkg/scaling/scale_handler_test.go

pkg/scaling/scale_handler.go

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@pkg/scaling/scale_handler.go`:
- Around line 187-201: The inline comment in the enabled-period check is stale:
when isInEnabledPeriod returns an error the code returns ScaleUp (preventing
scale-down) but the comment says "assume enabled to avoid disruption." Update
that comment to reflect the actual behavior in the block (e.g., "On error,
default to preventing scale-down" or "assume disabled — prevent scale-down")
near the isInEnabledPeriod call and the return ScaleUp to avoid confusion; no
logic change to the return or h.logger usage.

pkg/scaling/scale_handler.go

ramantehlan

Left a comment.

pkg/cronutil/cronutil.go

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

pkg/scaling/scale_handler.go (1)

187-200: The fail-closed behavior (Lines 190–194) was previously flagged and addressed.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/scaling/scale_handler.go` around lines 187 - 200, The current fail-closed
behavior in the enabled-period check (using es.Spec.EnabledPeriod and
h.isInEnabledPeriod) returns ScaleUp on error which prevents scale-down; change
this to fail-open so that on error you log the warning via h.logger.Warn
(including service and err) but return ScaleDown, nil instead of ScaleUp, nil,
while keeping the existing branch that returns ScaleUp when enabled==false
(outside period) unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71f0129 and 5c026fd.

📒 Files selected for processing (1)

pkg/scaling/scale_handler.go

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/scaling/scale_handler.go`:
- Around line 195-199: The code currently returns ScaleUp when
calculateScaleDirection indicates scaling outside the enabled window, which
refreshes UpdateLastScaledUpTime and prevents subsequent scale-to-zero; change
that branch in calculateScaleDirection/handleScaleFromZero to return NoScale
instead of ScaleUp (and ensure you do not call UpdateLastScaledUpTime when no
actual scale-up will occur) so the cooldown timestamp isn't updated on these
polling cycles and scale-down can proceed when the window opens.

---

Duplicate comments:
In `@pkg/scaling/scale_handler.go`:
- Around line 187-200: The current fail-closed behavior in the enabled-period
check (using es.Spec.EnabledPeriod and h.isInEnabledPeriod) returns ScaleUp on
error which prevents scale-down; change this to fail-open so that on error you
log the warning via h.logger.Warn (including service and err) but return
ScaleDown, nil instead of ScaleUp, nil, while keeping the existing branch that
returns ScaleUp when enabled==false (outside period) unchanged.

pkg/scaling/scale_handler.go

artazar · 2026-02-23T08:23:29Z

@ramantehlan good from my side, please check once more, thank you!

ramantehlan · 2026-02-24T21:20:27Z

Thank you @artazar, will have a look

artazar force-pushed the main branch from 2e31cb9 to 801e79a Compare February 4, 2026 10:04

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

docs/src/gs-configure-elastiservice.md Outdated Show resolved Hide resolved

pkg/cronutil/cronutil_test.go Show resolved Hide resolved

pkg/cronutil/cronutil.go Show resolved Hide resolved

artazar force-pushed the main branch 2 times, most recently from cf8c414 to 80168e6 Compare February 5, 2026 02:03

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

operator/api/v1alpha1/elastiservice_types.go Show resolved Hide resolved

pkg/cronutil/cronutil_test.go Show resolved Hide resolved

pkg/scaling/scale_handler_test.go Show resolved Hide resolved

pkg/scaling/scale_handler.go Show resolved Hide resolved

Implement EnabledPeriod for scheduled scaling activity

ae30dbf

artazar force-pushed the main branch from 80168e6 to ae30dbf Compare February 5, 2026 02:48

Merge branch 'main' into main

71f0129

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

pkg/scaling/scale_handler.go Show resolved Hide resolved

ramantehlan approved these changes Feb 12, 2026

View reviewed changes

ramantehlan reviewed Feb 12, 2026

View reviewed changes

pkg/scaling/scale_handler.go Show resolved Hide resolved

ramantehlan requested changes Feb 12, 2026

View reviewed changes

ramantehlan reviewed Feb 12, 2026

View reviewed changes

pkg/cronutil/cronutil.go Show resolved Hide resolved

fix: improve message on isInEnabledPeriod check

5c026fd

coderabbitai bot reviewed Feb 23, 2026

View reviewed changes

pkg/scaling/scale_handler.go Show resolved Hide resolved

artazar requested a review from ramantehlan February 24, 2026 04:41

Comments

Conversation

artazar commented Feb 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Checklist:

Additional notes:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

artazar commented Feb 4, 2026

Uh oh!

coderabbitai bot commented Feb 4, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ramantehlan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

artazar commented Feb 23, 2026

Uh oh!

ramantehlan commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

artazar commented Feb 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 4, 2026 •

edited

Loading