Skip to content

Conversation

@nickbirnberg
Copy link
Member

@nickbirnberg nickbirnberg commented Dec 16, 2025

What does this PR do?

Adds a new optional options section to the DatadogPodAutoscalerSpec in the v1alpha2 API. This section initially contains an outOfMemory configuration with a bumpUpRatio field that defines the ratio to multiply memory by when OOM events are detected.

Example usage:

  spec:
    options:
      outOfMemory:
        bumpUpRatio: "1.2"  # Multiply memory by 1.2 (20% increase) on OOM

Motivation

Support memory scale-up configuration when out-of-memory events are detected, allowing users to specify how much to increase memory allocation in response to OOM kills.

Additional Notes

  • The bumpUpRatio field uses resource.Quantity type to avoid floating point precision issues in CRDs
  • A value of "1.2" means increase memory by 20% (multiply current by 1.2)
  • The new types are defined in the common package (api/datadoghq/common/datadogpodautoscaler_types.go) for potential reuse

Minimum Agent Versions

N/A - This is an API-only change for the operator CRD.

  • Agent: N/A
  • Cluster Agent: N/A

Describe your test plan

  1. Apply a DatadogPodAutoscaler with the new options field:
apiVersion: datadoghq.com/v1alpha2
kind: DatadogPodAutoscaler
metadata:
  name: test-dpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  owner: Local
  options:
    outOfMemory:
      bumpUpRatio: "1.2"
  1. Verify the resource is accepted by the API server
  2. Verify kubectl get dpa test-dpa -o yaml shows the options field correctly

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label

@codecov-commenter
Copy link

codecov-commenter commented Dec 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 38.09%. Comparing base (8518b87) to head (12c7132).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2412   +/-   ##
=======================================
  Coverage   38.09%   38.09%           
=======================================
  Files         299      299           
  Lines       25182    25182           
=======================================
  Hits         9594     9594           
  Misses      14853    14853           
  Partials      735      735           
Flag Coverage Δ
unittests 38.09% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8518b87...12c7132. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// requests:
// minAllowed:
// maxAllowed:
// recommendationOptions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking if we should include recommendations because it's not really tied to recommendations, technically it could applied in-cluster on top of recommendations.

What do you think about

options | settings:
  outOfMemory:
    increaseFactor: 20% (same syntax as `Deployment.MaxUnavailable`?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's good for me, I updated with the hierarchy. However, I kept the bumpUpRatio name and use of a multiplier for familiarity (with VPA) and to reduce confusion. but can go to percentages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to align with VPA then.

@nickbirnberg nickbirnberg force-pushed the nickbirnberg/CASCL-628-support-memory-scale-up-factor branch from 5c5a0e9 to 3093276 Compare January 14, 2026 19:06
@nickbirnberg nickbirnberg added enhancement New feature or request qa/skip-qa labels Jan 14, 2026
@nickbirnberg nickbirnberg marked this pull request as ready for review January 14, 2026 20:31
@nickbirnberg nickbirnberg requested review from a team as code owners January 14, 2026 20:31
@nickbirnberg
Copy link
Member Author

/merge

@gh-worker-devflow-routing-ef8351
Copy link

gh-worker-devflow-routing-ef8351 bot commented Jan 15, 2026

View all feedbacks in Devflow UI.

2026-01-15 14:13:06 UTC ℹ️ Start processing command /merge


2026-01-15 14:13:22 UTC ℹ️ MergeQueue: waiting for PR to be ready

This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
It will be added to the queue as soon as checks pass and/or get approvals. View in MergeQueue UI.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.


2026-01-15 14:29:42 UTC ℹ️ MergeQueue: merge request added to the queue

The expected merge time in main is approximately 1h (p90).


2026-01-15 15:36:33 UTC ℹ️ MergeQueue: This merge request was merged

@dd-mergequeue dd-mergequeue bot merged commit 670fad6 into main Jan 15, 2026
46 checks passed
@dd-mergequeue dd-mergequeue bot deleted the nickbirnberg/CASCL-628-support-memory-scale-up-factor branch January 15, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants