Skip to content

New Controller Interface#2899

Closed
termi-official wants to merge 63 commits intoSciML:masterfrom
termi-official:do/controller-refactor
Closed

New Controller Interface#2899
termi-official wants to merge 63 commits intoSciML:masterfrom
termi-official:do/controller-refactor

Conversation

@termi-official
Copy link
Contributor

@termi-official termi-official commented Oct 26, 2025

Resolves #2883 .

TODO

  • Upgrade non-standard controllers to new interface
  • Introduce multi-controller for algorithms with stiffness switching
  • Add appropriate documentation of the full API

Not addressed in this PR / Remaining things to be addressed

  • Reformulate PI controller to remove inverses (break semantics)
  • Remove redundant variables from integrator (breaking change)
  • Move controller cache into algorithm cache whenever appropriate (and maybe add the controller to the algorithm)
  • Remove q from step_accept_controller!(integrator, controller_cache::AbstractControllerCache, q)
  • Remove DummyController
  • Update DelayDiffEq and friends to use the new interface

erracc::QT
dtacc::tType
# TODO ^^^ remove these
controller_cache::CC
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at some of the algorithms, we even might want to think moving the controller into the algorithm cache together with some helpers to construct standard controllers.

@termi-official termi-official marked this pull request as ready for review October 27, 2025 18:25
@oscardssmith
Copy link
Member

@ChrisRackauckas what would you think about removing the Legacy controllers (the renamed version of the current ones). I don't think this version is less breaking, and the new controllers aren't breaking (except for people making their own controllers which I don't think is common)

@termi-official
Copy link
Contributor Author

[..] Legacy controllers (the renamed version of the current ones).

Please note that I have changed the names back. They now differ only by an intermediate supertype. So PIController <: AbstractLegacyController is the old one and NewPIController <: AbstractController is the new one.

@ChrisRackauckas
Copy link
Member

@ChrisRackauckas what would you think about removing the Legacy controllers (the renamed version of the current ones). I don't think this version is less breaking, and the new controllers aren't breaking (except for people making their own controllers which I don't think is common)

I think it's best to just have a deprecation path. v7 is coming soon and we just delete them with that. So if it's not too hard to have a deprecation path that would definitely help people with the upgrading. But yes the main user is likely @ranocha and the Trixi crew, I at least don't see any use cases publicly in the wild.

@ChrisRackauckas
Copy link
Member

While doing this refactor, maybe it would be good to add the vector of EEst to the controller? Right now we usually call this atmp, but it's not uniform across the methods. Basically, it would be good to make that vector of all error estimates be something that is uniformly handled, not just the norm(vector_EEst). The reason is I'm going to want to set it up with logging really soon, because knowing which value is causing step rejections is really helpful for knowing why a solver haulted. But in the current form you'd need to implement that separately per solver. It would be good to make sure this can be more nicely handled in the new controller interface, as that would then make it straightforward to add logging about error control in the controller interface rather than at per-solver level.

@termi-official
Copy link
Contributor Author

termi-official commented Oct 30, 2025

Should this also go into this PR or a separate one? If I understand you correctly, then we would like to have changes of the form

@cache mutable struct ImplicitEulerCache{
    uType, rateType, uNoUnitsType, N, AV, StepLimiter} <:
                      SDIRKMutableCache
    u::uType
    uprev::uType
    uprev2::uType
    fsalfirst::rateType
    atmp::uNoUnitsType # Remove this variable here
    nlsolver::N
    algebraic_vars::AV
    step_limiter!::StepLimiter
end

and analogue

mutable struct IControllerCache{C, T, uNoUnityType} <: AbstractControllerCache
    controller::C
    q::T
    dtreject::T
    # atmp::uNoUnityType
    # I believe this should go here or in the algorithm cache, but not in the integrator itself.
    # EEst::T
end

together with the respective change in perform_step! and the constant cache?

Also, I think it makes sense to move the controller cache into the algorithm cache and the controller should be part of the algorithm constructor. E.g.

struct ImplicitEuler{CS, AD, F, F2, P, CT, FDT, ST, CJ, StepLimiter} <:
       OrdinaryDiffEqNewtonAdaptiveAlgorithm{CS, AD, FDT, ST, CJ}
    linsolve::F
    nlsolve::F2
    precs::P
    extrapolant::Symbol
    controller::CT # -> was "controller::Symbol" before
    step_limiter!::StepLimiter
    autodiff::AD
end

which defaults to e.g. PIController, while we get for the caches

@cache mutable struct ImplicitEulerCache{
    uType, rateType, uNoUnitsType, N, AV, StepLimiter, CC} <:
                      SDIRKMutableCache
    u::uType
    uprev::uType
    uprev2::uType
    fsalfirst::rateType
    atmp::uNoUnitsType
    nlsolver::N
    algebraic_vars::AV
    step_limiter!::StepLimiter
    controller_cache::CC
end

This could help to simplify the init logic a little bit, as we hand down the initialization of the controller to the algorithm -- and ultimately the algorithm knows best if the passed controller is compatible or not.

@ChrisRackauckas
Copy link
Member

Rebase

@termi-official
Copy link
Contributor Author

There is some stuff failing which I need to check first.

@termi-official termi-official force-pushed the do/controller-refactor branch from 1f64a43 to 90a9148 Compare January 5, 2026 17:46
@termi-official
Copy link
Contributor Author

Fixing the DiffEqDevTools.jl/Core/1.10 failure is a bit tougher without breaking anything. From my understanding we need the controller to be fully differentiable somewhere, but the type of Q seems to be not necessarily a dual number.

Furthermore, I cannot figure out how to fix downgrade CI, as I need to bump and register the Core package first, such that the new types become visible in the downstream packages.

@oscardssmith
Copy link
Member

downgrade CI isn't your fault

@termi-official
Copy link
Contributor Author

For the default and composite algorithms the post_newton_controller! is never called when a BDF algorithm is at turn.

@termi-official
Copy link
Contributor Author

Before someone spends too much time here, I need to push a few more commits to get CI on track again. I do not handle all corner cases in the default alg correctly right now.

@termi-official
Copy link
Contributor Author

termi-official commented Jan 6, 2026

Okay, the default algorithm tests are on track. The remaining failures (which are not already addressed by open PRs) can be grouped as follows:

  1. For all CIs on pre, they fail with a JET compat issue, so I assume it is unrelated
  2. Most CIs on lts fail because they do not pick up the correct package version. They consistently fail with the new types not being found in OrdinaryDiffEqCore.
  3. I suspect the Downstream CI fails, because there is more adjustment necessary outside of this PR (MethodError: no method matching Float64(::Measurements.Measurement{Float64}) during EEst assignment).
  4. Then there is one more test in the InterfaceI group which I have absolutely now clue why it is failing, but it consistently fails for Julia 1.12 only, i.e. 1.11 and 1.10 work fine. An MWE is here.
using OrdinaryDiffEq, Test, LinearAlgebra

T = 100.0;
Ttr = 0.0;
d0 = 1.0e-9;
threshold = 10^4 * d0;
dt = 0.1;
diff_eq_kwargs = Dict();

@inbounds function eom_lorenz!(du, u, p, t)
    σ = 10.0
    ρ = 28.0
    β = 8 / 3
    du[1] = σ * (u[2] - u[1])
    du[2] = u[1] *- u[3]) - u[2]
    du[3] = u[1] * u[2] - β * u[3]
end

prob = ODEProblem(eom_lorenz!, [0.0, 10.0, 0], (zero(T), T))
integ1 = init(prob, Tsit5(); beta1 = 0.14, diff_eq_kwargs...)
@. prob.u0 = prob.u0 + d0 / sqrt(3)
integ2 = init(prob, Tsit5(); beta1 = 0.14, diff_eq_kwargs...)
integ1.opts.advance_to_tstop = true
integ2.opts.advance_to_tstop = true

# Confirm that the old controller is used
@info integ1.opts.controller

dist = d0
λ = zero(eltype(integ1.u))
i = 0
tvector = dt:dt:T

for τ in tvector
    # evolve until rescaling:
    push!(integ1.opts.tstops, τ)
    @info integ1.u
    step!(integ1)
    @info integ1.u
    push!(integ2.opts.tstops, τ)
    @info integ2.u
    step!(integ2)
    @info integ2.u
    global dist = norm(integ1.u .- integ2.u)
    @info dist
    # Rescale:
    if dist  threshold
        # Rescale and reset everything:
        integ2.u .= integ1.u #.+ (integ2.u .- integ1.u)./a
        u_modified!(integ2, true)
        set_proposed_dt!(integ2, integ1)
        break
    end
end

τ = tvector[end]
push!(integ1.opts.tstops, τ);
step!(integ1);
push!(integ2.opts.tstops, τ);
step!(integ2);
@test dist = norm(integ1.u .- integ2.u) == 0

Is a stripped down example of the full test using the "old" controllers from master.

@oscardssmith
Copy link
Member

@ChrisRackauckas can you set claude on 4? I can confirm that it fails on master.

@termi-official
Copy link
Contributor Author

Commit 554ebc4 reveals that the integration test on chaotic systems (test/interface/umodified_test.jl) can also sporadically break on master, as the test relied on both integrators hitting the boundary of the control regieme in the last step, such that both controllers have qoldinit valued old errors. Which does not happen super rarely on the lorentz system, so we got away most of the time.

There are two more types of test failures which I do not know how to address.

  1. I see several permission denied errors sporadically appearing on random packages which I did not touch. These are not triggered on the latest master CI run.
  2. Some LTS CI runners does not pick up all changes correctly. I have changes to Core introducing new types, which are accessed on Extrapolation, LowOrderRK, FIRK, and ImplicitDiscreteSolve (which fail), as well as in LowStorageRK, and BDF (which are green).

Regarding 2 I see the following difference in the CI.

  • LowStorageRK resolves [1dea7af3] OrdinaryDiffEq v6.105.0 ~/github-runners/deepsea4-17/_work/OrdinaryDiffEq.jl/OrdinaryDiffEq.jl
  • LowOrderRK resolves [1dea7af3] OrdinaryDiffEq v6.105.0 ~/work/OrdinaryDiffEq.jl/OrdinaryDiffEq.jl

@termi-official
Copy link
Contributor Author

Before merging this, please wait until the Trixi group have a chance to check again that I really broke nothing (cc @ranocha )

@termi-official termi-official marked this pull request as draft January 14, 2026 18:19
@termi-official
Copy link
Contributor Author

The lts failure is related to Julia 1.10 not using the sources and hence picking up the wrong package versions (latest release version vs latest commit). Downgrade CI fails on master, too, and I cannot figure out why. So, if no other CI fails, this should be good to go from my end.

@ChrisRackauckas
Copy link
Member

The lts failure is related to Julia 1.10 not using the sources and hence picking up the wrong package versions (latest release version vs latest commit).

So it's a breaking change?

@termi-official
Copy link
Contributor Author

termi-official commented Jan 23, 2026

The lts failure is related to Julia 1.10 not using the sources and hence picking up the wrong package versions (latest release version vs latest commit).

So it's a breaking change?

No, this is purely related to how the CI is setup in Github. Right now, the CI relies on the sources feature which was introduced in Pkg 1.11 (https://github.com/JuliaLang/Pkg.jl/blob/master/CHANGELOG.md#pkg-v111-release-notes). The lts CI uses Julia 1.10, which ignores the section in the corresponding Project.toml files and therefore the runner picks up the latest release of the subpackages instead of the current commit, which then leads to the failure of

  • (OrdinaryDiffEqFIRK, lts)
  • (OrdinaryDiffEqExtrapolation, lts)
  • (InterfaceI, lts)
  • (ImplicitDiscreteSolve, lts)

The remaining lts tests do not even cover the PR, as the version of the latest commit is not picked up anyway.

The failure of (AD, 1.11) is real unfortunately, so let me fix that before merging.

@termi-official
Copy link
Contributor Author

The AD test has been fixed in c6b065e . The remaining failures are the same oulined above.

@ChrisRackauckas
Copy link
Member

No, this is purely related to how the CI is setup in Github. Right now, the CI relies on the sources feature which was introduced in Pkg 1.11 (https://github.com/JuliaLang/Pkg.jl/blob/master/CHANGELOG.md#pkg-v111-release-notes). The lts CI uses Julia 1.10, which ignores the section in the corresponding Project.toml files and therefore the runner picks up the latest release of the subpackages instead of the current commit, which then leads to the failure of

That is irrelevant, that just means it's testing the break. If the solver libraries require the new version of core then it's a breaking one? Is there no way to make this have a deprecation path?

@termi-official
Copy link
Contributor Author

Oh, I see. Thanks for pointing this out. Right now FIRK, IDS, and Extrapolation depend on the master of Core due to the newly introduced controller base types. Let me see, I should be able to resolve this with some compile-time tricks.

@termi-official
Copy link
Contributor Author

I start getting quite a few errors of the following kind in the CI setup:

Error: Unable to locate executable file: julia. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable.

uuid = "bbf590c4-e513-4bbe-9b18-05decba2e5d8"
authors = ["ParamThakkar123 <paramthakkar864@gmail.com>"]
version = "3.2.0"
version = "3.3.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisRackauckas Just that this does not go unnoticed, I need to bump the version to make the CI work. See below for more details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's kind of funky and doesn't seem to work so well... instead just make an OrdinaryDiffEqCore v4?

Comment on lines +32 to +41
@static if Base.pkgversion(OrdinaryDiffEqCore) >= v"3.3"
@eval begin
import OrdinaryDiffEqCore: AbstractLegacyController, default_controller_v7,
legacy_default_controller, setup_controller_cache, AbstractControllerCache
end
else
@eval begin
import OrdinaryDiffEqCore: default_controller
end
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only sane way I found to make this PR truely non-breaking. I used this trick in all packages downstream of Core which are touched by this PR. For OrdinaryDiffEq v7 we can remove these static switches.

Comment on lines +106 to +115
# FIXME AitkenNeville is missing integration with the extrapolation controller and picks up the PI controller instead.
function default_controller(
alg::AitkenNeville,
cache,
qoldinit, _beta1 = nothing, _beta2 = nothing
)
QT = typeof(qoldinit)
beta1, beta2 = _digest_beta1_beta2(alg, cache, Val(QT), _beta1, _beta2)
return PIController(beta1, beta2)
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added this here to preserve the behavior of this algorithm as on master. However, if I am not mistaken Aitken-Neville should also use the extrapolation controller, right? Should I include the fix in this PR or should I preserve the behavior for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that should probably be an extraoplation controller. Thanks, yeah can you do a clean up PR? I was just trying to make sure everything got green and missed this.

Copy link
Contributor Author

@termi-official termi-official Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do! Okay, not straight forward to fix for me, as I am not into the theory. See #3029 .

ChrisRackauckas added a commit that referenced this pull request Feb 2, 2026
…factor

Fix deprecation path for controller refactor (PR #2899)
@ChrisRackauckas
Copy link
Member

Merged in #3018

@termi-official termi-official deleted the do/controller-refactor branch February 2, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interaction between controller api and time loop

4 participants