AMReX-Astro · BenWibking · Oct 19, 2025 · Oct 19, 2025 · Oct 19, 2025 · Oct 19, 2025
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,19 @@
+# Repository Guidelines
+
+## Project Structure & Module Organization
+The repository is organized around modular physics kernels in `EOS/`, `networks/`, `integration/`, and companion directories such as `conductivity/`, `neutrinos/`, `opacity/`, and `rates/`. Shared math utilities and build scripts remain in `util/`, while reusable constants sit in `constants/`. Interfaces consumed by AMReX-based application codes are in `interfaces/`. Unit regression assets reside under `unit_test/`, and Sphinx documentation sources are in `Docs/` with release history tracked in `CHANGES.md`.
+
+## Build, Test, and Development Commands
+Export `AMREX_HOME` and `MICROPHYSICS_HOME` before building so the GNU make stubs locate AMReX sources. Build a standalone unit test in place, e.g. `cd unit_test/burn_cell && make -j4`, which yields `main3d.gnu.ex`. Select physics at compile time with flags like `make NETWORK_DIR=aprox19 EOS_DIR=helmholtz`, and reset a directory via `make clean`.
+
+## Coding Style & Naming Conventions
+C++ sources follow `.editorconfig`: four-space indentation, LF line endings, and tabs only inside Makefiles. Headers keep the uppercase `.H` suffix and implementations use `.cpp`. Favor the existing snake_case routine pattern (`nse_check`, `burn_cell`) and guard autogenerated directories such as `util/autodiff`. Run `.clang-tidy` with `make USE_CLANG_TIDY=TRUE` on significant changes.
+
+## Testing Guidelines
+Each directory beneath `unit_test/` ships a `GNUmakefile`, `inputs_*` controls, and README notes; run the built binary with the matching inputs file (for example `./main3d.gnu.ex inputs_aprox13`). For new coverage, clone an existing `test_*` layout, describe the scenario, and rerun the most relevant case before posting. Capture scratch artifacts via `.gitignore` rather than committing them.
+
+## Commit & Pull Request Guidelines
+Develop on topic branches and open pull requests against `development`; merges to `main` occur monthly. Keep commit subjects concise and imperative, mirroring history such as `add NSE network compatibility docs (#1852)`. Update `CHANGES.md` for user-facing fixes, link issues, and note required AMReX revisions. PRs should summarize physics choices, list tests run, and attach plots or logs when behavior shifts.
+
+## Documentation & Automation
+Sphinx sources in `Docs/` back the published guide; update them with code changes and ensure `make -C Docs html` finishes cleanly. GitHub Actions publishes the result and exercises unit tests, so keep workflows green by limiting optional dependencies and recording new Python requirements in `requirements.txt`.
diff --git a/integration/BackwardEuler/be_integrator.H b/integration/BackwardEuler/be_integrator.H
@@ -2,6 +2,7 @@
 #define BE_INTEGRATOR_H
 
 #include <AMReX_Algorithm.H>
+#include <limits>
 
 #include <be_type.H>
 #include <network.H>
@@ -11,7 +12,6 @@
 #endif
 #include <burn_type.H>
 #include <linpack.H>
-#include <numerical_jacobian.H>
 #ifdef STRANG
 #include <integrator_rhs_strang.H>
 #endif
@@ -31,6 +31,7 @@ AMREX_GPU_HOST_DEVICE AMREX_FORCE_INLINE
 int single_step (BurnT& state, BeT& be, const amrex::Real dt)
 {
     constexpr int int_neqs = integrator_neqs<BurnT>();
+    constexpr amrex::Real uround = std::numeric_limits<amrex::Real>::epsilon();
 
     int ierr = IERR_SUCCESS;
     bool converged = false;
@@ -88,10 +89,41 @@ int single_step (BurnT& state, BeT& be, const amrex::Real dt)
         if (be.jacobian_type == 1) {
             jac(be.t, state, be, be.jac);
         } else {
-            jac_info_t jac_info;
-            jac_info.h = dt;
-            numerical_jac(state, jac_info, be.jac);
-            be.n_rhs += (NumSpec+1);
+            // Build a numerical Jacobian by differencing the wrapped RHS.
+            // This keeps the Jacobian consistent with whatever variables
+            // the integrator is actually evolving (X or number densities).
+            amrex::Real fac = 0.0_rt;
+            for (int i = 1; i <= int_neqs; ++i) {
+                const amrex::Real wt = (i <= NumSpec) ?
+                    (integrator_rp::rtol_spec * std::abs(be.y(i)) + integrator_rp::atol_spec) :
+                    (integrator_rp::rtol_enuc * std::abs(be.y(i)) + integrator_rp::atol_enuc);
+                fac += (ydot(i) / wt) * (ydot(i) / wt);
+            }
+            fac = std::sqrt(fac / int_neqs);
+
+            amrex::Real r0 = 1000.0_rt * std::abs(dt) * uround * int_neqs * fac;
+            if (r0 == 0.0_rt) {
+                r0 = 1.0_rt;
+            }
+
+            constexpr bool in_jacobian = true;
+            amrex::Array1D<amrex::Real, 1, int_neqs> ydot_pert;
+            for (int j = 1; j <= int_neqs; ++j) {
+                const amrex::Real yj = be.y(j);
+                const amrex::Real wt = (j <= NumSpec) ?
+                    (integrator_rp::rtol_spec * std::abs(yj) + integrator_rp::atol_spec) :
+                    (integrator_rp::rtol_enuc * std::abs(yj) + integrator_rp::atol_enuc);
+                const amrex::Real r = amrex::max(std::sqrt(uround) * std::abs(yj), r0 * wt);
+
+                be.y(j) += r;
+                rhs(be.t, state, be, ydot_pert, in_jacobian);
+                for (int i = 1; i <= int_neqs; ++i) {
+                    be.jac(i, j) = (ydot_pert(i) - ydot(i)) / r;
+                }
+                be.y(j) = yj;
+            }
+
+            be.n_rhs += int_neqs;
         }
 
         be.n_jac++;

diff --git a/integration/VODE/vode_dvnlsd.H b/integration/VODE/vode_dvnlsd.H
@@ -148,7 +148,7 @@ amrex::Real dvnlsd (int& NFLAG, BurnT& state, DvodeT& vstate)
             // sometime VODE goes way off tangent.  If these mass fractions
             // are really bad, then let's just bail now
 
-            if (integrator_rp::do_corrector_validation && !integrator_rp::use_number_densities) {
+            if (integrator_rp::do_corrector_validation) {
 #ifdef SDC
                 const amrex::Real rho_current = state.rho_orig + vstate.tn * state.ydot_a[SRHO];
                 const amrex::Real thresh = species_failure_tolerance * rho_current;

diff --git a/integration/VODE/vode_dvode.H b/integration/VODE/vode_dvode.H
@@ -170,14 +170,18 @@ int dvode (BurnT& state, DvodeT& vstate)
 
            if (vstate.n_step == 0) {
 #ifndef AMREX_USE_GPU
+           if (!state.suppress_failure_output) {
                std::cout << amrex::Font::Bold << amrex::FGColor::Red << "DVODE: too much accuracy requested at start of integration" << amrex::ResetDisplay << std::endl;
+           }
 #endif
                return IERR_TOO_MUCH_ACCURACY_REQUESTED;
            }
 
            // Too much accuracy requested for machine precision.
 #ifndef AMREX_USE_GPU
-           std::cout << "DVODE: too much accuracy requested" << std::endl;
+           if (!state.suppress_failure_output) {
+               std::cout << "DVODE: too much accuracy requested" << std::endl;
+           }
 #endif
            for (int i = 1; i <= int_neqs; ++i) {
                vstate.y(i) = vstate.yh(i,1);
@@ -197,7 +201,9 @@ int dvode (BurnT& state, DvodeT& vstate)
        if (kflag == -1) {
            // Error test failed repeatedly or with ABS(H) = HMIN.
 #ifndef AMREX_USE_GPU
-           std::cout << amrex::Font::Bold << amrex::FGColor::Red << "DVODE: error test failed repeatedly or with abs(H) = HMIN" << amrex::ResetDisplay << std::endl;
+           if (!state.suppress_failure_output) {
+               std::cout << amrex::Font::Bold << amrex::FGColor::Red << "DVODE: error test failed repeatedly or with abs(H) = HMIN" << amrex::ResetDisplay << std::endl;
+           }
 #endif
            // Set Y array, T, and optional output.
            for (int i = 1; i <= int_neqs; ++i) {
@@ -211,7 +217,9 @@ int dvode (BurnT& state, DvodeT& vstate)
        else if (kflag == -2) {
            // Convergence failed repeatedly or with ABS(H) = HMIN.
 #ifndef AMREX_USE_GPU
-           std::cout << amrex::Font::Bold << amrex::FGColor::Red << "DVODE: corrector convergence failed repeatedly or with abs(H) = HMIN" << amrex::ResetDisplay << std::endl;
+           if (!state.suppress_failure_output) {
+               std::cout << amrex::Font::Bold << amrex::FGColor::Red << "DVODE: corrector convergence failed repeatedly or with abs(H) = HMIN" << amrex::ResetDisplay << std::endl;
+           }
 #endif
            // Set Y array, T, and optional output.
            for (int i = 1; i <= int_neqs; ++i) {

diff --git a/integration/integrator_setup_strang.H b/integration/integrator_setup_strang.H
@@ -181,7 +181,7 @@ void integrator_cleanup (IntegratorT& int_state, BurnT& state,
 
     // If we failed, print out the current state of the integration.
 
-    if (! state.success) {
+    if (! state.success && !state.suppress_failure_output) {
 #ifndef AMREX_USE_GPU
         std::cout << amrex::Font::Bold << amrex::FGColor::Red << "[ERROR] integration failed in net" << amrex::ResetDisplay << std::endl;
         std::cout << "istate = " << istate << std::endl;

diff --git a/integration/utils/jacobian_utilities.H b/integration/utils/jacobian_utilities.H
@@ -2,7 +2,8 @@
 #define JACOBIAN_UTILITIES_H
 
 #ifndef AMREX_USE_GPU
-#include <format>
+#include <iomanip>
+#include <iostream>
 #endif
 
 #include <burn_type.H>
@@ -31,20 +32,20 @@ print_jacobian (const T& jac) {
 
     constexpr int int_neqs = integrator_neqs<BurnT>();
 
-    std::cout << std::format("{:9} ", "*");
+    std::cout << std::setw(9) << "*" << " ";
     for (int i = 1; i <= NumSpec; ++i) {
-        std::cout << std::format("{:9} ", short_spec_names_cxx[i-1]);
+        std::cout << std::setw(9) << short_spec_names_cxx[i-1] << " ";
     }
-    std::cout << std::format("{:9} ", "e") << std::endl;
+    std::cout << std::setw(9) << "e" << std::endl;
 
     for (int j = 1; j <= int_neqs; ++j) {
         if (j < int_neqs) {
-            std::cout << std::format("{:5} ", short_spec_names_cxx[j-1]);
+            std::cout << std::setw(5) << short_spec_names_cxx[j-1] << " ";
         } else {
-            std::cout << std::format("{:5} ", "e");
+            std::cout << std::setw(5) << "e" << " ";
         }
         for (int i = 1; i <= int_neqs; ++i) {
-            std::cout << std::format("{:9.3g} ", jac.get(i, j));
+            std::cout << std::setw(9) << std::setprecision(3) << jac.get(i, j) << " ";
         }
         std::cout << std::endl;
     }

diff --git a/integration/utils/numerical_jacobian.H b/integration/utils/numerical_jacobian.H
@@ -59,12 +59,16 @@ void numerical_jac(const BurnT& state, const jac_info_t& jac_info, JacNetArray2D
 
     BurnT state_delp = state;
 
-    // default -- plus convert the dY/dt into dX/dt
+    // default state RHS
 
-    actual_rhs(state, ydotm);
+    actual_rhs(state_delp, ydotm);
 
-    for (int q = 1; q <= NumSpec; q++) {
-        ydotm(q) *= aion[q-1];
+    // We integrate X, not Y, except for primordial chemistry where
+    // we integrate number densities directly.
+    if (! integrator_rp::use_number_densities) {
+        for (int q = 1; q <= NumSpec; q++) {
+            ydotm(q) *= aion[q-1];
+        }
     }
 
     // start by computing |f|, which is the norm of the RHS / weight
@@ -112,10 +116,12 @@ void numerical_jac(const BurnT& state, const jac_info_t& jac_info, JacNetArray2D
 
         actual_rhs(state_delp, ydotp);
 
-        // We integrate X, so convert from the Y we got back from the RHS
-
-        for (int q = 1; q <= NumSpec; q++) {
-            ydotp(q) *= aion[q-1];
+        // We integrate X, not Y, except for primordial chemistry where
+        // we integrate number densities directly.
+        if (! integrator_rp::use_number_densities) {
+            for (int q = 1; q <= NumSpec; q++) {
+                ydotp(q) *= aion[q-1];
+            }
         }
 
         // now fill in all of the rows for this column X_n
@@ -150,8 +156,10 @@ void numerical_jac(const BurnT& state, const jac_info_t& jac_info, JacNetArray2D
 
     actual_rhs(state_delp, ydotp);
 
-    for (int q = 1; q <= NumSpec; q++) {
-        ydotp(q) *= aion[q-1];
+    if (! integrator_rp::use_number_densities) {
+        for (int q = 1; q <= NumSpec; q++) {
+            ydotp(q) *= aion[q-1];
+        }
     }
 
     // first fill just the last column with dy/dT

diff --git a/interfaces/burn_type.H b/interfaces/burn_type.H
@@ -162,6 +162,9 @@ struct burn_t
   // integrator error code
   short error_code{};
 
+  // suppress verbose failure diagnostics (useful for GPU retries)
+  bool suppress_failure_output{};
+
 };
 
 

diff --git a/unit_test/burn_cell_primordial_chem/GNUmakefile b/unit_test/burn_cell_primordial_chem/GNUmakefile
@@ -2,20 +2,21 @@ PRECISION  = DOUBLE
 PROFILE    = FALSE
 
 # Set DEBUG to TRUE if debugging
-DEBUG      = TRUE
+DEBUG      = FALSE
 
 DIM        = 1
 
 COMP       = gnu
+EXTRACXXFLAGS += -Wno-psabi
 
 USE_MPI    = FALSE
-USE_OMP    = FALSE
+USE_OMP    = TRUE
 #set USE_CUDA to TRUE to compile and run on GPUs
 USE_CUDA   = FALSE
 USE_REACT = TRUE
 
 # Set USE_MICROPHYSICS_DEBUG to TRUE if debugging
-USE_MICROPHYSICS_DEBUG = TRUE
+USE_MICROPHYSICS_DEBUG = FALSE
 
 EBASE = main
 
@@ -38,3 +39,18 @@ Bpack   := ./Make.package
 Blocs   := .
 
 include $(MICROPHYSICS_HOME)/unit_test/Make.unit_test
+
+ifeq ($(USE_CUDA),TRUE)
+ifeq ($(shell uname -s),Linux)
+ifeq ($(shell uname -m),aarch64)
+# NVTX3 can trigger GCC outlined atomics on AArch64; make link+codegen robust.
+EXTRACXXFLAGS += -mno-outline-atomics
+NVCC_FLAGS += -Xcompiler -mno-outline-atomics
+override XTRALIBS += -latomic
+endif
+endif
+endif
+
+ifeq ($(shell uname),Darwin)
+AMREX_LINKER := ./post_link_install_name.sh $(AMREX_LINKER)
+endif
diff --git a/unit_test/burn_cell_primordial_chem/README.md b/unit_test/burn_cell_primordial_chem/README.md
@@ -21,3 +21,29 @@
 # continuous integration
 
 The code is built with the `primordial_chem` network and run with `inputs_primordial_chem`.
+
+# reference and compare workflow
+
+Use this two-step workflow to split CPU reference generation from later
+comparison runs.
+
+1. Generate/update the saved CPU reference solution:
+
+   ```bash
+   ./main1d.gnu.ex inputs_primordial_chem
+   ```
+
+   This runs the normal Richardson-enabled path and writes
+   `burn_cell_final_state.txt`.
+
+2. Compare a future run against the saved reference:
+
+   ```bash
+   ./main1d.gnu.ex inputs_primordial_chem --compare-final-state burn_cell_final_state.txt
+   ```
+
+   In compare mode, the code does not run Richardson. It performs one batch
+   run on the active backend (GPU when enabled, otherwise CPU), compares each
+   zone against the saved reference, and reports per-zone `PASS`/`FAIL`.
+   For `T`, `e`, and each species number density, the comparison threshold is
+   `2 * |saved truncation error|` from the reference file.
diff --git a/unit_test/burn_cell_primordial_chem/_parameters b/unit_test/burn_cell_primordial_chem/_parameters
@@ -16,6 +16,18 @@ tfirst        real       0.0
 
 # number of steps for the single zone burn
 nsteps        int        1000
+# print multi-zone burn progress every N timesteps (<= 0 disables)
+burn_progress_interval int 100
+
+# estimate truncation error with three tolerance levels and Richardson extrapolation
+richardson_enable         bool       1
+richardson_tol_factor     real       10.0
+richardson_print_per_zone bool       0
+# non-trace threshold as a fraction of the most abundant species in each zone
+richardson_trace_species_cutoff real 0.0
+
+# compute per-zone Jacobian stiffness diagnostics (expensive on CPU)
+compute_stiffness_diagnostics bool 0
 
 # initial temperature
 temperature   real       1e2