Skip to content
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ open(joinpath(@__DIR__, "src", "index.md"), "w") do io
for line in eachline(joinpath(dirname(@__DIR__), "README.md"))
println(io, line)
end

for line in eachline(joinpath(@__DIR__, "src", "reproducibility.md"))
println(io, line)
end
Expand Down
12 changes: 11 additions & 1 deletion docs/pages.jl
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
# Put in a separate page so it can be used by SciMLDocs.jl

pages = ["Home" => "index.md", "tutorials.md", "api.md"]
pages = [
"Home" => "index.md",
"Tutorials" => "tutorials.md",
"Derivatives" => "derivatives.md",
"Gradients" => "gradients.md",
"Jacobians" => "jacobians.md",
"Hessians" => "hessians.md",
"Jacobian-Vector Products" => "jvp.md",
"Step Size Selection" => "epsilons.md",
"Internal Utilities" => "utilities.md"
]
58 changes: 0 additions & 58 deletions docs/src/api.md

This file was deleted.

26 changes: 26 additions & 0 deletions docs/src/derivatives.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Derivatives

Functions for computing derivatives of scalar-valued functions.

## Overview

Derivatives are computed for scalar→scalar maps `f(x)` where `x` can be a single point or a collection of points. The derivative functions support:

- **Forward differences**: `O(1)` function evaluation per point, `O(h)` accuracy
- **Central differences**: `O(2)` function evaluations per point, `O(h²)` accuracy
- **Complex step**: `O(1)` function evaluation per point, machine precision accuracy

For optimal performance with repeated computations, use the cached versions with `DerivativeCache`.

## Functions

```@docs
FiniteDiff.finite_difference_derivative
FiniteDiff.finite_difference_derivative!
```

## Cache

```@docs
FiniteDiff.DerivativeCache
```
71 changes: 71 additions & 0 deletions docs/src/epsilons.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Step Size Selection (Epsilons)

Functions and theory for computing optimal step sizes in finite difference approximations.

## Theory

The choice of step size (epsilon) in finite difference methods is critical for accuracy. Too large a step leads to truncation error, while too small a step leads to round-off error. The optimal step size balances these two sources of error.

### Error Analysis

For a function `f` with bounded derivatives, the total error in finite difference approximations consists of:

1. **Truncation Error**: Comes from the finite difference approximation itself
- Forward differences: `O(h)` where `h` is the step size
- Central differences: `O(h²)`
- Hessian central differences: `O(h²)` for second derivatives

2. **Round-off Error**: Comes from floating-point arithmetic
- Forward differences: `O(eps/h)` where `eps` is machine epsilon
- Central differences: `O(eps/h)`

### Optimal Step Sizes

Minimizing the total error `truncation + round-off` gives optimal step sizes:

- **Forward differences**: `h* = sqrt(eps)` - balances `O(h)` truncation with `O(eps/h)` round-off
- **Central differences**: `h* = eps^(1/3)` - balances `O(h²)` truncation with `O(eps/h)` round-off
- **Hessian central**: `h* = eps^(1/4)` - balances `O(h²)` truncation for mixed derivatives
- **Complex step**: `h* = eps` - no subtractive cancellation, only limited by machine precision

## Adaptive Step Sizing

The step size computation uses both relative and absolute components:

```julia
epsilon = max(relstep * abs(x), absstep) * dir
```

This ensures:
- **Large values**: Use relative step `relstep * |x|` for scale-invariant accuracy
- **Small values**: Use absolute step `absstep` to avoid underflow
- **Direction**: Multiply by `dir` (±1) for forward differences

## Implementation

The step size computation is handled by internal functions:

- **`compute_epsilon(fdtype, x, relstep, absstep, dir)`**: Computes the actual step size for a given finite difference method and input value
- **`default_relstep(fdtype, T)`**: Returns the optimal relative step size for a given method and numeric type

These functions are called automatically by all finite difference routines, but understanding their behavior can help with custom implementations or debugging numerical issues.

## Special Cases

### Complex Step Differentiation

For complex step differentiation, the step size is simply machine epsilon since this method avoids subtractive cancellation entirely:

⚠️ **Important**: The function `f` must be complex analytic when the input is complex!

### Sparse Jacobians

When computing sparse Jacobians with graph coloring, the step size is computed based on the norm of the perturbation vector to ensure balanced accuracy across all columns in the same color group.

## Practical Considerations

- **Default step sizes** are optimal for most smooth functions
- **Custom step sizes** may be needed for functions with unusual scaling or near-discontinuities
- **Relative steps** should scale with the magnitude of the input
- **Absolute steps** provide a fallback for inputs near zero
- **Direction parameter** allows for one-sided differences when needed (e.g., at domain boundaries)
38 changes: 38 additions & 0 deletions docs/src/gradients.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Gradients

Functions for computing gradients of scalar-valued functions with respect to vector inputs.

## Function Types

Gradients support two types of function mappings:

- **Vector→scalar**: `f(x)` where `x` is a vector and `f` returns a scalar
- **Scalar→vector**: `f(fx, x)` for in-place evaluation or `fx = f(x)` for out-of-place

## Performance Notes

- **Forward differences**: `O(n)` function evaluations, `O(h)` accuracy
- **Central differences**: `O(2n)` function evaluations, `O(h²)` accuracy
- **Complex step**: `O(n)` function evaluations, machine precision accuracy

## Cache Management

When using `GradientCache` with pre-computed function values:

- If you provide `fx`, then `fx` will be used in forward differencing to skip a function call
- You must update `cache.fx` before each call to `finite_difference_gradient!`
- For immutable types (scalars, `StaticArray`), use `@set` from [Setfield.jl](https://github.com/jw3126/Setfield.jl)
- Consider aliasing existing arrays into the cache for memory efficiency

## Functions

```@docs
FiniteDiff.finite_difference_gradient
FiniteDiff.finite_difference_gradient!
```

## Cache

```@docs
FiniteDiff.GradientCache
```
45 changes: 45 additions & 0 deletions docs/src/hessians.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Hessians

Functions for computing Hessian matrices of scalar-valued functions.

## Function Requirements

Hessian functions are designed for scalar-valued functions `f(x)` where:

- `x` is a vector of parameters
- `f(x)` returns a scalar value
- The Hessian `H[i,j] = ∂²f/(∂x[i]∂x[j])` is automatically symmetrized

## Mathematical Background

For a scalar function `f: ℝⁿ → ℝ`, the Hessian central difference approximation is:

```
H[i,j] ≈ (f(x + eᵢhᵢ + eⱼhⱼ) - f(x + eᵢhᵢ - eⱼhⱼ) - f(x - eᵢhᵢ + eⱼhⱼ) + f(x - eᵢhᵢ - eⱼhⱼ)) / (4hᵢhⱼ)
```

where `eᵢ` is the i-th unit vector and `hᵢ` is the step size in dimension i.

## Performance Considerations

- **Complexity**: Requires `O(n²)` function evaluations for an n-dimensional input
- **Accuracy**: Central differences provide `O(h²)` accuracy for second derivatives
- **Memory**: The result is returned as a `Symmetric` matrix view
- **Alternative**: For large problems, consider computing the gradient twice instead

## StaticArrays Support

The cache constructor automatically detects `StaticArray` types and adjusts the `inplace` parameter accordingly for optimal performance.

## Functions

```@docs
FiniteDiff.finite_difference_hessian
FiniteDiff.finite_difference_hessian!
```

## Cache

```@docs
FiniteDiff.HessianCache
```
41 changes: 41 additions & 0 deletions docs/src/jacobians.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Jacobians

Functions for computing Jacobian matrices of vector-valued functions.

## Function Types

Jacobians support the following function signatures:

- **Out-of-place**: `fx = f(x)` where both `x` and `fx` are vectors
- **In-place**: `f!(fx, x)` where `f!` modifies `fx` in-place

## Sparse Jacobians

FiniteDiff.jl provides efficient sparse Jacobian computation using graph coloring:

- Pass a `colorvec` of matrix colors to enable column compression
- Provide `sparsity` as a sparse or structured matrix (`Tridiagonal`, `Banded`, etc.)
- Supports automatic sparsity pattern detection via ArrayInterfaceCore.jl
- Results are automatically decompressed unless `sparsity=nothing`

## Performance Notes

- **Forward differences**: `O(n)` function evaluations, `O(h)` accuracy
- **Central differences**: `O(2n)` function evaluations, `O(h²)` accuracy
- **Complex step**: `O(n)` function evaluations, machine precision accuracy
- **Sparse Jacobians**: Use graph coloring to reduce function evaluations significantly

For non-square Jacobians, specify the output vector `fx` when creating the cache to ensure proper sizing.

## Functions

```@docs
FiniteDiff.finite_difference_jacobian
FiniteDiff.finite_difference_jacobian!
```

## Cache

```@docs
FiniteDiff.JacobianCache
```
48 changes: 48 additions & 0 deletions docs/src/jvp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Jacobian-Vector Products (JVP)

Functions for computing Jacobian-vector products efficiently without forming the full Jacobian matrix.

## Mathematical Background

The JVP computes `J(x) * v` where `J(x)` is the Jacobian of function `f` at point `x` and `v` is a direction vector. This is computed using finite difference approximations:

- **Forward**: `J(x) * v ≈ (f(x + h*v) - f(x)) / h`
- **Central**: `J(x) * v ≈ (f(x + h*v) - f(x - h*v)) / (2h)`

where `h` is the step size and `v` is the direction vector.

## Performance Benefits

JVP functions are particularly efficient when you only need directional derivatives:

- **Function evaluations**: Only 2 function evaluations (vs `O(n)` for full Jacobian)
- **Forward differences**: 2 function evaluations, `O(h)` accuracy
- **Central differences**: 2 function evaluations, `O(h²)` accuracy
- **Memory efficient**: No need to store the full Jacobian matrix

## Use Cases

JVP is particularly useful for:

- **Optimization**: Computing directional derivatives along search directions
- **Sparse directions**: When `v` has few non-zero entries
- **Memory constraints**: Avoiding storage of large Jacobian matrices
- **Newton methods**: Computing Newton steps `J⁻¹ * v` iteratively

## Limitations

- **Complex step**: JVP does not currently support complex step differentiation (`Val(:complex)`)
- **In-place functions**: For in-place function evaluation, ensure proper cache sizing

## Functions

```@docs
FiniteDiff.finite_difference_jvp
FiniteDiff.finite_difference_jvp!
```

## Cache

```@docs
FiniteDiff.JVPCache
```
Loading
Loading