Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 21 additions & 7 deletions .claude/skills/new-monitoring-feature/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,22 @@ There is one skill per narrow concern. This one is the wiring map.

## 0. Register the `Layer` — the feature's entry point

A `Layer` is how OAP slices services / instances / endpoints by data source. **Every new feature needs a new `Layer` enum value.** The UI, storage partitioning, menu navigation, and OAL aggregation all key off it.
A `Layer` is how OAP slices services / instances / endpoints by data source. The UI, storage partitioning, menu navigation, and OAL aggregation all key off it.

**Only one place to edit** — `oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java`. Add a new enum constant with a unique id and `normal` flag. Ids are never reused; pick the next integer. Examples: `IOS(47, true)`, `APISIX(27, true)`, `VIRTUAL_DATABASE(11, false)` for inferred/non-real services.
`Layer` is a registry-backed value type (no longer a closed enum). Built-in layers are declared as `public static final Layer` constants in `oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java`; external layers are registered through the `Layer.register(name, ordinal, normal)` API at boot. **Pick the registration path that matches the scope of your feature:**

UI template folders are auto-discovered: `UITemplateInitializer.UI_TEMPLATE_FOLDER` is computed from `Layer.values()` + `"custom"` at class-init time. Drop a `ui-initialized-templates/<layer-name-lowercased>/` folder on disk and the initializer picks it up on the next boot. Missing folders are silently skipped. There is no allowlist to append to.
| Your feature ships as | Registration path |
|---|---|
| Part of the OAP distribution (in-tree, the common case for new SkyWalking-supported targets) | Add a `public static final Layer` constant to `Layer.java` with the next sequential ordinal in `0–49`. Examples: `IOS = register("IOS", 47, true)`, `APISIX = register("APISIX", 21, true)`, `VIRTUAL_DATABASE = register("VIRTUAL_DATABASE", 14, false)` for inferred/non-real services. |
| An out-of-tree MAL or LAL rule file | Add a top-level `layerDefinitions:` block to the rule file. The DSL loader funnels each entry through `Layer.register` before compiling the rule. One file ships the layer + the rules that produce its telemetry. |
| An out-of-tree plugin module (jar) | Implement `org.apache.skywalking.oap.server.core.analysis.LayerExtension` and register via `META-INF/services/`. Discovered by `LayerExtensionLoader` during `CoreModuleProvider.prepare()`. |
| Operator-deployed config (no code, no DSL) | Add an entry to `oap-server/server-starter/src/main/resources/layer-extensions.yml` (or override on the OAP node's classpath). |

**Ordinal conventions:** `0–49` is in active use by built-ins. `50–999` is reserved by convention for future built-in layers. External layers are recommended (not required) to start at `>= 1000` to avoid colliding with future built-ins on OAP upgrade. Collisions in either direction are detected at boot via the ordinal-uniqueness check, which fails OAP startup loudly.

**Storage encoding is the ordinal int**, persisted in BanyanDB / Elasticsearch / JDBC. Every OAP node that reads or writes a given layer must agree on its `(name, ordinal)` mapping — deploy `layer-extensions.yml` and any `layerDefinitions:` rule files identically across all nodes. The registry is sealed at the start of `Core.notifyAfterCompleted()`; later registration attempts throw.

**UI template folders are auto-discovered by file scan, not by `Layer.values()`.** `UITemplateInitializer` walks `ui-initialized-templates/**/*.json` recursively (depth 2) and trusts each template's own `configuration.layer` field. Drop a folder of dashboard JSONs on disk and the initializer picks them up on the next boot — folder name is purely organizational.

**Component ID lookup in Java code**: IDs declared in `component-libraries.yml` are loaded at runtime into `ComponentLibraryCatalogService`'s `componentName2Id` map — they are **not** exposed as Java enum constants. To look up by name in listener code, inject the catalog service and resolve once at construction:
```java
Expand All @@ -35,11 +46,14 @@ int myComponentId = catalog.getComponentId("My-Component-Name");
```
Cache as an `int` field; runtime comparisons are then plain `componentId == myComponentId`. **Trap:** there is a `ComponentsDefine` class under `skywalking-trace-receiver-plugin/src/test/java/.../mock/ComponentsDefine.java` — it is a test-only mock holding five hand-picked constants (Tomcat, Dubbo, RocketMQ, MongoDB). Do not import or extend it from production code.

Emit the layer from every source object your feature produces:
Emit the layer from every source object your feature produces. Built-in layers have a static-field accessor; external layers are looked up by name through the registry:

```java
service.setLayer(Layer.<YOUR_LAYER>);
serviceInstance.setServiceLayer(Layer.<YOUR_LAYER>);
endpoint.setServiceLayer(Layer.<YOUR_LAYER>);
// Built-in layer (constant)
service.setLayer(Layer.IOS);

// External layer (registered via yaml / SPI / layerDefinitions:)
service.setLayer(Layer.nameOf("IOT_FLEET"));
```

Downstream (the core OAL, `service ly <LAYER>` swctl query, topology filters, UI root dashboard's layer selector) all work off this single enum value.
Expand Down
1 change: 1 addition & 0 deletions docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@
* MAL: add `safeDiv(divisor)` on `SampleFamily` that yields `0` when the divisor is `0` instead of `Infinity`/`NaN`. Replace `/` with `safeDiv(...)` in Envoy AI Gateway latency-average rules so `sum / count * 1000` no longer produces dropped or out-of-range samples when a counter is zero in a window.
* Fix: `envoy-ai-gateway` metrics rules, make the metrics value return `0` when the divisor is `0`.
* Fix: LAL compiler treated `(tag("x") as Integer) + (tag("y") as Integer)` as string concatenation instead of numeric addition. Expressions like `input_tokens + output_tokens < 10000` produced the concatenated string `"2589115"` rather than the integer sum `2704`, so token-threshold conditions never triggered `abort {}`. The compiler now detects all-numeric operands (cast to `Integer` or `Long`) and emits proper `long` arithmetic.
* Custom `Layer`s can be declared without modifying the OAP source — via an operator-managed `layer-extensions.yml`, inline `layerDefinitions:` block in a MAL or LAL rule file, or a plugin extension. UI dashboard templates for new layers are auto-discovered from the `ui-initialized-templates/` directory. Recommended ordinal range for external layers is `>= 1000`; conflicting names or ordinals are reported at boot.

#### UI
* Add mobile menu icon and i18n labels for the iOS layer.
Expand Down
45 changes: 45 additions & 0 deletions docs/en/concepts-and-designs/lal.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,51 @@ Use `tag 'key': sourceAttribute("attr")` in the extractor to selectively persist
Layer should be declared in the LAL script to represent the analysis scope of the logs.
LAL rules are routed by layer — only rules matching the incoming log's layer are evaluated.

### Inline layer declarations (`layerDefinitions:`)

A LAL file may declare its own custom layers with a top-level `layerDefinitions:` block.
Each entry is funneled through `Layer.register(name, ordinal, normal)` **before**
the rules in the same file compile, so a LAL file is fully self-describing — a new
monitoring target can land as a single LAL file without an enum edit elsewhere in the OAP
source.

```yaml
layerDefinitions:
- name: IOT_FLEET # upper-snake-case, must match [A-Z][A-Z0-9_]*
ordinal: 1000 # unique across all layers; >= 1000 recommended
normal: true # true = agent-installed (default), false = conjectured/virtual

rules:
- name: iot-fleet-access
layer: IOT_FLEET
dsl: |
filter {
text { regexp $/(?<status>\d+)\s+(?<path>\S+)/$ }
sink { sampler { rateLimit { rpm 1800 } } }
}
```

Notes:
- **Storage encoding is the ordinal int**, persisted in BanyanDB / Elasticsearch / JDBC.
Every OAP node that reads or writes a given layer must agree on its `(name, ordinal)`
mapping — deploy a LAL file with `layerDefinitions:` identically across all nodes.
- **Identical re-registration is a no-op**, so the same `IOT_FLEET` entry can appear in
multiple LAL files (and additionally in a MAL file, in `layer-extensions.yml`, or via the
`LayerExtension` SPI). Conflicting registrations cause OAP boot to fail loudly with the
offending file in the stack trace.
- **Ordinals 0–49** are in active use by the OAP distribution's built-in layers; **50–999**
are reserved by convention for future built-ins. External layers should start at `>= 1000`
— enforcement is not strict, but staying above the reserved band avoids upgrade-time
collisions.
- `layer: auto` works with extension layers too — the extractor body can call
`layer "IOT_FLEET"` and the runtime resolves it through the registry.

Three other registration paths exist for layers that are **not** specific to a LAL file: an
operator-managed `layer-extensions.yml`, a `LayerExtension` Java SPI for plugin jars, and
the built-in static fields in `Layer.java` for distribution layers. See
[`Layer.java`](../../../oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java)
javadoc for the full picture.

When `layer: auto` is declared, the rule matches logs where `service.layer` is absent (common for OTLP
sources that don't set this attribute). The script is expected to set the layer in the extractor:

Expand Down
38 changes: 38 additions & 0 deletions docs/en/concepts-and-designs/mal.md
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,44 @@ name: <string>
exp: <string>
```

### <layer_definitions>

Optional top-level block for declaring custom layers inline alongside the rules that produce
their telemetry. Each entry is funneled through `Layer.register(name, ordinal, normal)`
**before** the rules in the same file compile, so a MAL file is fully self-describing — a new
monitoring target can land as a single MAL file without an enum edit elsewhere in the OAP source.

```yaml
layerDefinitions:
- name: IOT_FLEET # upper-snake-case, must match [A-Z][A-Z0-9_]*
ordinal: 1000 # unique across all layers; >= 1000 recommended
normal: true # true = agent-installed (default), false = conjectured/virtual

metricsRules:
- name: device_battery_percentage
exp: iot_device_battery_level.tagAverage(['service'], ['host'])
expSuffix: instance(['host'], ['service'], Layer.nameOf('IOT_FLEET'))
```

Notes:
- **Storage encoding is the ordinal int**, persisted in BanyanDB / Elasticsearch / JDBC. Every
OAP node that reads or writes a given layer must agree on its `(name, ordinal)` mapping —
deploy a MAL file with `layerDefinitions:` identically across all nodes.
- **Identical re-registration is a no-op**, so the same `IOT_FLEET` entry can appear in multiple
MAL files (and additionally in a LAL file, in `layer-extensions.yml`, or via the
`LayerExtension` SPI). Conflicting registrations (same name with different ordinal, or same
ordinal with different name) cause OAP boot to fail loudly with the offending file in the
stack trace.
- **Ordinals 0–49** are in active use by the OAP distribution's built-in layers; **50–999** are
reserved by convention for future built-ins. External layers should start at `>= 1000` —
enforcement is not strict, but staying above the reserved band avoids upgrade-time collisions.

Three other registration paths exist for layers that are **not** specific to a MAL file: an
operator-managed `layer-extensions.yml`, a `LayerExtension` Java SPI for plugin jars, and the
built-in static fields in `Layer.java` for distribution layers. See
[`Layer.java`](../../../oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java)
javadoc for the full picture.

## More Examples

Please refer to [OAP Self-Observability](../../../oap-server/server-starter/src/main/resources/otel-rules/oap.yaml).
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.skywalking.oap.server.core.analysis.Layer;

/**
* Code generation for LAL block-level structures: {@code extractor},
Expand Down Expand Up @@ -265,7 +266,11 @@ private static void generateFieldToOutput(
sb.append(", \"")
.append(LALCodegenHelper.escapeJava(field.getFormatPattern()))
.append("\")");
} else if (paramType.isEnum()) {
} else if (paramType.isEnum() || paramType == Layer.class) {
// `Layer` was historically an enum and is now a registry-backed value type with a
// matching `valueOf(String)`; both flow through the same `Type.valueOf(string)`
// codegen so LAL extractors written as `layer "MYSQL"` still resolve to the typed
// setter argument.
sb.append(paramType.getName()).append(".valueOf(");
LALValueCodegen.generateCastedValueAccess(sb, field.getValue(), "String", genCtx);
sb.append(")");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
import java.util.Map;
import lombok.Data;
import lombok.extern.slf4j.Slf4j;
import org.apache.skywalking.oap.server.core.UnexpectedException;
import org.apache.skywalking.oap.server.core.analysis.LayerDefinition;
import org.apache.skywalking.oap.server.core.rule.ext.RuleSetMerger;
import org.apache.skywalking.oap.server.library.module.ModuleManager;
import org.apache.skywalking.oap.server.library.module.ModuleStartException;
Expand All @@ -48,6 +50,12 @@
@Slf4j
public class LALConfigs {
private List<LALConfig> rules;
/**
* Optional inline layer registrations. When present, each entry is registered through
* {@code Layer.register(...)} before the rules in this file are compiled, so a
* LAL file is self-describing for any custom layers it references.
*/
private List<LayerDefinition> layerDefinitions;

public static List<LALConfigs> load(final String path, final List<String> files) throws Exception {
return loadInternal(path, files, null, /* useInstalledManager= */ true);
Expand Down Expand Up @@ -127,6 +135,7 @@ private static List<LALConfigs> loadInternal(final String path, final List<Strin
if (configs == null || configs.getRules() == null) {
continue;
}
registerInlineLayers(ruleName, configs);
// sourceFileName is only present for entries that came from disk; resolver-
// only rules synthesise a name so diagnostics still print something.
final String src = sourceFileName.getOrDefault(ruleName, ruleName + ".yaml");
Expand All @@ -141,4 +150,25 @@ private static List<LALConfigs> loadInternal(final String path, final List<Strin
throw new ModuleStartException("Failed to load LAL config rules", e);
}
}

/**
* Funnel any inline {@code layerDefinitions:} entries through {@code Layer.register}.
* Conflict checks (reserved-range, name uniqueness, ordinal uniqueness, sealed-state) live
* in {@code Layer.register}; failures here surface with the offending rule name in
* the stack trace, which is enough for an operator to find the bad file.
*/
private static void registerInlineLayers(final String ruleName, final LALConfigs configs) {
final List<LayerDefinition> defs = configs.getLayerDefinitions();
if (defs == null || defs.isEmpty()) {
return;
}
for (final LayerDefinition def : defs) {
try {
def.register();
} catch (RuntimeException e) {
throw new UnexpectedException(
"LAL rule " + ruleName + " layerDefinitions entry rejected: " + def, e);
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/

package org.apache.skywalking.oap.log.analyzer.v2.provider;

import java.util.Collections;
import java.util.List;
import org.apache.skywalking.oap.server.core.analysis.Layer;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNotSame;

/**
* Verifies that a LAL rule file declaring a top-level {@code layerDefinitions:} block has
* each entry funneled through {@link Layer#register} before the rules compile, so the LAL
* file is self-describing for any custom layers it references.
*
* <p>Owns ordinals 1300–1309. Does not call {@link Layer#seal()} — the registry is a
* process-wide singleton and sealing here would taint sibling tests.
*/
class LALConfigsLayerDefinitionsTest {

@Test
void inlineLayerDefinitionsRegisterAndRulesParse() throws Exception {
final List<LALConfigs> configs = LALConfigs.load(
"test-lal-with-layer-defs",
Collections.singletonList("test-rule"),
null
);

assertEquals(1, configs.size(), "Expected exactly one LALConfigs loaded from the fixture");

// Inline layerDefinitions: produced two registry entries before the LAL DSL was
// compiled.
final Layer a = Layer.nameOf("TEST_LAL_LAYER_A");
assertNotSame(Layer.UNDEFINED, a);
assertEquals(1300, a.value());
assertEquals(true, a.isNormal());

final Layer b = Layer.nameOf("TEST_LAL_LAYER_B");
assertNotSame(Layer.UNDEFINED, b);
assertEquals(1301, b.value());
assertEquals(false, b.isNormal());

// The rule list survived the layerDefinitions parse.
final LALConfigs single = configs.get(0);
assertEquals(1, single.getRules().size());
assertEquals("TEST_LAL_LAYER_A", single.getRules().get(0).getLayer());
assertEquals(2, single.getLayerDefinitions().size());
}

@Test
void reloadingTheSameFileIsIdempotent() throws Exception {
LALConfigs.load("test-lal-with-layer-defs", Collections.singletonList("test-rule"), null);
LALConfigs.load("test-lal-with-layer-defs", Collections.singletonList("test-rule"), null);

assertEquals(1300, Layer.nameOf("TEST_LAL_LAYER_A").value());
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Fixture for LALConfigsLayerDefinitionsTest. Owns ordinals 1300-1309.
# Verifies that a top-level `layerDefinitions:` block at the head of a LAL rule file is
# funneled through Layer.register() before the rules compile.

layerDefinitions:
- name: TEST_LAL_LAYER_A
ordinal: 1300
normal: true
- name: TEST_LAL_LAYER_B
ordinal: 1301
normal: false

rules:
- name: dummy-rule
layer: TEST_LAL_LAYER_A
dsl: |
filter {
sink { sampler { rateLimit { rpm 10 } } }
}
Loading
Loading