Framework internal-fault model

Gap capability: the runtime distinguishes two classes of in-cycle fault — a recoverable fault contained at the task boundary, and a non-recoverable violation of an internal dispatch invariant that fails fast.

Feature: Framework internal-fault model FEAT_0024
status: open
satisfies: FEAT_0010
is satisfied by: REQ_0123, REQ_0124, REQ_0125
is implemented by: BB_0094

The runtime distinguishes two classes of in-cycle fault and handles them oppositely. A recoverable fault — a user item returning an error, panicking, or overrunning its deadline — is contained at the task boundary and surfaced as a Cycle-overrun fault primitive (FEAT_0018) fault transition, leaving sibling tasks and the process running. A non-recoverable fault — a violation of an internal dispatch invariant (lock poisoning, ready-ring overflow, broken in-degree accounting) — means the executor’s own state is unsound; the runtime fails fast rather than execute further logic over corrupt state. This feature is the runtime realisation of Internal fault detection an... (AFSR_0004) for the panic case.

Requirement: Framework-invariant violation triggers fail-fast REQ_0123
status: draft
satisfies: FEAT_0024
is refined by: ADR_0065
is implemented by: IMPL_0085
is verified by: TEST_0823, TEST_0824

Any panic that escapes the per-item catch_unwind boundary — i.e. a panic originating in framework dispatch machinery rather than in a user item’s execute — shall be treated as a non-recoverable internal-invariant violation. The runtime shall not swallow such a panic and shall not attempt to continue or resume dispatch.

On such a violation the runtime shall, in order: (1) invoke a user-registered fatal handler (see User-registered fatal handler (REQ_0125)) on a best-effort, time-bounded basis; then (2) call std::process::abort. Because abort runs no destructors, the output safe-state guarantee rests entirely on the external fieldbus watchdog (Output-slave watchdog enabl... (AOU_0016)), not on any runtime code executing after the violation. The fail-fast boundary is realised at every runtime thread top — the pool worker loop, the inline-mode submit path, and the executor dispatch thread’s run loop — since a user-item panic is already converted to an error below this boundary and can never reach it. See Abort on framework-invarian... (ADR_0065) for the rationale and the documented failure model; this requirement refines the internal-fault-propagation obligation of Internal fault detection an... (AFSR_0004).

The containment carve-out of User-item panic is containe... (REQ_0124) covers only a user item’s execute. Panics raised in framework-invoked user callbacks that run outside that inner catch — Observer and ExecutionMonitor methods (e.g. on_app_error, post_execute) — escape to this boundary and therefore fail-fast. Integrators shall treat those callbacks as non-panicking.

Requirement: User-item panic is contained, not a fail-fast REQ_0124
status: implemented
satisfies: FEAT_0024
is implemented by: IMPL_0086
is verified by: TEST_0825
links outgoing: IMPL_0086, TEST_0825

A panic originating in a user item’s execute shall be caught and converted to an ItemError (PanickedTask). The error shall be surfaced to the configured Observer via on_app_error and propagated as the item’s error result — stopping downstream items in its enclosing chain or DAG per Abort propagation (REQ_0022)without aborting the process, without invoking the fatal handler of User-registered fatal handler (REQ_0125), and without affecting independent sibling tasks (which continue to be dispatched on subsequent cycles).

A panicking item does not transition the task to the Faulted state of Cycle-overrun fault primitive (FEAT_0018); that state is reserved for deadline-budget breaches (Per-task overrun fault tran... (REQ_0070)). Containment here means the panic is reified as a normal item error, not escalated to the framework fail-fast path. This contained-panic behaviour is load-bearing for the task-isolation guarantee of Internal fault detection an... (AFSR_0004) and shall not regress to the fail-fast path of Framework-invariant violati... (REQ_0123).

Requirement: User-registered fatal handler REQ_0125
status: draft
satisfies: FEAT_0024
is implemented by: IMPL_0085
is verified by: TEST_0823
links outgoing: BB_0094, IMPL_0085, TEST_0823

The runtime shall accept an optional fatal handler, registered at Executor::build time, invoked once on the fail-fast path of Framework-invariant violati... (REQ_0123) immediately before std::process::abort. The default handler is a no-op. The handler contract, which the runtime shall document and enforce, is: it runs over known-unsound executor state and therefore must not access executor internals; it is time-bounded; and a panic raised inside the handler shall route directly to abort (the handler is itself catch-guarded). Its intended use is a narrow last-gasp — driving a hardware safe-state output or flushing a black-box recorder — not recovery.