# Troubleshoot missed Schedule Actions

> Diagnose why a Schedule did not fire by alerting on the missed catchup window metric, then narrow down to the affected Schedule with ListSchedules and DescribeSchedule.

When a [Schedule](/schedule) does not start a Workflow Execution at its expected time, the Action was either skipped intentionally (paused, overlap policy, end time reached) or the Temporal Service could not take the Action within the [Catchup Window](/schedule#catchup-window). This guide covers the second case.

## Alert on missed catchup window

The Temporal Service emits a counter each time it skips a scheduled Action because it could not run it within the configured Catchup Window. Alert on any non-zero value.

### Temporal Cloud

Alert on [`temporal_cloud_v1_schedule_missed_catchup_window_count`](/cloud/metrics/openmetrics/metrics-reference#temporal_cloud_v1_schedule_missed_catchup_window_count) grouped by `temporal_namespace`.

Example PromQL:

```
sum by (temporal_namespace) (
  increase(temporal_cloud_v1_schedule_missed_catchup_window_count[5m])
) > 0
```

### Self-hosted

Alert on [`schedule_missed_catchup_window`](/references/cluster-metrics#schedule_missed_catchup_window) grouped by `namespace`.

Example PromQL:

```
sum by (namespace) (
  increase(schedule_missed_catchup_window[5m])
) > 0
```

The metric is scoped to the Namespace, not to individual Schedules. A non-zero value tells you that at least one Schedule in the Namespace missed an Action, but not which one.

## Investigate which Schedule missed an Action

Once the alert fires, narrow your search down to the affected Schedule in two steps.

### 1. List Schedules in the Namespace

Enumerate the Schedules in the alerting Namespace:

```
temporal schedule list --namespace <your-namespace>
```

[`ListSchedules`](/cli/command-reference/schedule#list) returns Schedule Ids and summary information. It does not return per-Schedule miss counters, so use it only to produce the set of Schedule Ids to inspect.

### 2. Describe each Schedule

For each Schedule Id returned, run:

```
temporal schedule describe \
  --schedule-id <your-schedule-id> \
  --namespace <your-namespace>
```

[`DescribeSchedule`](/cli/command-reference/schedule#describe) returns full Schedule state, including the `info` block with cumulative counters. The relevant fields:

| Field | Meaning |
|-------|---------|
| `missedCatchupWindow` | Actions skipped because they could not run within the Catchup Window. Non-zero here identifies the Schedule responsible for the alert. |
| `overlapSkipped` | Actions skipped because the previous run was still in progress and the Overlap Policy is `Skip`. |
| `bufferDropped` | Buffered Actions dropped because the buffer was full under `BufferOne` or `BufferAll`. |
| `bufferSize` | Current depth of the Action buffer. |
| `recentActions` | Most recent Action times and results. |
| `runningWorkflows` | Workflow Executions currently running for this Schedule. |

Scripting the fan-out against the JSON output (`temporal schedule describe -o json`) is usually faster than inspecting each Schedule interactively.

## Interpret the result

Once you have identified the Schedule with a non-zero `missedCatchupWindow`, use the rest of the `DescribeSchedule` output to determine impact and root cause.

### Assess impact

- Compare `recentActions` to the Schedule's Spec to determine how many Actions were skipped and over what time period.
- If the Schedule uses the `Skip` Overlap Policy and the preceding run was long-running, the miss may reflect that run exceeding the Catchup Window, not a Service outage.
- For business-critical Schedules, [Backfill](/schedule#backfill) the skipped interval once the underlying cause is resolved.

### Common root causes

- **Service or Namespace outage longer than the Catchup Window.** The default Catchup Window is one year, so a miss typically means the Schedule is configured with a tighter window (minimum ten seconds) and the outage exceeded it.
- **Namespace rate limiting.** If scheduled starts are throttled, Actions can queue past the Catchup Window. Cross-check [`temporal_cloud_v1_schedule_rate_limited_count`](/cloud/metrics/openmetrics/metrics-reference#temporal_cloud_v1_schedule_rate_limited_count) (Cloud) or [`schedule_rate_limited`](/references/cluster-metrics#schedule_rate_limited) (self-hosted) in the same time range.
- **Buffer overruns under `BufferAll`.** Long-running Workflow Executions under `BufferAll` can push buffered Actions past the Catchup Window. Cross-check [`temporal_cloud_v1_schedule_buffer_overruns_count`](/cloud/metrics/openmetrics/metrics-reference#temporal_cloud_v1_schedule_buffer_overruns_count) (Cloud) or [`schedule_buffer_overruns`](/references/cluster-metrics#schedule_buffer_overruns) (self-hosted) and examine `bufferSize`.

### Remediate

- Widen the Catchup Window if the current value is tighter than your Service's worst-case unavailability. The trade-off is that more late Actions will fire during recovery.
- Revisit the Overlap Policy if runs routinely exceed the Spec interval. `BufferAll` and `Skip` have different failure modes under sustained delay.
- Increase Namespace throughput limits if rate limiting is the contributing factor.
- [Backfill](/schedule#backfill) the missed interval if the skipped Actions need to run.

## Related reading

- [Schedule concept](/schedule)
- [Catchup Window](/schedule#catchup-window)
- [Temporal CLI schedule reference](/cli/command-reference/schedule)
- [Temporal Cloud OpenMetrics metrics reference](/cloud/metrics/openmetrics/metrics-reference)
- [Self-hosted cluster metrics reference](/references/cluster-metrics)
