# Application failures

> Learn what application failures are in Temporal, how they differ from platform failures, and how errors propagate between Activities and Workflows.

Temporal handles many types of failures automatically through Durable Execution.
Worker crashes, network interruptions, and infrastructure outages are all recovered from without any intervention.
But some failures require your application to detect and respond to them.
Understanding which failures Temporal handles and which ones your application must handle is fundamental to building reliable Temporal applications.

## Platform failures vs application failures 

Failures fall into two categories based on where they are detected and mitigated: platform failures and application failures.

### Platform failures

Platform failures occur due to issues with the infrastructure: server outages, network interruptions, Worker crashes, or other environmental factors outside of your application's control.
Temporal's Durable Execution handles these failures transparently.
When a Worker crashes mid-execution, another Worker picks up the work and continues from where it left off.
Your application code does not need to account for these failures.

Platform failures are resolved through **forward recovery**: the system retries the failed operation, and if the retry succeeds, the application continues from the point of failure without undoing any previous work.

### Application failures

Application failures are generated by your code.
They indicate an issue with your application logic, such as invalid input data, a business rule violation, or a failed call to an external service.

Application failures do not resolve on their own through retries alone.
Recovering from an application failure may require fixing a bug, passing different input data, or performing some external mitigation.

Application failures often involve **backward recovery**: the system undoes some of the work that has already been performed to return to a previous state.
For example, if a payment step fails after inventory has already been reserved, the application may need to release that inventory.

For guidance on categorizing failures and deciding how to handle them, see [Error handling](/best-practices/error-handling).

## How Temporal represents failures 

When a failure surfaces to your application code, the SDK represents it as a typed error object.
Each SDK uses the conventions of its language: what is called a Failure in one SDK might be called an Error or Exception in another.

Most SDKs have a base class that other failure types extend.
This provides a common interface and shared behavior across different failure types.
For example:

- TypeScript: [TemporalFailure](https://typescript.temporal.io/api/classes/common.TemporalFailure)
- Java: [TemporalFailure](https://www.javadoc.io/doc/io.temporal/temporal-sdk/latest/io/temporal/failure/TemporalFailure.html)
- Python: [FailureError](https://python.temporal.io/temporalio.exceptions.FailureError.html)
- PHP: [TemporalFailure](https://php.temporal.io/classes/Temporal-Exception-Failure-TemporalFailure.html)
- Go: Uses specific error types rather than a base class

For the complete list of failure types and their SDK-specific classes, see [Failures reference](/references/failures).

Errors that extend this base class are referred to as **Temporal failures**.
These are the SDK's typed error classes for failures that surface to application code, whether generated by the system (such as ActivityFailure or TimeoutFailure) or by your code (ApplicationFailure).
Platform failures like Worker crashes and network interruptions do not produce Temporal failure objects. The platform handles those transparently through retries.

The SDK uses whether an error is a Temporal failure to determine how to handle it.
In Workflow code, throwing a Temporal failure fails the Workflow Execution, while throwing any other error fails the Workflow Task and is retried automatically.

The Temporal failure types are:

| Failure type | Description |
| :--- | :--- |
| **Application Failure** | Raised by your code to indicate application-specific errors. This is the only failure type you create directly. |
| **Activity Failure** | Wraps an error from an Activity Execution. The `cause` field contains the underlying error. |
| **Child Workflow Failure** | Wraps an error from a Child Workflow Execution. |
| **Timeout Failure** | Occurs when an Activity or Workflow exceeds its configured timeout. |
| **Cancelled Failure** | Results from cancellation of a Workflow, Activity, or Timer. |
| **Terminated Failure** | Occurs when a Workflow Execution is forcefully terminated. |
| **Server Failure** | Originates from the Temporal Service itself. |

Do not extend the base failure class or any of its children in your code.
The provided classes are designed to work with Temporal's serialization mechanism, which converts failures to Protocol Buffer messages for communication across process and language boundaries.
Custom subclasses can break this serialization and lead to unexpected behavior.

### Application Failure

Application Failure is the failure type you use to communicate application-specific errors.
It is the only failure type designed to be created and thrown directly by your code.

When you throw an Application Failure, you can set these fields:

- **message**: A human-readable description of the error.
- **type**: A string that categorizes the failure (for example, `"InvalidInput"` or `"InsufficientFunds"`).
- **non_retryable**: A flag that prevents the operation from being retried, regardless of the Retry Policy.
- **details**: Additional data about the failure.

Any non-Temporal error thrown from an Activity is automatically converted to an Application Failure.
During this conversion, the error's type name, message, and call stack are preserved, and `non_retryable` is set to `false`.

### Failure Converters

When Temporal returns a failure, the default Failure Converter copies error messages and stack traces as plain text.
This text is accessible in the Web UI and through the CLI.

If your errors might contain sensitive information, you can encrypt the message and stack trace by configuring a custom Failure Converter with a codec.
See [Failure Converter](/failure-converter) for details.

## Workflow Task failures vs Workflow Execution failures 

When an error occurs in Workflow code, it produces one of two outcomes depending on the error type: a Workflow Task failure or a Workflow Execution failure.
Understanding the difference is important because they have very different implications.

|  | Workflow Task failure | Workflow Execution failure |
| :--- | :--- | :--- |
| **Caused by** | Non-Temporal errors (null reference, division by zero, type errors, non-determinism errors) | Temporal failures thrown by your code, such as Application Failure |
| **Retried?** | Yes, automatically | No |
| **Workflow state** | Preserved. You can fix the bug and redeploy without losing progress. | "Failed" state permanently. No more attempts are made. |
| **Typical cause** | A bug in the Workflow code | A permanent business logic failure where retrying with the same input will not help |

When a Workflow Task failure is retried:

1. The Worker removes the Workflow Execution from its cache.
2. The Temporal Service schedules a new Workflow Task on the original Task Queue.
3. A Worker picks up the Task and replays the Workflow Execution from Event History to restore the correct state before continuing.

## How errors propagate 

When an Activity fails, Temporal wraps the error in an Activity Failure before delivering it to the Workflow.
The Activity Failure provides context about the failure, including the Activity Type, the number of retry attempts, and the original cause.

The original error is in the `cause` field.
For example, if an Activity throws an Application Failure with `type: "InvalidInput"`, the Workflow receives an Activity Failure whose `cause` is that Application Failure.
If an Activity times out instead, the `cause` is a Timeout Failure.

This wrapping pattern applies to other execution types as well.
A failed Child Workflow delivers a Child Workflow Failure to the parent Workflow, with the original error in the `cause` field.

If a Temporal failure propagates unhandled through Workflow code, it fails the Workflow Execution.
The exception is Cancelled Failure, which puts the Workflow in "Cancelled" state instead of "Failed".

### The outermost error type determines retryability 

When an Activity returns an error, the SDK inspects the **outermost** error to decide how to represent the failure to the Temporal Service.
The SDK performs a type check on the outermost error and converts it to a Protocol Buffer [Failure](https://api-docs.temporal.io/#temporal.api.failure.v1.Failure) message.
If the outermost error is an Application Failure, the SDK preserves its `non_retryable` flag and `type` field in the resulting `ApplicationFailureInfo` proto.
If the outermost error is any other type, the SDK falls back to creating a default, **retryable** `ApplicationFailureInfo`.

The Temporal Service only inspects the **top-level** `failure_info` on the Failure proto when making retry decisions.
The original error is preserved in the `cause` chain, but the Service does not look at `cause` to determine retryability.

This means that wrapping an Application Failure in a generic language error silently loses the `non_retryable` flag.
If an Activity throws a non-retryable Application Failure, but your code catches it and re-throws it wrapped in a standard error, the Activity will be retried despite the original intent.

> **⚠️ Caution:**
> Wrapping errors can lose retryability flags
> If you need to add context to an error, wrap it in another Application Failure that preserves the `non_retryable` flag.
> Do not wrap Application Failures in generic language errors (such as `Error` in TypeScript, `Exception` in Python, or `fmt.Errorf` in Go), as this causes the SDK to treat the error as a new, retryable failure.

This behavior is consistent across all Temporal SDKs.
For language-specific examples and correct wrapping patterns, see the error handling guide for your SDK.

## Failures in Event History 

Failures are recorded in Event History, which provides a detailed record for debugging.

### Activity failures

An Activity Execution that completes results in three Events: `ActivityTaskScheduled`, `ActivityTaskStarted`, and `ActivityTaskCompleted`.

If an Activity fails and the Retry Policy does not cause it to retry, the Temporal Service adds an `ActivityTaskFailed` Event that contains the error details.
If an Activity times out, an `ActivityTaskTimedOut` Event is added instead.

While an Activity is running, `ActivityTaskScheduled` is the most recent Event visible for that Activity.
The `ActivityTaskStarted` Event is not written until the Activity Task closes, because the final retry attempt number (an attribute of `ActivityTaskStarted`) is not known until then.

You can view pending Activity Executions in the Web UI's Pending Activities section, which shows the Activity Type, current retry attempt, remaining attempts, and heartbeat information.

### Workflow Execution failures

An Activity failure does not directly cause a Workflow Execution failure.
If an Activity fails and the error propagates out of the Workflow function without being caught (or is caught and intentionally re-raised as an Application Failure), the Workflow Execution fails.

When a Workflow Execution fails, the Temporal Service adds a `WorkflowExecutionFailed` Event.
If the failure was caused by an unhandled Activity error, the `activityFailureInfo` is attached to that Event.
