# Custom Data Cleanup Processors

Umbraco Engage includes a data-cleanup pipeline that periodically anonymizes and deletes analytics data based on the configured retention periods. You can extend this pipeline with custom processors to perform additional cleanup or anonymization logic.

## The IAnalyticsDataCleanupProcessor interface

To create a custom processor, implement the `IAnalyticsDataCleanupProcessor` interface:

```csharp
using System.Runtime.CompilerServices;
using Umbraco.Engage.Infrastructure.Analytics.Cleanup.Processors;

public class MyCustomCleanupProcessor : IAnalyticsDataCleanupProcessor
{
    public AnalyticsDataCleanupType Type => AnalyticsDataCleanupType.DeleteOrphanedData;

    public async IAsyncEnumerable<AnalyticsDataCleanupResult> ProcessAsync(
        [EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        // Perform cleanup logic here...

        yield return new AnalyticsDataCleanupResult("MyCustomTable", numberOfRecordsAffected);
    }
}
```

The interface requires two members:

| Member                            | Description                                                                                                                                                                                      |
| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `Type`                            | The type of cleanup operation this processor performs. Determines when the processor runs relative to other processors.                                                                          |
| `ProcessAsync(CancellationToken)` | Performs the cleanup and yields one or more `AnalyticsDataCleanupResult` records indicating the table name and number of affected rows. Use the cancellation token to support graceful shutdown. |

## Cleanup types

The `AnalyticsDataCleanupType` enum controls the execution order. Processors are grouped and executed in this sequence:

| Value                 | Description                                                                                                                                  |
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `Anonymize`           | Replaces personally identifiable information with anonymized values for data that has exceeded its retention period. Runs first.             |
| `DeleteAnalyticsData` | Deletes analytics data (pageviews, control group data, raw data) that has exceeded its retention period. Runs second.                        |
| `DeleteOrphanedData`  | Deletes orphaned records no longer referenced by any analytics data (for example: sessions, visitors, devices without pageviews). Runs last. |

## Registering a custom processor

Register your processor using the `EngageDataCleanupProcessors()` extension method on `IUmbracoBuilder` inside a composer:

```csharp
using Umbraco.Cms.Core.Composing;
using Umbraco.Cms.Core.DependencyInjection;
using Umbraco.Engage.Infrastructure.Analytics.Cleanup.Processors;

public class MyCleanupComposer : IComposer
{
    public void Compose(IUmbracoBuilder builder)
    {
        builder.EngageDataCleanupProcessors()
            .Append<MyCustomCleanupProcessor>();
    }
}
```

The collection builder supports the standard Umbraco ordering methods such as `Append<T>()`, `InsertBefore<TBefore, T>()`, and `InsertAfter<TAfter, T>()` to control where your processor runs within its cleanup type group.

## Result type

Each processor yields one or more `AnalyticsDataCleanupResult` records:

```csharp
public record AnalyticsDataCleanupResult(string TableName, int RecordsAffected);
```

These results are logged by the cleanup background job and surfaced in the Engage settings dashboard.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.umbraco.com/umbraco-engage/developers/analytics/extending-analytics/custom-data-cleanup-processors.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
