Skip to main content

Indexing Settings

In the Indexing settings tab, you will find all configurations regarding the event and the property catalog creation.

image

Options​

  • Lookback Days: Mitzu only indexes the events from the latest N days in your event tables in your data warehouse. The lookback days setting sets the days the indexing process will look to.

  • Sample Size: Mitzu will only pick a small sample of events for indexing. Increasing the sample size will increase indexing precision while prolonging the indexing process.

  • Bucketed Table Indexing: Mitzu efficiently indexes any table by default using a single SQL query. However, this process takes longer if your data warehouse contains wide tables (tables with many columns). Processing the table in buckets of columns can improve performance, especially in data lakes with Parquet or ORC files.

  • Dimension Table Sample Size: Mitzu will only pick a small sample of rows for indexing. Increasing the sample size will increase indexing precision while prolonging the indexing process.

  • Data Scrambling: Mitzu indexing will read the data warehouse in the default order (no ordering), which may result in skewed data reads. Data scrambling randomizes the reading order. This setting will randomize the indexing, but it will result in slower indexing.