Skip to main content

Introduction To Data Modeling

info

Mitzu can work on top of RAW data.

It doesn't require any transformations, normalizations or any sort of modeling. This makes it a great tool for dataset that are ingest to the data warehouse via:

  • ELT/ETL tools, like Fivetran, Airbyte, Rivery, Stitch, etc.
  • CDPs, like Segment, RudderStack, Jitsu, etc.

However, you might have use-cases that other tools and service require your data to be modeled. In this section, we will cover the basics of data modeling for Mitzu.

In this section, we will cover the basics of data modeling for Mitzu.

Events and dimensions​

Mitzu can work well on top of event data models. If you are familiar with the facts and dimensions model (Star schema), the event data model is very similar to it.

The main difference is that every fact is always related to a User. This means each of the "event tables" contain two columns:

  • User ID: The ID of the user that performed the event.
  • Event time: The time when the event occurred.

Optionally you can also add a Group ID column to the event table. This column is used to mark the team, or the organization the user belonged to.

Every event performed by the user in the product or service must have a single row in the event table.

info

Some event tables might contain an event_name or event_type column. This column represents the name of the event the user performed at a given time.

Event tables that has the event_name column are called multi event tables. Alternatively, they are called "one big table" or "wide table". While tables that don't have the event_name are called single event tables.

success

Mitzu supports both single and multi event tables.

Event properties​

Any column that is present in an event table is considered an event property. If you tables have complex column types such as JSON, MAP, VARIANT, STRUCT, etc. The nested key-value pairs can be considered as event properties as well. More on this subject here.

Dimension tables​

Dimension tables contain information about entities of your business. Typically these are:

  • Orders
  • Users
  • Products
  • Items

These don't have an event_time or event_name column. However, they can have a "primary key" column that is used to identify the entity.

info

Mitzu currently supports 2 types of dimension tables:

  • User
  • Group

Later we will introduce any arbitrary dimension tables.

The primary keys in the dimension tables should match the user_id and group_id columns in the event tables.