Introduction To Data Modeling
Mitzu can work on top of RAW data.
It doesn't require any transformations, normalizations or any sort of modeling. This makes it a great tool for dataset that are ingest to the data warehouse via:
- ELT/ETL tools, like Fivetran, Airbyte, Rivery, Stitch, etc.
- CDPs, like Segment, RudderStack, Jitsu, etc.
However, you might have use-cases that other tools and service require your data to be modeled. In this section, we will cover the basics of data modeling for Mitzu.
In this section, we will cover the basics of data modeling for Mitzu.
Events and dimensions​
Mitzu can work well on top of event data models. If you are familiar with the facts and dimensions model (Star schema), the event data model is very similar to it.
The main difference is that every fact is always related to a User. This means each of the "event tables" contain two columns:
- User ID: The ID of the user that performed the event.
- Event time: The time when the event occurred.
Optionally you can also add a Group ID column to the event table. This column is used to mark the team, or the organization the user belonged to.
Every event performed by the user in the product or service must have a single row in the event table.
Some event tables might contain an event_name
or event_type
column.
This column represents the name of the event the user performed at a given time.
Event tables that has the event_name
column are called multi event tables.
Alternatively, they are called "one big table" or "wide table".
While tables that don't have the event_name
are called single event tables.
Mitzu supports both single and multi event tables.
Event properties​
Any column that is present in an event table is considered an event property. If you tables have complex column types such as JSON, MAP, VARIANT, STRUCT, etc. The nested key-value pairs can be considered as event properties as well. More on this subject here.
Dimension tables​
Dimension tables contain information about entities of your business. Typically these are:
- Orders
- Users
- Products
- Items
These don't have an event_time
or event_name
column.
However, they can have a "primary key" column that is used to identify the entity.
Mitzu currently supports 2 types of dimension tables:
- User
- Group
Later we will introduce any arbitrary dimension tables.
The primary keys in the dimension tables should match the user_id
and group_id
columns in the event tables.