Skip to main content

AWS Athena

Overview​

Mitzu connects to AWS Athena using an AWS user with the right permissions to access your data. To connect Mitzu to AWS Athena, you need to create this user first and then configure its credentials in Mitzu.

If you use other AWS services, we recommend creating a dedicated AWS service account that only has the permissions required to run Athena, and using the IAM credentials from that account to connect Mitzu to Athena.

See Identity and access management in Athena.

Supported data types​

Mitzu will map the types of the data warehouse based on the following table:

Mitzu typeData warehouse type
StringCHAR, CHAR(length), STRING, VARCHAR(length)
NumberTINYINT, SMALLINT, INT, INTEGER, BIGINT, FLOAT, DOUBLE
BooleanBOOLEAN
DatetimeTIME, DATE, TIMESTAMP
MapMAP
StructSTRUCT
ArrayARRAY

info
All unrecognized types will be handled as strings.

Create an AWS Athena service user​

Head to AWS IAM and create a new user. This user needs access to three primary resources:

  • Files in S3
  • AWS Glue
  • AWS Athena

Here, you can find more information about AWS users and how to create them.

Here is an example IAM Policy document containing the proper permissions:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Athena",
"Effect": "Allow",
"Action": [
"athena:BatchGetNamedQuery",
"athena:BatchGetQueryExecution",
"athena:GetNamedQuery",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:GetQueryResultsStream",
"athena:GetWorkGroup",
"athena:ListDatabases",
"athena:ListDataCatalogs",
"athena:ListNamedQueries",
"athena:ListQueryExecutions",
"athena:ListTagsForResource",
"athena:ListWorkGroups",
"athena:ListTableMetadata",
"athena:StartQueryExecution",
"athena:StopQueryExecution",
"athena:CreatePreparedStatement",
"athena:DeletePreparedStatement",
"athena:GetPreparedStatement"
],
"Resource": "*"
},
{
"Sid": "Glue",
"Effect": "Allow",
"Action": [
"glue:BatchGetPartition",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetPartition",
"glue:GetPartitions",
"glue:GetTable",
"glue:GetTables",
"glue:GetTableVersion",
"glue:GetTableVersions"
],
"Resource": "*"
},
{
"Sid": "S3ReadAccess",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket", "s3:GetBucketLocation"],
"Resource": [
"arn:aws:s3:::bucket1",
"arn:aws:s3:::bucket1/*",
"arn:aws:s3:::bucket2",
"arn:aws:s3:::bucket2/*"
]
},
{
"Sid": "AthenaResultsBucket",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": ["arn:aws:s3:::bucket2", "arn:aws:s3:::bucket2/*"]
}
]
}

Set the credentials in Mitzu​

Find and copy the AWS_ACCESS_KEY_ID and AWS_SECRET_KEY into Mitzu. For AWS Athena, the Catalog should stay set to AwsDataCatalog, or you can leave the field empty. For S3 Staging Dir, make sure you choose the correct bucket for storing intermediate files.

image

Click the Test connection button to check if Mitzu can connect to your data warehouse using the entered values.

warning
Mitzu will try to connect to your data warehouse and execute a SELECT 1;command. You may need to grant further permission Mitzu to see and query your data tables.

To save the settings, click the Test connection & Save button.

Next steps​

Once the connection is tested an saved the event end dimension tables can be configured. Please follow the setting up event tables guide.