Skip to main content

AWS Athena

Overview​

Mitzu connects to AWS Athena using an AWS user with the right permissions to access your data. To connect Mitzu to AWS Athena, first, this user should be created, and then its credentials need to be configured in Mitzu.

If you use other AWS services, we recommend creating a special AWS Service Account that only has the permissions required to run Athena and input the IAM credentials from that account to connect Mitzu to Athena.

See Identity and access management in Athena.

Supported data types​

Mitzu will map the types of the data warehouse based on the following table:

Mitzu typeData warehouse type
StringCHAR, CHAR(length), STRING, VARCHAR(length)
NumberTINYINT, SMALLINT, INT, INTEGER, BIGINT, FLOAT, DOUBLE
BooleanBOOLEAN
DatetimeTIME, DATE, TIMESTAMP
MapMAP
StructSTRUCT
ArrayARRAY

info
All unrecognized types will be handled as strings.

Create an AWS Athena service user​

Head to AWS IAM and create a new user. This user should be able to access three primary resources:

  • Files in S3
  • AWS Glue
  • AWS Athena

Here, you can find more information about AWS users and how to create them.

Here is an example IAM Policy document containing the proper permissions:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Athena",
"Effect": "Allow",
"Action": [
"athena:BatchGetNamedQuery",
"athena:BatchGetQueryExecution",
"athena:GetNamedQuery",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:GetQueryResultsStream",
"athena:GetWorkGroup",
"athena:ListDatabases",
"athena:ListDataCatalogs",
"athena:ListNamedQueries",
"athena:ListQueryExecutions",
"athena:ListTagsForResource",
"athena:ListWorkGroups",
"athena:ListTableMetadata",
"athena:StartQueryExecution",
"athena:StopQueryExecution",
"athena:CreatePreparedStatement",
"athena:DeletePreparedStatement",
"athena:GetPreparedStatement"
],
"Resource": "*"
},
{
"Sid": "Glue",
"Effect": "Allow",
"Action": [
"glue:BatchGetPartition",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetPartition",
"glue:GetPartitions",
"glue:GetTable",
"glue:GetTables",
"glue:GetTableVersion",
"glue:GetTableVersions"
],
"Resource": "*"
},
{
"Sid": "S3ReadAccess",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket", "s3:GetBucketLocation"],
"Resource": [
"arn:aws:s3:::bucket1",
"arn:aws:s3:::bucket1/*",
"arn:aws:s3:::bucket2",
"arn:aws:s3:::bucket2/*"
]
},
{
"Sid": "AthenaResultsBucket",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": ["arn:aws:s3:::bucket2", "arn:aws:s3:::bucket2/*"]
}
]
}

Set the credentials in Mitzu​

Find and copy the AWS_ACCESS_KEY_ID and AWS_SECRET_KEY to Mitzu. In the case of AWS Athena, the Catalog should stay AwsDataCatalog or leave the field empty. For S3 Staging Dir, make sure you have chosen the correct bucket for storing intermediary files.

image

Click the Test connection button to check if Mitzu can connect to your data warehouse using the entered values.

warning
Mitzu will try to connect to your data warehouse and execute a SELECT 1;command. You may need to grant further permission Mitzu to see and query your data tables.

To save the settings, click the Test connection & Save button.

Next steps​

Once the connection is tested an saved the event end dimension tables can be configured. Please follow the setting up event tables guide.