AWS Athena
Overview​
Mitzu connects to AWS Athena using an AWS user with the right permissions to access your data. To connect Mitzu to AWS Athena, first, this user should be created, and then its credentials need to be configured in Mitzu.
If you use other AWS services, we recommend creating a special AWS Service Account that only has the permissions required to run Athena and input the IAM credentials from that account to connect Mitzu to Athena.
See Identity and access management in Athena.
Supported data types​
Mitzu will map the types of the data warehouse based on the following table:
Mitzu type | Data warehouse type |
---|---|
String | CHAR, CHAR(length), STRING, VARCHAR(length) |
Number | TINYINT, SMALLINT, INT, INTEGER, BIGINT, FLOAT, DOUBLE |
Boolean | BOOLEAN |
Datetime | TIME, DATE, TIMESTAMP |
Map | MAP |
Struct | STRUCT |
Array | ARRAY |
Create an AWS Athena service user​
Head to AWS IAM and create a new user. This user should be able to access three primary resources:
Files in S3
AWS Glue
AWS Athena
Here, you can find more information about AWS users and how to create them.
Here is an example IAM Policy document containing the proper permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Athena",
"Effect": "Allow",
"Action": [
"athena:BatchGetNamedQuery",
"athena:BatchGetQueryExecution",
"athena:GetNamedQuery",
"athena:GetQueryExecution",
"athena:GetQueryResults",
"athena:GetQueryResultsStream",
"athena:GetWorkGroup",
"athena:ListDatabases",
"athena:ListDataCatalogs",
"athena:ListNamedQueries",
"athena:ListQueryExecutions",
"athena:ListTagsForResource",
"athena:ListWorkGroups",
"athena:ListTableMetadata",
"athena:StartQueryExecution",
"athena:StopQueryExecution",
"athena:CreatePreparedStatement",
"athena:DeletePreparedStatement",
"athena:GetPreparedStatement"
],
"Resource": "*"
},
{
"Sid": "Glue",
"Effect": "Allow",
"Action": [
"glue:BatchGetPartition",
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetPartition",
"glue:GetPartitions",
"glue:GetTable",
"glue:GetTables",
"glue:GetTableVersion",
"glue:GetTableVersions"
],
"Resource": "*"
},
{
"Sid": "S3ReadAccess",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket", "s3:GetBucketLocation"],
"Resource": [
"arn:aws:s3:::bucket1",
"arn:aws:s3:::bucket1/*",
"arn:aws:s3:::bucket2",
"arn:aws:s3:::bucket2/*"
]
},
{
"Sid": "AthenaResultsBucket",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": ["arn:aws:s3:::bucket2", "arn:aws:s3:::bucket2/*"]
}
]
}
Set the credentials in Mitzu​
Find and copy the AWS_ACCESS_KEY_ID
and AWS_SECRET_KEY
to Mitzu. In the case of AWS Athena, the Catalog should stay AwsDataCatalog or leave the field empty. For S3 Staging Dir
, make sure you have chosen the correct bucket for storing intermediary files.
Click the Test connection
button to check if Mitzu can connect to your data warehouse using the entered values.
SELECT 1;
command. You may need to grant further permission Mitzu to see and query your data tables.To save the settings, click the Test connection & Save
button.
Next steps​
Once the connection is tested an saved the event end dimension tables can be configured. Please follow the setting up event tables guide.