Prerequisites

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

This section helps you prepare to configure the connector properties in Alation, including obtaining authentication details and ensuring your project contains the necessary files. The prerequisites depend on your dbt product and vary between dbt Cloud and Core.

dbt Cloud

Before using the connector for extraction in dbt Cloud, you must complete the following steps:

Set Up a dbt Project

Create a project with required models or use an exisiting one.

For more information on setting up a project see dbt Project. Also, you must execute required jobs and tests for the connector to extract. For information about models, jobs, and test status see dbt models and Test status.

Generate a Service Account Token

A service token is a unique access token linked to an account, used to assign a specific set of permissions for managing access. Service tokens are the preferred method to enable access on behalf of the dbt Cloud account.

To generate a service account token, see Generate Service Account Tokens.

Important

In the New service token window, you must select Read-Only and All projects in the Permission set and Project fields, respectively.

Note

Optionally, instead of a service token you can obtain a Personal Access Token (PAT). A PAT specifies a unique access token of an account. For more details, refer to Personal Access Tokens. However, dbt recommends using service tokens over PATs. For details, see Authentication tokens.

Enable ELT Source in Alation

To enable an ELT source in Alation, contact Alation Support.

dbt Core

Before using the connector for extraction in dbt Core, you must complete the following steps:

Prepare the Artifacts for dbt Projects

To set up dbt projects, you must have the artificats listed in this table in your project structure. These artifacts are JSON files essential for the connector to extract metadata and lineage. If you already have a project, verify if these files are present in the project structure. If not, create the files and place them in the project structure.

Note

Alation doesn’t support the ODBC project type.

File

Description

How to Generate

manifest.json

Fetches the dbt model information and also builds table level lineage

Using the dbt build command

env_details.json

Fetches the host and port information to build source system information and table or column level lineage

manual

catalog.json

(Optional)

Fetches the dbt model column information and also builds column-level lineage

Using the dbt docs generate command

run_results.json

(Optional)

Fetches the job run and test run information to build data health

Using the dbt build command

Generate Files for dbt Project

You can generate the required files using the following methods:

Generate the Manifest and Run Files

To generate the manifest.json and run_results.json, Alation recommends using the dbt build command. However, if you choose to use the dbt run or dbt test commands to create these files (manifest.json using dbt run and run_results.json using dbt run or dbt test) and place the files in the project folder structure. Additionally, if there are multiple run_results.json files (one each from dbt run and dbt test), rename them and place them in your project as shown below:

../../../_images/dbt-core-proj-structure3.png
Generate the Environment File

Alation displays database table information on the catalog pages of dbt objects under the Source System Information field. To enable this, Alation requires a custom file called env_details.json, which should be included alongside the other required files. This file contains the host and port details for the database resources in the extracted project.

You can create the env_details.json file manually or using a custom build.

To manually create the file:

  1. Identify the required resources in your project and collect the host and port information for each.

  2. Create a JSON file using a text editor of your choice, following this format:

    {
    "host": "<host1>",
    "port": <port_for_host1>
    }
    

    Here’s an example of the env_details.json file created manually for Databricks Unity Catalog.

    {
        "host":"dbc-25e69bfd-44ed.cloud.databricks.com",
        "port": 443
    }
    

    Note

    If you are using Databricks without Unity Catalog, after creating the env_details.json file, you must set unityCatalog as false in the env_details.json file.

    {
        "host":"dbc-25e69bfd-44ed.cloud.databricks.com",
        "port": 443,
        "unityCatalog": false
    }
    

The host and port information is available in the profiles.yml file under the ~/.dbt folder within your dbt Core environment.

The infromation below explains how to extract host and port details for each type of data source:

File

Example from profiles.yml File

What to look for?

Snowflake

snowflake_dbt_project:
 outputs:
   prod:
    account: alation_partner.us-east-1
     database: IM_SNOWFLAKE_CLL_1
     password: <password>
     role: ACCOUNTADMIN
     schema: prod_schema_dbt_core_gen2
     threads: 1
     type: snowflake
     user: <user>
     warehouse: TEST
  target: prod

host: Look for the value in the account field and append snowflakecomputing.com to it.

Example: host": "alation_partner.us-east-1.snowflakecomputing.com"

port: The value is always the default port 443

PostgreSQL

postgres_dbt:
 outputs:
   prod:
    dbname: test_alation_adbc_database_01
     host: 10.13.34.128
     pass: <password>
     port: 5432
     schema: target_schema
     threads: 1
     type: postgres
     user: <user>
  target: prod

host: Look for the value in the host field.

Example:

host": "10.13.34.128"

port: Look for the value in the port field.

Example:

port":5432

Redshift

redshift_dbt:
 outputs:
   prod:
    dbname: test_alation_adbc_database_01
     host: test.chby8zuitgrf.us-east-1.redshift
            .amazonaws.com
     pass: <password>
     port: 5439
     schema: target_schema
     threads: 1
     type: redshift
     user: <user>
  target: prod

host: Look for the value in the host field.

Example:

host": "test."chby8zuitgrf.us-east-1.redshift. amazonaws.com

port: Look for the value in the port field.

Example:

port":5439

Google BigQuery

Not applicable

host: The value is always www.googleapis.com

port: The value is always 443.

Unity Databricks

unitydatabricks_dbt:
 outputs:
   prod:
    catalog: ap_test_catalog
     host: dbc-xxxx.cloud.databricks.com
     http_path: sql/protocolv1/o/7841352139603430
     /0205-054336-bjxhu84o
     schema: databricks_dbt_target_schema
     threads: 1
     token: <token>
     type: databricks
  target: prod

host: Look for the value in the host field.

Example:

host": test.chby8zuitgrf.us-east-1.redshift. amazonaws.com"

port: The value is always the default port 443.

Non-unity Databricks

unitydatabricks_dbt:
 outputs:
   prod:
     catalog: null
     host: dbc-xxxx.cloud.databricks.com
     http_path: sql/protocolv1/o/7841352139603430
     /0317-045430-puca15i6
     schema: dbt_core_gen2_aws_databricks_target_schema
     threads: 1
     token: <token>
     type: databricks
  target: prod

host: Look for the value in the host field.

Example:

host": test.chby8zuitgrf.us-east-1.redshift.

amazonaws.com"

port: The value is always the default port 443.

Generate the Catalog File

To generate the catalog.json file, use the dbt docs generate command. This command generates the documentation for the dbt project and creates the catalog.json file. Place the file in the project folder structure.

Important

The catalog.json file is optional; however, if it is not included, Alation will not display column-level lineage information.

Create the File Structure for dbt Projects

Alation requires these files to be placed, if not already present, in a designated storage location, following a specific directory structure. Depending on your setup, your project may include a single environment or multiple environments to support different use cases.

If a project on the storage system does not follow the required structure, the connector will skip metadata and lineage extraction for that project.

Project Structure for a Single Environment

../../../_images/dbt-core-proj-structure1.png

The <project_name> is a placeholder that represents a specific dbt project.

Example

../../../_images/dbt-core-proj-structure1-example.png

Alation extracts from the production environment by default if you don’t specify an environment.

../../../_images/dbt-core-proj-structure4.png

Example

../../../_images/dbt-core-proj-structure4-example.png

Project Structure for Multiple Environments

Project structure with multiple environments allows you to catalog development or staging sources or targets in Alation.

../../../_images/dbt-core-proj-structure2.png

Example

../../../_images/dbt-core-proj-structure2-example.png

Grant Access to Storage Location

Alation supports the following storage locations to store your files:

  • GitHub

  • Amazon S3

  • Azure Blob Storage

Based on your preferred storage location, you must allow Alation to access the projects in the respective storage location.

Grant Access to Projects on Amazon S3

  1. Create an S3 bucket in your AWS account or use an existing one.

  2. Create an AWS IAM user with the following permissions. Replace BUCKET_NAME with the actual name of your S3 bucket.

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:ListBucket"
                ],
                "Resource": [
                    "arn:aws:s3:::BUCKET_NAME/*",
                    "arn:aws:s3:::BUCKET_NAME"
                ]
             }
        ]
    }
    

For more information on creating an IAM user with the required permissions, see Create an IAM User in Your AWS Account.

Grant Access to Projects on GitHub

  1. Create a GitHub repository or use an existing one.

  2. Create a GitHub access token. For details on how to create a Personal Access Token, see Managing your personal access tokens.

  3. In your GitHub repository page, go to Settings > Developer Settings > Personal Access tokens > Tokens(classic) and add the repo access to the token.

    ../../../_images/dbt-core-github-access.png

Grant Access to Projects on Azure Blob Storage

  1. Create an Azure Blob Storage account in your Azure account or use an existing one.

  2. Create a storage access key or Shared Access Signature. For details on how to create a storage access key, see Use the account access key and for Shared Access Signature, see Create a storage SAS.

    The storage access key must have full access to the storage account. Similarly, the Shared Access Signature must have the following permissions:

    • Allowed services - Blob

    • Allowed resource types - Service, Container, and Object

    • Allowed permissions - Read

    ../../../_images/dbt-core-azure-sas.png

Upload the Projects

After you set up the project structure, upload the projects with the prepared files to GitHub, Amazon S3, or Azure Blob Storage based on your preferred storage location.

Note

If you already have projects with required files in GitHub, Amazon S3, or Azure Blob Storage, you can use the connector to extract metadata and lineage, provided the projects are in the required structure.

Enable ELT Source

To enable an ELT source in Alation, contact Alation Support.