Prerequisites¶

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

This section helps you prepare to configure the connector properties in Alation, including obtaining authentication details and ensuring your project contains the necessary files. The prerequisites depend on your dbt product and vary between dbt Cloud and Core.

dbt Cloud¶

Before using the connector for extraction in dbt Cloud, you must complete the following steps:

Set Up a dbt Project

Generate a Service Account Token

Enable Support for Databricks Unity Catalog Lineage

Enable ELT Source in Alation

Set Up a dbt Project¶

Create a project with required models or use an exisiting one.

For more information on setting up a project see dbt Project. Also, you must execute required jobs and tests for the connector to extract. For information about models, jobs, and test status see dbt models and Test status.

Generate a Service Account Token¶

A service token is a unique access token linked to an account, used to assign a specific set of permissions for managing access. Service tokens are the preferred method to enable access on behalf of the dbt Cloud account.

To generate a service account token, see Generate Service Account Tokens.

Important

In the New service token window, you must select Read-Only and All projects in the Permission set and Project fields, respectively.

Note

Optionally, instead of a service token you can obtain a Personal Access Token (PAT). A PAT specifies a unique access token of an account. For more details, refer to Personal Access Tokens. However, dbt recommends using service tokens over PATs. For details, see Authentication tokens.

Enable ELT Source in Alation¶

To enable an ELT source in Alation, contact Alation Support.

Enable Support for Databricks Unity Catalog Lineage¶

When configuring Databricks with Unity Catalog in dbt Cloud, you must include the catalog name in the connection settings.

Alation identifies the database type as Databricks Unity Catalog only if the catalog name is specified. Without it, lineage capture may be incomplete.

When using multiple Unity Catalog databases, perform these steps:

In your dbt Cloud instance, go to the Deploy > Environments > <environment> page.
In the Catalog field under the Connection Settings section, specify the target catalog name. dbt Cloud uses the target catalog name specified in the Catalog field to enable support for Databricks Unity Catalog, and the dbt job uses it as the target catalog.
Specify the target schema name in the Schema field under the Deployment credentials section.

Locate the schema.yml in with your dbt project and specify the source catalog names in the schema.yml file.

Example of schema.yml source configuration:

sources:
 - name: <Name_1>
   database: <source_catalog_1>
   schema: <source_schema_1>
   tables:
     - name: <table_name_1>
 - name: <Name_2>
   database: <source_catalog_2>
   schema: <source_schema_2>
   tables:
    - name: <table_name_2>

Create models in your dbt project to use, if not created already. Open the model file, and create an sql query to fetch the required informtion from the database for which you specified the source catalog name in the step 4.
Example of model creation for source_catalog_1 source catalog:
-- models/base/<filename_1>.sql SELECT <Columns> FROM {{ source('<source_catalog_1>', 'table_name_1') }}
Example of model creation for source_catalog_2 source catalog:
-- models/base/<filename_2>.sql SELECT <Columns> FROM {{ source('<source_catalog_2>', 'table_name_2') }}
Run the build in dbt Cloud instance.

The model tables will be created within the target catalog, as specified in the Connection Settings section.

dbt Core¶

Before using the connector for extraction in dbt Core, you must complete the following steps:

Prepare the Artifacts for dbt Projects

Create the File Structure for dbt Projects

Grant Access to Storage Location

Upload the Projects

Enable ELT Source

Prepare the Artifacts for dbt Projects¶

To set up dbt projects, you must have the artificats listed in this table in your project structure. These artifacts are JSON files essential for the connector to extract metadata and lineage. If you already have a project, verify if these files are present in the project structure. If not, create the files and place them in the project structure.

Note

Alation doesn’t support the ODBC project type.

File	Description	How to Generate
`manifest.json`	Fetches the dbt model information and also builds table level lineage	Using the `dbt build` command
`env_details.json`	Fetches the host and port information to build source system information and table or column level lineage	manual
`catalog.json` (Optional)	Fetches the dbt model column information and also builds column-level lineage	Using the `dbt docs generate` command
`run_results.json` (Optional)	Fetches the job run and test run information to build data health	Using the `dbt build` command

Generate Files for dbt Project¶

You can generate the required files using the following methods:

Generate the Manifest and Run Files¶

To generate the manifest.json and run_results.json, Alation recommends using the dbt build command. However, if you choose to use the dbt run or dbt test commands to create these files (manifest.json using dbt run and run_results.json using dbt run or dbt test) and place the files in the project folder structure. Additionally, if there are multiple run_results.json files (one each from dbt run and dbt test), rename them and place them in your project as shown below:

Generate the Environment File¶

Alation displays database table information on the catalog pages of dbt objects under the Source System Information field. To enable this, Alation requires a custom file called env_details.json, which should be included alongside the other required files. This file contains the host and port details for the database resources in the extracted project.

You can create the env_details.json file manually or using a custom build.

To manually create the file:

Identify the required resources in your project and collect the host and port information for each.
Create a JSON file using a text editor of your choice, following this format:
{ "host": "<host1>", "port": <port_for_host1> }
Here’s an example of the env_details.json file created manually for Databricks Unity Catalog.
{ "host":"dbc-25e69bfd-44ed.cloud.databricks.com", "port": 443 }
Note

If you are using Databricks without Unity Catalog, after creating the env_details.json file, you must set unityCatalog as false in the env_details.json file.
{ "host":"dbc-25e69bfd-44ed.cloud.databricks.com", "port": 443, "unityCatalog": false }

The host and port information is available in the profiles.yml file under the ~/.dbt folder within your dbt Core environment.

The infromation below explains how to extract host and port details for each type of data source:

File	Example from profiles.yml File	What to look for?
Snowflake	snowflake_dbt_project: outputs: prod: account: alation_partner.us-east-1 database: IM_SNOWFLAKE_CLL_1 password: <password> role: ACCOUNTADMIN schema: prod_schema_dbt_core_gen2 threads: 1 type: snowflake user: <user> warehouse: TEST target: prod	host: Look for the value in the `account` field and append `snowflakecomputing.com` to it. Example: `host": "alation_partner.us-east-1.snowflakecomputing.com"` port: The value is always the default port `443`
PostgreSQL	postgres_dbt: outputs: prod: dbname: test_alation_adbc_database_01 host: 10.13.34.128 pass: <password> port: 5432 schema: target_schema threads: 1 type: postgres user: <user> target: prod	host: Look for the value in the `host` field. Example: `host": "10.13.34.128"` port: Look for the value in the `port` field. Example: `port":5432`
Redshift	redshift_dbt: outputs: prod: dbname: test_alation_adbc_database_01 host: test.chby8zuitgrf.us-east-1.redshift .amazonaws.com pass: <password> port: 5439 schema: target_schema threads: 1 type: redshift user: <user> target: prod	host: Look for the value in the `host` field. Example: `host": "test."chby8zuitgrf.us-east-1.redshift. amazonaws.com` port: Look for the value in the `port` field. Example: `port":5439`
Google BigQuery	Not applicable	host: The value is always `www.googleapis.com` port: The value is always `443`.
Unity Databricks	unitydatabricks_dbt: outputs: prod: catalog: ap_test_catalog host: dbc-xxxx.cloud.databricks.com http_path: sql/protocolv1/o/7841352139603430 /0205-054336-bjxhu84o schema: databricks_dbt_target_schema threads: 1 token: <token> type: databricks target: prod	host: Look for the value in the `host` field. Example: `host": test.chby8zuitgrf.us-east-1.redshift. amazonaws.com"` port: The value is always the default port `443`.
Non-unity Databricks	unitydatabricks_dbt: outputs: prod: catalog: null host: dbc-xxxx.cloud.databricks.com http_path: sql/protocolv1/o/7841352139603430 /0317-045430-puca15i6 schema: dbt_core_gen2_aws_databricks_target_schema threads: 1 token: <token> type: databricks target: prod	host: Look for the value in the `host` field. Example: `host": test.chby8zuitgrf.us-east-1.redshift.` `amazonaws.com"` port: The value is always the default port `443`.
Oracle	oracle_dbt: outputs: prod: host: 10.13.50.18 port: 1521 user :<username> password :<password> schema: dbt_oracle_schema database:alation threads: 1 service: alation type: oracle protocol :tcp	host: Look for the value in the `host` field. Example: `host": "10.13.34.128"` port: Look for the value in the `port` field. Example: `1521`

Generate the Catalog File¶

To generate the catalog.json file, use the dbt docs generate command. This command generates the documentation for the dbt project and creates the catalog.json file. Place the file in the project folder structure.

Important

The catalog.json file is optional; however, if it is not included, Alation will not display column-level lineage information.

Create the File Structure for dbt Projects¶

Alation requires these files to be placed, if not already present, in a designated storage location, following a specific directory structure. Depending on your setup, your project may include a single environment or multiple environments to support different use cases.

If a project on the storage system does not follow the required structure, the connector will skip metadata and lineage extraction for that project.

Project Structure for a Single Environment¶

The <project_name> is a placeholder that represents a specific dbt project.

Example

Alation extracts from the production environment by default if you don’t specify an environment.

Example

Project Structure for Multiple Environments¶

Project structure with multiple environments allows you to catalog development or staging sources or targets in Alation.

../../../_images/dbt-core-proj-structure2.png

Example

Grant Access to Storage Location¶

Alation supports the following storage locations to store your files:

GitHub

Amazon S3

Azure Blob Storage

Based on your preferred storage location, you must allow Alation to access the projects in the respective storage location.

Grant Access to Projects on Amazon S3¶

Create an S3 bucket in your AWS account or use an existing one.

Create an AWS IAM user with the following permissions. Replace BUCKET_NAME with the actual name of your S3 bucket.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::BUCKET_NAME/*",
                "arn:aws:s3:::BUCKET_NAME"
            ]
         }
    ]
}

For more information on creating an IAM user with the required permissions, see Create an IAM User in Your AWS Account.

Grant Access to Projects on GitHub¶

Create a GitHub repository or use an existing one.
Create a GitHub access token. For details on how to create a Personal Access Token, see Managing your personal access tokens.
In your GitHub repository page, go to Settings > Developer Settings > Personal Access tokens > Tokens(classic) and add the repo access to the token.

Grant Access to Projects on Azure Blob Storage¶

Create an Azure Blob Storage account in your Azure account or use an existing one.
Create a storage access key or Shared Access Signature. For details on how to create a storage access key, see Use the account access key and for Shared Access Signature, see Create a storage SAS.

The storage access key must have full access to the storage account. Similarly, the Shared Access Signature must have the following permissions:
- Allowed services - Blob
- Allowed resource types - Service, Container, and Object
- Allowed permissions - Read

Upload the Projects¶

After you set up the project structure, upload the projects with the prepared files to GitHub, Amazon S3, or Azure Blob Storage based on your preferred storage location.

Note

If you already have projects with required files in GitHub, Amazon S3, or Azure Blob Storage, you can use the connector to extract metadata and lineage, provided the projects are in the required structure.

Enable ELT Source¶

To enable an ELT source in Alation, contact Alation Support.

Prerequisites¶

dbt Cloud¶

Set Up a dbt Project¶

Generate a Service Account Token¶

Enable ELT Source in Alation¶

Enable Support for Databricks Unity Catalog Lineage¶

dbt Core¶

Prepare the Artifacts for dbt Projects¶

Generate Files for dbt Project¶

Generate the Manifest and Run Files¶

Generate the Environment File¶

Generate the Catalog File¶

Create the File Structure for dbt Projects¶

Project Structure for a Single Environment¶

Project Structure for Multiple Environments¶

Grant Access to Storage Location¶

Grant Access to Projects on Amazon S3¶

Grant Access to Projects on GitHub¶

Grant Access to Projects on Azure Blob Storage¶

Upload the Projects¶

Enable ELT Source¶

Alation User Documentation PDFs