Overview

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Available from Alation version 2024.3.4

The OCF Connector for dbt Gen2 is developed by Alation and is available for download from the Connector Hub on Alation Customer Portal. Ask an Alation admin with access to Customer Portal to download the connector from the Connectors section (Customer Portal > Connectors).

Note

The dbt Gen2 OCF connector is available as a standalone connector. For information on how this connector is different from the add-on connector for dbt, see ELT Connectors.

The connector extracts projects, models, model columns, and job runs (jobs executed for a given model in an instance) from dbt. After the metadata is extracted, it is represented in the data catalog as a hierarchy of catalog pages. Alation users can leverage the full catalog functionality to search for and find the extracted metadata, curate the corresponding catalog pages, create documentation about the data source, and exchange information about it.

Team

You may require assistance from the following administrators to install and configure this connector in Alation:

  • Alation Server administrator:

    • Installs the connector.

    • Enables and configures metadata and lineage extraction.

  • dbt administrator:

    • Provides the API URLs for accessing dbt and extracting metadata through the dbt Admin and Discovery APIs

    • Provides the account ID required for the API URLs

    • Provides an access token for use with the dbt APIs

Scope

The table below shows which metadata objects are extracted by this connector and which operations are supported.

Feature

Scope

Availability

Authentication *(dbt Cloud)

dbt access token

Authentication using an access token that can be obtained from dbt account settings. dbt Cloud supports both service account tokens and Personal Access Tokens (PAT).

Yes

Authentication *(dbt Core)

GitHub access token

Authentication using an access token that can be obtained from GitHub.

Yes

Authentication with IAM user

Amazon S3 bucket authentication with an IAM user.

Yes

Shared Access Signature

Azure Blob storage authentication using a Shared Access Signature.

Yes

Metadata extraction (MDE)

Default MDE

Extraction of metadata using dbt APIs.

Yes

Selective extraction

Extraction of selected metadata using configurable filters.

Yes

Extracted metadata objects

Project

List of projects.

Yes

Models

List of models within a project.

Yes

Model columns

Includes model columns, model description, and job runs.

Yes

Test status

Includes data health of associated data source tables.

Yes

Lineage

Jinja code extraction

Extraction of Jinja code from dbt into dataflow content. Jinja code is used as the transformation scripting language in dbt. Jinja code is extracted from the dbt models.

Yes

Automatic lineage generation

Auto-calculation of lineage based on the lineage data extracted from dbt

Yes

Table-level lineage

Calculation of lineage data at the table level.

Yes

Column-level lineage

Calculation of lineage data at the column level.

Yes

Supported Data Sources for dbt Gen2 OCF Connector

The dbt Gen2 OCF connector supports metadata and lineage extraction from the following data sources in both dbt Cloud and dbt Core:

  • Azure Databricks

  • Databricks on AWS

    Note

    Databricks on AWS supports table-level lineage and does not support column-level lineage.

  • Databricks on Google Cloud

  • Databricks Unity Catalog

  • Google BigQuery

  • PostgreSQL

  • Redshift

  • Snowflake

dbt Core

To configure dbt Core in Alation, you need to include the following files in your dbt project:

  • manifest.json

  • env_details.json

  • run_results.json (Optional)

  • catalog.json (Optional)

For details, see Prerequisites.

Alation supports the following storage locations for storing your files:

  • GitHub

  • Amazon S3

  • Azure Blob Storage

Understand Catalog Pages of dbt Objects

After extraction, Alation generates:

Use the Alation UI to understand and navigate through dbt metadata hierarchy.

Project Catalog Page

The project catalog page contains:

  • A list of models within the project.

  • The Job Runs table, which tracks jobs run on the project.

  • The Source System Information field, which lists data sources associated with the project.

  • Built-in and custom catalog fields associated with the catalog template of this object.

Here’s an example of a project catalog page.

../../../_images/dbt-project-catalog.png

Model Catalog Page

The model catalog page includes:

  • A list of model columns

  • The Source System Information field with the linked tables in an environment, associated with the extracted RDBMS data source.

  • The Job Runs table, which shows the latest job execution summary for a given model.

  • Built-in and custom catalog fields associated with the catalog template of this object.

Here’s an example of a project catalog page.

../../../_images/dbt-model-catalog-page.png

Model Column Catalog Page

  • The model columns catalog page provides details on:

  • Extracted model column properties

  • The Source System Information field with the associated columns from the respective source tables.

  • Built-in and custom catalog fields associated with the catalog template of this object.

Limitations

dbt Core

The following limitations apply to dbt Core:

  • For otype catalog pages, the connector does not retrieve the createdDate, lastUpdatedDate, and comment fields from the manifest.json file.

  • For models, the connector does not retrieve the owner from the manifest.json file.

  • For model columns, the connector does not retrieve the Tag and Comment fields from the manifest.json and catalog.json files.

  • For job runs, the connector does not retrieve the NextRunAt field from the run_results.json file.