Data Dictionaries

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

A data dictionary is a consolidated summary file of all curation information for a catalog page. Data dictionaries are available for:

  • RDBMS objects (data sources, schemas, tables)

  • Folders in a document hub (available from version 2024.3.5)

    Note

    Curation information for child columns in a table and child documents in a folder is included in the data dictionary of the parent object. Data dictionaries cannot be generated separately for individual columns or documents.

The primary use of data dictionaries is bulk-editing:

  • Download—You can export the existing curation information for catalog pages in CSV, XML, and JSON formats for analysis and external use.

  • Upload—You can upload a data dictionary to bulk-curate catalog pages of a data source and its child objects. Supported formats for upload include CSV and TSV.

  • Bulk curation best practice workflow—You can use the data dictionary download and upload as a workflow to curate multiple catalog fields. Editing a downloaded data dictionary file is much simpler than creating a source file from scratch:

    1. Download the data dictionary file.

    2. Modify the file by adding, removing, or updating field values.

    3. Upload the updated file to apply changes in bulk to the corresponding catalog pages.

Additionally, data dictionaries may be used to migrate curation information between data sources.

Data dictionaries have a fixed structure and must conform to the specific format required for each field type.

Note

NoSQL object types, such as document store folders (docstore_folder), collections (docstore_collection), and schemas (doc_schema), are not currently supported by data dictionaries. When support for complex data types (struct, array, JSON) is enabled on the Alation instance, complex data type columns are stored as a NoSQL object type (doc_schema). Such columns are not supported by data dictionaries and cannot be curated via the data dictionary upload.