Lexicon and Auto-Titling

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Important

You are viewing documentation for Classic Alation.

Lexicon is a system dictionary used by Alation’s auto-titling algorithm, ALLIE, to suggest meaningful titles for cataloged data objects. It maps the abbreviations found in object names within your catalog metadata to their corresponding expansions.

Abbreviations and Expansions

Abbreviations are short strings derived from the names of schemas, tables, or columns in the metadata that Alation ingests from data sources.

Expansions are full words or phrases that correspond to these abbreviations and are either algorithmically inferred or added manually.

For example, if a column is named rgnl_sls, Lexicon will detect two abbreviations: rgnl and sls, treating the underscore _ as a word separator. The abbreviations can be expanded into regional and sales.

How Lexicon Works

When Lexicon runs, it performs the following steps:

  1. Parses catalog metadata to identify abbreviations in schema, table, and column names.

    • Separators like underscores _ are used to break names into individual parts. For example, a name like lu_drg will be split into lu and drg.

    • In cases where names contain concatenated words, like shippingdata, Lexicon will also attempt to split them into word parts, like shipping and data.

  2. Builds a dictionary of abbreviations it detected.

  3. Computes expansions for the abbreviations using algorithmic methods. Alation’s Lexicon includes a built-in list of common words that it uses to initially calculate expansions. As your catalog data grows, Lexicon will also scan the text data in your catalog to algorithmically find more expansions and enrich this list by parsing text fields from:

    • Data object descriptions

    • Article content

  4. Matches abbreviations to expansions and updates the Lexicon dictionary. The list of all abbreviation-expansion matches can be found on the Lexicon page in Admin Settings, accessible to users with the Server Admin and Catalog Admin roles.

    Note

    Suggested words for articles are computed by Lexicon too.

  5. Triggers the ALLIE algorithm to generate auto-suggested titles for schemas, tables, and columns using the abbreviation-expansion mappings.

Title Generation

Titles generated by Lexicon appear in the Title field on catalog pages. For example, if the Lexicon contains the mappings like below and the column name is lu_drg, Lexicon will suggest Lookup Diagnosis Related Group as a title.

lu = lookup
drg = diagnosis related group

A robot head icon next to a title indicates it was auto-generated by Alation. Depending on the level of confidence of the suggestion, the icon of new titles will either be red (low-confidence guess) or yellow (high-confidence guess). Users can manually confirm or reject auto-suggested titles.

../../_images/Lexicon_AutoTitles_Example_Classic.png

Teach ALLIE to Improve Suggestions

Catalog and Server Admins can confirm or reject auto-titles directly on the Lexicon page. These actions help ALLIE learn and make better title suggestions over time. See Confirm or Reject Auto-Titles on Catalog Pages for more details.

Lexicon Schedule

The Lexicon job runs automatically every Sunday at 8:00 AM and performs the following:

  • Re-parses metadata

  • Recalculates abbreviation-expansion mappings

  • Applies all new confirmed or rejected title feedback

  • Recomputes auto-titles

  • Computes suggested words for glossaries

Catalog and Server Admins can also run the Lexicon job on demand to apply changes immediately. See Manage Lexicon.