Bot Configuration¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
Sample Bot Configuration¶
We will start with a sample Bot config. Below is a “Data Source Ownership” Bot Configuration which:
Reads all data sources and associated attributes from the rdbms_datasources Alation Analytics table
Checks if the Steward OR the Data Owner field for a data source is empty
If empty, then start a conversation to notify Alation user #2 with subject Datasource without Steward(s) or Data Owner(s).
{ "bot_name": "Bot - Data Source Ownership", "description": "Bot to check for without Steward(s) or Data Owner(s)", "conn_type": {"sf": {"otype": "DATA", "aa_query": "SELECT * FROM public.rdbms_datasources WHERE deleted = False", "last_etl": "SELECT last_load_time from information_schema.load_history where table_name = 'STG_RDBMS_DATASOURCES' ORDER BY last_load_time DESC LIMIT 1"} }, "create_policy": True, "rules": [ { "fields_to_check": [ {"Steward":{"condition":"IS NULL", "field_type": "OBJECT_SET"}}, {"Data Owner":{"condition":"IS NULL", "field_type": "OBJECT_SET"}} ], "fields_to_check_operator":"or", "notification": {"type": "conversation", "subject": "Datasource without Steward(s) or Data Owner(s)", "frequency": "always", "to": [{"user": 2}]} } ] }
The notified user can be any valid Alation user ID, and is typically an administrator.
Let’s examine each of the parameters used in a Bot configuration.
Bot Parameters¶
Here is a complete list of the Bot configuration parameters:
bot_name |
The name of the Bot in the Bot Manager The name of the Bot Violation Policy within Alation |
description |
This description will be added to the description in the created Bot Violation Policy within Alation |
conn_type |
How your Bot will interact with Alation Analytics. There are two possible settings here. sf represents Snowflake, and this value tells your bot to use the Snowflake connector to communicate with Alation Analytics pg represents Postgres, and this value tells your bot to use the Postgres connector to communicate with Alation Analytics Example:
|
otype |
Alation otype that the Bot will be working with. See supported otypes below |
custom_template_name |
This is required when running a Bot against a BUSINESS_POLICY. The entree needs to be the exact name of the policy template entered into Alation. |
aa_query |
Data that the Bot will be analyzing. See Alation Analytics example queries below |
last_etl |
When a Bot runs, it stores the date_time of that run. Before it runs again, it compares the stored date_time to the last run of Alation ETL to see if there is any new data. If there is no new data, the Bot will not run. See last_etl examples below. |
create_policy |
Bot Violation policies are not always required. Pass true or false to instruct the Bot to create a policy. There will be times when you just want your Bot to run and perform an action if a condition is met. For example, attach a specific policy to an object if a condition is met. This example does not need a Bot Violation Policy; just have the Bot perform the action. |
rules |
A collection of rules your bot will run. Rules are waterfall-based, and the first rule condition satisfied wins. Any following rules will not be examined. Bots have a max limit of 4 rules. |
fields_to_check |
These are the fields that the condition in the rule should check. They are Alation fields and must be typed exactly as they are labeled within Alation. Spaces are acceptable, and the names have to match. Multiple fields_to_check entrees are acceptable. If more than one entry, then the fields_to_check_operator is required. Example: "fields_to_check":[
{"Certified":{"condition":"IS NULL", "field_type": "PICKER"}},
{"Phase One":{"condition":"IS NULL", "field_type": "PICKER"}},
{"Phase Two":{"condition":"IS NULL", "field_type": "PICKER"}},
{"Phase Three":{"condition":"IS NULL", "field_type": "PICKER"}}
],
"fields_to_check_operator": "or",
|
fields_to_check_operator |
If more than one fields_to_check entree is added to a rule, a fields_to_check_operator is required. This can be and or or. |
properties_to_check |
You can create properties to check from your aa_query. These properties are analyzed just like fields_to_check. Multiple properties_to_check are permitted, but if more than one, the properties_to_check_operator is required. Example: "aa_query": "SELECT \*, DATEDIFF(day, verified_date::DATE,
CURRENT_DATE) as date_span FROM public.business_policy
WHERE policy_type_name = 'Compliance' AND deleted = False",
{"date_span":{"condition":"GREATER THAN", "value":"120", "field_type": "TEXT"}}
|
properties_to_check_operator |
If more than one properties_to_check entree is added to a rule, a fields_to_check_operator is required. This can be and or or. |
condition |
Supported conditions are IS_NULL IS NOT NULL CONTAINS DOES NOT CONTAIN GREATER THAN LESS THAN |
value |
Required if the following condition is used CONTAINS DOES NOT CONTAIN GREATER THAN LESS THAN |
field_type |
Support field types are TEXT DATE MULTI_PICKER OBJECT_SET PICKER RICH_TEXT |
actions_to_perform |
If a condition is satisfied, one or more actions can be performed. An action is altering the data within a field. Example: {"Business Policy":{"action":"ADD","field_type": "OBJECT_SET", "values":["business_policy_24"]}}
|
action |
Supported actions are ADD REMOVE |
field_type |
Supported Alation fields are TEXT DATE MULTI_PICKER OBJECT_SET PICKER RICH_TEXT |
value |
The value that will be passed to the field. Notice that this is a list since a MULTI_PICKER field is supported. |
Notifications¶
These are the parameters used when a notification is required:
type |
conversation slack |
subject |
{string} |
frequency |
updates - Any time there is a change in the violation count + or - always - Every time the Bot runs, a notification is sent regardless of the violation count |
Conversation Example "notification": {"type": "conversation",
"subject": "Attach PII Policy to required tables",
"frequency": "always",
"to": [{"user": 2}]}
A Conversation is then created with the specified Alation user ID - user 2 in this example. |
|
Email Example "notification": {
"type": "email",
"subject": "Table Completeness",
"frequency": "always",
"to": [
"[email protected]",
"[email protected]"
]
}
|
|
Slack Example "notification": {
"type": "slack",
"subject": "Table Completeness",
"frequency": "always",
"to": [
"[email protected]",
"[email protected]"
],
"channel": "valid_slack_channel"
}
|
Supported otypes¶
SCHEMA |
rdbms_schemas |
TABLE |
rdbms_tables |
ATTRIBUTE |
rdbms_columns |
DATA |
rdbms_datasources |
BUSINESS_POLICY |
business_policy |
aa_query examples¶
Data Source |
SELECT * FROM public.rdbms_datasources WHERE deleted = False AND EXCLUDED = False |
Schema |
SELECT * FROM public.rdbms_schemas WHERE deleted = False AND EXCLUDED = False |
Table |
SELECT * FROM public.rdbms_tables WHERE deleted = False AND EXCLUDED = False |
Column (Attribute) |
SELECT * FROM public.rdbms_columns WHERE deleted = False AND EXCLUDED = False |
Policy |
SELECT * FROM public.business_policy WHERE deleted = False AND EXCLUDED = False |
last_etl examples¶
You will have noticed a second sql query labelled “last_etl”. Each time the Alation Analytics ETL job runs within your Alation Instance it leaves a “last updated” timestamp in the target database for each object it updates. If you make a catalog change to table metadata for example, the last updated stamp for the rdbms_table table will be updated to that time. When a Bot runs it will check to see if it has a stored timestamp from the last time the Bot ran. If it does, it will compare that timestamp to the one in the data object that the Bot is running against. If the time stamp for the data object is newer then the one stored, it will run the bot because there must be some new data. Otherwise the Bot will not run. If there is no timestamp stored because this is the first time the Bot has been run, it will ignore timestamps and just run. This efficiency feature eliminates unnecessary Bot executions on data that has not changed.
Data Source |
SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_DATASOURCES’ ORDER BY last_load_time DESC LIMIT 1 |
Schema |
SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_SCHEMAS’ ORDER BY last_load_time DESC LIMIT 1 |
Table |
SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_TABLES’ ORDER BY last_load_time DESC LIMIT 1 |
Column (Attribute) |
SELECT last_load_time from information_schema.load_history where table_name = ‘STG_RDBMS_COLUMNS’ ORDER BY last_load_time DESC LIMIT 1 |
Policy |
SELECT last_load_time from information_schema.load_history where table_name = ‘STG_BUSINESS_POLICY’ ORDER BY last_load_time DESC LIMIT 1 |
Further Bot Examples¶
Elapsed Time¶
It is possible to create Bots with rules based on elapsed time. This can be useful for managing content that should be checked on a regular basis, or finding content that may have expired for example.
This is achieved by creating a date diff property in the aa_query we run. The date diff will return the number of days between two dates. So in the query we take the current date and the date stored in a custom field and return the difference in days. Then in the Bot rule we test that number and choose to do something if the number exceeds some threshold.
The following Bot checks if a policy has been verified within a specified (120 days) time range.
{
"bot_name": "Bot Sample - Verified Policy",
"description": "Bot to check if policy has been verified within a time range",
"conn_type": {
"sf":{
"otype": "BUSINESS_POLICY",
"custom_template_name": "Compliance",
"aa_query": "SELECT *, DATEDIFF(day, verified_date::DATE, CURRENT_DATE) as date_span FROM public.business_policy WHERE policy_type_name = 'Compliance' AND deleted = False",
"last_etl": "SELECT last_load_time from information_schema.load_history WHERE table_name = 'STG_BUSINESS_POLICY' ORDER BY last_load_time DESC LIMIT 1"
}
},
"create_policy": true,
"rules": [
{
"properties_to_check": [
{"date_span":{"condition":"GREATER THAN", "value":"120", "field_type": "TEXT"}}
],
"notification": {"type": "conversation",
"subject": "Policy must be verified!",
"frequency": "always",
"to": [{"user": 2}]}
}
]
}