Configure Connection to Data Source¶

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

After you install the Elasticsearch OCF connector, you must configure the connection to the Elasticsearch data source.

The various steps involved in configuring the Elasticsearch data source connection setting are:

Provide Access

Connect to Data Source

Provide Access¶

Go to the Access tab on the Settings page of your Elasticsearch data source, set the data source visibility using these options:

Public Data Source — The data source is visible to all users of the catalog.

Private Data Source — The data source is visible to the users allowed access to the data source by Data Source Admins.

You can add new Data Source Admin users in the Data Source Admins section.

Connect to Data Source¶

To connect to the data source, you must perform these steps:

Application Settings¶

Specify Application Settings if applicable. Save the changes after providing the information by clicking Save.

Specify Application Settings if applicable. Click Save to save the changes after providing the information.

Parameter

Description

BI Connection Info

This parameter is used to generate lineage between the current data source and another source in the catalog, for example a BI source that retrieves data from the underlying database. The parameter accepts host and port information of the corresponding BI data source connection.

Use the following format: host:port

You can provide multiple values as a comma-separated list:

10.13.71.216:1541,sever.com:1542

Find more details in BI Connection Info

Disable Automatic Lineage Generation

Select this checkbox to disable automatic lineage generation from QLI, MDE, and Compose queries. By default, automatic lineage generation is enabled.

Connector Settings¶

Datasource Connection¶

Not applicable.

Authentication¶

Specify Authentication Settings. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_01.png

Parameter	Description
Auth Scheme	Select the auth scheme used for authentication with the Elasticsearch from the dropdown. None: No authentication is performed unless User and Password properties are set in which BASIC authentication will be performed. Basic: Basic authentication is performed. Negotiate: If AuthScheme is set to Negotiate, the provider will negotiate an authentication mechanism with the server. Set AuthScheme to Negotiate if you want to use Kerberos authentication. AwsRootKeys: Set this to use the root user access key and secret. AwsIAMRoles: Set to use IAM Roles for the connection. APIKey: Set to use APIKey and APIKeyId for the connection. See Appendix - Authentication Schemes to know more about the configuration fields required for each authentication scheme.
User	Specify the username to authenticate Elasticsearch.
Password	Specify the password that authenticates Elasticsearch.
Use SSL.	Select this check box to use SSL/TSL authentication.
Server	The host name or IP address of the Elasticsearch REST server. Alternatively, multiple nodes in a single cluster can be specified, though all such nodes must be able to support REST API calls.
Port	The port for the Elasticsearch REST server.
API Key	Specify the API Key used to authenticate to Elasticsearch.
API Key Id	Specify the API Key ID to authenticate to Elasticsearch.

Connection¶

Specify Connection properties. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_02.png

Parameter	Description
Data Model	Specifies the data model to use when parsing Elasticsearch documents and generating the database metadata.
Expose Dot Indices	Select this checkbox to expose the dot indices as tables or views.
Use Lake Formation	Select this checkbox to use the AWS Lake Formation service to retrieve temporary credentials, which enforce access policies against the user based on the configured IAM role. The service can be used when authenticating through OKTA, ADFS, AzureAD, and PingFederate while providing a SAML assertion.

AWS Authentication¶

Specify AWS Authentication properties. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_07.png

Parameter	Description
AWS Access Key	Specify the AWS account access key. This value is accessible from your AWS security credentials page.
AWS Secret Key	Specify the AWS account secret key. This value is accessible from your AWS security credentials page.
AWS Role ARN	Specify the Amazon Resource Name of the role to use when authenticating.
AWS Region	Specify the hosting region for your Amazon Web Services.
AWS Session Token	Specify the AWS session token.
Temporary Token Duration	Specify the amount of time (in seconds) an AWS temporary token will last.
AWS External Id	Specify the unique identifier required when you assume a role in another account.

Kerberos¶

Not supported.

SSL¶

Specify SSL properties. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_03.png

Parameter	Description
SSL Client Cert	Specify the TLS/SSL client certificate store for SSL Client Authentication (2-way SSL).
SSL Client Cert Type	Select the type of key store containing the TLS or SSL client certificate from the drop-down.
SSL Client Cert Password	Specify the password for the TLS or SSL client certificate.
SSL Client Cert Subject	Specify the subject of the TLS or SSL client certificate.
SSL Server Cert	Specify the certificate to be accepted from the server when connecting using TLS or SSL.

Firewall¶

Specify Firewall properties. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_04.png

Parameter	Description
Firewall Type	Specify the protocol used by a proxy-based firewall. None TUNNEL: Opens a connection to Elasticsearch and traffic flows back and forth through the proxy. SOCKS4: Sends data through the SOCKSv4 proxy as specified in the Firewall Server and Firewall Port. SOCKS5: Sends data through the SOCKSv5 proxy as specified in the Firewall Server and Firewall Port.
Firewall Server	Specify the hostname, DNS name, or IP address of the proxy-based firewall.
Firewall Port	Specify the TCP port of the proxy-based firewall.
Firewall User	Specify the user name to authenticate with the proxy-based firewall.
Firewall Password	Specify the password to authenticate with the proxy-based firewall.

Proxy¶

Specify Proxy properties. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_05.png

Parameter	Description
Proxy Auto Detect	Select the checkbox to use the system proxy settings. This takes precedence over other proxy settings, so do not select this checkbox to use custom proxy settings.
Proxy Server	Specify the hostname or IP address of a proxy to route HTTP traffic through.
Proxy Port	Specify the TCP port the Proxy Server proxy is running on.
Proxy Auth Scheme	Select the authentication type to authenticate to the Proxy Server from the drop-down: BASIC, DIGEST NONE NEGOTIATE NTLM PROPRIETARY
Proxy User	Specify the user name to authenticate the Proxy Server.
Proxy Password	Specify the password of the Proxy User.
Proxy SSL Type	Select the SSL type when connecting to the ProxyServer from the drop-down: AUTO ALWAYS NEVER TUNNEL
Proxy Exceptions	Specify the list (separated by semicolon) of destination hostnames or IPs that are exempt from connecting through the Proxy Server.

Logging¶

Specify Logging properties. Save the changes after providing the information by clicking Save.

Parameter

Description

Verbosity

Specify the verbosity level between 1 to 5 to include details in the log file.

Log Modules

Includes the core modules in the log files. Add module names separated by a semi-colon. By default, all modules are included.

Max Log File Count

Specify the maximum file count for log files. After the limit, the log file is rolled over, and time is appended at the end of the file. The oldest log file is deleted.

Maximum Value: 2

Default: -1. A negative or zero value indicates unlimited files.

View the connector logs in Admin Settings > Server Admin > Manage Connectors > Elasticsearch OCF connector.

Schema¶

Specify Schema properties. Save the changes after providing the information by clicking Save.

../../../_images/ElasticsearchOCF_06.png

Parameter	Description
Browsable Schemas	Specify the schemas as a subset of the available schemas in a comma-separated list. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC.
Tables	Specify the fully qualified name of the table as a subset of the available tables in a comma-separated list. Each table must be a valid SQL identifier that might contain special characters escaped using square brackets, double quotes, or backticks. For example, Tables=TableA,[TableB/WithSlash],WithCatalog.WithSchema.`TableC With Space`.
Views	Specify the fully qualified name of the Views as a subset of the available tables in a comma-separated list. Each view must be a valid SQL identifier that might contain special characters escaped using square brackets, double quotes, or backticks. For example, Views=ViewA,[ViewB/WithSlash],WithCatalog.WithSchema.`ViewC With Space`.
Flatten Objects	Select the Flatten Objects checkbox to flatten object properties into their columns. Otherwise, objects nested in arrays are returned as strings of JSON.
Flatten Arrays	Set Flatten Arrays to the number of nested array elements you want to return as table columns. By default, nested arrays are returned as strings of JSON.

Miscellaneous¶

Specify Miscellaneous properties. Save the changes after providing the information by clicking Save.

Parameter	Description
Batch Size	Specify the maximum size of each batch operation to submit. When BatchSize is set to a value greater than 0, the batch operation will split the entire batch into separate batches of size BatchSize. The split batches will then be submitted to the server individually. This is useful when the server has limitations on the request size that can be submitted. Setting BatchSize to 0 will submit the entire batch as specified.
Client Side Evaluation	Select the Client Side Evaluation checkbox to perform the evaluation client side on nested objects.
Connection LifeTime	The maximum lifetime of a connection in seconds. Once the time has elapsed, the connection object is disposed. The default is 0, indicating no limit to the connection lifetime.
Generate Schema Files	Select the user preference for when schemas should be generated and saved. Never - A schema file will never be generated. OnUse - A schema file will be generated the first time a table is referenced, provided the schema file for the table does not already exist. OnStart - A schema file will be generated at connection time for tables that do not currently have a schema file. OnCreate - A schema file will be generated when running a CREATE TABLE SQL query.
Maximum Results	The maximum number of total results to return from Elasticsearch when using the default Search API.
Max Rows	Limits the number of rows returned when no aggregation or GROUP BY is used in the query. This takes precedence over LIMIT clauses.
Other	This field is for properties that are used only in specific use cases.
Page size	Specify the maximum number of rows to fetch from Elasticsearch. The Page Size can control the number of results received per request from Elasticsearch on a given query.
Pagination Mode	Specifies whether to use PIT with search_after or scroll to the page through query results.
PIT Duration	Specifies the time unit to keep alive when retrieving results via PIT API.
Pool Idle Timeout	The allowed idle time for a connection before it is closed.
Pool Max Size	The maximum connections in the pool. The default is 100.
Pool Min Size	The minimum number of connections in the pool. The default is 1.
Pool Wait Time	The max seconds to wait for an available connection. The default value is 60 seconds.
Pseudo Columns	This property indicates whether or not to include pseudo columns as columns to the table.
Query Passthrough	Select this checkbox to pass exact queries to Elasticsearch.
Read only	Select this checkbox to enforce read-only access to Elasticsearch from the provider.
Row Scan Depth	Specify the maximum number of messages to scan for the columns in the topic. Setting a high value may decrease performance. Setting a low value may prevent the data type from being determined properly.
Scroll Duration	Specifies the time unit to keep alive when retrieving results via the Scroll API.
Timeout	Specify the value in seconds until the timeout error is thrown, canceling the operation.
Use Connection Pooling	Select this checkbox to enable connection pooling.
Use Fully Qualified Nested Table Name	Select this checkbox to set the generated table name as the complete source path when flattening nested documents using Relational DataModel.
User Defined Views	Specify the file path pointing to the JSON configuration file containing your custom views.

Obfuscate Literals¶

Obfuscate Literals — Enable this toggle to hide the details of the queries in the catalog page that are ingested via QLI or executed in Compose. Disabled by default.

Test Connection¶

Under Test Connection, click Test to validate network connectivity.

Note

You can only test connectivity after providing the authentication information.

Delete the Data Source¶

To delete the Data Source, refer Delete a Data Source.

Metadata Extraction¶

This connector supports metadata extraction (MDE) based on default queries built in the connector code but does not support custom query-based MDE. You can configure metadata extraction on the Metadata Extraction tab of the settings page.

For more information about the available configuration options, see Configure Metadata Extraction for OCF Data Sources.

Compose¶

Not supported.

Sampling and Profiling¶

Sampling and profiling is supported. For details, see Configure Sampling and Profiling for OCF Data Sources.

Query Log Ingestion¶

Not supported.

Troubleshooting¶

Refer to Troubleshooting for information about logs.

Configure Connection to Data Source¶

Provide Access¶

Connect to Data Source¶

Application Settings¶

Connector Settings¶

Datasource Connection¶

Authentication¶

Connection¶

AWS Authentication¶

Kerberos¶

SSL¶

Firewall¶

Proxy¶

Logging¶

Schema¶

Miscellaneous¶

Obfuscate Literals¶

Test Connection¶

Delete the Data Source¶

Metadata Extraction¶

Compose¶

Sampling and Profiling¶

Query Log Ingestion¶

Troubleshooting¶

Alation User Documentation PDFs