Update HA Pair Configuration (4.7 and above)¶
Customer Managed Applies to customer-managed instances of Alation
Warning
These upgrade steps have been deprecated and no longer work. To upgrade an HA pair, use the information in Update HA Pair with Cluster Splitting.
This article provides the steps for updating Alation releases 4.7 and above when it is installed as an HA Pair configuration.
This instruction describes updating the HA Pair with preserving replication between the Primary and Secondary nodes.
1. Prerequisites¶
Important
Perform the update only after you have familiarized yourself with and completed the required pre-upgrade steps for your release.
Before updating:
Check system upgradeability¶
On both Primary and Secondary nodes, confirm the system upgradeability
by validating that a minimum of 15 GB space is free at /opt/alation/
. On the host, outside the Alation shell, run:
df -h
The output will show the used and available disk space for the disks. Find the main data disk. It is usually called /data
. Find the number in the Avail column. This number should be equal or more than 15 GB.
[root@C74X ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 44G 17G 27G 38% / /dev/sdc1 69G 2.2G 67G 4% /BACKUP /dev/sdb1 44G 11G 34G 24% /DATA
Check for backup in progress¶
On Primary, check that there are currently no backup processes in progress. A backup in progress is not compatible with the update. To check, you can:
Version V R3 (5.6.x) and above: In the Alation Catalog UI, navigate to Admin Settings > Monitor > Active Tasks (requires the Server Admin role) and check if there are any active backup tasks in the current queue.
Using Linux commands, check for active processes that have
backup
in the process name.
Continue with update after the backup process has finished.
Check the replication status of your HA Pair¶
To check,
From the UI of the Primary, check the following URL: <your_alation_URL>/monitor/replication
It will return the byte lag with Postgres (all versions) and Mongo (releases before V R5). If replication is running, it will return some realistic byte lag values, and you can proceed with the update. If replication is not running, it will return “unknown”, and this may be indicative of replication failure. You may need to decide how to proceed: rebuild replication for the HA Pair or update the Primary and Secondary servers as standalone instances and rebuild replication after the update.
Check permissions on extra_config¶
On Primary, check permissions on the folder extra_config at /opt/alation/site/config/extra_config (inside the Alation shell).
The permissions should be set to 755.
ls -al /opt/alation/site/config/
Example output:
If you see other permissions on extra_config, change to 755:
sudo chmod 755 /opt/alation/site/config/extra_config
2. Configure Network for Reporting Usage Data¶
Recommended by Alation
If you have not done so yet, configure your network to allow communicating with the Alation Cloud. This is required for reporting Alation usage data automatically. For details on reporting usage data and why it is important to Alation, see Reporting Usage.
The time before you run the update may be a good moment to do this change because it enables usage data reporting in the updated Alation instance.
To configure the network for automatic reporting, make sure the following ports are open on both Primary and Secondary:
Function |
Direction |
Ports |
Destination |
---|---|---|---|
Usage Stats |
outbound |
TCP 443 |
Alation Cloud: 52.4.59.229 |
3. Update the HA Pair¶
Make sure you have a valid backup.
CONDITIONAL STEP
Warning
Perform this step if you are updating:
to 2021.2 from 2020.4.x or 2020.3.x
to 2021.1 from 2020.3.x or VR7 (5.12.x)
The update to 2021.1 or the update that skips 2021.1 requires key rotation to have been performed at least once on the instance. Key rotation must be performed on Primary. If you have not recently done so, rotate the keys. After key rotation is completed, allow some time for replication between Primary and Secondary to catch up.
CONDITIONAL STEP
Warning
Perform this step if you are updating Alation:
to 2021.2 from 2020.3.x
to 2021.1 from 2020.3 or VR7 (5.12.x)
to 2020.4.x from 2020.3 or VR7 (5.12.x)
If you are performing the update to a release older than 2020.4.x, skip this step.
If you are performing the update from 2020.4 to a newer patch release of 2020.4 or a newer version, skip this step.
If you have not done so yet, run the 2020.4 pre-upgrade reindexing script provided by Alation. It should be run on the Primary server. The instructions can be found in:
After running the 2020.4 pre-upgrade script, proceed to step 4 of this instruction.
On Primary: If you haven’t done so yet, check the byte lag and then stop UWSGI to let replication drain to Secondary.
To check the byte lag:
curl -L --insecure http://localhost/monitor/replication/
To stop UWSGI:
sudo /etc/init.d/alation shell alation_action stop_uwsgi
On Primary, still in the Alation shell: Check for replication to complete by looking for lag 0. Because UWSGI is stopped, use a different way to get the lag information:
alation_psql SELECT client_hostname, client_addr, pg_wal_lsn_diff(pg_stat_replication.sent_lsn,pg_stat_replication.replay_lsn) AS byte_lag FROM pg_stat_replication;
Look for
byte_lag = 0
.On Secondary, outside of the Alation shell: Stop Alation services.
sudo /etc/init.d/alation stop
CONDITIONAL STEP
Warning
Perform this step only if you are updating to:
2021.2 from version 2020.3.x
2021.1 from versions 2020.3.x or V R7 (5.12.x).
2020.4.x from versions 2020.3.x or V R7 (5.12.x)
If you are performing the update to a version older than 2020.4, skip this step and proceed to step 7.
If you already are on 2020.4 and are updating to a later patch version of 2020.4 or a newer release, skip this step and proceed to step 7.
Create the following file on the Secondary node from the Alation shell:
# To enter the Alation shell sudo /etc/init.d/alation shell # Create the success file touch /opt/alation/site/site_data/reindex_rosemeta_success
On Primary, outside of the Alation shell: install the package. The update package can be downloaded from Alation Customer Portal. It should be placed on the host to a location outside of the Alation shell.
Important
The
sudo /etc/init.d/alation update
command can take a few hours, therefore, Alation recommends that you run it withnohup
or in a Screen session.Installing the RPM package (RHEL or CENTOS)
sudo rpm -Uvh <path_to_the_RPM_package>/<package_name> sudo /etc/init.d/alation update
Note
If you receive an error
headerRead failed: hdr data: BAD, no. of bytes(...) out of range
at this step, troubleshoot using recommendations in RPM Installation Error During Update.Installing the DEB package (Ubuntu)
sudo dpkg -i <path_to_the_DEB_package>/<package_name> sudo /etc/init.d/alation update
You can monitor the progress using
/opt/alation/<alation-####>/var/log/installer.log
(path outside of the Alation shell):Note
####
represents the Alation version number inx.y.z.nnnn
format (x
= major,y
= minor,z
= patch, andnnnn
= build).tail -f /opt/alation/<alation-####>/var/log/installer.log
Example:
tail -f /opt/alation/alation-4.14.7.20232/var/log/installer.log
After the update is complete and the services are started on Primary, log in to Alation as a user and test the system.
Perform the update (Step 8) on Secondary after confirmation from users who tested the system on Primary.
CONDITIONAL STEP
Warning
Perform this step if you are updating to release 2021.3 from a previous release. If not, skip to the next step.
Postgresql upgrade does not run on Secondary; instead, run the replicate Postgres script to sync with the Primary. On Secondary, run:
alation_action cluster_replicate_postgres
To check that Postgres on Secondary is operational after the replication, on Secondary, run:
alation_psql
This command should return the updated Postgres version 13.1
If this command returns the following error:
“could not connect to server: No such file or directory. Is the server running locally and accepting connections on Unix domain socket “/tmp/.s.PGSQL.5432 Failed to execute command with exception details: Command failed: PGPASSWORD=(hidden) psql rosemeta -h /tmp -p 5432 -U alation”
troubleshoot by running the replicate files to replicate the conf and then re-run the replicate Postgres action:
alation_action cluster_replicate_files alation_action cluster_replicate_postgres
Check Postgres and Mongo* lags on Primary again to ensure the HA Pair is syncing.
curl -L --insecure http://localhost/monitor/replication/
* Mongo is only available in releases before V R5. In V R5+, it is Postgres only.
Because Secondary may have been stopped for a long time, depending on how much the upgrade takes, the replication may not start. If this occurs, rebuild the database on Secondary.
CONDITIONAL STEP
Warning
Do this step if you are updating to release 2021.2 or 2021.3 from a previous release and your Primary server was on the Backup V1 tool before the update.
Update Backup Settings on Primary After Update¶
This section applies if your Primary server was on Backup V1 before the update to 2021.2 or 2021.3.
After the update, decide which backup tool you will use. Backup V2 is the default backup tool starting with release 2021.2. Backup V1 can also be used in versions 2021.2 and 2021.3.
If your decision is to use the Backup V2 tool, you need to explicitly enable Backup V2 on Primary. This step needs to be performed manually as the HA Pair configuration does not allow to fully enable Backup V2 programmatically during the update to 2021.2.
If your decision is to continue using Backup V1, you need to manually disable Backup V2 on Primary.
To enable Backup V2,
On Primary, run the alation_action
given below from the Alation shell. Note that this action will restart the Postgres service on your Alation instance:
sudo /etc/init.d/alation shell alation_action enable_backupv2
To disable Backup V2 and revert to Backup V1,
On Primary, run the alation_action
given below from the Alation shell. Note that this action will restart the Postgres service on your Alation instance:
sudo /etc/init.d/alation shell alation_action disable_backupv2
Next, validate your backup configuration.
Validating Your Backup Configuration¶
To confirm that your Backup configuration is correct, check the set of alation_conf values given below.
Important
Do not attempt to modify the alation_conf values for Backup V2 manually. They are handled by the dedicated
alation_action
commands that enable or disable Backup V2.
Backup V2 fully enabled¶
If you find that the parameters listed below have these values, it means that Backup V2 has been successfully enabled and the instance will be automatically backed up on schedule using the Backup V2 tool:
Parameter name |
Value should be: |
---|---|
alation.backup_v2.enabled |
True |
pgsql.config.archive_mode |
True |
pgsql.config.archive_command |
|
Backup V2 disabled¶
If you find that the parameters listed below have these values, it means that Backup V2 has been disabled and the instance may be automatically backed up with the Backup V1 tool.
Parameter name |
Value should be: |
---|---|
alation.backup_v2.enabled |
False |
pgsql.config.archive_mode |
False |
pgsql.config.archive_command |
|
Backup V1 enabled¶
If you find that the parameter below is in True
, Backup V1 is enabled. If at the same time Backup V2 is enabled too, this means the instance is backed up using Backup V2 (Backup V1 is ignored). If at the same time Backup V2 is disabled, this means your instance will be backed up using the Backup V1 tool.
Parameter name |
Value should be: |
---|---|
alation.backup.enabled |
True |
Backup V1 disabled¶
If you find that the parameter below is in False
, this state means that Backup V1 is disabled. If at the same time Backup V2 is enabled, this means the instance is backed up using Backup V2. If at the same time Backup V2 is disabled, this means the backup process is completely disabled on your instance: the instance is not backed up automatically on schedule. If you perform a manual backup, it will run using the Backup V1 tool.
Parameter name |
Value should be: |
---|---|
alation.backup.enabled |
False |