Troubleshoot your distributed upgrade

Resolve automatic system prerequisites checks issues

Text in CLI

Description

How to resolve

(log data about prerequisite checks are found in /var/log/algosec-software-upgrade.log)

Machine [machine IP] does not meet the minimal hardware requirements.
							

Checks system machine appliance specs: cores, memory.

Make sure the machine meets the system requirements. See System requirements.

For details, see Checking cores and memory on [machine IP] in the log.

There is less than xx MB free disk space in OS partition on node [machine IP].

 

Insufficient disk space. xxxMB found for installation (Less than the required 5000 MB in the OS partition on node[machine IP])

 

Partition (/data) on local node must have at least <required> MB free space. This includes the amount of space needed to sync the monitor data directory, plus an additional 5 GB. You currently only have <avail> MB free space. 

Checks disk space on system machine.
See Disk space requirements.

Run auto-remove to free up disk or delete old run files.

To run auto-remove, in AFA Administration, go to the Options tab, Storage sub-tab, and click Clean-up now.

If the issue persists after running Clean-up now, contact AlgoSec support.

Insufficient disk speed. 

Checks source node disk speed.

We recommend disk write speed of at least 300MB/s. Minimum allowable is 80MB/s.

Contact your IT department to determine and adjust, if necessary, your node disk speed.

Tip:

Use the following command to check disk speed:

dd if=/dev/zero of=/data/test-big-file.bin bs=786432000 count=1 oflag=dsync 2>&1 ; rm -f /data/test-big-file.bin

An example of the output is:

786432000 bytes (786 MB) copied, 0.624098 s, 1.3 GB/s

Tip: If you are using an AlgoSec VM, make sure you are following VM best practices. See Best practices for your AlgoSec VMware Deployment . If you make changes, check your disk speed again to see if it has improved.

Tip:If your target machine is an AlgoSec AMI, make sure you are using recommended deployment. See Deploy ASMS on AWS.

Distribution nodes machine time prerequisite check failed.

Compares Time between system machine and distribution nodes (Remote Agent and LDUs).

The machines can be in different time zones but they have to be at the same time relative to UTC:

  1. Compare time and date between CM and the distribution node by running this command on every node mentioned in the message :

    date +%s

    Acceptable results should be up to 180 difference (3 minutes). If a machine exceeds this limit:

    1. Configure NTP server. Use algosec_conf option 2 on the machine to be updated.

    2. Run this command as root user to force time sync:

      ntpdate -u $(awk '$1 =="server"  {print $2}' /etc/ntp.conf)
    3. Reboot the machine.

    4. To verify, rerun on the updated node:

      date +%s
NAS is configured, but directories are not mounted.

 

NAS mount is disabled due to fault detected.

Checks NAS status on Central Manager and LDUs.

Open algosec_conf menu on the node with the NAS issue. Run option option 11 - Configure NAS. Run option 3 - Re-enable NAS mount.

If issue persists on an LDU, in the algosec_conf menu, run option 15 - Distributed Architecture configuration.

If problem persists, contact AlgoSec support.

NAS is suspended

Open algosec_conf menu on the node with the NAS issue. Run option option 11 - Configure NAS. Run option 3 - Re-enable NAS mount.

If issue persists on an LDU, in the algosec_conf menu, run option 15 - Distributed Architecture configuration.

If problem persists, contact AlgoSec support.

The services listed below are not OK.

Checks status of services.

First, try to restart the services. Run for each service:

algosec_test_service -n <SERVICE NAME> -f

for example, algosec_test_service -n postgresql -f

If services do not restart, contact AlgoSec support.

Node: 10.20.8.95
* The path /home/afa/algosec should be non-broken symlink
Checks essential redirect links. Contact AlgoSec support.
Validation of upgrade files xxx failed. The files may be corrupted. Download the files again.
Checks for corrupted run files. Download run files again.
Distribution Architecture is not configured properly.
Checks for improperly configured distribution nodes. In the algosec_conf menu, run option 15 - Distributed Architecture configuration.
[product] version earlier than [version #] found on this machine.
Checks for product versions earlier than two versions before the version you want to upgrade to. Remove the product run file /root/Algosec_Upgrade/<product run file> or upgrade the product to a version not earlier than two versions before the version you want to upgrade to.
PostgreSQL is not synced between Cluster machine ([machine IP]) and the Primary machine ([machine IP]).
Checks PostgreSQL sync status between cluster machine and Primary. in the algosec_conf menu, go to option 13 - HA/DR Setup. Select 1 - View cluster status details.
Inconsistencies found between the devices list and database records. 
Checks for database inconsistencies.

To fix the inconsistency, see procedure in the knowledge base article: www.algosec.com/r/a32.00/42845777.

Excessive RPM removal check failed.
Checks for RPMs that need to be removed.
  1. Go to log file, find the following line to get list of excessive packages that can't be removed:

    -> error: Failed dependencies
  1. Manually remove the packages.

    For example, if the log displays:

    -->      libyajl.so.2()(64bit) is needed by (installed) collectd-5.8.1-1.el7.x86_64
    --> Error: can't remove excessive packages - some other packages are dependent on them

    Manually remove the RPM collectd-5.8.1-1.el7.x86_64.

Unable to connect to mongo admin.
Make sure mongod service is up and responsive.
automatic system prerequisites checks

First, try to restart the services. Run for each service:

algosec_test_service -n mongod -f

If mongo service does not start, contact AlgoSec support.

Failed to get HA dependent nodes, make sure that the ms-hadr service is up.
Checks that remote HA nodes are responsive.

In the algosec_conf menu of the HA Remote Agent, go to option 13 - HA/DR Setup. Select 1 - View cluster status details. By doing this you restart the service. Make sure that cluster is now synced.

You are using a custom SSO module: <name of SSO module>

This implementation may be incompatible with the version you're upgrading to.

We recommend that you contact AlgoSec support before continuing with the upgrade.

Checks custom SSO module. Contact AlgoSec support.

Back to top

Resolve upgrade failures

  • If your distributed upgrade fails for any reason, the system displays an error, as well as the location of specific log files.

    • The central upgrade log file is located at: /var/log/algosec-software-upgrade.log

    • The system also prompts you with options to start the upgrade again.

  • If you have a distributed system and only some nodes failed, the system will show a summary for all the nodes and their status. You can select the nodes you want to reinstall, or rerun the entire upgrade from scratch. Select the option that works best for you and run through the CLI process as prompted and described above.

  • For HA/DR Suspend/Resume Cluster errors: Go to /var/log/algosec_hadr/ms-hadr.log and check the log for errors.

  • For run file errors: Check the log displayed in the error message for details on why the upgrade failed.

  • If you receive the message:

    Upgrade failed during AES 256 encryption update.

    This indicates an AES 256 encryption update failure. For more details to correct the problem, check the log file /var/log/algosec-software-upgrade.log. After fixing the problem, re-run the upgrade to resume the process.

Contact AlgoSec Support for additional assistance, and send copies of all supporting log information.