Signiant Support

Emergency measures to recover manager disk space Print


Summary

    There may come a time when emergency measures are needed in order to free disk space on your manager, and for whatever reason, it is impossible to bring the manager down for maintenance.  This article offers suggestions on how to free disk space for a running manager.

Discussion

Suggestion 1:  Delete unneeded log files

Deleting log files, particularly if they are for a job running in debug mode, can free a significant amount of space.

For Linux managers, you can use the following to delete files that are older than one day (i.e., created more than 24 hours ago) when executed from the Signiant log directory (/usr/signiant/dds/log by default):

find . -name "*.log" -ctime +1 -delete

For Windows managers you can use the following, assuming you are in the Signiant log directory (C:\Program Files\Signiant\Mobilize\log by default):

for /F %a in ('..\bin\findfiles -nodirs -created gt 1d -name "*.log"') do @del %a

Suggestion 2:  Delete Upgrade in Place staging files

Approximately 1 GB of storage can be recovered by deleting the Upgrade in Place staging files.  These are located in the manager's stage/agents directory (/usr/signiant/dds/stage/agents by default on Linux, C:\Program Files\Signiant\Mobilize\stage\agents by default on Windows).

Note that while these files are not needed for normal manager operations, you will be unable to use the Agent Upgrade feature of the manager GUI until these files are replaced (either by copying the files back or by performing an upgrade on the manager).

Suggestion 3:  Truncate the interval stats table

For managers that run a large number of jobs on a tight frequency, or where the maintenance job is either not running, or running with settings that are not aggressive enough, the database table holding the interval stats can consume an extremely large amount of space.  These stats are used only to draw the graph shown in the job details screen.  By removing the stats you will not be able to see any graphs for past runs, but there will be no other affect on the system.

To determine if your interval stats table is consuming a large amount of disk space, do the following:

    Log onto the manager as root and run this command:
         psql DTM_DB -U postgres

    At the DTM_DB# prompt, enter the following command:
         select relname,relpages from pg_class where relname = 'job_run_stat_interval';

    This will return a table similar to the following:
                  relname        |  relpages
       -----------------------+------------
        job_run_stat_interval | 1.1294e+06
        (1 row)

    The value under 'relpages' shows how many 8 KB pages are used to store the table on disk.  Sufficiently large numbers will be expressed in scientific notation.  In the example above, approximately 1,239,400 pages are being used by the table and will consume 9,035,200 KB of disk space, or roughly 9 GB.

    If the table is consuming a large amount of space truncate it with this command:
         truncate job_run_stat_interval

    Exit the PSQL with \q.  Disk space will be returned in very short time as the database removes the data.

IMPORTANT:  This is the ONLY table used by the Signiant manager that is considered safe for the truncate operation.  Under no circumstances should any other table be truncated.  Truncating any table other than job_run_stat_interval is NOT a supported operation.

Suggestion 4:  Reindex tables with bloated indexes

Over time, a table's index can grow to exceptional size.  Reindexing these tables can return substantial disk space and potentially improve overall database performance.  To determine if any of your tables are good candidates for reindexing, do the following:

    Log onto the manager as root and run this command:
         psql DTM_DB -U postgres

    At the DTM_DB# prompt, enter the following command:
         select relname,relpages,reltuples from pg_class order by relpages desc;

    This will return a table similar to the following:

                    relname                    | relpages | reltuples
    ----------------------------------------+----------+-----------
     scheduled_job_run_pkey           |   153659 |    582135
     scheduled_job_run                    |    28017 |    582135
     job_run_stat                             |    11087 |      8674
     scheduled_job_run_uidx            |    10329 |    582161
     job_run_stat_interval                 |     7354 |    161347
     job_run_stat_interval_idx           |     5758 |    161351
     scheduled_job_run_cntrct_idx    |     5608 |    582145
     scheduled_job                          |     5446 |      1787

    (Note:  The above is a truncated list -- many more will be returned from this command.)

    The value under 'relpages' shows how many 8 KB pages are used to store the index on disk.  Sufficiently large numbers will be expressed in scientific notation.  Problems can arise when the space used for the index exceeds the space used for the table itself.  In fact, problems can arise if the number of pages exceeds the number of tuples (rows) in the database.

    In the example above, 153659 pages are being used by the index for scheduled_job_run.  This is much more than used by the table itself, and will result in very inefficient lookups when this index is employed.  The number of pages used also exceeds the number of rows in the table it is indexing.  In this example, 1229272 KB (about 1 GB) of space is used to hold the index where the table itself is only 224136 KB (roughly 200 MB).

    For tables that are good candidates for reindexing, run the following command:
         reindex table TABLENAME;

    where TABLENAME is the name of the table.  In our example we would execute the command:
         reindex table scheduled_job_run;

    This operation may take a long time to complete.

Additional Information

When taking any action with the database, it is recommended that these be performed only after a good manager backup is available.  When in doubt, run the manager's backup job (from the Administration / Manager / Backup menu) and do not proceed with any database work until the backup has completed successfully.

If disk space is becoming a regular issue for your manager you should consider moving the manager's installation directory to a device or partition with more storage capacity.

When it is possible to take a maintenance window, the following recommendations should be considered:

    Apply the settings found in the document from the KB article titled "Recommended Manager Configuration for Large Installs".
    Move the manager's log files to another device or partition by following the steps in the KB article titled "Change the location of a manager or agent's log directory".