Features Core Health Monitoring

Taking the pulse

Applicable to all system types, the core health consists of a number of broad metrics, covering:

  • Average CPU load
  • Memory usage
  • Disk space
  • Failed services
  • Key performance items

as appropriate for the system.

The CPU load not an instaneous value, but rather is an average of samples taken over a 5 minute period, to ensure that brief spikes do not create inappropriate responses.

Memory usage and disk space are expressed as percentages used of the total available capacity.

Threshold-Based Monitoring

Each metric has two thresholds, simply dividing each metric into three levels. When a metric crosses a threshold, an appropriate notification is generated.

The thresholds default to sensible values, but can all be tailored to individual systems as required.

Broad Categorisation

Each metric has one of five states, determined by the value of the metric and the thresholds set.

  • Error – the information could not be obtained
  • Critical – a serious problem that should be resolved immediately
  • Warning – a problem that warrants attention
  • Normal – value is within normal limits
  • Not available – the information is not currently available

Colour Coding

Also available as an "at a glance" indication of system health, each metric is colour coded to indicate the severity of the problem, if any.

The core health information also forms a key part of the reporting, both for network overviews and individual systems.

a colour-coded indication of any problems