Applicable to all system types, the core health consists of a number of broad metrics, covering:
- Average CPU load
- Memory usage
- Disk space
- Failed services
- Key performance items
as appropriate for the system.
The CPU load not an instaneous value, but rather is an average of samples taken over a 5 minute period, to ensure that brief spikes do not create inappropriate responses.
Memory usage and disk space are expressed as percentages used of the total available capacity.
Each metric has two thresholds, simply dividing each metric into three levels. When a metric crosses a threshold, an appropriate notification is generated.
The thresholds default to sensible values, but can all be tailored to individual systems as required.
Each metric has one of five states, determined by the value of the metric and the thresholds set.
- Error – the information could not be obtained
- Critical – a serious problem that should be resolved immediately
- Warning – a problem that warrants attention
- Normal – value is within normal limits
- Not available – the information is not currently available
Also available as an "at a glance" indication of system health, each metric is colour coded to indicate the severity of the problem, if any.
The core health information also forms a key part of the reporting, both for network overviews and individual systems.
a colour-coded indication of any problems