Ramsdata

One of the biggest challenges in IT monitoring is keeping monitoring configurations up to date. New services, new applications, changing server configurations – monitoring configured once quickly gets old and no longer reflects the actual state of the infrastructure. Checkmk solves this problem through an automatic service discovery (Service Discovery) mechanism that continuously scans monitored hosts and identifies what should be monitored and how – without manually configuring each service.

Table of contents

  1. What is Service Discovery in Checkmk?
  2. How does the service detection mechanism work?
  3. Automatic detection vs. rules (rules) Checkmk
  4. Service Discovery vs. infrastructure changes
  5. Periodic Service Discovery – continuous configuration updates
  6. Checkmk Agent – the foundation of service discovery
  7. Detection of services in agentless monitoring (SNMP, API)
  8. Key findings
  9. FAQ
  10. Summary

What is Service Discovery in Checkmk?

Service Discovery is Checkmk’s mechanism that automatically identifies what services (services) should be monitored on a given host and what parameters should be checked. “Service” in Checkmk is any aspect of the system that can be monitored – a disk, a network interface, a process, a daemon, a Windows service, a Kubernetes resource and hundreds of others.

Without automatic detection, an administrator would have to manually configure each service on each host – determining what to check, what alert thresholds to use and how to interpret the results. In an environment with hundreds or thousands of hosts, this is unsustainable. Checkmk Service Discovery eliminates this work – it scans hosts and suggests or automatically adds services to monitor. Checkmk in Ramsdata’s offering is a monitoring tool that Ramsdata deploys with full technical and training support.

How does the service detection mechanism work?

Checkmk’s Service Discovery mechanism works through so-called check plugins – modules responsible for a specific type of service. Each check plugin “knows” how to query a host for data on its scope and how to interpret the results.

When Checkmk scans a host, it runs all the relevant check plugins and each returns a list of detected instances. The disk monitoring plugin will return a list of all disks found on the host. The process monitoring plugin will return a list of running processes matching defined patterns. The network interface monitoring plugin will return a list of all interfaces. The result of a Service Discovery scan is a list of proposed services – the administrator can accept, reject or configure exceptions.

Automatic detection vs. rules (rules) Checkmk

Rules (rules) in Checkmk allow you to configure detection and monitoring parameters in a hierarchical and scalable way. Instead of configuring each disk on each server separately, the administrator defines a rule: “on all production servers, monitor disks with WARN alert at 80% full and CRIT at 90%.” The rule is automatically applied to all servers in the group.

Service Discovery respects rules when detecting services – if a rule says “don’t monitor temporary disks,” Service Discovery automatically excludes those disks from the list of detected services. The rules can be very granular – different thresholds for different hosts, exclusions for specific processes or interfaces. This combination of automatic detection with rule-based configuration makes Checkmk extremely scalable.

Service Discovery vs. infrastructure changes

Service Discovery is particularly valuable for infrastructure changes. When a new disk, a new network interface or a new application service arrives on the server, Checkmk detects this change on the next Service Discovery scan and flags it as a “new unapproved service.”

The administrator sees a list of new, deleted and changed services in Checkmk and can make a decision on each of them – accept for monitoring, reject or leave for a later decision. This eliminates the risk that new infrastructure components will remain unmonitored – something that happens regularly in the classic manual configuration approach. The opposite situation – removal of a disk or interface – is also detected and the service is flagged as “lost,” allowing the monitoring configuration to be cleaned up. For more information on Checkmk’s capabilities, see Ramsdata’s knowledge base.

Periodic Service Discovery – continuous configuration updates

Manual Service Discovery mode requires the administrator to regularly scan hosts and accept new services. Periodic Service Discovery automates this process – Checkmk automatically scans hosts at predefined intervals and, depending on the configuration, automatically accepts new services, removes decayed ones or just flags changes for manual verification.

This is especially valuable in dynamic environments – cloud, Kubernetes, microservices – where new services appear and disappear regularly. Periodic Service Discovery ensures that monitoring is always in sync with the actual state of the infrastructure without constant manual work by administrators. Configuring automatic mode requires caution – aggressive automation can lead to uncontrolled proliferation of monitoring configurations.

Checkmk Agent – the foundation of service discovery

Checkmk Agent is a lightweight agent installed on monitored hosts (Linux, Windows, AIX, Solaris and others) that collects system data and makes it available to the Checkmk server. The agent is the foundation of Service Discovery – without it, service discovery capabilities are significantly limited.

The Checkmk agent collects and provides data from dozens of sources: file system, processes, system services, logs, memory, CPU, network and more. Check plugins on the Checkmk server interpret this data and detect services. The advantage of the Checkmk agent is its plugin architecture (Local Checks) – each administrator can add his own check scripts, which will be automatically collected by the agent and interpreted by Checkmk. This allows application monitoring and custom metrics with full integration in Service Discovery.

Detection of services in agentless monitoring (SNMP, API)

Not all devices can have an agent installed – network switches, printers, IoT devices, disk arrays communicate via SNMP. Checkmk supports Service Discovery for SNMP-monitored hosts – it scans the device’s MIB and OID and detects available metrics.

Integration with third-party APIs (VMware vCenter, AWS, Azure, Kubernetes) allows dynamic detection of services in virtualized and cloud environments. Checkmk automatically detects new virtual machines, Kubernetes containers, cloud resources and adds them to monitoring without manual configuration of each resource. This is especially important in dynamic environments where the infrastructure changes multiple times a day.

Key findings

  • Service Discovery automatically identifies what should be monitored on each host.
  • Check plugins analyze data from the host and return lists of detected instances (drives, interfaces, processes).
  • Checkmk rules (rules) configure detection and monitoring parameters in a hierarchical and scalable manner.
  • Periodic Service Discovery automates the synchronization of monitoring configurations with infrastructure status.
  • Checkmk Agent provides the broadest service discovery capabilities – with support for custom plug-ins.
  • Agentless monitoring via SNMP and APIs provides Service Discovery for network devices and cloud environments.

FAQ

How often should Checkmk perform Periodic Service Discovery? For static environments, once a day or once a week is sufficient. For dynamic environments (Kubernetes, cloud) every 30-60 minutes or less, with a careful auto acceptance mode.

Can Service Discovery automatically remove decayed services? Yes – Periodic Service Discovery in “fixall” mode automatically accepts new and removes decayed services. In production environments, it is a good idea to use “new_only” mode and manually verify deletions.

How does Checkmk handle thousands of services on a large host? Checkmk is designed to scale – monitoring servers can support tens of thousands of services. Distributed Monitoring architecture allows scaling by adding local monitoring servers.

Can Service Discovery be limited to selected types of services? Yes – rules (rules) allow you to exclude specific check plugins or service types from Service Discovery for selected groups of hosts.

Summary

Checkmk’s automatic service discovery is one of the platform’s most valuable features – it eliminates manual configuration of each service and ensures that monitoring is always up-to-date and reflects the actual state of the infrastructure. Combined with policies, Periodic Service Discovery and integration with agents and APIs, Checkmk creates a monitoring system that grows and adapts with your infrastructure. If you want to implement or optimize IT monitoring in your organization, contact Checkmk partner Ramsdata.

Leave a Reply

Your email address will not be published. Required fields are marked *

error: Content is protected !!