Agent Deploy

The agent watches K8s resources and reports service status every 30 seconds. It discovers services from ConfigMaps labeled with itops.io/config: "true". Install one agent per cluster.

Agent values.yaml

node:
  id: "myorg/myplatform/prod/cluster1"   # 4-level hierarchy path (REQUIRED)
  name: "production-cluster"

itops:
  url: "https://api.yourdomain.com"      # ITOps Core API URL (REQUIRED)
  apiKey:
    value: "your-operator-api-key"        # Must match ITOPS_SECURITY_OPERATOR_API_KEY
    # OR use existing secret:
    # existingSecret: "itops-api-key"
    # existingSecretKey: "api-key"

slaGroups:                                # Optional: define SLA groups from agent
  - name: "payment-system"
    displayName: "Payment System"
    tier: "critical"
    targets:
      uptime: 99.99

watch:
  namespaces: []                          # Empty = watch all namespaces
Important: The apiKey.value must match the ITOPS_SECURITY_OPERATOR_API_KEY value set in the ITOps platform chart. If not set, the agent gets CreateContainerConfigError.

Service Config

Each service is configured via a ConfigMap that the agent discovers. The agent watches for ConfigMaps with label <labelPrefix>/config: "true" (prefix defaults to itops.io, configurable via watch.labelPrefix in the agent Helm values) and reads the data under one of these keys: it-ops.yaml, itops.yaml, it-ops.yml, itops.yml.

Required fields: hierarchy block (all 5 levels) and service.name are mandatory. Without them the agent will not register the service. service.workloadType + service.workloadName are needed for health monitoring (status, replicas).

Legacy fallback: A single-line placement.node: "org/platform/env/cluster" is still accepted by the parser for backwards compat with older manifests, but the structured hierarchy block is preferred going forward.
SLA groups — agent block vs service field: The agent-level slaGroups: block (seen above) is where you define groups and their uptime targets, once per cluster. Per-service ConfigMaps then just reference an existing group via service.slaGroup: "payment-system" to add that service as a member. If two agents (or an agent plus a push webhook) reference the same group name across clusters, they all merge into the same group row — sla_groups is UNIQUE(name).

ConfigMap Template (recommended)

# templates/itops-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ .Chart.Name }}-itops
  labels:
    itops.io/config: "true"          # REQUIRED - agent discovers by this label
data:
  it-ops.yaml: |                     # REQUIRED - must be "it-ops.yaml" (with hyphen)
    version: "1"
    hierarchy:                       # REQUIRED - all 5 levels
      organization: {{ .Values.itops.organization | default "myorg" }}
      platform: {{ .Values.itops.platform | default "myplatform" }}
      environment: {{ .Values.itops.environment | default "prod" }}
      cluster: {{ .Values.itops.cluster | default "cluster1" }}
      service: {{ .Chart.Name }}
    service:
      name: {{ .Chart.Name }}        # REQUIRED - service identifier
      criticality: {{ .Values.itops.criticality | default "medium" }}
      slaGroup: {{ .Values.itops.slaGroup | default "" }}
      workloadType: "deployment"     # REQUIRED for health - deployment/statefulset/daemonset
      workloadName: {{ .Chart.Name }}  # REQUIRED for health - K8s workload name
    operations:
      backup:
        expected: {{ .Values.itops.backup.expected | default false }}
        maxAgeDays: {{ .Values.itops.backup.maxAgeDays | default 1 }}

Service values.yaml

# helmcharts/my-service/values.yaml
itops:
  organization: "myorg"
  platform: "myplatform"
  environment: "prod"
  cluster: "cluster1"
  criticality: "critical"     # critical / high / medium / low
  slaGroup: "payment-system"  # SLA group membership (optional)
  backup:
    expected: true             # backup monitoring enabled
    maxAgeDays: 1              # alert if older than N days

Common Mistakes

MistakeResultFix
Label itops.io/managed: "true"Agent ignores ConfigMapUse itops.io/config: "true"
Missing hierarchy block AND placement.nodeParser error: "hierarchy or placement.node is required"Add the hierarchy block (or a placement.node fallback)
Missing service.nameParser error: "service.name is required"Add name under service: block
Missing workloadNameService status stays UNKNOWNAdd workloadType + workloadName
Label prefix mismatch (custom watch.labelPrefix)Agent doesn't see the ConfigMapConfigMap label must use the same prefix, e.g. acme.io/config: "true"