GitOps Repository Structure

How to organize your ITOps configuration in a GitOps repository. Everything is declarative, version-controlled, and ArgoCD-syncable.

Recommended Repository Layout

infra-gitops-repo/
├── platform/                        # ITOps platform installation
│   ├── itops-values.yaml            # Core platform Helm values
│   ├── itops-agent-values.yaml      # Agent Helm values (per cluster)
│   └── sla-portal-values.yaml       # SLA Portal Helm values
│
├── services/                        # Service definitions (ConfigMaps)
│   ├── postgresql-itops.yaml        # Database service
│   ├── redis-itops.yaml             # Cache service
│   ├── payment-api-itops.yaml       # Application service
│   └── galera-itops.yaml            # Bare metal service
│
├── monitoring/                      # Push webhook CronJobs
│   ├── storage-reporter.yaml        # Storage metrics push
│   ├── backup-reporter.yaml         # Backup completion push
│   └── health-push.yaml             # Bare metal health push
│
└── sla/                             # SLA configuration
    └── sla-portal-values.yaml       # SLA targets (error budgets)

Why Separate Directories?

DirectoryWhatLifecycleWho changes it
platform/Helm values for ITOps core, agent, portalRarely (upgrades)Platform team
services/ConfigMaps for each monitored serviceWhen services are added/removedService owners
monitoring/CronJobs for storage, backup, health pushWhen monitoring requirements changeOperations team
sla/SLA Portal targets and error budgetsWhen SLA contracts changeManagement / compliance

Security defaults (since v4.1.3 / chart 1.10.0)

The chart ships with defense-in-depth on by default — you don't opt in, you opt out. Specifically:

If a webhook target or workflow HTTP step needs to reach an in-cluster private service, set the specific ITOPS_SECURITY_WEBHOOK_HOST_ALLOWLIST env var AND add the namespace to networkPolicy.allowedEgressNamespaces. The Installation page has full recipes.

1. Platform Installation

Install ITOps from the Helm repo. The values file is your IaC definition.

# platform/itops-values.yaml
imagePullSecrets:
  - name: ghcr-secret

ui:
  apiUrl: "https://api.yourdomain.com"
  wsHost: "api.yourdomain.com"

env:
  ITOPS_SERVER_ENVIRONMENT: production
  ITOPS_FEATURE_LOCAL_AUTH: "true"
  ITOPS_FEATURE_OPERATOR_API: "true"

secretEnv:
  ITOPS_DATABASE_PASSWORD: "strong-password"
  ITOPS_JWT_SECRET: "random-jwt-secret"
  ITOPS_SECURITY_OPERATOR_API_KEY: "random-api-key"
  ITOPS_LICENSE_KEY: "eyJhbGci..."

ingress:
  hosts:
    - host: api.yourdomain.com
      paths: [{ path: /, pathType: Prefix }]

uiIngress:
  hosts:
    - host: app.yourdomain.com
      paths: [{ path: /, pathType: Prefix }]
# Install / upgrade
helm repo add itops https://charts.mlops.hu
helm upgrade --install itops itops/itops -n itops --create-namespace -f platform/itops-values.yaml

1b. Auth providers (GitOps, since v4.1.5 / chart 1.13.0)

Authentication providers — local login, LDAP, future SSO — are now declared in the same values file and bootstrapped into the database on every pod start. The admin UI shows them read-only: to add, edit, or remove a provider you change the Helm values and redeploy. No manual UI clicks, no config drift.

# platform/itops-auth-values.yaml
auth:
  local:
    enabled: true
    isDefault: true
    passwordPolicy:
      minLength: 12
      requireUppercase: true
      requireDigit: true

  ldap:
    enabled: true
    name: "corporate-ldap"
    displayName: "Corporate LDAP"
    host: "ldap.corp.internal"
    port: 389
    bindDn: "cn=service-itops,ou=ServiceAccounts,dc=corp,dc=internal"
    bindPasswordSecret:                 # recommended — external-secrets or similar
      name: my-ldap-creds
      key: bindPassword
    baseDn: "dc=corp,dc=internal"
    userFilter: "(sAMAccountName=%s)"   # AD
    groupFilter: "(member=%s)"
    userAttrs:
      username: "sAMAccountName"
      email: "mail"
      displayName: "displayName"
    # Optional explicit mappings. If omitted, LDAP groups whose CN matches an
    # ITOps group name (e.g. cn=itops-admins) auto-join. The itops- prefix
    # protects you from a generic company "admins" group accidentally
    # granting platform admin.
    groupMappings: []

Apply with:

helm upgrade itops itops/itops -n itops \
  -f platform/itops-values.yaml \
  -f platform/itops-auth-values.yaml

External secrets (Vault / ESO / SOPS)

For production, do not hand LDAP bind passwords or DB credentials to Helm directly. The chart gives you three GitOps-safe knobs:

# 1) Per-key reference. Each ITOPS_* env var can point to an existing Secret.
#    Works great with external-secrets-operator / Vault CSI / SOPS-decoded
#    manifests.
secretRefs:
  ITOPS_DATABASE_PASSWORD:
    name: itops-db-creds         # existing K8s Secret
    key: password
  ITOPS_JWT_SECRET:
    name: itops-jwt
    key: secret
  ITOPS_SECURITY_OPERATOR_API_KEY:
    name: itops-operator-api-key
    key: key

# 2) Bulk envFrom. Every key in the Secret becomes an env var of the same
#    name. Minimum plumbing — perfect when ESO syncs a whole bag at once.
extraEnvFrom:
  - secretRef:
      name: itops-bulk-secrets    # ExternalSecret → K8s Secret → here

# 3) LDAP bind password via existing Secret (no plaintext in values).
auth:
  ldap:
    enabled: true
    host: ldap.corp.internal
    baseDn: dc=corp,dc=internal
    bindDn: cn=svc-itops,ou=Service,dc=corp,dc=internal
    bindPasswordSecret:
      name: itops-ldap-creds     # existing K8s Secret
      key: bindPassword

All three coexist. If the same env var is set in both secretEnv (plain) and secretRefs (reference), the reference wins.

Group semantics

ITOps ships with four admin-relevant built-in groups. The admin UI Groups page shows each one's internal UUID (copy button) — use it in groupMappings when you need explicit control.

ITOps groupWhat it grantsLDAP CN to auto-match
itops-adminsFull platform admincn=itops-admins
itops-trust-adminsTrusted admin ops (PKI / HSM)cn=itops-trust-admins
itops-operatorsDay-to-day service/ticket opscn=itops-operators
itops-usersRead-onlycn=itops-users

Auto-match is bidirectional (since 4.1.12): an LDAP user whose groups include cn=itops-admins joins the ITOps itops-admins group at login via JIT provisioning. Remove the LDAP group, and on next login the ITOps membership is revoked — LDAP is the source of truth. The revocation only touches groups considered "managed" (explicitly mapped in groupMappings, or itops-* named in CN auto-match mode); manually-added custom groups are never clobbered.

Authoritative reconciliation (since 4.1.14)

GitOps reconciliation is authoritative: the state of the Helm values is the state of the system, including deletions. When you remove an entry from values and redeploy:

EntityBehaviour on removal
Auth provider (auth.ldap)GitOps-managed row (config._managedByGitOps=true) is DELETED on next core startup. Admin-created rows without the flag are never touched.
Group mapping (groupMappings[])Transactional replace — mapping table re-synced in full every reconcile cycle.
SLA group (agent's slaGroups)Memberships pruned per-node immediately. When the last node stops contributing to a group, the group row itself is DELETED on the next sync cycle.
Group-member link (node X services in group Y)Pruned per-node every sync.
User (LDAP JIT-provisioned)User row is NOT deleted — historical ownership of tickets / workflows / audit entries is preserved. The user simply can't log in anymore. Group memberships are revoked on the last attempted login.

Safety: the reconciler only prunes when it successfully processed at least one entry this cycle. If the config file is missing or empty, nothing is deleted — a misconfigured deploy can never wipe login. Historical SLA snapshots are indexed by service_id, not group_id, so deleting an SLA group has no effect on past uptime numbers.

Result on the admin UI: you only ever see what is currently declared in GitOps. No ghost providers, no stale SLA groups carrying over from a previous layout.

2. Service Definitions (ConfigMaps)

Each monitored service gets a ConfigMap. The agent discovers them automatically by the itops.io/config: "true" label.

# services/postgresql-itops.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: postgresql-itops
  namespace: production
  labels:
    itops.io/config: "true"
data:
  it-ops.yaml: |
    version: "1"
    hierarchy:
      organization: "myorg"
      platform: "myplatform"
      environment: "prod"
      cluster: "cluster1"
      service: "postgresql"
    service:
      name: "postgresql"
      criticality: "critical"
      slaGroup: "payment-system"
      workloadType: "statefulset"
      workloadName: "postgresql"
    tags:
      - database
      - storage
    metadata:
      serviceType: "PostgreSQL"
      usedBy:
        - name: "payment-api"
          displayName: "Payment API"
        - name: "user-service"
          displayName: "User Service"
    operations:
      backup:
        expected: true
        maxAgeDays: 1

Apply via ArgoCD (recommended for production) or kubectl apply -f services/ for one-off testing. Mixing kubectl apply with an ArgoCD-managed path causes drift and is not GitOps — pick one.

Bare-Metal Auto-Register (no ConfigMap)

Services that aren't running in Kubernetes skip the ConfigMap step entirely. The first call to /api/v1/health/report or /api/v1/storage/report auto-creates the service (source=external, hierarchy node from nodeId). Subsequent pushes update status. This is ideal for databases on VMs, external S3 buckets, RDS instances, routers, etc.

# Bare-metal service — CronJob does everything, no ConfigMap needed
curl -X POST https://api.yourdomain.com/api/v1/health/report \
  -H "X-API-Key: $API_KEY" \
  -d '{
    "service": "galera-node1",
    "nodeId": "myorg/infra/prod/baremetal",
    "status": "OPERATIONAL",
    "criticality": "critical",
    "slaGroup": "database-cluster",
    "tags": ["database", "baremetal"]
  }'

3. Monitoring CronJobs (Push Webhooks)

CronJobs push storage, backup, and health data to the ITOps API. They use the same OPERATOR_API_KEY as the agent.

Storage Reporter

# monitoring/storage-reporter.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: itops-storage-reporter
  namespace: itops
spec:
  schedule: "*/15 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
            - name: reporter
              image: curlimages/curl:latest
              env:
                - name: API_URL
                  value: "http://itops-core.itops:8080"
                - name: API_KEY
                  valueFrom:
                    secretKeyRef:
                      name: itops-secrets
                      key: ITOPS_SECURITY_OPERATOR_API_KEY
              command: ["/bin/sh", "-c"]
              args:
                - |
                  # PostgreSQL storage
                  curl -s -X POST "$API_URL/api/v1/storage/report" \
                    -H "X-API-Key: $API_KEY" \
                    -H "Content-Type: application/json" \
                    -d '{"service":"postgresql","nodeId":"myorg/myplatform/prod/cluster1","allocatedBytes":107374182400,"usedBytes":64424509440,"storageType":"pvc"}'

Backup Reporter

# monitoring/backup-reporter.yaml
# Add to your existing backup CronJob (pg_dump, mysqldump, etc.)
# After successful backup, report to ITOps:

curl -s -X POST "$API_URL/api/v1/backup/report" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"service\":\"postgresql\",\"nodeId\":\"myorg/myplatform/prod/cluster1\",\"status\":\"success\",\"sizeBytes\":$BACKUP_SIZE}"

Bare Metal Health Push

# monitoring/health-push.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: itops-health-push
  namespace: itops
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
            - name: health
              image: curlimages/curl:latest
              env:
                - name: API_URL
                  value: "http://itops-core.itops:8080"
                - name: API_KEY
                  valueFrom:
                    secretKeyRef:
                      name: itops-secrets
                      key: ITOPS_SECURITY_OPERATOR_API_KEY
              command: ["/bin/sh", "-c"]
              args:
                - |
                  # Galera cluster health
                  curl -s -X POST "$API_URL/api/v1/health/report" \
                    -H "X-API-Key: $API_KEY" \
                    -H "Content-Type: application/json" \
                    -d '{"service":"galera-node1","status":"OPERATIONAL","message":"cluster_size=3","nodeId":"myorg/infra/prod/baremetal","criticality":"critical","slaGroup":"database-cluster","serviceType":"database"}'

4. SLA Portal Targets

Define SLO objectives in the SLA Portal Helm values. Error budgets are calculated automatically.

# sla/sla-portal-values.yaml
ingress:
  host: sla.yourdomain.com
apiKey: "your-sla-portal-key"
slaTargets:
  payment-system:
    uptime: 99.99
    label: "Payment System SLA"
  infrastructure:
    uptime: 99.9
    label: "Infrastructure SLA"
helm upgrade --install sla-portal itops/sla-portal -n sla-portal --create-namespace -f sla/sla-portal-values.yaml

5. Template Export / Import (workflows, catalog, SLA defs)

Workflows, ticket catalog items and SLA definitions are created via the admin UI (not Helm) but can be round-tripped through YAML for GitOps backup. The core exposes GET /api/v1/templates/export and POST /api/v1/templates/import. Use these in CI so the admin-UI state is reproducible:

# Nightly export to Git (CronJob or CI job)
curl -sX GET https://api.yourdomain.com/api/v1/templates/export \
  -H "X-API-Key: $API_KEY" \
  -o ./gitops/templates/itops-config-$(date +%F).yaml
# Then git add / commit / push

# Restore after rebuild (disaster recovery or fresh env bootstrap)
curl -sX POST https://api.yourdomain.com/api/v1/templates/import \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/yaml" \
  --data-binary @./gitops/templates/itops-config.yaml

This is the GitOps "escape hatch" for admin-UI state — everything the backend creates at runtime (workflow templates, catalog items, SLA definitions with custom tiers) can be exported and re-applied declaratively.

Data Flow Summary

DataSourceDestinationMethod
Service registrationConfigMap (Git)Agent → Core APIAuto-discovery
Service health (K8s)K8s APIAgent → Core APIAgent sync (30s)
Service health (bare metal)CronJob (Git)Health webhook → Core APIPush (configurable)
Storage metricsCronJob (Git)Storage webhook → Core APIPush (15 min)
Backup statusBackup script (Git)Backup webhook → Core APIPush (after backup)
SLA targetsHelm values (Git)SLA Portal env varHelm install
SLA reportsCore APISLA PortalDaily push (07:00)

ArgoCD Setup

Three ArgoCD applications, one for each concern:

# App 1: Platform
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: itops-platform
spec:
  source:
    repoURL: https://charts.mlops.hu
    chart: itops
    targetRevision: "1.9.14"
    helm:
      valueFiles: ["$values/platform/itops-values.yaml"]

# App 2: Agent (one per K8s cluster — install in each cluster's ArgoCD)
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: itops-agent
spec:
  source:
    repoURL: https://charts.mlops.hu
    chart: itops-agent
    targetRevision: "1.2.0"
    helm:
      valueFiles: ["$values/platform/itops-agent-values.yaml"]

# App 3: Service ConfigMaps + Monitoring CronJobs
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: itops-services
spec:
  source:
    repoURL: https://github.com/myorg/infra-gitops.git
    path: services/
    targetRevision: main

# App 4: SLA Portal
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: sla-portal
spec:
  source:
    repoURL: https://charts.mlops.hu
    chart: sla-portal
    targetRevision: "1.3.1"
    helm:
      valueFiles: ["$values/sla/sla-portal-values.yaml"]