一.deploy中的指标
1.1 Deployment 副本数未达预期告警
min(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})>0
and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_desired[{#NAMESPACE}/{#NAME}])>=0
and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}])>=0
说明:
1)min(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})>0
kube.deployment.replicas_mismatched为deployment副本数量不一致的数量,{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD}为模板中的设置的宏设置为#5即5个监控周期,server默认的监控周期是30s,在其主要项Kubernetes: Get state metrics中设置的监控周期是1m,覆盖掉默认的20s监控,所以5个监控周期为5分钟。
在宏中可以通过配置{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD}来配置不同的告警检测时间,如设置所有的deployment告警检测时间为5分钟{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:regex:"deployment:.*:.*"} = #5,设置default中deployment名为nginx的告警检测时间为为3分钟{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:default:nginx"} = #3。
所以第一句即为5分钟之内最小副本不匹配数为大于0。
2)last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_desired[{#NAMESPACE}/{#NAME}])>=0
kube.deployment.replicas_desired为deployment所需副本数,大于等于0
3)last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}])>=0
kube.deployment.replicas_available为deployment可用副本,大于等于0
1.2 Deployment 副本数未达预期恢复
max(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})=0
and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_desired[{#NAMESPACE}/{#NAME}])>=0
and last(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_available[{#NAMESPACE}/{#NAME}])>=0
说明
1)max(/Kubernetes_test cluster state by HTTP/kube.deployment.replicas_mismatched[{#NAMESPACE}/{#NAME}],{$KUBE.REPLICA.MISMATCH.EVAL_PERIOD:"deployment:{#NAMESPACE}:{#NAME}"})=0
5分钟内最大deployment副本数量不一致的数量为0
2)kube.deployment.replicas_desired为deployment所需副本数,大于等于0
3)kube.deployment.replicas_available为deployment可用副本,大于等于0