Kubernetes(K8S) 监控 Prometheus + Grafana

kubernetes,k8s,监控,prometheus,grafana · 浏览次数 : 338

小编点评

## 内容生成时需要带简单的排版说明 **1. 排版说明格式** - 使用 tab 或空白字符隔开说明 - 使用中文描述说明 - 使用清晰的标题和描述 - 使用缩短的变量名 - 使用缩短的命令 - 使用空行分隔说明 - 使用简洁的格式 **2. 排版说明示例** **1.标题** > Grafana 配置数据源 **2.描述** > 配置 Grafana 的数据源,包括数据源名称、数据源类型、数据源连接参数等。 **3.标题** > Grafana 数据源设置 **4.描述** > 设置 data源,包括数据源名称、数据源类型、数据源连接参数等。 **5.标题** > 数据源设置 **6.描述** > 设置 data source,包括以下参数: - 数据源名称 - 数据源类型 - 数据源连接参数 **7.标题** > 数据源类型设置 **8.描述** > 设置 data source 类型,包括以下参数: - 数据库连接 - 远程连接 - 文件连接 **9.标题** > 数据源连接参数设置 **10.描述** > 设置数据源连接参数,包括以下参数: - 数据库连接参数 - 远程连接参数 - 文件连接参数 **11.标题** > 数据源设置完成 **12.描述** > 数据源设置完成,现在可以访问数据源进行配置。 **13.示例** > # 数据库连接设置 ``` db_conn_params = { 'db_name': 'my_db', 'user_name': 'my_user', 'user_password': 'my_password', 'db_host': 'my_host' } ``` **14.建议** - 排版说明时尽量使用中文 - 使用清晰的标题和描述 - 使用缩短的变量名 - 使用缩短的命令 - 使用空行分隔说明 - 使用简洁的格式

正文

监控指标
集群监控

  • 节点资源利用率
  • 节点数
  • 运行Pods

Pod 监控

  • 容器指标
  • 应用程序
Prometheus

开源的
监控、报警、数据库
以HTTP协议周期性抓取被监控组件状态
不需要复杂的集成过程,使用http接口接入就可以了

Grafana

开源的数据分析和可视化工具
支持多种数据源

image

Yaml 文件

将 Yaml 传到 linux 服务器
image
node-exporter.yaml
prometheus
--configmap.yaml
--prometheus.deploy.yml
--prometheus.svc.yml
--rbac-setup.yaml
grafana
--grafana-deploy.yaml
--grafana-ing.yaml
--grafana-svc.yaml

Prometheus Yaml 文件
  • node-exporter.yaml
    守护进程
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    k8s-app: node-exporter
spec:
  selector:
    matchLabels:
      k8s-app: node-exporter
  template:
    metadata:
      labels:
        k8s-app: node-exporter
    spec:
      containers:
      - image: prom/node-exporter
        name: node-exporter
        ports:
        - containerPort: 9100
          protocol: TCP
          name: http
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: node-exporter
  name: node-exporter
  namespace: kube-system
spec:
  ports:
  - name: http
    port: 9100
    nodePort: 31672
    protocol: TCP
  type: NodePort
  selector:
    k8s-app: node-exporter


  • configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: kube-system
data:
  prometheus.yml: |
    global:
      scrape_interval:     15s
      evaluation_interval: 15s
    scrape_configs:

    - job_name: 'kubernetes-apiservers'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https

    - job_name: 'kubernetes-nodes'
      kubernetes_sd_configs:
      - role: node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics

    - job_name: 'kubernetes-cadvisor'
      kubernetes_sd_configs:
      - role: node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

    - job_name: 'kubernetes-service-endpoints'
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (https?)
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name

    - job_name: 'kubernetes-services'
      kubernetes_sd_configs:
      - role: service
      metrics_path: /probe
      params:
        module: [http_2xx]
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
        action: keep
        regex: true
      - source_labels: [__address__]
        target_label: __param_target
      - target_label: __address__
        replacement: blackbox-exporter.example.com:9115
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        target_label: kubernetes_name

    - job_name: 'kubernetes-ingresses'
      kubernetes_sd_configs:
      - role: ingress
      relabel_configs:
      - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
        regex: (.+);(.+);(.+)
        replacement: ${1}://${2}${3}
        target_label: __param_target
      - target_label: __address__
        replacement: blackbox-exporter.example.com:9115
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_ingress_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_ingress_name]
        target_label: kubernetes_name

    - job_name: 'kubernetes-pods'
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name

  • prometheus.deploy.yml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    name: prometheus-deployment
  name: prometheus
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - image: prom/prometheus:v2.0.0
        name: prometheus
        command:
        - "/bin/prometheus"
        args:
        - "--config.file=/etc/prometheus/prometheus.yml"
        - "--storage.tsdb.path=/prometheus"
        - "--storage.tsdb.retention=24h"
        ports:
        - containerPort: 9090
          protocol: TCP
        volumeMounts:
        - mountPath: "/prometheus"
          name: data
        - mountPath: "/etc/prometheus"
          name: config-volume
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 500m
            memory: 2500Mi
      serviceAccountName: prometheus    
      volumes:
      - name: data
        emptyDir: {}
      - name: config-volume
        configMap:
          name: prometheus-config   

  • prometheus.svc.yml
---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: prometheus
  name: prometheus
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 9090
    targetPort: 9090
    nodePort: 30303
  selector:
    app: prometheus

  • rbac-setup.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  - extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: kube-system

grafana Yaml 文件
  • grafana-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-core
  namespace: kube-system
  labels:
    app: grafana
    component: core
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
      component: core
  template:
    metadata:
      labels:
        app: grafana
        component: core
    spec:
      containers:
      - image: grafana/grafana:4.2.0
        name: grafana-core
        imagePullPolicy: IfNotPresent
        # env:
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 100Mi
        env:
          # The following env variables set up basic auth twith the default admin user and admin password.
          - name: GF_AUTH_BASIC_ENABLED
            value: "true"
          - name: GF_AUTH_ANONYMOUS_ENABLED
            value: "false"
          # - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          #   value: Admin
          # does not really work, because of template variables in exported dashboards:
          # - name: GF_DASHBOARDS_JSON_ENABLED
          #   value: "true"
        readinessProbe:
          httpGet:
            path: /login
            port: 3000
          # initialDelaySeconds: 30
          # timeoutSeconds: 1
        volumeMounts:
        - name: grafana-persistent-storage
          mountPath: /var
      volumes:
      - name: grafana-persistent-storage
        emptyDir: {}

  • grafana-ing.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
   name: grafana
   namespace: kube-system
spec:
   rules:
   - host: k8s.grafana
     http:
       paths:
       - path: /
         backend:
          serviceName: grafana
          servicePort: 3000
  • grafana-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: kube-system
  labels:
    app: grafana
    component: core
spec:
  type: NodePort
  ports:
    - port: 3000
  selector:
    app: grafana
    component: core

部署

[root@k8smaster monitor]# ls
grafana  node-exporter.yaml  prometheus
# 安装 Prometheus
[root@k8smaster monitor]# kubectl create -f prometheus/rbac-setup.yaml 
clusterrole.rbac.authorization.k8s.io/prometheus created
serviceaccount/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
[root@k8smaster monitor]# kubectl create -f node-exporter.yaml
daemonset.apps/node-exporter created
service/node-exporter created
[root@k8smaster monitor]# kubectl create -f prometheus/configmap.yaml
configmap/prometheus-config created
[root@k8smaster monitor]# kubectl create -f prometheus/prometheus.deploy.yml
deployment.apps/prometheus created
[root@k8smaster monitor]# kubectl create -f prometheus/prometheus.svc.yml
service/prometheus created
# 安装 Grafana
[root@k8smaster monitor]# kubectl create -f grafana/grafana-deploy.yaml 
deployment.apps/grafana-core created
[root@k8smaster monitor]# kubectl create -f grafana/grafana-svc.yaml 
service/grafana created
[root@k8smaster monitor]# kubectl create -f grafana/grafana-ing.yaml 
ingress.extensions/grafana created
# 查看状态
[root@k8smaster monitor]# kubectl get pod,svc -n kube-system
NAME                                    READY   STATUS    RESTARTS   AGE
pod/coredns-7ff77c879f-jzlk2            1/1     Running   3          46d
pod/coredns-7ff77c879f-lkcdc            1/1     Running   1          7d
pod/etcd-k8smaster                      1/1     Running   2          46d
pod/grafana-core-768b6bf79c-cql42       1/1     Running   0          2m17s
pod/kube-apiserver-k8smaster            1/1     Running   2          46d
pod/kube-controller-manager-k8smaster   1/1     Running   3          46d
pod/kube-proxy-2245f                    1/1     Running   11         46d
pod/kube-proxy-4rlp8                    1/1     Running   7          46d
pod/kube-proxy-c8fq4                    1/1     Running   0          5d2h
pod/kube-proxy-gtfts                    1/1     Running   2          46d
pod/kube-scheduler-k8smaster            1/1     Running   4          46d
pod/metrics-server-655cb9c58b-xcxzw     1/1     Running   17         5d2h
pod/node-exporter-52hwv                 1/1     Running   0          34m
pod/node-exporter-mj8tk                 1/1     Running   0          34m
pod/node-exporter-n2ctw                 1/1     Running   0          34m
pod/prometheus-7486bf7f4b-4f7vt         1/1     Running   0          3m

NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
service/grafana          NodePort    10.107.69.123    <none>        3000:31116/TCP           2m8s
service/kube-dns         ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP,9153/TCP   46d
service/metrics-server   ClusterIP   10.104.178.240   <none>        443/TCP                  11d
service/node-exporter    NodePort    10.104.117.209   <none>        9100:31672/TCP           34m
service/prometheus       NodePort    10.106.76.46     <none>        9090:30303/TCP           13m
[root@k8smaster monitor]# 

image

image
默认用户名:admin 密码:admin

打开 Grafana,配置数据源,

配置数据源,
image
image
点击 Add
image

导入显示模板

image
image
image
image

与Kubernetes(K8S) 监控 Prometheus + Grafana相似的内容:

Kubernetes(K8S) 监控 Prometheus + Grafana

监控指标 集群监控 节点资源利用率 节点数 运行Pods Pod 监控 容器指标 应用程序 Prometheus 开源的 监控、报警、数据库 以HTTP协议周期性抓取被监控组件状态 不需要复杂的集成过程,使用http接口接入就可以了 Grafana 开源的数据分析和可视化工具 支持多种数据源 Yam

如何用 Prometheus Operator 监控 K8s 集群外服务?

前言 前面系列文章中: Prometheus Operator 与 kube-prometheus 之一 - 简介 - 东风微鸣技术博客 (ewhisper.cn) 监控 Kubernetes 集群证书过期时间的三种方案 - 东风微鸣技术博客 (ewhisper.cn) 介绍了 Prometheus

批处理及有状态等应用类型在 K8S 上应该如何配置?

众所周知, Kubernetes(K8S)更适合运行无状态应用, 但是除了无状态应用. 我们还会有很多其他应用类型, 如: 有状态应用, 批处理, 监控代理(每台主机上都得跑), 更复杂的应用(如:hadoop 生态...). 那么这些应用可以在 K8S 上运行么? 如何配置? 其实, K8S 针对

K8S Pod Sidecar 应用场景之一-加入 NGINX Sidecar 做反代和 web 服务器

Kubernetes Pod Sidecar 简介 Sidecar 是一个独立的容器,与 Kubernetes pod 中的应用容器一起运行,是一种辅助性的应用。 Sidecar 的常见辅助性功能有这么几种: 服务网格 (service mesh) 代理 监控 Exporter(如 redis ex

K8S Pod Sidecar 应用场景之一-加入 NGINX Sidecar 做反代和 web 服务器

Kubernetes Pod Sidecar 简介 Sidecar 是一个独立的容器,与 Kubernetes pod 中的应用容器一起运行,是一种辅助性的应用。 Sidecar 的常见辅助性功能有这么几种: 服务网格 (service mesh) 代理 监控 Exporter(如 redis ex

Kubernetes(K8S) Deployment 拉取阿里云镜像部署

Docker Image 推到阿里云仓库,可以看 SpringBoot Docker 发布到 阿里仓库 1. 阿里镜像仓库加了授权,所以 K8S 拉之前要做下授权处理 [root@k8smaster ~]# kubectl create secret docker-registry registry

Kubernetes(K8S) 拉取镜像 ImagePullBackOff pull access denied

K8S 拉取阿里云镜像 第一次用时,没注意 授权,所以在 kubectl apply 后一直出现 ImagePullBackOff [root@k8smaster ~]# kubectl apply -f javademo1.yaml deployment.apps/javademo1 create

Kubernetes(K8S) Deployment 升级和回滚

创建部署详见 Kubernetes(K8S) Deployment 部署 Pod 传统应用升级,一般是V1.0的jar包,有一个应对 1.0 的 shell 启动脚本。升级时,传 2.0 的 jar包,配置 2.0 的 shell 脚本。 执行顺序为,停1.0的服务,启2.0的服务,有问题时,把2.

Kubernetes(K8S) 镜像拉取策略 imagePullPolicy

镜像仓库,镜像已更新,版本没更新, K8S 拉取后,还是早的服务,原因:imagePullPolicy 镜像拉取策略 默认为本地有了就不拉取,需要修改 [root@k8smaster ~]# kubectl edit deployment/javademo1 ..... spec: containe

Kubernetes(K8S) Node NotReady 节点资源不足 Pod无法运行

k8s 线上集群中 Node 节点状态变成 NotReady 状态,导致整个 Node 节点中容器停止服务。 一个 Node 节点中是可以运行多个 Pod 容器,每个 Pod 容器可以运行多个实例 App 容器。Node 节点不可用,就会直接导致 Node 节点中所有的容器不可用,Node 节点是否