【集群】KubeSphere搭建记录——ks-installer 解析

【集群】KubeSphere搭建记录——ks-installer 解析

Fre5h1nd Lv5

💡简介

KubeSphere 部署后,Prometheus 一直无法正常部署,排查后发现是底层 NFS 系统出了故障,但想修改 Prometheus 的存储配置却无从下手。
因此,对 ks-installer 组件自动安装器进行了解析。

🫎修改配置方法

根据对 ks-installer 组件极其核心脚本分析,想修改 monitoring 模块的 Prometheus 组件配置方式如下:

假设/root/wxl/cluster-configuration.yaml配置文件对应的cluster-configuration已部署,相关属性如下:

1
2
3
kind: ClusterConfiguration(简称cc)
name: ks-installer
namespace: kubesphere-system

可通过以下命令修改:kubectl edit cc -n kubesphere-system ks-installer

  1. 删除出错prometheus对应的pvc(因为monitoring会获取历史配置)
    /kubesphere/roles/roles/ks-monitor/tasks/get_old_config.yaml:4

  2. 在已部署的cluster-configuration中删除status:monitoring(或改为满足以下条件:"status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

  3. 修改相关参数。当ks-installer检测到ClusterConfiguration变化则会重新部署。

ps:如果想取消 Prometheus 持久化,只要把storage属性完全删掉就可以了。

🧠解析:自动化安装Prometheus逻辑[1]

ks-installer本质是一个脚本执行器(shell-operator),发生变化时自动执行部署脚本。(部署在shell-operator中的脚本可以订阅预设的钩子,钩子发生变化后触发脚本)

shell-operator支持以下三类钩子:

  1. OnStartup:启动后即运行
  2. schedule:crontab格式的定时任务
  3. kubernetes:监控Kubernetes资源,根据定义的事件类型来响应

ks-installer的pod中/hooks/kubesphere目录下包含两个文件:

  • installRunner.py:部署ks-installer
  • schedule.sh:定期执行任务,检查状态、注册

查看安装日志:kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f

通过进入pod查看ks-installer文件,可以理解安装Prometheus逻辑[2]kubectl -n kubesphere-system exec -it $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- bash

根据[2]发现Prometheus的相关配置在/kubesphere/kubesphere/prometheus目录下。找到使我们报错的文件:/kubesphere/kubesphere/prometheus/prometheus/prometheus-prometheus.yaml

🖼️解析:ks-installer 核心脚本

  • installRunner.py 文件位置:
  1. 进入 ks-installer pod
    kubectl exec -it -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -- bash
  2. 进入钩子目录
    cd /hooks/kubesphere

逻辑解析:以 monitoring 组件为例

  1. 在 configFile 文件中配置enabled(部分组件默认enabled,无需单独配置)

    1
    2
    3
    # /kubesphere/config/ks-config.json
    monitoring:
    enabled: true
  2. 配置 Ansible playbook 脚本,说明流程

    1
    2
    3
    4
    5
    6
    7
    # /kubesphere/playbooks/monitoring.yaml
    ---
    - hosts: localhost # 表示 playbook 中的任务将在 localhost上执行。
    gather_facts: false # 表示 Ansible 不会收集关于 localhost 的 facts。Facts 是 Ansible 收集的关于系统的信息,包括操作系统、网络接口、硬件、环境变量等等。如果你不需要这些信息,可以设置 gather_facts: false 来提高 playbook 的执行速度。
    roles: # 一个列表,定义了应用于 localhost 的 Ansible roles。Roles 是一种组织 playbook 的方式,它们包含了一系列相关的任务、变量、模板等等。在这个例子中,有两个 roles:kubesphere-defaults 和 ks-monitor。这意味着 Ansible 将执行这两个 roles 中定义的任务。
    - kubesphere-defaults
    - ks-monitor
  3. 查看 roles 定义的任务(核心在3.2.3)

  • 3.1 kubesphere-defaults

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    # /kubesphere/roles/kubesphere-defaults/tasks/main.yaml
    ---
    - name: KubeSphere | Setting images' namespace override
    set_fact: # 当local_registry被设置为北京阿里云仓库、或zone被设置为cn时,设置 namespace_override 变量
    namespace_override: "kubesphereio"
    when: (local_registry is defined and local_registry == "registry.cn-beijing.aliyuncs.com") or (zone is defined and zone == "cn")

    - name: KubeSphere | Configuring defaults
    debug: # 输出信息msg到日志
    msg: "Check roles/kubesphere-defaults/defaults/main.yml"
    tags:
    - always
  • 3.2 ks-monitor

    • 3.2.1 main

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      # /kubesphere/roles/ks-monitor/tasks/main.yaml
      # 导入一系列其他的任务文件(import_tasks)和执行一个 shell 命令
      - import_tasks: prometheus-stack.yaml # 导入 prometheus-stack.yaml 文件中定义的任务,并在 common.monitoring.type 未定义或者不等于 'external' 时执行。
      when:
      - "common.monitoring.type is not defined or common.monitoring.type != 'external'"

      - import_tasks: monitoring-dashboard.yaml # 当没有monitoring的status时,初始化monitoring
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: ks-istio-monitoring.yaml
      when:
      - "servicemesh.enabled is defined and servicemesh.enabled"

      - import_tasks: gpu-monitoring.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - name: Monitoring | Importing ks-monitoring status
      shell: >
      {{ bin_dir }}/kubectl patch cc ks-installer
      --type merge
      -p '{"status": {"monitoring": {"status": "enabled", "enabledTime": "{{ lookup('pipe','date +%Y-%m-%dT%H:%M:%S%Z') }}"}}}'
      -n kubesphere-system
      register: cc_result
      failed_when: "cc_result.stderr and 'Warning' not in cc_result.stderr"
      until: cc_result is succeeded
      retries: 5
      delay: 3

      - import_tasks: thanos-ruler.yaml
      when:
      - alerting is defined
      - alerting.enabled is defined
      - alerting.enabled == true
      - "status.alerting is not defined or status.alerting.status is not defined or status.alerting.status != 'enabled'"

      - import_tasks: alert-migrate.yaml
      when:
      - alerting is defined and alerting.enabled is defined and alerting.enabled == true
      - "status.alerting is not defined or status.alerting.status is not defined or status.alerting.status != 'enabled'"
    • 3.2.2 prometheus-stack

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      45
      # /kubesphere/roles/ks-monitor/tasks/prometheus-stack.yaml
      ---
      - import_tasks: cleanup.yaml

      - import_tasks: generate_manifests.yaml

      - import_tasks: prometheus-operator.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: node-exporter.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: kube-state-metrics.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: grafana.yaml
      when:
      - monitoring.grafana is defined
      - monitoring.grafana.enabled is defined
      - monitoring.grafana.enabled == true

      - import_tasks: prometheus.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: etcd.yaml

      - import_tasks: k8s-monitor.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: ks-core-monitor.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: alertmanager.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"

      - import_tasks: notification-manager.yaml
      when:
      - "status.monitoring is not defined or status.monitoring.status is not defined or status.monitoring.status != 'enabled'"
    • 3.2.3 generate_manifests

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      # /kubesphere/roles/ks-monitor/tasks/generate_manifests.yaml
      ---
      - name: Monitoring | Getting ks-monitoring installation files
      copy:
      src: "{{ item }}"
      dest: "{{ kubesphere_dir }}/"
      loop:
      - "prometheus"

      - import_tasks: get_old_config.yaml

      - name: Monitoring | Creating manifests
      template:
      src: "{{ item.file }}.j2"
      dest: "{{ kubesphere_dir }}/{{ item.path }}/{{ item.file }}"
      with_items:
      - { path: prometheus/prometheus-operator, file: prometheus-operator-deployment.yaml }
      - { path: prometheus/prometheus, file: prometheus-prometheus.yaml }
      - { path: prometheus/prometheus, file: prometheus-podDisruptionBudget.yaml}
      - { path: prometheus/kube-state-metrics, file: kube-state-metrics-deployment.yaml }
      - { path: prometheus/node-exporter, file: node-exporter-daemonset.yaml }
      - { path: prometheus/alertmanager, file: alertmanager-alertmanager.yaml }
      - { path: prometheus/alertmanager, file: alertmanager-podDisruptionBudget.yaml }
      - { path: prometheus/grafana, file: grafana-deployment.yaml }
      - { path: prometheus/etcd, file: prometheus-serviceMonitorEtcd.yaml }
      - { path: prometheus/etcd, file: prometheus-endpointsEtcd.yaml }
      - { path: prometheus/thanos-ruler, file: thanos-ruler-thanosRuler.yaml }
      - { path: prometheus/thanos-ruler, file: thanos-ruler-podDisruptionBudget.yaml }
    • 3.2.4 get_old_config

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      30
      31
      32
      33
      34
      35
      36
      37
      38
      39
      40
      41
      42
      43
      44
      # /kubesphere/roles/roles/ks-monitor/tasks/get_old_config.yaml
      ---
      - name: Monitoring | Checking Prometheus PersistentVolumeClaim
      shell: >
      {{ bin_dir }}/kubectl get pvc -n kubesphere-monitoring-system prometheus-k8s-db-prometheus-k8s-0 -o jsonpath='{.spec.resources.requests.storage}'
      register: prometheus_pvc
      failed_when: false # 即使命令执行失败,也不会停止 playbook 的执行

      - name: Monitoring | Setting Prometheus data pv size
      set_fact:
      prometheus_pv_size: "{{ prometheus_pvc.stdout }}"
      when:
      - prometheus_pvc.rc == 0
      - prometheus_pvc.stdout != ""
      failed_when: false

      - name: Monitoring | Checking Prometheus retention days
      shell: >
      {{ bin_dir }}/kubectl get prometheuses.monitoring.coreos.com -n kubesphere-monitoring-system k8s -o jsonpath='{.spec.retention}'
      register: prometheus_retention
      failed_when: false

      - name: Monitoring | Setting Prometheus retention days
      set_fact:
      prometheus_retention_duration: "{{ prometheus_retention.stdout }}"
      when:
      - prometheus_retention.rc == 0
      - prometheus_retention.stdout != ""
      failed_when: false

      - name: Monitoring | Checking Prometheus node selector
      shell: |
      {{ bin_dir }}/kubectl get prometheuses.monitoring.coreos.com -n kubesphere-monitoring-system k8s -o go-template --template="{{ '{{' }}range \$key, \$value := .spec.nodeSelector{{ '}}' }} {{ '{{' }}\$key{{ '}}' }}: {{ '{{' }}\$value{{ '}}' }}
      {{ '{{' }}end{{ '}}' }}"
      register: prometheus_node_selector
      failed_when: false

      - name: Monitoring | Setting Prometheus node selector
      set_fact:
      prometheus_node_selector_map: "{{ prometheus_node_selector.stdout }}"
      when:
      - prometheus_node_selector.rc == 0
      - prometheus_node_selector.stdout != ""
      failed_when: false
    • 3.2.5 monitoring-dashboard

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      # /kubesphere/roles/ks-monitor/tasks/monitoring-dashboard.yaml
      ---

      - name: Monitoring | Getting monitoring-dashboard installation files
      copy:
      src: "{{ item }}" # 遍历loop中的值,此处只有monitoring-dashboard
      dest: "{{ kubesphere_dir }}/"
      loop:
      - "monitoring-dashboard"

      - name: Monitoring | Installing monitoring-dashboard
      shell: >
      {{ bin_dir }}/kubectl apply -f {{ kubesphere_dir }}/monitoring-dashboard

🏥反思



  • 希望这篇博客对你有帮助!如果你有任何问题或需要进一步的帮助,请随时提问。
  • 如果你喜欢这篇文章,欢迎动动小手 给我一个follow或star。

🗺参考文献

[1] Kubesphere之ks-installer介绍

[2] 集成您自己的 Prometheus

[3] 生成prometheus-prometheus.yaml文件的模板

  • 标题: 【集群】KubeSphere搭建记录——ks-installer 解析
  • 作者: Fre5h1nd
  • 创建于 : 2024-01-19 00:05:20
  • 更新于 : 2024-03-08 15:36:48
  • 链接: https://freshwlnd.github.io/2024/01/19/k8s/k8s-kubesphere-installer/
  • 版权声明: 本文章采用 CC BY-NC-SA 4.0 进行许可。
评论