目录
一、实验
1.环境
2.K8S 1.29版本 部署Jenkins 服务
3.jenkins安装Kubernetes插件
二、问题
1.创建pod失败
2.journalctl如何查看日志信息
2.容器内如何查询jenkins初始密码
3.jenkins离线安装中文包报错
4.jenkins插件报错
一、实验
1.环境
(1)主机
表1 主机
主机 | 架构 | 版本 | IP | 备注 |
master | K8S master节点 | 1.29.0 | 192.168.204.8 | |
node1 | K8S node节点 | 1.29.0 | 192.168.204.9 | |
node2 | K8S node节点 | 1.29.0 | 192.168.204.10 | 已部署Kuboard |
(2)master节点查看集群
1)查看node kubectl get node 2)查看node详细信息 kubectl get node -o wide
(3)查看pod
[root@master ~]# kubectl get pod -A
(4) 访问Kuboard
http://192.168.204.10:30080/kuboard/cluster
查看节点
2.K8S 1.29版本 部署Jenkins 服务
(1)master节点创建命名空间
[root@master jenkins]# kubectl create ns jenkins
(2)Kuboard查看名称空间
已新增jenkins
http://192.168.204.10:30080/kubernetes/K8S-1.29/cluster/namespace
(3)创建serviceAccount服务账户
用来定义运行在Pod中的进程(容器)对Kubernetes API的访问权限的身份。
[root@master jenkins]# vim serviceAccount.yaml
--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: jenkins-admin rules: - apiGroups: [""] resources: ["*"] verbs: ["*"] --- apiVersion: v1 kind: ServiceAccount metadata: name: jenkins-admin namespace: jenkins --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: jenkins-admin roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: jenkins-admin subjects: - kind: ServiceAccount name: jenkins-admin namespace: jenkins
(4)生成资源
[root@master jenkins]# kubectl apply -f serviceAccount.yaml
(5)创建持久化清单
分配一个名为jenkins-pv-volume的pv容量为5G,在这个pv中分名为jenkins-pv-claim的pvc限制3G,挂载目录为/hone/jenkins,挂载节点为node2
[root@master jenkins]# vim volume.yaml
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: local-storage provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer --- apiVersion: v1 kind: PersistentVolume metadata: name: jenkins-pv-volume labels: type: local spec: storageClassName: local-storage claimRef: name: jenkins-pv-claim namespace: jenkins capacity: storage: 5Gi accessModes: - ReadWriteMany local: path: /home/jenkins nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node2 --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: jenkins-pv-claim namespace: jenkins spec: storageClassName: local-storage accessModes: - ReadWriteMany resources: requests: storage: 3Gi
(6) node2节点创建挂载的目录
[root@node2 ~]# mkdir /home/jenkins [root@node2 ~]# chmod 777 jenkins/
(7)生成pv资源
[root@master jenkins]# kubectl apply -f volume.yaml
查看
[root@master jenkins]# kubectl get pv -n jenkins
(8)docker hub 查看jenkins镜像
https://hub.docker.com/r/jenkins/jenkins/tags
(9)node节点提前拉取镜像
node2节点
[root@node2 ~]# docker pull jenkins/jenkins:2.414.1
(10)创建deployment配置文件
[root@master jenkins]# vim deployment.yaml
挂载pv的节点为node2,image镜像版本为2.440.3-lts-jdk17
apiVersion: apps/v1 kind: Deployment metadata: name: jenkins namespace: jenkins spec: replicas: 1 selector: matchLabels: app: jenkins template: metadata: labels: app: jenkins spec: nodeSelector: kubernetes.io/hostname: node2 securityContext: fsGroup: 1000 runAsUser: 1000 serviceAccountName: jenkins-admin containers: - name: jenkins image: jenkins/jenkins:2.440.3-lts-jdk17 resources: limits: memory: "2Gi" cpu: "1000m" requests: memory: "500Mi" cpu: "500m" ports: - name: httpport containerPort: 8080 - name: jnlpport containerPort: 50000 livenessProbe: httpGet: path: "/login" port: 8080 initialDelaySeconds: 90 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 5 readinessProbe: httpGet: path: "/login" port: 8080 initialDelaySeconds: 60 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 volumeMounts: - name: jenkins-data mountPath: /var/jenkins_home volumes: - name: jenkins-data persistentVolumeClaim: claimName: jenkins-pv-claim
(11) 创建资源
[root@master jenkins]# kubectl apply -f deployment.yaml
查看pod
[root@master jenkins]# kubectl get pods -n jenkins -o wide -w
(12)创建service
[root@master jenkins]# vim service.yaml
Kubernetes的监控注解配置(配置Prometheus来抓取指标的注解设置):
annotations:这是Kubernetes资源的注解字段,用于附加非标准的元数据。 prometheus.io/scrape:这是一个特殊的注解键,表示是否应该抓取这个服务的指标。 true:这是prometheus.io/scrape注解的值,表示应该抓取这个服务的指标。 prometheus.io/port注解指定了Prometheus用来抓取指标的端口。
具体配置:
apiVersion: v1 kind: Service metadata: name: jenkins namespace: jenkins annotations: prometheus.io/scrape: 'true' prometheus.io/path: / prometheus.io/port: '8080' spec: selector: app: jenkins type: NodePort ports: - port: 8080 targetPort: 8080 nodePort: 32000
(13)生成service资源并查看
[root@master jenkins]# kubectl apply -f service.yaml
查看svc
[root@master jenkins]# kubectl get svc -n jenkins
(14)Kuboard查看
工作负载
容器组
(15)访问
http://192.168.204.10:32000
(16)获取密码
[root@master jenkins]# kubectl logs -f jenkins-69758d74c9-6tc8s -n jenkins
密码在这一段:
Jenkins initial setup is required. An admin user has been created and a password generated. Please use the following password to proceed to installation: df904552c24d46999f2bfb44b0aa916e
(17)输入密码
可以跳过插件安装
可以点击右上角X 跳过安装
开始使用
(18)进入系统
3.jenkins安装Kubernetes插件
(1)点击系统右下角
Website可以跳转中文官网
https://www.jenkins.io/zh/
(2)管理界面
https://192.168.204.10:32000/manage
插件管理
https://192.168.204.10:32000/manage/pluginManager/advanced
设置国内源
https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json
System也可以设置 Resource Root URL
(3)修改密码与时区
(4)重新登录
(5)master节点上重启jenkins
[root@master jenkins]# kubectl delete pods jenkins-69758d74c9-6tc8s -n jenkins
(6)查看pod
[root@master jenkins]# kubectl get pod -n jenkins
(7)node2节点查看
[root@node2 var]# cd /home/jenkins/ [root@node2 jenkins]# ls
(8)容器进入查看
内容与node2的/home/jenkins/ 一致
[root@master jenkins]# kubectl exec -it jenkins-69758d74c9-hxb89 -n jenkins /bin/bash
(9)离线安装包
https://updates.jenkins-ci.org/download/plugins/
下载中文包
https://updates.jenkins-ci.org/download/plugins/localization-zh-cn/
(10)安装中文离线包
安装:
完成:(需要重新拉活pod)
(11)安装Kubernetes插件
完成:
(12) jenkins绑定k8s集群
创建
查看 Kubernetes API server
[root@master ~]# kubectl cluster-info
连接测试 (因为是基于K8S部署的jenkins,也部署了Service Account的所以不需要填key)
测试成功:
完成连接:
(13)最后再次查看Kuboard
jenkins名称空间
kube-system名称空间
(14)其他方式的jenkins部署
可以参考本人博客:
持续集成交付CICD:Jenkins部署-CSDN博客
二、问题
1.创建pod失败
(1)报错
节点创建Pod会一直卡在ContainerCreating的状态无法顺利创建并且就绪,READY状态一直为0/1
(2)原因分析
①查看pod
[root@master jenkins]# kubectl describe pod jenkins-69758d74c9-6tc8s -n jenkins
最后显示FailedCreatePodSandBox
Warning FailedCreatePodSandBox 7m18s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "7d71985b3886817eb93f1885835a0bb869f67a4de34797266ff850f53f62af1c" network for pod "jenkins-69758d74c9-6tc8s": networkPlugin cni failed to set up pod "jenkins-69758d74c9-6tc8s_jenkins" network: plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized, failed to clean up sandbox container "7d71985b3886817eb93f1885835a0bb869f67a4de34797266ff850f53f62af1c" network for pod "jenkins-69758d74c9-6tc8s": networkPlugin cni failed to teardown pod "jenkins-69758d74c9-6tc8s_jenkins" network: plugin type="calico" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized] Normal SandboxChanged 2m1s (x25 over 7m18s) kubelet Pod sandbox changed, it will be killed and re-created.
②node2节点继续查看cni的日志
sudo journalctl -xe | grep cni
最后一个显示failed to "KillPodSandbox"
4月 19 16:50:55 node2 kubelet[51899]: E0419 16:50:55.608296 51899 kubelet.go:2032] failed to "KillPodSandbox" for "2e3c9e42-396b-4b9b-980a-b8275991b8a8" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"jenkins-696cf86678-jx477_jenkins\" network: plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized" 4月 19 16:50:55 node2 kubelet[51899]: E0419 16:50:55.608390 51899 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"KillPodSandbox\" for \"2e3c9e42-396b-4b9b-980a-b8275991b8a8\" with KillPodSandboxError: \"rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \\\"jenkins-696cf86678-jx477_jenkins\\\" network: plugin type=\\\"calico\\\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized\"" pod="jenkins/jenkins-696cf86678-jx477" podUID="2e3c9e42-396b-4b9b-980a-b8275991b8a8" 4月 19 16:50:56 node2 sudo[73408]: root : TTY=pts/1 ; PWD=/etc/cni/net.d ; USER=root ; COMMAND=/bin/journalctl -xe
③CNI的配置文件默认在/etc/cni/net.d/目录,进入目录查看
[root@node2 net.d]# cd /etc/cni/net.d/ [root@node2 net.d]# ls
nodename为node2,正确的
[root@node2 net.d]# vim 10-calico.conflist
④ 查看kubelet日志
[root@node2 net.d]# journalctl --since="2024-04-19 16:00:00" --until="2024-04-19 17:00:00" -fu kubelet
显示Failed to stop sandbox
4月 19 16:56:55 node2 kubelet[51899]: E0419 16:56:55.626079 51899 kuberuntime_manager.go:1381] "Failed to stop sandbox" podSandboxID={"Type":"docker","ID":"2958227182cb84e9c4bc0d44a662316ab58355f1cb9bb8a1923225d9b37247fc"}
最后显示failed to "KillPodSandbox"
4月 19 16:56:55 node2 kubelet[51899]: E0419 16:56:55.626182 51899 kubelet.go:2032] failed to "KillPodSandbox" for "18e3512f-846a-42c3-a10b-6bb0a2a33533" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"jenkins-69758d74c9-br846_jenkins\" network: plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized"
⑤ 查看各节点cri-docker 并重启服务
systemctl status cri-docker systemctl restart cri-docker
⑥ 综上分析
原因是node2节点的cni容器出现了异常无法为pod分配ip导致的卡在ContainerCreating的状态。
(3)解决方法
删除异常节点的calico-node容器,让它拉起重新同步数据即可修复。
① 删除 calico-node-zwfqf
②已重新拉活
查看
③ pod部署成功
2.journalctl如何查看日志信息
(1)命令
1)以flow形式查看日志 实时滚动 journalctl -f 2)查看内核日志 journalctl -k 3)查看指定服务日志 实时滚动最新日志 journalctl -u kubelet 4)查看指定日期日志 journalctl --since="2024-04-19 16:00:00" -fu kubelet journalctl --since="2024-04-19 16:00:00" --until="2024-04-19 17:00:00" -fu kubelet # –until “1 hour ago” / –until now journalctl --since “10 min ago” #显示最近10分钟内的日志 journalctl --since today/yesterday #显示今天/昨天以来的日志 5)查看日志占用的磁盘空间 journalctl --disk-usage 6)设置日志占用的空间 journalctl --vacuum-size=500M 7)设置日志保存的时间 journalctl --vacuum-time=1month 8)检查日志文件一致性 journalctl –-verify 9)显示最后num行的日志,如果省略num,则默认显示最后10行 journalctl -n [num] 10)设置日志输出格式 journalctl -o #格式如下: mode的值为(short, short-iso,short-precise, short-monotonic, verbose,export, json, json-pretty, json-sse, ca) 11)正常标准输出 日志默认分页输出,–no-pager改为正常的标准输出 journalctl --no-pager 12)获取指定进程号的日志 journalctl _PID=22856 13)查看指定用户的日志 journalctl _UID=33 --since=today 14)通过系统优先级匹配 journalctl _SYSTEMD_UNIT=cron.service PRIORITY=6 15)查看帮助文档 man journalctl journalctl -h
2.容器内如何查询jenkins初始密码
(1)node节点上查询获取
[root@master jenkins]# kubectl exec -it jenkins-69758d74c9-lb96b -n jenkins /bin/bash jenkins@jenkins-69758d74c9-lb96b:/$ cat /var/jenkins_home/secrets/initialAdminPassword jenkins@jenkins-69758d74c9-lb96b:/$ exit
(2)获取到运行的容器ID,然后进入容器查看初始密码
[root@node2 ~]# docker ps -a
jenkins镜像id为4e586344183a
[root@node2 ~]# docker ps -a | grep jenkins
查看
docker ps -a --filter ancestor=4e586344183a --format "{{.ID}}"
进入容器
docker exec -it 0821261b4091 bash cat /var/jenkins_home/secrets/initialAdminPassword
3.jenkins离线安装中文包报错
(1)报错
(2)原因分析
需要先安装Localization Support。
(3)解决方法
先离线安装Localization Support:
然后安装中文包:
重新拉活jenkins
[root@master jenkins]# kubectl delete pods jenkins-69758d74c9-hxb89 -n jenkins
观察pod:(68s完成重启)
成功:
4.jenkins插件报错
(1)报错
站点报错
安装Kubernetes离线包报错
(2)原因分析
因为 K8s 集群中运行的 Jenkins 的 pod 无法 ping 通外部网络域名,才导致网络报错问题。
(3)解决方法
①查看系统的 coredns pod 容器信息
[root@master ~]# kubectl get pods -n kube-system -o wide |grep coredns
②Kuboard查看
③查看 dns server 的信息
[root@master ~]# kubectl get svc -n kube-system -o wide
dns server 的 IP是10.96.0.10
④node节点操作无权限
echo "$(sed 's/10.96.0.10/10.244.166.133/g' /etc/resolv.conf)" > /etc/resolv.conf
⑤docker进入容器操作无权限
⑥coredns 扩容
原先coredns只部署在了node1节点,现在扩容为3个
完成:node2节点部署的pod为10.244.104.10
查看pod
[root@master ~]# kubectl get pods -n kube-system -o wide |grep coredns
⑦通过docker cp拷贝进行修改
将容器中的文件拷贝出来
[root@node2 /]# sudo docker cp 0821261b4091:/etc/resolv.conf ~
查看配置文件
修改配置文件
将容器中的文件拷贝回去,还是无权限
[root@node2 ~]# sudo docker cp resolv.conf 0821261b4091:/etc/
⑧root用户进入docker容器
修改文件
echo "$(sed 's/10.96.0.10/10.244.104.10/g' /etc/resolv.conf)" > /etc/resolv.conf
⑨重新启动一下 jenkins 服务
观察pod拉活情况
⑩jenkins成功获取插件信息
安装: