kubectl get pods结果:
[root@centos-master ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-server-h6nw8 1/1 Running 0 1h
nfs-web-07rxz 0/1 CrashLoopBackOff 8 16m
nfs-web-fdr9h 0/1 CrashLoopBackOff 8 16m
kubectl describe pods 的输出:
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
16m 16m 1 {default-scheduler } Normal Scheduled Successfully assigned nfs-web-fdr9h to centos-minion-2
16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Created Created container with docker id 495fcbb06836
16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Started Started container with docker id 495fcbb06836
16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Started Started container with docker id d56f34ae4e8f
16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Created Created container with docker id d56f34ae4e8f
16m 16m 2 {kubelet centos-minion-2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "web" with CrashLoopBackOff: "Back-off 10s restarting failed container=web pod=nfs-web-fdr9h_default(461c937d-d870-11e6-98de-005056040cc2)"
我有两个pod:nfs-web-07rxz
、nfs-web-fdr9h
,但是如果我执行 kubectl logs nfs-web-07rxz
或使用 -p
选项,我在两个 pod 中都看不到任何日志:
[root@centos-master ~]# kubectl logs nfs-web-07rxz -p
[root@centos-master ~]# kubectl logs nfs-web-07rxz
这是我的replicationController.yaml:
apiVersion: v1 kind: ReplicationController metadata: name: nfs-web spec: replicas: 2 selector:
role: web-frontend template:
metadata:
labels:
role: web-frontend
spec:
containers:
- name: web
image: eso-cmbu-docker.artifactory.eng.vmware.com/demo-container:demo-version3.0
ports:
- name: web
containerPort: 80
securityContext:
privileged: true
我的 Docker 镜像是由这个简单的 dockerfile 制作的:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y nginx
RUN apt-get install -y nfs-common
我在 CentOs-1611 上运行我的 kubernetes 集群,kubectl的版本是:
[root@centos-master ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
如果我通过 docker run 运行 docker 镜像,我可以毫无问题地运行,只有通过 kubernetes 我才会崩溃。
我如何在没有看到任何日志的情况下进行调试?
你需要在你的
Dockerfile
中或者在ReplicationController中增加Command
来执行运行
的。例如:
Dockerfile:
FROM centos ... #当启动容器时执行的脚本文件 CMD ["/run.sh"]
k8s:
... containers: - name: api image: localhost:5000/image-name command: [ "sleep" ] args: [ "infinity" ] ...
Pod崩溃是因为没有cmd,所以启动就立即退出,因此Kubernetes重新启动它,所以一直循环。
感谢,是少了命令启动!
如果application的启动速度较慢,可能与readiness/liveness探针的初始值有关。通过将
initialDelaySeconds
的值增加到120s
来解决我的问题,因为我的SpringBoot应用程序要处理大量的初始化。service: livenessProbe: httpGet: path: /health/local scheme: HTTP port: 8888 initialDelaySeconds: 120 periodSeconds: 5 timeoutSeconds: 5 failureThreshold: 10 readinessProbe: httpGet: path: /admin/health scheme: HTTP port: 8642 initialDelaySeconds: 150 periodSeconds: 5 timeoutSeconds: 5 failureThreshold: 10
什么是
initialDelaySeconds
的默认值,对这些值有一个非常好的解释。运行情况或readiness就绪检查算法的工作原理如下:
initialDelaySeconds
。failureThreshold
返回失败否则等待periodSeconds
并开始新的检查。在我的情况中,会受到这些验证的影响。
你的答案