Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Create kubeadm token for joining nodes with 24h expiration fail. #5648

Open
zsh4614 opened this issue Nov 4, 2021 · 10 comments
Open

Create kubeadm token for joining nodes with 24h expiration fail. #5648

zsh4614 opened this issue Nov 4, 2021 · 10 comments

Comments

@zsh4614
Copy link

zsh4614 commented Nov 4, 2021

Organization Name:HIT

Short summary about the issue/question:

fatal: [master -> 10.10.8.87]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.017698", "end": "2021-11-04 14:19:20.079711", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:18:05.062013", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

Brief what process you are following: follow #5592

  • Operating type: Initial deployment

OpenPAI Environment:

  • OpenPAI version: 1.8.0
  • OS (e.g. from /etc/os-release): Ubuntu 18.04.6 LTS
  • Kernel (e.g. uname -a):4.15.0-161-generic
TASK [kubernetes/master : kubeadm | Create kubeadm config] **********************************************************************************************************************************************************
Thursday 04 November 2021  22:11:22 +0800 (0:00:00.031)       0:02:29.371 ***** 
changed: [master]

TASK [kubernetes/master : Backup old certs and keys] ****************************************************************************************************************************************************************
Thursday 04 November 2021  22:11:22 +0800 (0:00:00.563)       0:02:29.934 ***** 
ok: [master] => (item={'src': 'apiserver.crt', 'dest': 'apiserver.crt.old'})
ok: [master] => (item={'src': 'apiserver.key', 'dest': 'apiserver.key.old'})
ok: [master] => (item={'src': 'apiserver-kubelet-client.crt', 'dest': 'apiserver-kubelet-client.crt.old'})
ok: [master] => (item={'src': 'apiserver-kubelet-client.key', 'dest': 'apiserver-kubelet-client.key.old'})
ok: [master] => (item={'src': 'front-proxy-client.crt', 'dest': 'front-proxy-client.crt.old'})
ok: [master] => (item={'src': 'front-proxy-client.key', 'dest': 'front-proxy-client.key.old'})

TASK [kubernetes/master : kubeadm | Initialize first master] ********************************************************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:01.101)       0:02:31.035 ***** 

TASK [kubernetes/master : set kubeadm certificate key] **************************************************************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.026)       0:02:31.062 ***** 

TASK [kubernetes/master : Create hardcoded kubeadm token for joining nodes with 24h expiration (if defined)] ********************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.033)       0:02:31.095 ***** 

TASK [kubernetes/master : Create kubeadm token for joining nodes with 24h expiration (default)] *********************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.040)       0:02:31.135 ***** 
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (5 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (4 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (3 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (2 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (1 retries left).
fatal: [master -> 10.10.8.87]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.017698", "end": "2021-11-04 14:19:20.079711", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:18:05.062013", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT **************************************************************************************************************************************************************************************************

PLAY RECAP **********************************************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
master                     : ok=229  changed=23   unreachable=0    failed=1    skipped=339  rescued=0    ignored=1   
worker1                    : ok=147  changed=11   unreachable=0    failed=0    skipped=200  rescued=0    ignored=0   
@zsh4614
Copy link
Author

zsh4614 commented Nov 4, 2021

TASK [Configure | Check if etcd cluster is healthy] *****************************************************************************************************************************************************************
Thursday 04 November 2021  22:32:48 +0800 (0:00:00.070)       0:01:01.100 ***** 
fatal: [master]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/etcdctl --endpoints=https://10.10.8.87:2379 cluster-health | grep -q 'cluster is healthy'", "delta": "0:00:00.006207", "end": "2021-11-04 14:32:48.419183", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:32:48.412976", "stderr": "Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.10.8.87:2379: connect: connection refused\n\nerror #0: dial tcp 10.10.8.87:2379: connect: connection refused", "stderr_lines": ["Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.10.8.87:2379: connect: connection refused", "", "error #0: dial tcp 10.10.8.87:2379: connect: connection refused"], "stdout": "", "stdout_lines": []}
...ignoring

TASK [Configure | Check if etcd-events cluster is healthy] **********************************************************************************************************************************************************
Thursday 04 November 2021  22:32:48 +0800 (0:00:00.232)       0:01:01.332 ***** 

another error.

@siaimes
Copy link
Contributor

siaimes commented Nov 4, 2021

Organization Name:HIT

Short summary about the issue/question:

fatal: [master -> 10.10.8.87]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.017698", "end": "2021-11-04 14:19:20.079711", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:18:05.062013", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

Brief what process you are following: follow #5592

  • Operating type: Initial deployment

OpenPAI Environment:

  • OpenPAI version: 1.8.0
  • OS (e.g. from /etc/os-release): Ubuntu 18.04.6 LTS
  • Kernel (e.g. uname -a):4.15.0-161-generic
TASK [kubernetes/master : kubeadm | Create kubeadm config] **********************************************************************************************************************************************************
Thursday 04 November 2021  22:11:22 +0800 (0:00:00.031)       0:02:29.371 ***** 
changed: [master]

TASK [kubernetes/master : Backup old certs and keys] ****************************************************************************************************************************************************************
Thursday 04 November 2021  22:11:22 +0800 (0:00:00.563)       0:02:29.934 ***** 
ok: [master] => (item={'src': 'apiserver.crt', 'dest': 'apiserver.crt.old'})
ok: [master] => (item={'src': 'apiserver.key', 'dest': 'apiserver.key.old'})
ok: [master] => (item={'src': 'apiserver-kubelet-client.crt', 'dest': 'apiserver-kubelet-client.crt.old'})
ok: [master] => (item={'src': 'apiserver-kubelet-client.key', 'dest': 'apiserver-kubelet-client.key.old'})
ok: [master] => (item={'src': 'front-proxy-client.crt', 'dest': 'front-proxy-client.crt.old'})
ok: [master] => (item={'src': 'front-proxy-client.key', 'dest': 'front-proxy-client.key.old'})

TASK [kubernetes/master : kubeadm | Initialize first master] ********************************************************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:01.101)       0:02:31.035 ***** 

TASK [kubernetes/master : set kubeadm certificate key] **************************************************************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.026)       0:02:31.062 ***** 

TASK [kubernetes/master : Create hardcoded kubeadm token for joining nodes with 24h expiration (if defined)] ********************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.033)       0:02:31.095 ***** 

TASK [kubernetes/master : Create kubeadm token for joining nodes with 24h expiration (default)] *********************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.040)       0:02:31.135 ***** 
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (5 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (4 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (3 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (2 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (1 retries left).
fatal: [master -> 10.10.8.87]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.017698", "end": "2021-11-04 14:19:20.079711", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:18:05.062013", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT **************************************************************************************************************************************************************************************************

PLAY RECAP **********************************************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
master                     : ok=229  changed=23   unreachable=0    failed=1    skipped=339  rescued=0    ignored=1   
worker1                    : ok=147  changed=11   unreachable=0    failed=0    skipped=200  rescued=0    ignored=0   

This problem may be caused by a legacy file that failed to install Kubernetes before. You can run the following command on every node to initialize kubeadm and then it should be resolved.

kubeadm reset

@siaimes
Copy link
Contributor

siaimes commented Nov 4, 2021

TASK [Configure | Check if etcd cluster is healthy] *****************************************************************************************************************************************************************
Thursday 04 November 2021  22:32:48 +0800 (0:00:00.070)       0:01:01.100 ***** 
fatal: [master]: FAILED! => {"changed": false, "cmd": "/usr/local/bin/etcdctl --endpoints=https://10.10.8.87:2379 cluster-health | grep -q 'cluster is healthy'", "delta": "0:00:00.006207", "end": "2021-11-04 14:32:48.419183", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:32:48.412976", "stderr": "Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.10.8.87:2379: connect: connection refused\n\nerror #0: dial tcp 10.10.8.87:2379: connect: connection refused", "stderr_lines": ["Error:  client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 10.10.8.87:2379: connect: connection refused", "", "error #0: dial tcp 10.10.8.87:2379: connect: connection refused"], "stdout": "", "stdout_lines": []}
...ignoring

TASK [Configure | Check if etcd-events cluster is healthy] **********************************************************************************************************************************************************
Thursday 04 November 2021  22:32:48 +0800 (0:00:00.232)       0:01:01.332 ***** 

another error.

Just

...ignoring

@zsh4614
Copy link
Author

zsh4614 commented Nov 5, 2021

Organization Name:HIT
Short summary about the issue/question:

fatal: [master -> 10.10.8.87]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.017698", "end": "2021-11-04 14:19:20.079711", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:18:05.062013", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

Brief what process you are following: follow #5592

  • Operating type: Initial deployment

OpenPAI Environment:

  • OpenPAI version: 1.8.0
  • OS (e.g. from /etc/os-release): Ubuntu 18.04.6 LTS
  • Kernel (e.g. uname -a):4.15.0-161-generic
TASK [kubernetes/master : kubeadm | Create kubeadm config] **********************************************************************************************************************************************************
Thursday 04 November 2021  22:11:22 +0800 (0:00:00.031)       0:02:29.371 ***** 
changed: [master]

TASK [kubernetes/master : Backup old certs and keys] ****************************************************************************************************************************************************************
Thursday 04 November 2021  22:11:22 +0800 (0:00:00.563)       0:02:29.934 ***** 
ok: [master] => (item={'src': 'apiserver.crt', 'dest': 'apiserver.crt.old'})
ok: [master] => (item={'src': 'apiserver.key', 'dest': 'apiserver.key.old'})
ok: [master] => (item={'src': 'apiserver-kubelet-client.crt', 'dest': 'apiserver-kubelet-client.crt.old'})
ok: [master] => (item={'src': 'apiserver-kubelet-client.key', 'dest': 'apiserver-kubelet-client.key.old'})
ok: [master] => (item={'src': 'front-proxy-client.crt', 'dest': 'front-proxy-client.crt.old'})
ok: [master] => (item={'src': 'front-proxy-client.key', 'dest': 'front-proxy-client.key.old'})

TASK [kubernetes/master : kubeadm | Initialize first master] ********************************************************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:01.101)       0:02:31.035 ***** 

TASK [kubernetes/master : set kubeadm certificate key] **************************************************************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.026)       0:02:31.062 ***** 

TASK [kubernetes/master : Create hardcoded kubeadm token for joining nodes with 24h expiration (if defined)] ********************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.033)       0:02:31.095 ***** 

TASK [kubernetes/master : Create kubeadm token for joining nodes with 24h expiration (default)] *********************************************************************************************************************
Thursday 04 November 2021  22:11:24 +0800 (0:00:00.040)       0:02:31.135 ***** 
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (5 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (4 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (3 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (2 retries left).
FAILED - RETRYING: Create kubeadm token for joining nodes with 24h expiration (default) (1 retries left).
fatal: [master -> 10.10.8.87]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create"], "delta": "0:01:15.017698", "end": "2021-11-04 14:19:20.079711", "msg": "non-zero return code", "rc": 1, "start": "2021-11-04 14:18:05.062013", "stderr": "timed out waiting for the condition", "stderr_lines": ["timed out waiting for the condition"], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT **************************************************************************************************************************************************************************************************

PLAY RECAP **********************************************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
master                     : ok=229  changed=23   unreachable=0    failed=1    skipped=339  rescued=0    ignored=1   
worker1                    : ok=147  changed=11   unreachable=0    failed=0    skipped=200  rescued=0    ignored=0   

This problem may be caused by a legacy file that failed to install Kubernetes before. You can run the following command on every node to initialize kubeadm and then it should be resolved.

kubeadm reset

Thank you very much for your answer, but after doing this, another problem appeared:

TASK [kubernetes/master : kubeadm | Create kubeadm config] **********************************************************************************************************************************************************
Friday 05 November 2021  12:05:54 +0800 (0:00:00.032)       0:01:39.996 ******* 
changed: [master]

TASK [kubernetes/master : Backup old certs and keys] ****************************************************************************************************************************************************************
Friday 05 November 2021  12:05:55 +0800 (0:00:00.589)       0:01:40.586 ******* 

TASK [kubernetes/master : kubeadm | Initialize first master] ********************************************************************************************************************************************************
Friday 05 November 2021  12:05:55 +0800 (0:00:00.062)       0:01:40.648 ******* 
FAILED - RETRYING: kubeadm | Initialize first master (3 retries left).
FAILED - RETRYING: kubeadm | Initialize first master (2 retries left).
FAILED - RETRYING: kubeadm | Initialize first master (1 retries left).
fatal: [master]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "300s", "300s", "/usr/local/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--skip-phases=addon/coredns", "--upload-certs"], "delta": "0:05:00.004794", "end": "2021-11-05 04:26:11.139309", "failed_when_result": true, "msg": "non-zero return code", "rc": 124, "start": "2021-11-05 04:21:11.134515", "stderr": "\t[WARNING Port-10251]: Port 10251 is in use\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING Port-10250]: Port 10250 is in use", "stderr_lines": ["\t[WARNING Port-10251]: Port 10251 is in use", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING Port-10250]: Port 10250 is in use"], "stdout": "[init] Using Kubernetes version: v1.15.11\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Activating the kubelet service\n[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"\n[certs] Using existing ca certificate authority\n[certs] Using existing apiserver certificate and key on disk\n[certs] Using existing apiserver-kubelet-client certificate and key on disk\n[certs] External etcd mode: Skipping etcd/ca certificate authority generation\n[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation\n[certs] External etcd mode: Skipping etcd/server certificate authority generation\n[certs] External etcd mode: Skipping etcd/peer certificate authority generation\n[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation\n[certs] Using existing front-proxy-ca certificate authority\n[certs] Using existing front-proxy-client certificate and key on disk\n[certs] Using the existing \"sa\" key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s\n[kubelet-check] Initial timeout of 40s passed.", "stdout_lines": ["[init] Using Kubernetes version: v1.15.11", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Activating the kubelet service", "[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"", "[certs] Using existing ca certificate authority", "[certs] Using existing apiserver certificate and key on disk", "[certs] Using existing apiserver-kubelet-client certificate and key on disk", "[certs] External etcd mode: Skipping etcd/ca certificate authority generation", "[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation", "[certs] External etcd mode: Skipping etcd/server certificate authority generation", "[certs] External etcd mode: Skipping etcd/peer certificate authority generation", "[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation", "[certs] Using existing front-proxy-ca certificate authority", "[certs] Using existing front-proxy-client certificate and key on disk", "[certs] Using the existing \"sa\" key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s", "[kubelet-check] Initial timeout of 40s passed."]}

NO MORE HOSTS LEFT **************************************************************************************************************************************************************************************************

PLAY RECAP **********************************************************************************************************************************************************************************************************
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
master                     : ok=218  changed=14   unreachable=0    failed=1    skipped=336  rescued=0    ignored=0   
worker1                    : ok=148  changed=18   unreachable=0    failed=0    skipped=199  rescued=0    ignored=0   

@siaimes
Copy link
Contributor

siaimes commented Nov 5, 2021

It seems that there is already a kubernets service, you can try to reset it with kubelet reset.

@zsh4614
Copy link
Author

zsh4614 commented Nov 5, 2021

It seems that there is already a kubernets service, you can try to reset it with kubelet reset.

Thank you very much for your patience!
kubelet reset is an invalid command, I reset k8s on the master and worker according to this, but the problem still exists.

@siaimes
Copy link
Contributor

siaimes commented Nov 7, 2021

If your machine has been used for other purposes before, it is recommended that you format it and reinstall the brand new ubuntu 18.04 LTS.

@zsh4614
Copy link
Author

zsh4614 commented Nov 8, 2021

If your machine has been used for other purposes before, it is recommended that you format it and reinstall the brand new ubuntu 18.04 LTS.

Thanks, does it have anything to do with the dev-box environment? Both my master and worker are reinstalled ubuntu 18.04.6 server. I used another physical machine of ubuntu16.04 desktop as the dev-box to install. This machine used to be the dev-box of other clusters before. Following the same process, the installation was finally successful. The difference is that all the machines were ubuntu16.04. After that, I did not change the environment of the dev-box. Should I initialize the environment of the dev-box machine?

@siaimes
Copy link
Contributor

siaimes commented Nov 8, 2021

If your machine has been used for other purposes before, it is recommended that you format it and reinstall the brand new ubuntu 18.04 LTS.

Thanks, does it have anything to do with the dev-box environment? Both my master and worker are reinstalled ubuntu 18.04.6 server. I used another physical machine of ubuntu16.04 desktop as the dev-box to install. This machine used to be the dev-box of other clusters before. Following the same process, the installation was finally successful. The difference is that all the machines were ubuntu16.04. After that, I did not change the environment of the dev-box. Should I initialize the environment of the dev-box machine?

The dev-box also needs ubuntu 18.04, the brand new is of course better.

@zsh4614
Copy link
Author

zsh4614 commented Nov 17, 2021

If your machine has been used for other purposes before, it is recommended that you format it and reinstall the brand new ubuntu 18.04 LTS.

Thanks, does it have anything to do with the dev-box environment? Both my master and worker are reinstalled ubuntu 18.04.6 server. I used another physical machine of ubuntu16.04 desktop as the dev-box to install. This machine used to be the dev-box of other clusters before. Following the same process, the installation was finally successful. The difference is that all the machines were ubuntu16.04. After that, I did not change the environment of the dev-box. Should I initialize the environment of the dev-box machine?

The dev-box also needs ubuntu 18.04, the brand new is of course better.

I started an 18.04 container as a dev-box on the ubutnu16.04 machine. After installing the required environment, the following error occurred:

TASK [set ansible control host IP fact] ****************************************************************************
fatal: [master]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_default_ipv4'\n\nThe error appears to be in '/home/openpai-87-test/pai/contrib/kubespray/environment-check.yml': line 14, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n    - name: set ansible control host IP fact\n      ^ here\n"}
fatal: [worker1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_default_ipv4'\n\nThe error appears to be in '/home/openpai-87-test/pai/contrib/kubespray/environment-check.yml': line 14, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n    - name: set ansible control host IP fact\n      ^ here\n"}

PLAY RECAP *********************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
master                     : ok=3    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
worker1                    : ok=3    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants