Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Registering the Member Cluster #4963

Open
amacharya opened this issue May 19, 2024 · 27 comments
Open

Issue with Registering the Member Cluster #4963

amacharya opened this issue May 19, 2024 · 27 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@amacharya
Copy link

amacharya commented May 19, 2024

I have created AWS EKS K8s clusters (K8s v1.29)

git clone https://github.com/karmada-io/karmada.git

helm install karmada -n karmada-system --create-namespace --dependency-update ./charts/karmada
Getting updates for unmanaged Helm repositories...
...Successfully got an update from the "https://charts.bitnami.com/bitnami" chart repository
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Downloading common from repo https://charts.bitnami.com/bitnami
Deleting outdated charts
NAME: karmada
LAST DEPLOYED: Sat May 18 21:52:59 2024
NAMESPACE: karmada-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
karmadactl version
karmadactl version: version.Info{GitVersion:"v1.9.1", GitCommit:"b57bff17d6133deb26d9c319714170a915d4fa54", GitTreeState:"clean", BuildDate:"2024-04-30T02:03:53Z", GoVersion:"go1.20.11", Compiler:"gc", Platform:"darwin/arm64"}

Cluster 1: karmada-main

kubectl -n karmada-system get pods
NAME                                               READY   STATUS    RESTARTS      AGE
etcd-0                                             1/1     Running   0             20m
karmada-aggregated-apiserver-79f6bdb5b9-qhbwd      1/1     Running   2 (20m ago)   20m
karmada-apiserver-6d97b54c47-lxv7w                 1/1     Running   0             20m
karmada-controller-manager-6965d94dc4-d6psr        1/1     Running   3 (20m ago)   20m
karmada-kube-controller-manager-5d4795ff87-24sn5   1/1     Running   2 (20m ago)   20m
karmada-scheduler-85bcf46665-gzl6l                 1/1     Running   0             20m
karmada-webhook-7bbb7ddb98-5mmx2                   1/1     Running   0             20m

Created CRDs

-path-/KARMADA/karmada/charts/karmada/_crds

changed caBundle: {{caBundle}} to caBundle: ''  in both files - webhook_in_clusterresourcebindings.yaml webhook_in_resourcebindings.yaml
kubectl apply -k .                        
# Warning: 'patchesStrategicMerge' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
customresourcedefinition.apiextensions.k8s.io/clusteroverridepolicies.policy.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/clusterpropagationpolicies.policy.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/clusterresourcebindings.work.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/cronfederatedhpas.autoscaling.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/federatedhpas.autoscaling.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/federatedresourcequotas.policy.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/multiclusteringresses.networking.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/multiclusterservices.networking.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/overridepolicies.policy.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/propagationpolicies.policy.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/remedies.remedy.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/resourcebindings.work.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/resourceinterpretercustomizations.config.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/resourceinterpreterwebhookconfigurations.config.karmada.io configured
customresourcedefinition.apiextensions.k8s.io/serviceexports.multicluster.x-k8s.io unchanged
customresourcedefinition.apiextensions.k8s.io/serviceimports.multicluster.x-k8s.io unchanged
customresourcedefinition.apiextensions.k8s.io/workloadrebalancers.apps.karmada.io created
customresourcedefinition.apiextensions.k8s.io/works.work.karmada.io configured

kubectl -n karmada-system get all                                                        
NAME                                                   READY   STATUS    RESTARTS      AGE
pod/etcd-0                                             1/1     Running   0             48m
pod/karmada-aggregated-apiserver-79f6bdb5b9-qhbwd      1/1     Running   2 (48m ago)   48m
pod/karmada-apiserver-6d97b54c47-lxv7w                 1/1     Running   0             48m
pod/karmada-controller-manager-6965d94dc4-d6psr        1/1     Running   3 (48m ago)   48m
pod/karmada-kube-controller-manager-5d4795ff87-24sn5   1/1     Running   2 (48m ago)   48m
pod/karmada-scheduler-85bcf46665-gzl6l                 1/1     Running   0             48m
pod/karmada-webhook-7bbb7ddb98-5mmx2                   1/1     Running   0             48m

NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/etcd                           ClusterIP   None             <none>        2379/TCP,2380/TCP   48m
service/etcd-client                    ClusterIP   xxx.xx.xxx.xxx   <none>        2379/TCP            48m
service/karmada-aggregated-apiserver   ClusterIP   xxx.xx.xxx.xxx   <none>        443/TCP             48m
service/karmada-apiserver              ClusterIP   xxx.xx.xxx.xxx   <none>        5443/TCP            48m
service/karmada-webhook                ClusterIP   xxx.xx.xxx.xx      <none>        443/TCP             48m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/karmada-aggregated-apiserver      1/1     1            1           48m
deployment.apps/karmada-apiserver                 1/1     1            1           48m
deployment.apps/karmada-controller-manager        1/1     1            1           48m
deployment.apps/karmada-kube-controller-manager   1/1     1            1           48m
deployment.apps/karmada-scheduler                 1/1     1            1           48m
deployment.apps/karmada-webhook                   1/1     1            1           48m

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/karmada-aggregated-apiserver-79f6bdb5b9      1         1         1       48m
replicaset.apps/karmada-apiserver-6d97b54c47                 1         1         1       48m
replicaset.apps/karmada-controller-manager-6965d94dc4        1         1         1       48m
replicaset.apps/karmada-kube-controller-manager-5d4795ff87   1         1         1       48m
replicaset.apps/karmada-scheduler-85bcf46665                 1         1         1       48m
replicaset.apps/karmada-webhook-7bbb7ddb98                   1         1         1       48m

NAME                    READY   AGE
statefulset.apps/etcd   1/1     48m

Cluster 2: karmada-member-1

 kubectl get pods -A
NAMESPACE     NAME                           READY   STATUS    RESTARTS   AGE
kube-system   aws-node-5rdwc                 2/2     Running   0          2m25s
kube-system   aws-node-js5t6                 2/2     Running   0          2m29s
kube-system   coredns-54d6f577c6-kp72z       1/1     Running   0          9m7s
kube-system   coredns-54d6f577c6-ngfhs       1/1     Running   0          9m7s
kube-system   eks-pod-identity-agent-dfm7d   1/1     Running   0          2m29s
kube-system   eks-pod-identity-agent-qfzq9   1/1     Running   0          2m29s
kube-system   kube-proxy-9c9cp               1/1     Running   0          3m40s
kube-system   kube-proxy-zjxzc               1/1     Running   0          3m41s

kubectl config use-context arn:aws:eks:us-east-1:xxxxx:cluster/karmada-main

karmadactl join karmada-member-1 --cluster-kubeconfig=/xxxx/xxxx/.kube/config

Error from server (NotFound): the server could not find the requested resource (get clusters.cluster.karmada.io)

Any help would be greatly appreciated

@amacharya
Copy link
Author

Hello @RainbowMango

Can you help me with this?

@RainbowMango
Copy link
Member

Error from server (NotFound): the server could not find the requested resource (get clusters.cluster.karmada.io)

That due to you are trying to join the cluster to the host cluster where the Karmada control plane running on. You should use the karmada-apiserver's config which you can export per Export kubeconfig.

@amacharya
Copy link
Author

amacharya commented May 20, 2024

Error from server (NotFound): the server could not find the requested resource (get clusters.cluster.karmada.io)

That due to you are trying to join the cluster to the host cluster where the Karmada control plane running on. You should use the karmada-apiserver's config which you can export per Export kubeconfig.

@RainbowMango Thank you for helping me!
I followed but got an issue - Unable to connect to the server: dial tcp: lookup karmada-apiserver.karmada-system.svc.cluster.local: no such host

@RainbowMango
Copy link
Member

That is because the service of karmada-apiserver wasn't exposed outside the cluster. It will work if you run karmadactl join on any node of the host cluster.

In case of running this command outside cluster, @chaosi-zju is there any document for how to expose karmada-apiserver?

@amacharya
Copy link
Author

I tried to change with NodePort and Updated config file with https://node-ip:port

Unable to connect to the server: dial tcp 10.0.2.39:32629: i/o timeout

@chaosi-zju
Copy link
Member

chaosi-zju commented May 21, 2024

Hi @amacharya, it seems that you are trying to install Karmada by helm chart.

For efficient installation, I recommend you to follow my steps to try it out:


Prerequisite: assuming you have installed the host k8s cluster and member k8s clusters, and there are no network problems with the cluster pulling the image.

Step 1: get host network ip (node ip) from kube-apiserver, and then add this ip to values.yaml as SAN of certificate. (you can do it by executing following command at Karmada repo root dir):

export KUBECONFIG=~/.kube/karmada-host.config
HOST_IP=$(kubectl get ep kubernetes -o jsonpath='{.subsets[0].addresses[0].ip}')

sed -i'' -e "/localhost/{n;s/      \"127.0.0.1/      \"${HOST_IP}\",\n&/g}" ./charts/karmada/values.yaml

you will see your karmada host node ip added to values.yaml

Step 2: install karmada in host cluster by helm:

helm install karmada -n karmada-system \
  --kubeconfig ~/.kube/karmada-host.config \
  --create-namespace \
  --dependency-update \
  --set apiServer.hostNetwork=true \
  ./charts/karmada

pay attention to we set apiServer.hostNetwork=true, with this mode will help you avoiding many small problems.

Step 3: export kubeconfig of karmada-apiserver to local path

kubectl get secret -n karmada-system karmada-kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d >~/.kube/karmada-apiserver.config
KARMADA_APISERVER_ADDR=$(kubectl get ep karmada-apiserver -n karmada-system | tail -n 1 | awk '{print $2}')
sed -i'' -e "s/karmada-apiserver.karmada-system.svc.cluster.local:5443/${KARMADA_APISERVER_ADDR}/g" ~/.kube/karmada-apiserver.config

the kubeconfig of karmada-apiserver will write to ~/.kube/karmada-apiserver.config, you can check whether this kubeconfig works.

Step 4: join member clusters:

# download karmadactl if not exist
if ! which karmadactl >/dev/null 2>&1; then
  curl -s https://raw.githubusercontent.com/karmada-io/karmada/master/hack/install-cli.sh | sudo bash
fi

# join member1、member2 to karmada with push mode
karmadactl join member1 --kubeconfig ~/.kube/karmada-apiserver.config --karmada-context karmada-apiserver --cluster-kubeconfig ~/.kube/members.config --cluster-context member1
karmadactl join member2 --kubeconfig ~/.kube/karmada-apiserver.config --karmada-context karmada-apiserver --cluster-kubeconfig ~/.kube/members.config --cluster-context member2

you should modify cluster name like member1 and member2 to your real cluster name.

Step 5: check whether cluster ready.

kubectl --kubeconfig ~/.kube/karmada-apiserver.config get cluster -o wide

My installation method can help you avoid many minor problems. I strongly recommend that you try the above. If you still have any questions, please continue to consult me.

@amacharya
Copy link
Author

Hi @RainbowMango @chaosi-zju

@chaosi-zju - On a fresh AWS EKS environment, I redeployed and followed your steps.

Karmada deployed correctly on the main cluster.

 kubectl get pods -A
NAMESPACE        NAME                                               READY   STATUS    RESTARTS        AGE
karmada-system   etcd-0                                             1/1     Running   3 (2m49s ago)   41m
karmada-system   karmada-aggregated-apiserver-79f6bdb5b9-sm467      1/1     Running   2 (41m ago)     41m
karmada-system   karmada-apiserver-5bd55dfcff-vd56t                 1/1     Running   0               41m
karmada-system   karmada-controller-manager-6965d94dc4-lbzlj        1/1     Running   2 (41m ago)     41m
karmada-system   karmada-kube-controller-manager-5d4795ff87-qfqbf   1/1     Running   2 (41m ago)     41m
karmada-system   karmada-scheduler-85bcf46665-q6rqq                 1/1     Running   0               41m
karmada-system   karmada-webhook-7bbb7ddb98-zlhd2                   1/1     Running   0               41m
kube-system      aws-node-vqcx2                                     2/2     Running   0               65m
kube-system      aws-node-xbw75                                     2/2     Running   0               65m
kube-system      coredns-dd6c5d584-4792q                            1/1     Running   0               67m
kube-system      coredns-dd6c5d584-zlgt4                            1/1     Running   0               72m
kube-system      eks-pod-identity-agent-8gts7                       1/1     Running   0               65m
kube-system      eks-pod-identity-agent-8nw2m                       1/1     Running   0               65m
kube-system      kube-proxy-f48mk                                   1/1     Running   0               65m
kube-system      kube-proxy-h4m7p                                   1/1     Running   0               65m
kubectl get endpoints -n karmada-system karmada-apiserver
NAME                ENDPOINTS          AGE
karmada-apiserver   MY_KARMADA_APISERVER_ADDR:5443   27m

but still have an issue:

 karmadactl join karmada-member-1 --kubeconfig ~/.kube/karmada-apiserver.config --karmada-context karmada-apiserver --cluster-kubeconfig ~/.kube/members.config --cluster-context arn:aws:eks:eu-central-1:613829453723:cluster/karmada-member-1
 
Unable to connect to the server: dial tcp MY_KARMADA_APISERVER_ADDR:5443: i/o timeout

@amacharya
Copy link
Author

@RainbowMango @chaosi-zju

If I ssh of my main-cluster or member-cluster node to test the connection, then it is working.

telnet 10.0.19.109 5443
Trying 10.0.19.109...
Connected to 10.0.19.109..
...
....

I changed karmda-apiserver svc with NodePort, which also did not work.

at last I tried with port forwarding, and it worked.

kubectl port-forward -n karmada-system svc/karmada-apiserver 5443:5443

In ~/.kube/karmada-apiserver.config changed server to https://localhost:5443.

kubectl cluster-info
Kubernetes control plane is running at https://localhost:5443

It is a temporary solution, but nothing is working apart from this

Is it a potential bug?

Or are there any steps specifically that we need to follow to deploy Karmada on the AWS EKS cluster to join and perform workload testing?

@chaosi-zju
Copy link
Member

Hi @amacharya

Before you are executing

 karmadactl join karmada-member-1 --kubeconfig ~/.kube/karmada-apiserver.config --karmada-context karmada-apiserver --cluster-kubeconfig ~/.kube/members.config --cluster-context arn:aws:eks:eu-central-1:613829453723:cluster/karmada-member-1

can you check whether you can successfully connect to karmada-apiserver by ~/.kube/karmada-apiserver.config and member cluster by ~/.kube/members.config?

just like:

# check whether you can connect to karmada-apiserver
kubectl --kubeconfig ~/.kube/karmada-apiserver.config --context karmada-apiserver cluster-info

# check whether you can connect to member cluster apiserver
kubectl --kubeconfig ~/.kube/members.config --context arn:aws:eks:eu-central-1:613829453723:cluster/karmada-member-1 cluster-info

@amacharya
Copy link
Author

Hello @chaosi-zju

`kubectl --kubeconfig ~/.kube/karmada-apiserver.config --context karmada-apiserver cluster-info`
E0521 21:39:53.825351   95910 memcache.go:265] couldn't get current server API group list: Get xxxx..xx..xxxxxxxxx:5443: i/o timeout
kubectl --kubeconfig ~/.kube/members.config --context arn:aws:eks:eu-central-1:613829453723:cluster/karmada-member-1 cluster-info
Kubernetes control plane is running at xxxxxxxxxxx
CoreDNS is running at xxxxxxxxxxxx/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

@chaosi-zju
Copy link
Member

So, it seems that your ~/.kube/karmada-apiserver.config is not valid, may be your apiserver address in it like server: https://xx.xx.xx.xx:5443 is not right.

Let's go review Step 3 mentioned above: export kubeconfig of karmada-apiserver to local path:

kubectl get secret -n karmada-system karmada-kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d >~/.kube/karmada-apiserver.config
KARMADA_APISERVER_ADDR=$(kubectl get ep karmada-apiserver -n karmada-system | tail -n 1 | awk '{print $2}')
sed -i'' -e "s/karmada-apiserver.karmada-system.svc.cluster.local:5443/${KARMADA_APISERVER_ADDR}/g" ~/.kube/karmada-apiser

may be we have some mistake in this step.

Since we installed the karmada-apiserver using host network mode, can you execute:

kubectl --context karmada-host get po -A -o wide

and check:

  • whether the IP of your karmada-apiserver is your node ip?
  • is this ip the same as the apiserver address in you ~/.kube/karmada-apiserver.config?
  • can you ping this ip at where your execute karmadactl join?

@amacharya
Copy link
Author

amacharya commented May 22, 2024

So, it seems that your ~/.kube/karmada-apiserver.config is not valid, may be your apiserver address in it like server: https://xx.xx.xx.xx:5443 is not right.

Let's go review Step 3 mentioned above: export kubeconfig of karmada-apiserver to local path:

kubectl get secret -n karmada-system karmada-kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d >~/.kube/karmada-apiserver.config
KARMADA_APISERVER_ADDR=$(kubectl get ep karmada-apiserver -n karmada-system | tail -n 1 | awk '{print $2}')
sed -i'' -e "s/karmada-apiserver.karmada-system.svc.cluster.local:5443/${KARMADA_APISERVER_ADDR}/g" ~/.kube/karmada-apiser

may be we have some mistake in this step.

I re-ran, but I am not seeing any error; I am getting the same value result compared to the previous

Since we installed the karmada-apiserver using host network mode, can you execute:

kubectl --context karmada-host get po -A -o wide

and check:

  • whether the IP of your karmada-apiserver is your node ip?

Yes, same value which we added in karmada-apiserver.config https://:5443

  • is this ip the same as the apiserver address in you ~/.kube/karmada-apiserver.config?

Yes, same value which we added in karmada-apiserver.config https://:5443

  • can you ping this ip at where your execute karmadactl join?

no , tried with telnet too(Ref - #4963 (comment))

@chaosi-zju
Copy link
Member

chaosi-zju commented May 22, 2024

  • can you ping this ip at where your execute karmadactl join?

no , tried with telnet too(Ref - #4963 (comment))

I don't know much about your network situation, maybe it's a network problem? Then, maybe you should copy karmada-apiserver.config to where you can ping the ip of karmada-apiserver and just in there to execute karmadactl join.


Take my test environment as example:

$ kubectl --context karmada-host get po -A -o wide
NAMESPACE            NAME                                                          READY   STATUS    RESTARTS      AGE   IP            NODE                         NOMINATED NODE   READINESS GATES
karmada-system       etcd-0                                                        1/1     Running   0             12h   10.230.0.65   karmada-host-control-plane   <none>           <none>
karmada-system       karmada-9a89z2j4bu-aggregated-apiserver-6b45b5ddcf-bm6r7      1/1     Running   2 (12h ago)   12h   10.230.0.63   karmada-host-control-plane   <none>           <none>
karmada-system       karmada-9a89z2j4bu-apiserver-54bb5b7c95-frt7w                 1/1     Running   0             12h   172.18.0.2    karmada-host-control-plane   <none>           <none>
karmada-system       karmada-9a89z2j4bu-controller-manager-74bfdb44c8-pnk8h        1/1     Running   3 (12h ago)   12h   10.230.0.62   karmada-host-control-plane   <none>           <none>
karmada-system       karmada-9a89z2j4bu-kube-controller-manager-684f8f7949-ggsv9   1/1     Running   2 (12h ago)   12h   10.230.0.64   karmada-host-control-plane   <none>           <none>
karmada-system       karmada-9a89z2j4bu-scheduler-698ff8bdf7-rlsdg                 1/1     Running   0             12h   10.230.0.61   karmada-host-control-plane   <none>           <none>
karmada-system       karmada-9a89z2j4bu-webhook-6d8cd98fbf-wlgps                   1/1     Running   0             12h   10.230.0.60   karmada-host-control-plane   <none>           <none>
kube-system          coredns-5d78c9869d-74224                                      1/1     Running   0             17h   10.230.0.2    karmada-host-control-plane   <none>           <none>
kube-system          coredns-5d78c9869d-q5scn                                      1/1     Running   0             17h   10.230.0.4    karmada-host-control-plane   <none>           <none>
kube-system          etcd-karmada-host-control-plane                               1/1     Running   0             17h   172.18.0.2    karmada-host-control-plane   <none>           <none>
kube-system          kindnet-5m2fv                                                 1/1     Running   0             17h   172.18.0.2    karmada-host-control-plane   <none>           <none>
kube-system          kube-apiserver-karmada-host-control-plane                     1/1     Running   0             17h   172.18.0.2    karmada-host-control-plane   <none>           <none>
kube-system          kube-controller-manager-karmada-host-control-plane            1/1     Running   0             17h   172.18.0.2    karmada-host-control-plane   <none>           <none>
kube-system          kube-proxy-pl4vt                                              1/1     Running   0             17h   172.18.0.2    karmada-host-control-plane   <none>           <none>
kube-system          kube-scheduler-karmada-host-control-plane                     1/1     Running   0             17h   172.18.0.2    karmada-host-control-plane   <none>           <none>
local-path-storage   local-path-provisioner-6bc4bddd6b-2gg5r                       1/1     Running   0             17h   10.230.0.3    karmada-host-control-plane   <none>           <none>

My karmada-host cluster has only one single node, its node ip is 172.18.0.2. You can see the pod IP of my karmada-9a89z2j4bu-apiserver-54bb5b7c95-frt7w is also 172.18.0.2.

$ cat ~/.kube/karmada-apiserver.config 
apiVersion: v1
kind: Config
clusters:
  - cluster:
      certificate-authority-data: LS0tLS1CRUd...
      server: https://172.18.0.2:5443
    name: karmada-9a89z2j4bu-apiserver
...

you can see server: https://172.18.0.2:5443

$ ping 172.18.0.2
PING 172.18.0.2 (172.18.0.2) 56(84) bytes of data.
64 bytes from 172.18.0.2: icmp_seq=1 ttl=64 time=0.049 ms
^C
--- 172.18.0.2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1006ms
rtt min/avg/max/mdev = 0.039/0.044/0.049/0.005 ms

$ curl -k https://172.18.0.2:5443/version
{
  "major": "1",
  "minor": "27",
  "gitVersion": "v1.27.11",
  "gitCommit": "b9e2ad67ad146db566be5a6db140d47e52c8adb2",
  ...
}# 

you can see my ip 172.18.0.2 is connected.

@amacharya
Copy link
Author

amacharya commented May 22, 2024

@chaosi-zju

Regarding the network, everything I have selected as public for initial trial and my worker nodes(both cluster) are also running on the public subnet network.

Have you created CRDs as well?

-path-/KARMADA/karmada/charts/karmada/_crds

changed caBundle: {{caBundle}} to caBundle: ''  in both files - webhook_in_clusterresourcebindings.yaml webhook_in_resourcebindings.yaml

have you setup an AWS Client VPN or Bastion host to be able to ping your private IP of node where the karmada-apiserver is running? 

Would it be possible to give you defined steps on how you deploy the AWS EKS cluster (main and member, including vpc and subnet)? It would be helpful for me to check what exactly is missing.

@amacharya
Copy link
Author

@chaosi-zju

Btw, would it be possible to schedule a call? It would be really helpful! - TIA

@chaosi-zju
Copy link
Member

Have you created CRDs as well?

-path-/KARMADA/karmada/charts/karmada/_crds
changed caBundle: {{caBundle}} to caBundle: '' in both files - webhook_in_clusterresourcebindings.yaml webhook_in_resour

No, attention please, do not manually apply crds, do not change caBundle.

Those actions will be automatically done by charts/karmada/templates/pre-install-job.yaml and charts/karmada/templates/post-install-job.yaml. (the file webhook_in_clusterresourcebindings.yaml webhook_in_resourcebindings.yaml is actually not used in helm install method)

- /bin/sh
- -c
- |
bash <<'EOF'
set -ex
kubectl apply -k /crds --kubeconfig /etc/kubeconfig
kubectl apply -f /static-resources --kubeconfig /etc/kubeconfig
EOF

have you setup an AWS Client VPN or Bastion host to be able to ping your private IP of node where the karmada-apiserver is running?

No, my environment is local, the cluster is under local network. I installed Karmada on my local PC by this method.


What makes me curious is that, why you can use karmada-host.config(kubeconfig of karmada host cluster) normally, but has network error in using karmada-apiserver.config? Not to mention that you have a public IP.

Can you please do following check again:

# check whether you can connect to karmada-host
kubectl --kubeconfig ~/.kube/karmada-host.config --context karmada-host cluster-info

# check whether you can connect to karmada-apiserver
kubectl --kubeconfig ~/.kube/karmada-apiserver.config --context karmada-apiserver cluster-info

Then, we need to check following things:

  1. what's your output of these two command? (For IP sensitivity, you can replace IP with x.x.x.x or y.y.y.y.)
  2. does the apiserver ip in ~/.kube/karmada-apiserver.config the same as the apiserver ip in ~/.kube/karmada-host.config? In general, they are the same ip and just port is different. (the former is x.x.x.x:5443 while the later is x.x.x.x:6443). So, if the ip is different, modify the ip in ~/.kube/karmada-apiserver.config to the same as ip in ~/.kube/karmada-host.config and try again.
  3. check your firewall to see if the 5443 port is not open.

@amacharya
Copy link
Author

kubectl --kubeconfig ~/.kube/karmada-main.config --context arn:aws:eks:eu-central-1:xxxxxx:cluster/karmada-main cluster-info

Kubernetes control plane is running at xxxxx
CoreDNS is running at xxxxxx/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
kubectl --kubeconfig ~/.kube/karmada-apiserver.config --context karmada-apiserver cluster-info

E0522 22:54:10.392360   51999 memcache.go:265] couldn't get current server API group list: Get "https://xxxxx:5443/api?timeout=32s": dial tcp xxxxx:5443: i/o timeout

My AWS EKS cluster has been deployed on a public subnet with no restriction on traffic (inbound or outbound SG).

I am able to ping the public IPv4 address of my instance, but I am not able to ping the private IPv4 address of my instance where Karmada-apiserver is running.
As I mentioned above for temporary solution using port forwarding, it is working to join clusters but has again issues with unjoin clusters.

Since you mentioned you deployed on a local machine, that is what I suspected.

Do you consider this a potential bug to deploy on the AWS EKS env?

@amacharya
Copy link
Author

Regarding CRDs -

On new fresh deployment karmada-controller-manager pod is crashing

I0523 04:11:43.314805       1 detector.go:188] Stopped as stopCh closed.
I0523 04:11:43.314853       1 controller.go:240] "Shutdown signal received, waiting for all workers to finish" controller="federatedresourcequota" controllerGroup="policy.karmada.io" controllerKind="FederatedResourceQuota"
I0523 04:11:43.314885       1 controller.go:242] "All workers finished" controller="federatedresourcequota" controllerGroup="policy.karmada.io" controllerKind="FederatedResourceQuota"
I0523 04:11:43.314944       1 internal.go:546] "Stopping and waiting for caches"
I0523 04:11:43.317516       1 internal.go:550] "Stopping and waiting for webhooks"
I0523 04:11:43.319506       1 internal.go:553] "Stopping and waiting for HTTP servers"
I0523 04:11:43.319592       1 server.go:43] "shutting down server" kind="health probe" addr="[::]:10357"
I0523 04:11:43.319653       1 server.go:251] "Shutting down metrics server with timeout of 1 minute" logger="controller-runtime.metrics"
I0523 04:11:43.319880       1 internal.go:557] "Wait completed, proceeding to shutdown the manager"
E0523 04:11:43.319971       1 controllermanager.go:199] controller manager exits unexpectedly: failed to wait for workload-rebalancer caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.WorkloadRebalancer
E0523 04:11:43.320462       1 run.go:74] "command failed" err="failed to wait for workload-rebalancer caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.WorkloadRebalancer"

@chaosi-zju
Copy link
Member

chaosi-zju commented May 23, 2024

On new fresh deployment karmada-controller-manager pod is crashing
failed to wait for workload-rebalancer caches to sync: timed out waiting for cache to be synced for Kind *v1alpha1.WorkloadRebalancer

hi, some other friends has encountered the same problem, you can refer to: #4927

@amacharya
Copy link
Author

I am already using v1.9.1

@amacharya
Copy link
Author

kubectl --kubeconfig ~/.kube/karmada-main.config --context arn:aws:eks:eu-central-1:xxxxxx:cluster/karmada-main cluster-info

Kubernetes control plane is running at xxxxx
CoreDNS is running at xxxxxx/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
kubectl --kubeconfig ~/.kube/karmada-apiserver.config --context karmada-apiserver cluster-info

E0522 22:54:10.392360   51999 memcache.go:265] couldn't get current server API group list: Get "https://xxxxx:5443/api?timeout=32s": dial tcp xxxxx:5443: i/o timeout

My AWS EKS cluster has been deployed on a public subnet with no restriction on traffic (inbound or outbound SG).

I am able to ping the public IPv4 address of my instance, but I am not able to ping the private IPv4 address of my instance where Karmada-apiserver is running. As I mentioned above for temporary solution using port forwarding, it is working to join clusters but has again issues with unjoin clusters.

Since you mentioned you deployed on a local machine, that is what I suspected.

also regarding this issue
Do you consider this a potential bug to deploy on the AWS EKS env?

@chaosi-zju
Copy link
Member

chaosi-zju commented May 23, 2024

I am already using v1.9.1

Hi, as I talked in #4927 (comment), the workload-rebalancer CRDs is introduced after v.1.10 version, it shouldn't appear in v1.9.1 version controller-manager images.

@chaosi-zju
Copy link
Member

Do you consider this a potential bug to deploy on the AWS EKS env?

May be, since we may haven't tested on AWS EKS env. So we attach great importance to your case and hope to find our shortcomings through your case.

@chaosi-zju
Copy link
Member

As you said above:

If I ssh of my main-cluster or member-cluster node to test the connection, then it is working.

and

My AWS EKS cluster has been deployed on a public subnet with no restriction on traffic (inbound or outbound SG). I am able to ping the public IPv4 address of my instance, but I am not able to ping the private IPv4 address of my instance where Karmada-apiserver is running.

So, I wonder, if you ssh into your main-cluster node, right in there, you can connect to the private IPv4 address where Karmada-apiserver is running, you can also connect to the public IPv4 address where member clutser apiserver is running. So, if you copy karmada-apiserver.config and members.config to your main-cluster node and execute karmadactl join command right there, will it success? As I see, in this way shouldn't have network error, am I right?

@amacharya
Copy link
Author

I am already using v1.9.1

Hi, as I talked in #4927 (comment), the workload-rebalancer CRDs is introduced after v.1.10 version, it shouldn't appear in v1.9.1 version controller-manager images.

@chaosi-zju
I tried as per your given steps in - #4927 (comment)
but karmada-controller-manager pod is crashing with the same issue!

@amacharya
Copy link
Author

Do you consider this a potential bug to deploy on the AWS EKS env?

May be, since we may haven't tested on AWS EKS env. So we attach great importance to your case and hope to find our shortcomings through your case.

Many thanks @chaosi-zju - It would be a really great help! 
I'm looking forward to installing Karmada on an AWS EKS env with the upcoming release version.

@RainbowMango
Copy link
Member

@chaosi-zju

Btw, would it be possible to schedule a call? It would be really helpful! - TIA

@amacharya Maybe we can have a chat at the community meeting. Please add an item to the meeting slot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Status: No status
Development

No branches or pull requests

3 participants