IK.AM

@making's tech note


Tanzu Kubernetes Grid on vSphereを1.3.0 -> 1.3.1-patch1にアップデートするメモ

🗃 {Dev/CaaS/Kubernetes/TKG/vSphere}
🏷 Kubernetes 🏷 vSphere 🏷 TKG 🏷 Tanzu 🏷 Cluster API 
🗓 Updated at 2021-07-06T14:38:52Z  🗓 Created at 2021-07-06T14:35:48Z   🌎 English Page

⚠️ 本記事の内容はVMwareによってサポートされていません。 記事の内容で生じた問題については自己責任で対応し、 VMwareサポート窓口には問い合わせないでください

"Tanzu Kubernetes Grid on vSphereを1.2.1 -> 1.3.0にアップデートするメモ"で作成した環境をアップデートします。

アップデートの手順はこちら

1.3.0 -> 1.3.1へアップデートするのが遅れているうちに1.3.1-patch1というイレギュラーなバージョンがリリースされていました。 こちらへのアップデート方法は以下のKnowledge Baseでも説明されています。

https://kb.vmware.com/s/article/83781

Release Noteはこちら

https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3.1/rn/VMware-Tanzu-Kubernetes-Grid-131-Release-Notes.html

Table of contents

CLIのインストール

https://my.vmware.com/en/web/vmware/downloads/details?downloadGroup=TKG-131&productId=988&rPId=65946

から

  • VMware Tanzu CLI for Mac

をダウンロードして次のようにインストールしてください。1.3からCLIがtkgからtanzuに変わりました。

tar xvf tanzu-cli-bundle-v1.3.1-darwin-amd64.tar

install cli/core/v1.3.1/tanzu-core-darwin_amd64 /usr/local/bin/tanzu
tanzu plugin clean
tanzu plugin install -u --local cli all

次のバージョンで動作確認します。

$ tanzu version
version: v1.3.1
buildDate: 2021-05-07
sha: e5c37c4

プラグイン一覧を確認します。

$ tanzu plugin list
  NAME                LATEST VERSION  DESCRIPTION                                                        REPOSITORY  VERSION  STATUS         
  alpha               v1.3.1          Alpha CLI commands                                                 core                 not installed  
  cluster             v1.3.1          Kubernetes cluster operations                                      core        v1.3.1   installed      
  kubernetes-release  v1.3.1          Kubernetes release operations                                      core        v1.3.1   installed      
  login               v1.3.1          Login to the platform                                              core        v1.3.1   installed      
  management-cluster  v1.3.1          Kubernetes management cluster operations                           core        v1.3.1   installed      
  pinniped-auth       v1.3.1          Pinniped authentication operations (usually not directly invoked)  core        v1.3.1   installed   

vCenterにOVAファイルをアップロード

https://my.vmware.com/en/web/vmware/downloads/details?downloadGroup=TKG-131&productId=988&rPId=65946

から

  • Photon v3 Kubernetes v1.20.5 vmware.2 OVA

をダウンロードし、~/Downloadsに保存します。

TKGをインストールする環境の情報を次の例のように設定してください。

cat <<EOF > tkg-env.sh
export GOVC_URL=vcsa.v.maki.lol
export GOVC_USERNAME=administrator@vsphere.local
export GOVC_PASSWORD=VMware1!
export GOVC_DATACENTER=Datacenter
export GOVC_NETWORK="VM Network"
export GOVC_DATASTORE=datastore1
export GOVC_RESOURCE_POOL=/Datacenter/host/Cluster/Resources/tkg
export GOVC_INSECURE=1
export TEMPLATE_FOLDER=/Datacenter/vm/tkg
EOF

次のコマンドでOVAファイルをvCenterにアップロードします。

source tkg-env.sh
govc import.ova -folder $TEMPLATE_FOLDER ~/Downloads/photon-3-kube-v1.20.5-vmware.2-tkg.1-3176963957469777230.ova
govc vm.markastemplate $TEMPLATE_FOLDER/photon-3-kube-v1.20.5

Management Clusterをアップデート

次のコマンドを実行し、アップデート対象のManagement Clusterにログインします。

tanzu login

初回コマンド実行時に、

the old providers folder /Users/toshiaki/.tanzu/tkg/providers is backed up to /Users/toshiaki/.tanzu/tkg/providers-20210623113623-wb2lqf65

というようなログが出力されます。TKG 1.3.0の時に使用していた$HOME/.tanzu/tkg/providersはバックアップフォルダにrenameされ、 1.3.1用のファイルが$HOME/.tanzu/tkg/providersに用意されます。 この時、$HOME/.tanzu/tkg/providers/infrastructure-vsphere/ytt/vsphere-overlay.yamlのようなクラスタ構築時にカスタマイズしたytt overlayファイルは1.3.1に引き継がれないので、 1.3.0->1.3.1においても引き続き利用したいOverlayファイルは改めて$HOME/.tanzu/tkg/providersへ設定する必要があります。

cp -r $HOME/.tanzu/tkg/providers-20210623113623-wb2lqf65/infrastructure-vsphere/ytt/vsphere-overlay.yaml $HOME/.tanzu/tkg/providers/infrastructure-vsphere/ytt/vsphere-overlay.yaml

アップデート前のクラスタ一覧を確認します。Kubernetesのバージョンはv1.20.4+vmware.1です。

$ tanzu cluster list --include-management-cluster

  NAME    NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES       PLAN  
  apple   default     running  1/1           2/2      v1.20.4+vmware.1  <none>      dev   
  grape   default     running  1/1           3/3      v1.20.4+vmware.1  <none>      prod  
  lemon   default     running  1/1           1/1      v1.20.4+vmware.1  <none>      dev   
  orange  default     running  1/1           2/2      v1.20.4+vmware.1  <none>      dev   
  carrot  tkg-system  running  1/1           1/1      v1.20.4+vmware.1  management  dev    

次のコマンドでManagement Clusterをアップデートします。

今回は1.3.1ではなく1.3.1-patch1へのアップデートであるため、微調整が必要です。

sed -i.bak 's/v1.20.5+vmware.1-tkg.1/v1.20.5+vmware.2-tkg.1/'  $HOME/.tanzu/tkg/bom/tkg-bom-v1.3.1.yaml
export TKG_BOM_CUSTOM_IMAGE_TAG="v1.3.1-patch1"
tanzu management-cluster upgrade -v6 -y

次のようなログが出力されます。

Upgrading management cluster providers...
clusterctl upgrade apply options: {Kubeconfig:{Path:/Users/toshiaki/.kube-tkg/config Context:carrot-admin@carrot} ManagementGroup:capi-system/cluster-api Contract: CoreProvider:capi-system/cluster-api:v0.3.14 BootstrapProviders:[capi-kubeadm-bootstrap-system/kubeadm:v0.3.14] ControlPlaneProviders:[capi-kubeadm-control-plane-system/kubeadm:v0.3.14] InfrastructureProviders:[capv-system/vsphere:v0.7.7]}
Checking cert-manager version...
Deleting cert-manager Version="v0.11.0"
Installing cert-manager Version="v0.16.1"
Waiting for cert-manager to be available...
Performing upgrade...
Deleting Provider="cluster-api" Version="" TargetNamespace="capi-system"
Installing Provider="cluster-api" Version="v0.3.14" TargetNamespace="capi-system"
Deleting Provider="bootstrap-kubeadm" Version="" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.14" TargetNamespace="capi-kubeadm-bootstrap-system"
Deleting Provider="control-plane-kubeadm" Version="" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.14" TargetNamespace="capi-kubeadm-control-plane-system"
Deleting Provider="infrastructure-vsphere" Version="" TargetNamespace="capv-system"
Installing Provider="infrastructure-vsphere" Version="v0.7.7" TargetNamespace="capv-system"
Waiting for provider infrastructure-vsphere
Waiting for provider cluster-api
Waiting for provider bootstrap-kubeadm
Waiting for provider control-plane-kubeadm
Waiting for resource capi-kubeadm-control-plane-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capi-kubeadm-control-plane-controller-manager' in namespace 'capi-kubeadm-control-plane-system', retrying
Waiting for resource capv-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capv-controller-manager' in namespace 'capv-system', retrying
Waiting for resource capi-kubeadm-bootstrap-controller-manager of type *v1.Deployment to be up and running
Waiting for resource capi-controller-manager of type *v1.Deployment to be up and running
Waiting for resource capi-kubeadm-bootstrap-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capi-controller-manager' in namespace 'capi-system', retrying
Passed waiting on provider bootstrap-kubeadm after 100.585982ms
Waiting for resource capi-kubeadm-control-plane-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capv-controller-manager' in namespace 'capv-system', retrying
Passed waiting on provider control-plane-kubeadm after 5.061307114s
Waiting for resource capi-controller-manager of type *v1.Deployment to be up and running
Passed waiting on provider cluster-api after 5.114968188s
pods are not yet running for deployment 'capv-controller-manager' in namespace 'capv-system', retrying
...
Waiting for resource capv-controller-manager of type *v1.Deployment to be up and running
Passed waiting on provider infrastructure-vsphere after 30.069141159s
Success waiting on all providers.
Management cluster providers upgraded successfully...
Upgrading management cluster kubernetes version...
Creating management cluster client...
Verifying kubernetes version...
Unable to detect current OS for the cluster. Using name:photon version:3 arch:amd64
Using OS options, name:photon version:3 arch:amd64
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.20.5+vmware.2...
Applying KubeadmControlPlane Patch: {
    "spec": {
      "version": "v1.20.5+vmware.2",
      "infrastructureTemplate": {
      "name": "carrot-control-plane-v1-20-5-vmware-2-21xat",
      "namespace": "tkg-system"
      },
      "kubeadmConfigSpec": {
      "clusterConfiguration": {
        "imageRepository" : "projects.registry.vmware.com/tkg",
        "dns": {
        "imageRepository": "projects.registry.vmware.com/tkg",
        "imageTag": "v1.7.0_vmware.8"
        },
        "etcd": {
        "local": {
          "dataDir": "/var/lib/etcd",
          "imageRepository": "projects.registry.vmware.com/tkg",
          "imageTag": "v3.4.13_vmware.7"
        }
        }
      }
      }
    }
    }
Applying patch to resource carrot-control-plane of type *v1alpha3.KubeadmControlPlane ...
patch cluster object with operation status: 
  {
    "metadata": {
      "annotations": {
        "TKGOperationInfo" : "{\"Operation\":\"Upgrade\",\"OperationStartTimestamp\":\"2021-07-06 12:32:39.968017 +0000 UTC\",\"OperationTimeout\":900}",
        "TKGOperationLastObservedTimestamp" : "2021-07-06 12:32:39.968017 +0000 UTC"
      }
    }
  }
Waiting for kubernetes version to be updated for control plane nodes
waiting for kubernetes version update, current kubernetes version v1.20.4+vmware.1 but expecting v1.20.5+vmware.2, retrying
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
etcdserver: request timed out, retrying
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
...
control-plane is still being upgraded, reason:'ScalingDown', message:'Scaling down control plane to 1 replicas (actual 2)' , retrying
...
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.20.5+vmware.2...
Applying MachineDeployment Patch: {
      "spec": {
        "template": {
        "spec": {
          "version": "v1.20.5+vmware.2",
          "infrastructureRef": {
          "name": "carrot-worker-v1-19-3-vmware-1-muxc1-v1-20-4-vmware-1-5mlj6-v1-20-5-vmware-2-4xqok",
          "namespace": "tkg-system"
          }
        }
        }
      }
      }
Applying patch to resource carrot-md-0 of type *v1alpha3.MachineDeployment ...
Waiting for kubernetes version to be updated for worker nodes...
worker nodes are still being upgraded for MachineDeployment 'carrot-md-0', DesiredReplicas=1 Replicas=2 ReadyReplicas=1 UpdatedReplicas=1, retrying
...
worker machines [carrot-md-0-6fcdff5b49-qttmt] are still not upgraded, retrying
...
updating additional components: 'metadata/tkg' ...
updating additional components: 'addons-management/kapp-controller' ...
updating additional components: 'addons-management/tanzu-addons-manager' ...
updating additional components: 'tkr/tkr-controller' ...
Waiting for additional components to be up and running...
Waiting for resource tanzu-addons-controller-manager of type *v1.Deployment to be up and running
Waiting for resource tkr-controller-manager of type *v1.Deployment to be up and running
Waiting for resource kapp-controller of type *v1.Deployment to be up and running
Management cluster 'carrot' successfully upgraded to TKG version 'v1.3.1-patch1' with kubernetes version 'v1.20.5+vmware.2'

次のコマンドでManagement ClusterのみKubernetes v1.20.5+vmware.2にバージョンアップされたことがわかります。

$ tanzu cluster list --include-management-cluster

  NAME    NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES       PLAN  
  apple   default     running  1/1           2/2      v1.20.4+vmware.1  <none>      dev   
  grape   default     running  1/1           3/3      v1.20.4+vmware.1  <none>      prod  
  lemon   default     running  1/1           1/1      v1.20.4+vmware.1  <none>      dev   
  orange  default     running  1/1           2/2      v1.20.4+vmware.1  <none>      dev   
  carrot  tkg-system  running  1/1           1/1      v1.20.5+vmware.2  management  dev 

Management Clusterにkubectlコマンドでアクセスしてみます。

$ kubectl get node -o wide
NAME                         STATUS   ROLES                  AGE     VERSION            INTERNAL-IP     EXTERNAL-IP     OS-IMAGE                 KERNEL-VERSION   CONTAINER-RUNTIME
carrot-control-plane-kbfpl   Ready    control-plane,master   18m     v1.20.5+vmware.2   192.168.11.93   192.168.11.93   VMware Photon OS/Linux   4.19.189-5.ph3   containerd://1.4.4
carrot-md-0-cf8849f4-skldd   Ready    <none>                 9m42s   v1.20.5+vmware.2   192.168.11.94   192.168.11.94   VMware Photon OS/Linux   4.19.189-5.ph3   containerd://1.4.4

アップデート後はControlPlaneのDHCP予約の設定を再度行う必要があります。

Workload Clusterをアップデート

次にWorkload Clusterをアップデートします。

次のコマンドでWorkload Clusterを作成します。

tanzu cluster upgrade lemon -v6 -y 

次のようなログが出力されます。

Creating management cluster client...
Validating configuration...
Creating workload cluster client...
Retrieving credentials for workload cluster lemon 
Getting secret for cluster
Waiting for resource lemon-kubeconfig of type *v1.Secret to be up and running
Merging credentials into kubeconfig file /Users/toshiaki/.kube-tkg/config 
Verifying kubernetes version...
Unable to detect current OS for the cluster. Using name:photon version:3 arch:amd64
Using OS options, name:photon version:3 arch:amd64
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.20.5+vmware.2...
Applying KubeadmControlPlane Patch: {
    "spec": {
      "version": "v1.20.5+vmware.2",
      "infrastructureTemplate": {
      "name": "lemon-control-plane-v1-20-5-vmware-2-hqtn7",
      "namespace": "default"
      },
      "kubeadmConfigSpec": {
      "clusterConfiguration": {
        "imageRepository" : "projects.registry.vmware.com/tkg",
        "dns": {
        "imageRepository": "projects.registry.vmware.com/tkg",
        "imageTag": "v1.7.0_vmware.8"
        },
        "etcd": {
        "local": {
          "dataDir": "/var/lib/etcd",
          "imageRepository": "projects.registry.vmware.com/tkg",
          "imageTag": "v3.4.13_vmware.7"
        }
        }
      }
      }
    }
    }
Applying patch to resource lemon-control-plane of type *v1alpha3.KubeadmControlPlane ...
patch cluster object with operation status: 
  {
    "metadata": {
      "annotations": {
        "TKGOperationInfo" : "{\"Operation\":\"Upgrade\",\"OperationStartTimestamp\":\"2021-07-06 13:04:51.372895 +0000 UTC\",\"OperationTimeout\":900}",
        "TKGOperationLastObservedTimestamp" : "2021-07-06 13:04:51.372895 +0000 UTC"
      }
    }
  }
Waiting for kubernetes version to be updated for control plane nodes
waiting for kubernetes version update, current kubernetes version v1.20.4+vmware.1 but expecting v1.20.5+vmware.2, retrying
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
Get "https://192.168.11.111:6443/version?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers), retrying
...
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
...
control-plane is still being upgraded, reason:'ScalingDown', message:'Scaling down control plane to 1 replicas (actual 2)' , retrying
...
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.20.5+vmware.2...
Applying MachineDeployment Patch: {
      "spec": {
        "template": {
        "spec": {
          "version": "v1.20.5+vmware.2",
          "infrastructureRef": {
          "name": "lemon-worker-v1-20-4-vmware-1-lp5nh-v1-20-5-vmware-2-vqfft",
          "namespace": "default"
          }
        }
        }
      }
      }
Applying patch to resource lemon-md-0 of type *v1alpha3.MachineDeployment ...
Waiting for kubernetes version to be updated for worker nodes...
worker nodes are still being upgraded for MachineDeployment 'lemon-md-0', DesiredReplicas=1 Replicas=2 ReadyReplicas=1 UpdatedReplicas=1, retrying
...
worker machines [lemon-md-0-75f769b4-bdvsw] are still not upgraded, retrying
...
updating additional components: 'metadata/tkg' ...
updating additional components: 'addons-management/kapp-controller' ...
cluster autoscaler is not enabled for cluster lemon
Cluster 'lemon' successfully upgraded to kubernetes version 'v1.20.5+vmware.2'

tkg get clusterでWorkload Clusterを確認できます。grapeのみKubernetes 1.19.3にバージョンアップされたことがわかります

$ tanzu cluster list --include-management-cluster
  NAME    NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES       PLAN  
  apple   default     running  1/1           2/2      v1.20.4+vmware.1  <none>      dev   
  grape   default     running  1/1           3/3      v1.20.4+vmware.1  <none>      prod  
  lemon   default     running  1/1           1/1      v1.20.5+vmware.2  <none>      dev   
  orange  default     running  1/1           2/2      v1.20.4+vmware.1  <none>      dev   
  carrot  tkg-system  running  1/1           1/1      v1.20.5+vmware.2  management  dev

アップデート後はControlPlaneのDHCP予約の設定を再度行う必要があります。

ここまででKubernetesのアップデートは完了しました。

Add-onの確認

TKG 1.2 -> 1.3の時のようにAdd-onの登録は1.3.0 -> 1.3.1では不要です。 自動でアップデートされます。tkg-system namespaceのAppリソースが全てReconcile succeededになっていることを確認してください。

Management Cluster

$ kubectl get app -n tkg-system 
NAME                   DESCRIPTION           SINCE-DEPLOY   AGE
antrea                 Reconcile succeeded   3m28s          101d
metrics-server         Reconcile succeeded   3m26s          101d
tanzu-addons-manager   Reconcile succeeded   3m42s          102d
vsphere-cpi            Reconcile succeeded   3m10s          78d
vsphere-csi            Reconcile succeeded   3m39s          101d

Workload Cluster

$ kubectl get app -n tkg-system 
NAME             DESCRIPTION           SINCE-DEPLOY   AGE
antrea           Reconcile succeeded   74s            101d
metrics-server   Reconcile succeeded   81s            101d
vsphere-cpi      Reconcile succeeded   85s            93d
vsphere-csi      Reconcile succeeded   1s             101d

このAppオブジェクトで使われているtemplate imageのtagがv1.3.1になっていれば1.3.0 -> 1.3.1のアップデートが成功しています。

$ kubectl get app -n tkg-system vsphere-csi  --template="{{(index .spec.fetch 0).image.url}}"
projects.registry.vmware.com/tkg/tanzu_core/addons/vsphere-csi-templates:v1.3.1

TKG Extensionのアップデート

TKG Extensionを使用している場合はこちらもアップデートが必要です。 ドキュメントの通りなので割愛します。

https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3/vmware-tanzu-kubernetes-grid-13/GUID-upgrade-tkg-extensions.html


TKG 1.3.0 -> 1.3.1-patch1へのアップデートを行いました。

TKG 1.2 -> 1.3に比べれば簡単で特にハマりポイントはありませんでした。


✒️️ Edit  ⏰ History  🗑 Delete