"Tanzu Kubernetes Grid on vSphereを1.2.1 -> 1.3.0にアップデートするメモ"で作成した環境をアップデートします。
アップデートの手順はこちら
- https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3/vmware-tanzu-kubernetes-grid-13/GUID-upgrade-tkg-index.html
- https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3/vmware-tanzu-kubernetes-grid-13/GUID-upgrade-tkg-management-cluster.html
- https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3/vmware-tanzu-kubernetes-grid-13/GUID-upgrade-tkg-workload-clusters.html
1.3.0 -> 1.3.1へアップデートするのが遅れているうちに1.3.1-patch1というイレギュラーなバージョンがリリースされていました。 こちらへのアップデート方法は以下のKnowledge Baseでも説明されています。
https://kb.vmware.com/s/article/83781
Release Noteはこちら
Table of contents
- CLIのインストール
- vCenterにOVAファイルをアップロード
- Management Clusterをアップデート
- Workload Clusterをアップデート
- Add-onの確認
- TKG Extensionのアップデート
CLIのインストール
https://my.vmware.com/en/web/vmware/downloads/details?downloadGroup=TKG-131&productId=988&rPId=65946
から
- VMware Tanzu CLI for Mac
をダウンロードして次のようにインストールしてください。1.3からCLIがtkg
からtanzu
に変わりました。
tar xvf tanzu-cli-bundle-v1.3.1-darwin-amd64.tar
install cli/core/v1.3.1/tanzu-core-darwin_amd64 /usr/local/bin/tanzu
tanzu plugin clean
tanzu plugin install -u --local cli all
次のバージョンで動作確認します。
$ tanzu version
version: v1.3.1
buildDate: 2021-05-07
sha: e5c37c4
プラグイン一覧を確認します。
$ tanzu plugin list
NAME LATEST VERSION DESCRIPTION REPOSITORY VERSION STATUS
alpha v1.3.1 Alpha CLI commands core not installed
cluster v1.3.1 Kubernetes cluster operations core v1.3.1 installed
kubernetes-release v1.3.1 Kubernetes release operations core v1.3.1 installed
login v1.3.1 Login to the platform core v1.3.1 installed
management-cluster v1.3.1 Kubernetes management cluster operations core v1.3.1 installed
pinniped-auth v1.3.1 Pinniped authentication operations (usually not directly invoked) core v1.3.1 installed
vCenterにOVAファイルをアップロード
https://my.vmware.com/en/web/vmware/downloads/details?downloadGroup=TKG-131&productId=988&rPId=65946
から
- Photon v3 Kubernetes v1.20.5 vmware.2 OVA
をダウンロードし、~/Downloads
に保存します。
TKGをインストールする環境の情報を次の例のように設定してください。
cat <<EOF > tkg-env.sh
export GOVC_URL=vcsa.v.maki.lol
export GOVC_USERNAME=administrator@vsphere.local
export GOVC_PASSWORD=VMware1!
export GOVC_DATACENTER=Datacenter
export GOVC_NETWORK="VM Network"
export GOVC_DATASTORE=datastore1
export GOVC_RESOURCE_POOL=/Datacenter/host/Cluster/Resources/tkg
export GOVC_INSECURE=1
export TEMPLATE_FOLDER=/Datacenter/vm/tkg
EOF
次のコマンドでOVAファイルをvCenterにアップロードします。
source tkg-env.sh
govc import.ova -folder $TEMPLATE_FOLDER ~/Downloads/photon-3-kube-v1.20.5-vmware.2-tkg.1-3176963957469777230.ova
govc vm.markastemplate $TEMPLATE_FOLDER/photon-3-kube-v1.20.5
Management Clusterをアップデート
次のコマンドを実行し、アップデート対象のManagement Clusterにログインします。
tanzu login
初回コマンド実行時に、
the old providers folder /Users/toshiaki/.tanzu/tkg/providers is backed up to /Users/toshiaki/.tanzu/tkg/providers-20210623113623-wb2lqf65
というようなログが出力されます。TKG 1.3.0の時に使用していた$HOME/.tanzu/tkg/providers
はバックアップフォルダにrenameされ、
1.3.1用のファイルが$HOME/.tanzu/tkg/providers
に用意されます。
この時、$HOME/.tanzu/tkg/providers/infrastructure-vsphere/ytt/vsphere-overlay.yaml
のようなクラスタ構築時にカスタマイズしたytt overlayファイルは1.3.1に引き継がれないので、
1.3.0->1.3.1においても引き続き利用したいOverlayファイルは改めて$HOME/.tanzu/tkg/providers
へ設定する必要があります。
cp -r $HOME/.tanzu/tkg/providers-20210623113623-wb2lqf65/infrastructure-vsphere/ytt/vsphere-overlay.yaml $HOME/.tanzu/tkg/providers/infrastructure-vsphere/ytt/vsphere-overlay.yaml
アップデート前のクラスタ一覧を確認します。Kubernetesのバージョンはv1.20.4+vmware.1
です。
$ tanzu cluster list --include-management-cluster
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
apple default running 1/1 2/2 v1.20.4+vmware.1 <none> dev
grape default running 1/1 3/3 v1.20.4+vmware.1 <none> prod
lemon default running 1/1 1/1 v1.20.4+vmware.1 <none> dev
orange default running 1/1 2/2 v1.20.4+vmware.1 <none> dev
carrot tkg-system running 1/1 1/1 v1.20.4+vmware.1 management dev
次のコマンドでManagement Clusterをアップデートします。
今回は1.3.1ではなく1.3.1-patch1へのアップデートであるため、微調整が必要です。
sed -i.bak 's/v1.20.5+vmware.1-tkg.1/v1.20.5+vmware.2-tkg.1/' $HOME/.tanzu/tkg/bom/tkg-bom-v1.3.1.yaml
export TKG_BOM_CUSTOM_IMAGE_TAG="v1.3.1-patch1"
tanzu management-cluster upgrade -v6 -y
次のようなログが出力されます。
Upgrading management cluster providers...
clusterctl upgrade apply options: {Kubeconfig:{Path:/Users/toshiaki/.kube-tkg/config Context:carrot-admin@carrot} ManagementGroup:capi-system/cluster-api Contract: CoreProvider:capi-system/cluster-api:v0.3.14 BootstrapProviders:[capi-kubeadm-bootstrap-system/kubeadm:v0.3.14] ControlPlaneProviders:[capi-kubeadm-control-plane-system/kubeadm:v0.3.14] InfrastructureProviders:[capv-system/vsphere:v0.7.7]}
Checking cert-manager version...
Deleting cert-manager Version="v0.11.0"
Installing cert-manager Version="v0.16.1"
Waiting for cert-manager to be available...
Performing upgrade...
Deleting Provider="cluster-api" Version="" TargetNamespace="capi-system"
Installing Provider="cluster-api" Version="v0.3.14" TargetNamespace="capi-system"
Deleting Provider="bootstrap-kubeadm" Version="" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.14" TargetNamespace="capi-kubeadm-bootstrap-system"
Deleting Provider="control-plane-kubeadm" Version="" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.14" TargetNamespace="capi-kubeadm-control-plane-system"
Deleting Provider="infrastructure-vsphere" Version="" TargetNamespace="capv-system"
Installing Provider="infrastructure-vsphere" Version="v0.7.7" TargetNamespace="capv-system"
Waiting for provider infrastructure-vsphere
Waiting for provider cluster-api
Waiting for provider bootstrap-kubeadm
Waiting for provider control-plane-kubeadm
Waiting for resource capi-kubeadm-control-plane-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capi-kubeadm-control-plane-controller-manager' in namespace 'capi-kubeadm-control-plane-system', retrying
Waiting for resource capv-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capv-controller-manager' in namespace 'capv-system', retrying
Waiting for resource capi-kubeadm-bootstrap-controller-manager of type *v1.Deployment to be up and running
Waiting for resource capi-controller-manager of type *v1.Deployment to be up and running
Waiting for resource capi-kubeadm-bootstrap-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capi-controller-manager' in namespace 'capi-system', retrying
Passed waiting on provider bootstrap-kubeadm after 100.585982ms
Waiting for resource capi-kubeadm-control-plane-controller-manager of type *v1.Deployment to be up and running
pods are not yet running for deployment 'capv-controller-manager' in namespace 'capv-system', retrying
Passed waiting on provider control-plane-kubeadm after 5.061307114s
Waiting for resource capi-controller-manager of type *v1.Deployment to be up and running
Passed waiting on provider cluster-api after 5.114968188s
pods are not yet running for deployment 'capv-controller-manager' in namespace 'capv-system', retrying
...
Waiting for resource capv-controller-manager of type *v1.Deployment to be up and running
Passed waiting on provider infrastructure-vsphere after 30.069141159s
Success waiting on all providers.
Management cluster providers upgraded successfully...
Upgrading management cluster kubernetes version...
Creating management cluster client...
Verifying kubernetes version...
Unable to detect current OS for the cluster. Using name:photon version:3 arch:amd64
Using OS options, name:photon version:3 arch:amd64
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.20.5+vmware.2...
Applying KubeadmControlPlane Patch: {
"spec": {
"version": "v1.20.5+vmware.2",
"infrastructureTemplate": {
"name": "carrot-control-plane-v1-20-5-vmware-2-21xat",
"namespace": "tkg-system"
},
"kubeadmConfigSpec": {
"clusterConfiguration": {
"imageRepository" : "projects.registry.vmware.com/tkg",
"dns": {
"imageRepository": "projects.registry.vmware.com/tkg",
"imageTag": "v1.7.0_vmware.8"
},
"etcd": {
"local": {
"dataDir": "/var/lib/etcd",
"imageRepository": "projects.registry.vmware.com/tkg",
"imageTag": "v3.4.13_vmware.7"
}
}
}
}
}
}
Applying patch to resource carrot-control-plane of type *v1alpha3.KubeadmControlPlane ...
patch cluster object with operation status:
{
"metadata": {
"annotations": {
"TKGOperationInfo" : "{\"Operation\":\"Upgrade\",\"OperationStartTimestamp\":\"2021-07-06 12:32:39.968017 +0000 UTC\",\"OperationTimeout\":900}",
"TKGOperationLastObservedTimestamp" : "2021-07-06 12:32:39.968017 +0000 UTC"
}
}
}
Waiting for kubernetes version to be updated for control plane nodes
waiting for kubernetes version update, current kubernetes version v1.20.4+vmware.1 but expecting v1.20.5+vmware.2, retrying
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
etcdserver: request timed out, retrying
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
...
control-plane is still being upgraded, reason:'ScalingDown', message:'Scaling down control plane to 1 replicas (actual 2)' , retrying
...
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.20.5+vmware.2...
Applying MachineDeployment Patch: {
"spec": {
"template": {
"spec": {
"version": "v1.20.5+vmware.2",
"infrastructureRef": {
"name": "carrot-worker-v1-19-3-vmware-1-muxc1-v1-20-4-vmware-1-5mlj6-v1-20-5-vmware-2-4xqok",
"namespace": "tkg-system"
}
}
}
}
}
Applying patch to resource carrot-md-0 of type *v1alpha3.MachineDeployment ...
Waiting for kubernetes version to be updated for worker nodes...
worker nodes are still being upgraded for MachineDeployment 'carrot-md-0', DesiredReplicas=1 Replicas=2 ReadyReplicas=1 UpdatedReplicas=1, retrying
...
worker machines [carrot-md-0-6fcdff5b49-qttmt] are still not upgraded, retrying
...
updating additional components: 'metadata/tkg' ...
updating additional components: 'addons-management/kapp-controller' ...
updating additional components: 'addons-management/tanzu-addons-manager' ...
updating additional components: 'tkr/tkr-controller' ...
Waiting for additional components to be up and running...
Waiting for resource tanzu-addons-controller-manager of type *v1.Deployment to be up and running
Waiting for resource tkr-controller-manager of type *v1.Deployment to be up and running
Waiting for resource kapp-controller of type *v1.Deployment to be up and running
Management cluster 'carrot' successfully upgraded to TKG version 'v1.3.1-patch1' with kubernetes version 'v1.20.5+vmware.2'
次のコマンドでManagement ClusterのみKubernetes v1.20.5+vmware.2
にバージョンアップされたことがわかります。
$ tanzu cluster list --include-management-cluster
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
apple default running 1/1 2/2 v1.20.4+vmware.1 <none> dev
grape default running 1/1 3/3 v1.20.4+vmware.1 <none> prod
lemon default running 1/1 1/1 v1.20.4+vmware.1 <none> dev
orange default running 1/1 2/2 v1.20.4+vmware.1 <none> dev
carrot tkg-system running 1/1 1/1 v1.20.5+vmware.2 management dev
Management Clusterにkubectl
コマンドでアクセスしてみます。
$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
carrot-control-plane-kbfpl Ready control-plane,master 18m v1.20.5+vmware.2 192.168.11.93 192.168.11.93 VMware Photon OS/Linux 4.19.189-5.ph3 containerd://1.4.4
carrot-md-0-cf8849f4-skldd Ready <none> 9m42s v1.20.5+vmware.2 192.168.11.94 192.168.11.94 VMware Photon OS/Linux 4.19.189-5.ph3 containerd://1.4.4
アップデート後はControlPlaneのDHCP予約の設定を再度行う必要があります。
Workload Clusterをアップデート
次にWorkload Clusterをアップデートします。
次のコマンドでWorkload Clusterを作成します。
tanzu cluster upgrade lemon -v6 -y
次のようなログが出力されます。
Creating management cluster client...
Validating configuration...
Creating workload cluster client...
Retrieving credentials for workload cluster lemon
Getting secret for cluster
Waiting for resource lemon-kubeconfig of type *v1.Secret to be up and running
Merging credentials into kubeconfig file /Users/toshiaki/.kube-tkg/config
Verifying kubernetes version...
Unable to detect current OS for the cluster. Using name:photon version:3 arch:amd64
Using OS options, name:photon version:3 arch:amd64
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.20.5+vmware.2...
Applying KubeadmControlPlane Patch: {
"spec": {
"version": "v1.20.5+vmware.2",
"infrastructureTemplate": {
"name": "lemon-control-plane-v1-20-5-vmware-2-hqtn7",
"namespace": "default"
},
"kubeadmConfigSpec": {
"clusterConfiguration": {
"imageRepository" : "projects.registry.vmware.com/tkg",
"dns": {
"imageRepository": "projects.registry.vmware.com/tkg",
"imageTag": "v1.7.0_vmware.8"
},
"etcd": {
"local": {
"dataDir": "/var/lib/etcd",
"imageRepository": "projects.registry.vmware.com/tkg",
"imageTag": "v3.4.13_vmware.7"
}
}
}
}
}
}
Applying patch to resource lemon-control-plane of type *v1alpha3.KubeadmControlPlane ...
patch cluster object with operation status:
{
"metadata": {
"annotations": {
"TKGOperationInfo" : "{\"Operation\":\"Upgrade\",\"OperationStartTimestamp\":\"2021-07-06 13:04:51.372895 +0000 UTC\",\"OperationTimeout\":900}",
"TKGOperationLastObservedTimestamp" : "2021-07-06 13:04:51.372895 +0000 UTC"
}
}
}
Waiting for kubernetes version to be updated for control plane nodes
waiting for kubernetes version update, current kubernetes version v1.20.4+vmware.1 but expecting v1.20.5+vmware.2, retrying
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
Get "https://192.168.11.111:6443/version?timeout=32s": net/http: request canceled (Client.Timeout exceeded while awaiting headers), retrying
...
control-plane is still being upgraded, reason:'RollingUpdateInProgress', message:'Rolling 1 replicas with outdated spec (1 replicas up to date)' , retrying
...
control-plane is still being upgraded, reason:'ScalingDown', message:'Scaling down control plane to 1 replicas (actual 2)' , retrying
...
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.20.5+vmware.2...
Applying MachineDeployment Patch: {
"spec": {
"template": {
"spec": {
"version": "v1.20.5+vmware.2",
"infrastructureRef": {
"name": "lemon-worker-v1-20-4-vmware-1-lp5nh-v1-20-5-vmware-2-vqfft",
"namespace": "default"
}
}
}
}
}
Applying patch to resource lemon-md-0 of type *v1alpha3.MachineDeployment ...
Waiting for kubernetes version to be updated for worker nodes...
worker nodes are still being upgraded for MachineDeployment 'lemon-md-0', DesiredReplicas=1 Replicas=2 ReadyReplicas=1 UpdatedReplicas=1, retrying
...
worker machines [lemon-md-0-75f769b4-bdvsw] are still not upgraded, retrying
...
updating additional components: 'metadata/tkg' ...
updating additional components: 'addons-management/kapp-controller' ...
cluster autoscaler is not enabled for cluster lemon
Cluster 'lemon' successfully upgraded to kubernetes version 'v1.20.5+vmware.2'
tkg get cluster
でWorkload Clusterを確認できます。grape
のみKubernetes 1.19.3にバージョンアップされたことがわかります
$ tanzu cluster list --include-management-cluster
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN
apple default running 1/1 2/2 v1.20.4+vmware.1 <none> dev
grape default running 1/1 3/3 v1.20.4+vmware.1 <none> prod
lemon default running 1/1 1/1 v1.20.5+vmware.2 <none> dev
orange default running 1/1 2/2 v1.20.4+vmware.1 <none> dev
carrot tkg-system running 1/1 1/1 v1.20.5+vmware.2 management dev
アップデート後はControlPlaneのDHCP予約の設定を再度行う必要があります。
ここまででKubernetesのアップデートは完了しました。
Add-onの確認
TKG 1.2 -> 1.3の時のようにAdd-onの登録は1.3.0 -> 1.3.1では不要です。
自動でアップデートされます。tkg-system
namespaceのApp
リソースが全てReconcile succeeded
になっていることを確認してください。
Management Cluster
$ kubectl get app -n tkg-system
NAME DESCRIPTION SINCE-DEPLOY AGE
antrea Reconcile succeeded 3m28s 101d
metrics-server Reconcile succeeded 3m26s 101d
tanzu-addons-manager Reconcile succeeded 3m42s 102d
vsphere-cpi Reconcile succeeded 3m10s 78d
vsphere-csi Reconcile succeeded 3m39s 101d
Workload Cluster
$ kubectl get app -n tkg-system
NAME DESCRIPTION SINCE-DEPLOY AGE
antrea Reconcile succeeded 74s 101d
metrics-server Reconcile succeeded 81s 101d
vsphere-cpi Reconcile succeeded 85s 93d
vsphere-csi Reconcile succeeded 1s 101d
このApp
オブジェクトで使われているtemplate imageのtagがv1.3.1
になっていれば1.3.0 -> 1.3.1のアップデートが成功しています。
$ kubectl get app -n tkg-system vsphere-csi --template="{{(index .spec.fetch 0).image.url}}"
projects.registry.vmware.com/tkg/tanzu_core/addons/vsphere-csi-templates:v1.3.1
TKG Extensionのアップデート
TKG Extensionを使用している場合はこちらもアップデートが必要です。 ドキュメントの通りなので割愛します。
TKG 1.3.0 -> 1.3.1-patch1へのアップデートを行いました。
TKG 1.2 -> 1.3に比べれば簡単で特にハマりポイントはありませんでした。