Модули OSD не работают на AKS
Я новичок в среде AKS. Требуется подготовить NFS с помощью Rook в кластере ceph. Я следую https://medium.com/@dmitrio_/installing-rook-v1-0-on-aks-f8c22a75d93d этому документу для включения кластера ладьи и цефалона. Три исходных файла - common.yaml, operator.yaml и cluster.yaml. Требуемый вывод - после запуска cluster.yaml,
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-nd2cc 1/1 Running 0 18h
rook-ceph-agent-sf97n 1/1 Running 0 18h
rook-ceph-agent-vk8r2 1/1 Running 0 18h
rook-ceph-agent-vslnn 1/1 Running 0 18h
rook-ceph-mgr-a-748cd69795-td9fs 1/1 Running 0 42h
rook-ceph-mon-b-869cd5d477-bhbmm 1/1 Running 0 42h
rook-ceph-mon-g-67998d888d-q9xws 1/1 Running 0 9h
rook-ceph-mon-h-6f4cd95f5c-bqc86 1/1 Running 0 3h41m
rook-ceph-operator-864d76cf49-mpkxk 1/1 Running 0 18h
rook-ceph-osd-0-5d8fc96495-252vd 1/1 Running 0 40h
rook-ceph-osd-1-cfbb46894-r6wzs 1/1 Running 0 40h
rook-ceph-osd-2-f59fc7dbd-9vdjf 1/1 Running 0 40h
Это должно произойти. Но в моем случае я получаю следующий результат после запуска вышеупомянутых файлов yaml.
Это мой результат.
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-7jfnb 3/3 Running 0 4h31m
csi-cephfsplugin-klclq 3/3 Running 0 4h31m
csi-cephfsplugin-provisioner-67f9c99b5f-cfslb 6/6 Running 0 4h31m
csi-cephfsplugin-provisioner-67f9c99b5f-gvl8b 6/6 Running 0 4h31m
csi-cephfsplugin-rpd54 3/3 Running 0 4h31m
csi-cephfsplugin-vztfm 3/3 Running 0 4h31m
csi-rbdplugin-4z4c9 3/3 Running 0 4h31m
csi-rbdplugin-bwbv7 3/3 Running 0 4h31m
csi-rbdplugin-l8665 3/3 Running 0 4h31m
csi-rbdplugin-provisioner-5d5cfb887b-wsnlr 6/6 Running 0 4h31m
csi-rbdplugin-provisioner-5d5cfb887b-zh5ps 6/6 Running 0 4h31m
csi-rbdplugin-x9sc2 3/3 Running 0 4h31m
rook-ceph-operator-78f46865d8-8ln89 1/1 Running 0 4h34m
Вот мой файл operator.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph
labels:
operator: rook
storage-backend: ceph
spec:
selector:
matchLabels:
app: rook-ceph-operator
replicas: 1
template:
metadata:
labels:
app: rook-ceph-operator
spec:
serviceAccountName: rook-ceph-system
containers:
- name: rook-ceph-operator
image: rook/ceph:master
args: ["ceph", "operator"]
volumeMounts:
- mountPath: /var/lib/rook
name: rook-config
- mountPath: /etc/ceph
name: default-config-dir
env:
- name: ROOK_CURRENT_NAMESPACE_ONLY
value: "false"
- name: FLEXVOLUME_DIR_PATH
value: "/etc/kubernetes/volumeplugins"
- name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
value: "false"
- name: ROOK_LOG_LEVEL
value: "INFO"
- name: ROOK_CEPH_STATUS_CHECK_INTERVAL
value: "60s"
- name: ROOK_MON_HEALTHCHECK_INTERVAL
value: "45s"
- name: ROOK_MON_OUT_TIMEOUT
value: "600s"
- name: ROOK_DISCOVER_DEVICES_INTERVAL
value: "60m"
- name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "false"
- name: ROOK_ENABLE_SELINUX_RELABELING
value: "true"
- name: ROOK_ENABLE_FSGROUP
value: "true"
- name: ROOK_DISABLE_DEVICE_HOTPLUG
value: "false"
- name: ROOK_ENABLE_FLEX_DRIVER
value: "false"
# Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster.
# This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs. --> CHANGED to false
- name: ROOK_ENABLE_DISCOVERY_DAEMON
value: "false"
- name: ROOK_CSI_ENABLE_CEPHFS
value: "true"
- name: ROOK_CSI_ENABLE_RBD
value: "true"
- name: ROOK_CSI_ENABLE_GRPC_METRICS
value: "true"
- name: CSI_ENABLE_SNAPSHOTTER
value: "true"
- name: CSI_PROVISIONER_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
- name: CSI_PLUGIN_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumes:
- name: rook-config
emptyDir: {}
- name: default-config-dir
emptyDir: {}
А это мой cluster.yaml
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
mon:
count: 3
allowMultiplePerNode: false
volumeClaimTemplate:
spec:
storageClassName: managed-premium
resources:
requests:
storage: 10Gi
cephVersion:
image: ceph/ceph:v14.2.4-20190917
allowUnsupported: false
dashboard:
enabled: true
ssl: true
network:
hostNetwork: false
storage:
storageClassDeviceSets:
- name: set1
# The number of OSDs to create from this device set
count: 4
# IMPORTANT: If volumes specified by the storageClassName are not portable across nodes
# this needs to be set to false. For example, if using the local storage provisioner
# this should be false.
portable: true
# Since the OSDs could end up on any node, an effort needs to be made to spread the OSDs
# across nodes as much as possible. Unfortunately the pod anti-affinity breaks down
# as soon as you have more than one OSD per node. If you have more OSDs than nodes, K8s may
# choose to schedule many of them on the same node. What we need is the Pod Topology
# Spread Constraints, which is alpha in K8s 1.16. This means that a feature gate must be
# enabled for this feature, and Rook also still needs to add support for this feature.
# Another approach for a small number of OSDs is to create a separate device set for each
# zone (or other set of nodes with a common label) so that the OSDs will end up on different
# nodes. This would require adding nodeAffinity to the placement here.
placement:
tolerations:
- key: storage-node
operator: Exists
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: agentpool
operator: In
values:
- npstorage
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- rook-ceph-osd
- key: app
operator: In
values:
- rook-ceph-osd-prepare
topologyKey: kubernetes.io/hostname
resources:
limits:
cpu: "500m"
memory: "4Gi"
requests:
cpu: "500m"
memory: "2Gi"
volumeClaimTemplates:
- metadata:
name: data
spec:
resources:
requests:
storage: 100Gi
storageClassName: managed-premium
volumeMode: Block
accessModes:
- ReadWriteOnce
disruptionManagement:
managePodBudgets: false
osdMaintenanceTimeout: 30
manageMachineDisruptionBudgets: false
machineDisruptionBudgetNamespace: openshift-machine-api
Любая помощь или руководство будут оценены. Заранее спасибо.