Rookv1.2 не добавляет ярлыки на CrushMap
В настоящее время я работаю с Rook v1.2.2 для создания кластера Ceph в моем кластере Kubernetes (v1.16.3), и мне не удается добавить уровень стойки на моем CrushMap.
Я хочу уйти с:
ID CLASS WEIGHT TYPE NAME
-1 0.02737 root default
-3 0.01369 host test-w1
0 hdd 0.01369 osd.0
-5 0.01369 host test-w2
1 hdd 0.01369 osd.1
примерно так:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.01358 root default
-5 0.01358 zone zone1
-4 0.01358 rack rack1
-3 0.01358 host mynode
0 hdd 0.00679 osd.0 up 1.00000 1.00000
1 hdd 0.00679 osd.1 up 1.00000 1.00000
Как описано в официальном документе о ладьях (https://rook.io/docs/rook/v1.2/ceph-cluster-crd.html).
Шаги, которые я сделал:
У меня есть кластер Kubernetes v1.16.3 с 1 мастером (test-m1) и двумя рабочими (test-w1 и test-w2). Я установил этот кластер, используя конфигурацию Kubespray по умолчанию (https://kubespray.io/).
Я пометил свой узел:
kubectl label node test-w1 topology.rook.io/rack=rack1
kubectl label node test-w2 topology.rook.io/rack=rack2
Я добавил этикетку role=storage-node
и портить storage-node=true:NoSchedule
чтобы заставить Rook работать на определенных узлах хранения, вот полный пример меток и заражений для одного узла хранения:
Name: test-w1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=test-w1
kubernetes.io/os=linux
role=storage-node
topology.rook.io/rack=rack1
Annotations: csi.volume.kubernetes.io/nodeid: {"rook-ceph.cephfs.csi.ceph.com":"test-w1","rook-ceph.rbd.csi.ceph.com":"test-w1"}
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 29 Jan 2020 03:38:52 +0100
Taints: storage-node=true:NoSchedule
Я начал развертывать common.yml Rook: https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/common.yaml
Я применил специальный файл operator.yml, чтобы иметь возможность запускать оператор, csi-plugin и агент на узлах с меткой "role=storage-node":
#################################################################################################################
# The deployment for the rook operator
# Contains the common settings for most Kubernetes deployments.
# For example, to create the rook-ceph cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
#
# Also see other operator sample files for variations of operator.yaml:
# - operator-openshift.yaml: Common settings for running in OpenShift
#################################################################################################################
# OLM: BEGIN OPERATOR DEPLOYMENT
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph
labels:
operator: rook
storage-backend: ceph
spec:
selector:
matchLabels:
app: rook-ceph-operator
replicas: 1
template:
metadata:
labels:
app: rook-ceph-operator
spec:
serviceAccountName: rook-ceph-system
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: role
operator: In
values:
- storage-node
tolerations:
- key: "storage-node"
operator: "Exists"
effect: "NoSchedule"
containers:
- name: rook-ceph-operator
image: rook/ceph:v1.2.2
args: ["ceph", "operator"]
volumeMounts:
- mountPath: /var/lib/rook
name: rook-config
- mountPath: /etc/ceph
name: default-config-dir
env:
# If the operator should only watch for cluster CRDs in the same namespace, set this to "true".
# If this is not set to true, the operator will watch for cluster CRDs in all namespaces.
- name: ROOK_CURRENT_NAMESPACE_ONLY
value: "false"
# To disable RBAC, uncomment the following:
# - name: RBAC_ENABLED
# value: "false"
# Rook Agent toleration. Will tolerate all taints with all keys.
# Choose between NoSchedule, PreferNoSchedule and NoExecute:
# - name: AGENT_TOLERATION
# value: "NoSchedule"
# (Optional) Rook Agent toleration key. Set this to the key of the taint you want to tolerate
# - name: AGENT_TOLERATION_KEY
# value: "storage-node"
# (Optional) Rook Agent tolerations list. Put here list of taints you want to tolerate in YAML format.
- name: AGENT_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-class
operator: Exists
# (Optional) Rook Agent priority class name to set on the pod(s)
# - name: AGENT_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Rook Agent NodeAffinity.
- name: AGENT_NODE_AFFINITY
value: "role=storage-node"
# (Optional) Rook Agent mount security mode. Can by `Any` or `Restricted`.
# `Any` uses Ceph admin credentials by default/fallback.
# For using `Restricted` you must have a Ceph secret in each namespace storage should be consumed from and
# set `mountUser` to the Ceph user, `mountSecret` to the Kubernetes secret name.
# to the namespace in which the `mountSecret` Kubernetes secret namespace.
# - name: AGENT_MOUNT_SECURITY_MODE
# value: "Any"
# Set the path where the Rook agent can find the flex volumes
# - name: FLEXVOLUME_DIR_PATH
# value: "<PathToFlexVolumes>"
# Set the path where kernel modules can be found
# - name: LIB_MODULES_DIR_PATH
# value: "<PathToLibModules>"
# Mount any extra directories into the agent container
# - name: AGENT_MOUNTS
# value: "somemount=/host/path:/container/path,someothermount=/host/path2:/container/path2"
# Rook Discover toleration. Will tolerate all taints with all keys.
# Choose between NoSchedule, PreferNoSchedule and NoExecute:
# - name: DISCOVER_TOLERATION
# value: "NoSchedule"
# (Optional) Rook Discover toleration key. Set this to the key of the taint you want to tolerate
# - name: DISCOVER_TOLERATION_KEY
# value: "storage-node"
# (Optional) Rook Discover tolerations list. Put here list of taints you want to tolerate in YAML format.
- name: DISCOVER_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
# (Optional) Rook Discover priority class name to set on the pod(s)
# - name: DISCOVER_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Discover Agent NodeAffinity.
- name: DISCOVER_AGENT_NODE_AFFINITY
value: "role=storage-node"
# Allow rook to create multiple file systems. Note: This is considered
# an experimental feature in Ceph as described at
# http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster
# which might cause mons to crash as seen in https://github.com/rook/rook/issues/1027
- name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
value: "false"
# The logging level for the operator: INFO | DEBUG
- name: ROOK_LOG_LEVEL
value: "INFO"
# The interval to check the health of the ceph cluster and update the status in the custom resource.
- name: ROOK_CEPH_STATUS_CHECK_INTERVAL
value: "60s"
# The interval to check if every mon is in the quorum.
- name: ROOK_MON_HEALTHCHECK_INTERVAL
value: "45s"
# The duration to wait before trying to failover or remove/replace the
# current mon with a new mon (useful for compensating flapping network).
- name: ROOK_MON_OUT_TIMEOUT
value: "600s"
# The duration between discovering devices in the rook-discover daemonset.
- name: ROOK_DISCOVER_DEVICES_INTERVAL
value: "60m"
# Whether to start pods as privileged that mount a host path, which includes the Ceph mon and osd pods.
# This is necessary to workaround the anyuid issues when running on OpenShift.
# For more details see https://github.com/rook/rook/issues/1314#issuecomment-355799641
- name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "false"
# In some situations SELinux relabelling breaks (times out) on large filesystems, and doesn't work with cephfs ReadWriteMany volumes (last relabel wins).
# Disable it here if you have similar issues.
# For more details see https://github.com/rook/rook/issues/2417
- name: ROOK_ENABLE_SELINUX_RELABELING
value: "true"
# In large volumes it will take some time to chown all the files. Disable it here if you have performance issues.
# For more details see https://github.com/rook/rook/issues/2254
- name: ROOK_ENABLE_FSGROUP
value: "true"
# Disable automatic orchestration when new devices are discovered
- name: ROOK_DISABLE_DEVICE_HOTPLUG
value: "false"
# Provide customised regex as the values using comma. For eg. regex for rbd based volume, value will be like "(?i)rbd[0-9]+".
# In case of more than one regex, use comma to seperate between them.
# Default regex will be "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
# Add regex expression after putting a comma to blacklist a disk
# If value is empty, the default regex will be used.
- name: DISCOVER_DAEMON_UDEV_BLACKLIST
value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
# Whether to enable the flex driver. By default it is enabled and is fully supported, but will be deprecated in some future release
# in favor of the CSI driver.
- name: ROOK_ENABLE_FLEX_DRIVER
value: "false"
# Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster.
# This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs.
- name: ROOK_ENABLE_DISCOVERY_DAEMON
value: "true"
# Enable the default version of the CSI CephFS driver. To start another version of the CSI driver, see image properties below.
- name: ROOK_CSI_ENABLE_CEPHFS
value: "true"
# Enable the default version of the CSI RBD driver. To start another version of the CSI driver, see image properties below.
- name: ROOK_CSI_ENABLE_RBD
value: "true"
- name: ROOK_CSI_ENABLE_GRPC_METRICS
value: "true"
# Enable deployment of snapshotter container in ceph-csi provisioner.
- name: CSI_ENABLE_SNAPSHOTTER
value: "true"
# Enable Ceph Kernel clients on kernel < 4.17 which support quotas for Cephfs
# If you disable the kernel client, your application may be disrupted during upgrade.
# See the upgrade guide: https://rook.io/docs/rook/v1.2/ceph-upgrade.html
- name: CSI_FORCE_CEPHFS_KERNEL_CLIENT
value: "true"
# CSI CephFS plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
# Default value is RollingUpdate.
#- name: CSI_CEPHFS_PLUGIN_UPDATE_STRATEGY
# value: "OnDelete"
# CSI Rbd plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
# Default value is RollingUpdate.
#- name: CSI_RBD_PLUGIN_UPDATE_STRATEGY
# value: "OnDelete"
# The default version of CSI supported by Rook will be started. To change the version
# of the CSI driver to something other than what is officially supported, change
# these images to the desired release of the CSI driver.
#- name: ROOK_CSI_CEPH_IMAGE
# value: "quay.io/cephcsi/cephcsi:v1.2.2"
#- name: ROOK_CSI_REGISTRAR_IMAGE
# value: "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
#- name: ROOK_CSI_PROVISIONER_IMAGE
# value: "quay.io/k8scsi/csi-provisioner:v1.4.0"
#- name: ROOK_CSI_SNAPSHOTTER_IMAGE
# value: "quay.io/k8scsi/csi-snapshotter:v1.2.2"
#- name: ROOK_CSI_ATTACHER_IMAGE
# value: "quay.io/k8scsi/csi-attacher:v1.2.0"
# kubelet directory path, if kubelet configured to use other than /var/lib/kubelet path.
#- name: ROOK_CSI_KUBELET_DIR_PATH
# value: "/var/lib/kubelet"
# (Optional) Ceph Provisioner NodeAffinity.
- name: CSI_PROVISIONER_NODE_AFFINITY
value: "role=storage-node"
# (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format.
# CSI provisioner would be best to start on the same nodes as other ceph daemons.
- name: CSI_PROVISIONER_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
# (Optional) Ceph CSI plugin NodeAffinity.
- name: CSI_PLUGIN_NODE_AFFINITY
value: "role=storage-node"
# (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format.
# CSI plugins need to be started on all the nodes where the clients need to mount the storage.
- name: CSI_PLUGIN_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
# Configure CSI cephfs grpc and liveness metrics port
#- name: CSI_CEPHFS_GRPC_METRICS_PORT
# value: "9091"
#- name: CSI_CEPHFS_LIVENESS_METRICS_PORT
# value: "9081"
# Configure CSI rbd grpc and liveness metrics port
#- name: CSI_RBD_GRPC_METRICS_PORT
# value: "9090"
#- name: CSI_RBD_LIVENESS_METRICS_PORT
# value: "9080"
# Time to wait until the node controller will move Rook pods to other
# nodes after detecting an unreachable node.
# Pods affected by this setting are:
# mgr, rbd, mds, rgw, nfs, PVC based mons and osds, and ceph toolbox
# The value used in this variable replaces the default value of 300 secs
# added automatically by k8s as Toleration for
# <node.kubernetes.io/unreachable>
# The total amount of time to reschedule Rook pods in healthy nodes
# before detecting a <not ready node> condition will be the sum of:
# --> node-monitor-grace-period: 40 seconds (k8s kube-controller-manager flag)
# --> ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS: 5 seconds
- name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS
value: "5"
# The name of the node to pass with the downward API
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# The pod name to pass with the downward API
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# The pod namespace to pass with the downward API
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Uncomment it to run rook operator on the host network
#hostNetwork: true
volumes:
- name: rook-config
emptyDir: {}
- name: default-config-dir
emptyDir: {}
# OLM: END OPERATOR DEPLOYMENT
Затем я применил свой собственный файл ceph-cluster.yml, чтобы модули могли запускаться на узлах с меткой "role=storage-node"
#################################################################################################################
# Define the settings for the rook-ceph cluster with settings that should only be used in a test environment.
# A single filestore OSD will be created in the dataDirHostPath.
# For example, to create the cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f ceph-cluster.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: ceph/ceph:v14.2.5
allowUnsupported: true
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
mon:
count: 1
allowMultiplePerNode: true
dashboard:
enabled: true
ssl: true
monitoring:
enabled: false # requires Prometheus to be pre-installed
rulesNamespace: rook-ceph
network:
hostNetwork: false
rbdMirroring:
workers: 0
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: role
operator: In
values:
- storage-node
tolerations:
- key: "storage-node"
operator: "Exists"
effect: "NoSchedule"
mgr:
modules:
# the pg_autoscaler is only available on nautilus or newer. remove this if testing mimic.
- name: pg_autoscaler
enabled: true
storage:
useAllNodes: false
useAllDevices: false
nodes:
- name: "test-w1"
directories:
- path: /var/lib/rook
- name: "test-w2"
directories:
- path: /var/lib/rook
С этой конфигурацией Ладья не применяет надписи на карте сокрушения. Если я установлю toolbox.yml (https://rook.io/docs/rook/v1.2/ceph-toolbox.html), зайдите в него и запустите
ceph osd tree
ceph osd crush tree
У меня такой вывод:
ID CLASS WEIGHT TYPE NAME
-1 0.02737 root default
-3 0.01369 host test-w1
0 hdd 0.01369 osd.0
-5 0.01369 host test-w2
1 hdd 0.01369 osd.1
Как видите, стойка не определена. Даже если я правильно пометил свои узлы.
Что удивительно, так это то, что pods prepare-osd могут извлекать информацию из первой строки следующих журналов:
$ kubectl logs rook-ceph-osd-prepare-test-w1-7cp4f -n rook-ceph
2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
[couppayy@test-m1 test_local]$ cat preposd.txt
2020-01-29 09:59:07.155656 I | cephcmd: desired devices to configure osds: [{Name: OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: IsFilter:false IsDevicePathFilter:false}]
2020-01-29 09:59:07.185024 I | rookcmd: starting Rook v1.2.2 with arguments '/rook/rook ceph osd provision'
2020-01-29 09:59:07.185069 I | rookcmd: flag values: --cluster-id=c9ee638a-1d02-4ad9-95c9-cb796f61623a, --data-device-filter=, --data-device-path-filter=, --data-devices=, --data-directories=/var/lib/rook, --encrypted-device=false, --force-format=false, --help=false, --location=, --log-flush-frequency=5s, --log-level=INFO, --metadata-device=, --node-name=test-w1, --operator-image=, --osd-database-size=0, --osd-journal-size=5120, --osd-store=, --osd-wal-size=576, --osds-per-device=1, --pvc-backed-osd=false, --service-account=
2020-01-29 09:59:07.185108 I | op-mon: parsing mon endpoints: a=10.233.35.212:6789
2020-01-29 09:59:07.272603 I | op-osd: CRUSH location=root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.313099 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-01-29 09:59:07.313397 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2020-01-29 09:59:07.322175 I | cephosd: discovering hardware
2020-01-29 09:59:07.322228 I | exec: Running command: lsblk --all --noheadings --list --output KNAME
2020-01-29 09:59:07.365036 I | exec: Running command: lsblk /dev/sda --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.416812 W | inventory: skipping device sda: Failed to complete 'lsblk /dev/sda': exit status 1. lsblk: /dev/sda: not a block device
2020-01-29 09:59:07.416873 I | exec: Running command: lsblk /dev/sda1 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.450851 W | inventory: skipping device sda1: Failed to complete 'lsblk /dev/sda1': exit status 1. lsblk: /dev/sda1: not a block device
2020-01-29 09:59:07.450892 I | exec: Running command: lsblk /dev/sda2 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.457890 W | inventory: skipping device sda2: Failed to complete 'lsblk /dev/sda2': exit status 1. lsblk: /dev/sda2: not a block device
2020-01-29 09:59:07.457934 I | exec: Running command: lsblk /dev/sr0 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.503758 W | inventory: skipping device sr0: Failed to complete 'lsblk /dev/sr0': exit status 1. lsblk: /dev/sr0: not a block device
2020-01-29 09:59:07.503793 I | cephosd: creating and starting the osds
2020-01-29 09:59:07.543504 I | cephosd: configuring osd devices: {"Entries":{}}
2020-01-29 09:59:07.543554 I | exec: Running command: ceph-volume lvm batch --prepare
2020-01-29 09:59:08.906271 I | cephosd: no more devices to configure
2020-01-29 09:59:08.906311 I | exec: Running command: ceph-volume lvm list --format json
2020-01-29 09:59:10.841568 I | cephosd: 0 ceph-volume osd devices configured on this node
2020-01-29 09:59:10.841595 I | cephosd: devices = []
2020-01-29 09:59:10.847396 I | cephosd: configuring osd dirs: map[/var/lib/rook:-1]
2020-01-29 09:59:10.848011 I | exec: Running command: ceph osd create 652071c9-2cdb-4df9-a20e-813738c4e3f6 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/851021116
2020-01-29 09:59:14.213679 I | cephosd: successfully created OSD 652071c9-2cdb-4df9-a20e-813738c4e3f6 with ID 0
2020-01-29 09:59:14.213744 I | cephosd: osd.0 appears to be new, cleaning the root dir at /var/lib/rook/osd0
2020-01-29 09:59:14.214417 I | cephconfig: writing config file /var/lib/rook/osd0/rook-ceph.config
2020-01-29 09:59:14.214653 I | exec: Running command: ceph auth get-or-create osd.0 -o /var/lib/rook/osd0/keyring osd allow * mon allow profile osd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format plain
2020-01-29 09:59:17.189996 I | cephosd: Initializing OSD 0 file system at /var/lib/rook/osd0...
2020-01-29 09:59:17.194681 I | exec: Running command: ceph mon getmap --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/298283883
2020-01-29 09:59:20.936868 I | exec: got monmap epoch 1
2020-01-29 09:59:20.937380 I | exec: Running command: ceph-osd --mkfs --id=0 --cluster=rook-ceph --conf=/var/lib/rook/osd0/rook-ceph.config --osd-data=/var/lib/rook/osd0 --osd-uuid=652071c9-2cdb-4df9-a20e-813738c4e3f6 --monmap=/var/lib/rook/osd0/tmp/activate.monmap --keyring=/var/lib/rook/osd0/keyring --osd-journal=/var/lib/rook/osd0/journal
2020-01-29 09:59:21.324912 I | mkfs-osd0: 2020-01-29 09:59:21.323 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.386136 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.387553 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.387585 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.450639 I | cephosd: Config file /var/lib/rook/osd0/rook-ceph.config:
[global]
fsid = a19423a1-f135-446f-b4d9-f52da10a935f
mon initial members = a
mon host = v1:10.233.35.212:6789
public addr = 10.233.95.101
cluster addr = 10.233.95.101
mon keyvaluedb = rocksdb
mon_allow_pool_delete = true
mon_max_pg_per_osd = 1000
debug default = 0
debug rados = 0
debug mon = 0
debug osd = 0
debug bluestore = 0
debug filestore = 0
debug journal = 0
debug leveldb = 0
filestore_omap_backend = rocksdb
osd pg bits = 11
osd pgp bits = 11
osd pool default size = 1
osd pool default pg num = 100
osd pool default pgp num = 100
osd max object name len = 256
osd max object namespace len = 64
osd objectstore = filestore
rbd_default_features = 3
fatal signal handlers = false
[osd.0]
keyring = /var/lib/rook/osd0/keyring
osd journal size = 5120
2020-01-29 09:59:21.450723 I | cephosd: completed preparing osd &{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}
2020-01-29 09:59:21.450743 I | cephosd: 1/1 osd dirs succeeded on this node
2020-01-29 09:59:21.450755 I | cephosd: saving osd dir map
2020-01-29 09:59:21.479301 I | cephosd: device osds:[]
dir osds: [{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}]
Вы хоть представляете, в чем проблема и как я могу ее решить?
1 ответ
Я поговорил с Rook Dev об этой проблеме в этом посте: https://groups.google.com/forum/
Он смог воспроизвести проблему:
Йохан, я также могу воспроизвести эту проблему, когда ярлыки не захватываются OSD, даже если ярлыки обнаруживаются в модуле подготовки OSD, как вы видите. Не могли бы вы открыть для этого проблему с GitHub? Я ищу исправление.
Но похоже, что проблема касалась только OSD, использующих каталоги, и проблема не существует, когда вы используете устройства (например, устройства RAW):
Йохан, я обнаружил, что это влияет только на OSD, созданные в каталогах. Я бы порекомендовал вам протестировать создание OSD на необработанных устройствах, чтобы правильно заполнить карту CRUSH. В выпуске v1.3 также важно отметить, что поддержка каталогов в OSD удаляется. Ожидается, что после этого выпуска OSD будут созданы на необработанных устройствах или разделах. Подробнее см. В этом выпуске: https://github.com/rook/rook/issues/4724
Поскольку поддержка OSD в каталогах будет удалена в следующем выпуске, я не собираюсь исправлять эту проблему.
Как видите, проблема не будет устранена, потому что использование каталогов скоро станет устаревшим.
Я перезапустил свои тесты с использованием устройств RAW вместо каталогов, и это сработало отлично.
Я хочу поблагодарить Трэвиса за оказанную помощь и быстрые ответы!