Kubernetes: сбой в модулях mysql, rabbitmq и etcd, возможная неверная конфигурация на resolv.conf?
Я запускаю кластер k8s на vSphere и пытаюсь запустить компоненты etcd, mongodb и mysql. При беге kubeadm init
У меня возникли проблемы с неработающим контейнером etcd, из-за которого инициализация не могла быть завершена. Я заметил, что resolv.conf перечислил четыре разных имени в search
и комментируя search
Линия полностью запустила контейнер etcd и включила и запустила kubernetes в кластере.
На каждом узле resolv.conf выглядит так:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by
resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE
OVERWRITTEN
nameserver 127.0.1.1
#search den.solidfire.net one.den.solidfire.net ten.den.solidfire.net
solidfire.net
так что нет строки поиска. При попытке запустить приложение, работающее с модулями etcd, mysql и rabbitmq, все три сталкиваются с проблемами, при этом прекрасно работают на облачных провайдерах, таких как Azure и AWS.
Mysql дает следующие ошибки:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 52m (x3 over 52m) default-scheduler pod has unbound PersistentVolumeClaims (repeated 3 times)
Normal Scheduled 52m default-scheduler Successfully assigned mysql-0 to sde-slave-test3
Normal SuccessfulAttachVolume 52m attachdetach-controller AttachVolume.Attach succeeded for volume "default-datadir-mysql-0-25e1f"
Normal SuccessfulMountVolume 52m kubelet, sde-slave-test3 MountVolume.SetUp succeeded for volume "config-emptydir"
Normal SuccessfulMountVolume 52m kubelet, sde-slave-test3 MountVolume.SetUp succeeded for volume "config"
Normal SuccessfulMountVolume 52m kubelet, sde-slave-test3 MountVolume.SetUp succeeded for volume "default-token-x2fsd"
Normal SuccessfulMountVolume 52m kubelet, sde-slave-test3 MountVolume.SetUp succeeded for volume "default-datadir-mysql-0-25e1f"
Normal Started 52m kubelet, sde-slave-test3 Started container
Normal Pulled 52m kubelet, sde-slave-test3 Container image "registry.qstack.com/qstack/mariadb-cluster:10.3.1" already present on machine
Normal Created 52m kubelet, sde-slave-test3 Created container
Normal Pulled 52m kubelet, sde-slave-test3 Container image "registry.qstack.com/qstack/mysqld-exporter:1.1" already present on machine
Normal Created 52m kubelet, sde-slave-test3 Created container
Normal Started 52m kubelet, sde-slave-test3 Started container
Warning Unhealthy 51m (x2 over 51m) kubelet, sde-slave-test3 Liveness probe failed: Get http://10.42.0.9:9104/metrics: dial tcp 10.42.0.9:9104: getsockopt: connection refused
Warning Unhealthy 51m (x3 over 52m) kubelet, sde-slave-test3 Readiness probe failed: Get http://10.42.0.9:9104/metrics: dial tcp 10.42.0.9:9104: getsockopt: connection refused
Warning Unhealthy 51m kubelet, sde-slave-test3 Liveness probe failed: Get http://10.42.0.9:9104/metrics: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 51m (x3 over 51m) kubelet, sde-slave-test3 Readiness probe failed: Get http://10.42.0.9:9104/metrics: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 7m (x267 over 51m) kubelet, sde-slave-test3 Readiness probe failed: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
Warning BackOff 2m (x148 over 45m) kubelet, sde-slave-test3 Back-off restarting failed container
Rabbitmq завершает работу перед тем, как ввести crashloopbackoff:
2018-08-08 13:59:02.268 [info] <0.198.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2018-08-08 13:59:02.268 [info] <0.198.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2018-08-08 13:59:07.275 [info] <0.198.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},
{inet,[inet],nxdomain}]}
2018-08-08 13:59:07.276 [error] <0.197.0> CRASH REPORT Process <0.197.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 163 in application_master:init/4 line 134
2018-08-08 13:59:07.276 [info] <0.33.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 163
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.default.svc.cluster.local\\",443}},\n {inet,[inet],nxdomain}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,163}]},{rabbit_mnesia,init_with_lock,3,[{file,\"src/rabbit_mnesia.erl\"},{line,143}]},{rabbit_mnesia,init,0,[{file,\"src/rabbit_mnesia.erl\"},{line,111}]},{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,run_step,2,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit_boot_steps,run_boot_steps,1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit,start,2,[{file,\"src/rabbit.erl\"},{line,792}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,"{failed_connect,[{to_address,{\"kubernetes.defau
Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done
И тддд бесконечно висит на:
Waiting for etcd-0.etcd to come up
ping: bad address 'etcd-0.etcd'
Waiting for etcd-0.etcd to come up
ping: bad address 'etcd-0.etcd'
Waiting for etcd-0.etcd to come up
ping: bad address 'etcd-0.etcd'
Waiting for etcd-0.etcd to come up
Теперь я думаю, что это может быть связано с тем, что resolv.conf был подделан. У кого-нибудь был опыт с этим?
При необходимости я могу опубликовать больше логов или при необходимости опубликовать спецификации k8s для каждого компонента.