Kubernetes: сбой в модулях mysql, rabbitmq и etcd, возможная неверная конфигурация на resolv.conf?

Я запускаю кластер k8s на vSphere и пытаюсь запустить компоненты etcd, mongodb и mysql. При беге kubeadm init У меня возникли проблемы с неработающим контейнером etcd, из-за которого инициализация не могла быть завершена. Я заметил, что resolv.conf перечислил четыре разных имени в search и комментируя search Линия полностью запустила контейнер etcd и включила и запустила kubernetes в кластере.

На каждом узле resolv.conf выглядит так:

# Dynamic resolv.conf(5) file for glibc resolver(3) generated by 
resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE 
OVERWRITTEN
nameserver 127.0.1.1
#search den.solidfire.net one.den.solidfire.net ten.den.solidfire.net 
solidfire.net

так что нет строки поиска. При попытке запустить приложение, работающее с модулями etcd, mysql и rabbitmq, все три сталкиваются с проблемами, при этом прекрасно работают на облачных провайдерах, таких как Azure и AWS.

Mysql дает следующие ошибки:

    Events:
  Type     Reason                  Age                 From                      Message
  ----     ------                  ----                ----                      -------
  Warning  FailedScheduling        52m (x3 over 52m)   default-scheduler         pod has unbound PersistentVolumeClaims (repeated 3 times)
  Normal   Scheduled               52m                 default-scheduler         Successfully assigned mysql-0 to sde-slave-test3
  Normal   SuccessfulAttachVolume  52m                 attachdetach-controller   AttachVolume.Attach succeeded for volume "default-datadir-mysql-0-25e1f"
  Normal   SuccessfulMountVolume   52m                 kubelet, sde-slave-test3  MountVolume.SetUp succeeded for volume "config-emptydir"
  Normal   SuccessfulMountVolume   52m                 kubelet, sde-slave-test3  MountVolume.SetUp succeeded for volume "config"
  Normal   SuccessfulMountVolume   52m                 kubelet, sde-slave-test3  MountVolume.SetUp succeeded for volume "default-token-x2fsd"
  Normal   SuccessfulMountVolume   52m                 kubelet, sde-slave-test3  MountVolume.SetUp succeeded for volume "default-datadir-mysql-0-25e1f"
  Normal   Started                 52m                 kubelet, sde-slave-test3  Started container
  Normal   Pulled                  52m                 kubelet, sde-slave-test3  Container image "registry.qstack.com/qstack/mariadb-cluster:10.3.1" already present on machine
  Normal   Created                 52m                 kubelet, sde-slave-test3  Created container
  Normal   Pulled                  52m                 kubelet, sde-slave-test3  Container image "registry.qstack.com/qstack/mysqld-exporter:1.1" already present on machine
  Normal   Created                 52m                 kubelet, sde-slave-test3  Created container
  Normal   Started                 52m                 kubelet, sde-slave-test3  Started container
  Warning  Unhealthy               51m (x2 over 51m)   kubelet, sde-slave-test3  Liveness probe failed: Get http://10.42.0.9:9104/metrics: dial tcp 10.42.0.9:9104: getsockopt: connection refused
  Warning  Unhealthy               51m (x3 over 52m)   kubelet, sde-slave-test3  Readiness probe failed: Get http://10.42.0.9:9104/metrics: dial tcp 10.42.0.9:9104: getsockopt: connection refused
  Warning  Unhealthy               51m                 kubelet, sde-slave-test3  Liveness probe failed: Get http://10.42.0.9:9104/metrics: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy               51m (x3 over 51m)   kubelet, sde-slave-test3  Readiness probe failed: Get http://10.42.0.9:9104/metrics: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy               7m (x267 over 51m)  kubelet, sde-slave-test3  Readiness probe failed: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  Warning  BackOff                 2m (x148 over 45m)  kubelet, sde-slave-test3  Back-off restarting failed container

Rabbitmq завершает работу перед тем, как ввести crashloopbackoff:

2018-08-08 13:59:02.268 [info] <0.198.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2018-08-08 13:59:02.268 [info] <0.198.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2018-08-08 13:59:07.275 [info] <0.198.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.default.svc.cluster.local",443}},
                 {inet,[inet],nxdomain}]}
2018-08-08 13:59:07.276 [error] <0.197.0> CRASH REPORT Process <0.197.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 163 in application_master:init/4 line 134
2018-08-08 13:59:07.276 [info] <0.33.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.default.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 163
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.default.svc.cluster.local\\",443}},\n                 {inet,[inet],nxdomain}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,163}]},{rabbit_mnesia,init_with_lock,3,[{file,\"src/rabbit_mnesia.erl\"},{line,143}]},{rabbit_mnesia,init,0,[{file,\"src/rabbit_mnesia.erl\"},{line,111}]},{rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,run_step,2,[{file,\"src/rabbit_boot_steps.erl\"},{line,49}]},{rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit_boot_steps,run_boot_steps,1,[{file,\"src/rabbit_boot_steps.erl\"},{line,26}]},{rabbit,start,2,[{file,\"src/rabbit.erl\"},{line,792}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,"{failed_connect,[{to_address,{\"kubernetes.defau

Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done

И тддд бесконечно висит на:

Waiting for etcd-0.etcd to come up
ping: bad address 'etcd-0.etcd'
Waiting for etcd-0.etcd to come up
ping: bad address 'etcd-0.etcd'
Waiting for etcd-0.etcd to come up
ping: bad address 'etcd-0.etcd'
Waiting for etcd-0.etcd to come up

Теперь я думаю, что это может быть связано с тем, что resolv.conf был подделан. У кого-нибудь был опыт с этим?

При необходимости я могу опубликовать больше логов или при необходимости опубликовать спецификации k8s для каждого компонента.

0 ответов

Другие вопросы по тегам