Anthos Multi Cluster Ingress — прерывистое подключение и исчезновение серверной службы

Я использую частный кластер 2 GKE, настроенный в europe-west2. У меня есть выделенный кластер конфигурации для MCI и рабочий кластер для рабочих нагрузок. Оба кластера зарегистрированы в концентраторе Anthos, а функция входа включена в кластере конфигурации. Кроме того, в рабочем кластере работает последняя версия ASM 1.12.2.

As far as MCI is concerned my deployment is 'standard' as in based on available docs (ie, terraform-example-foundation repo etc).

Everything works but I'm hitting an intermittent connectivity issue no matter how many times I redeploy entire stack. My eyes are bleeding from staring at logging dashboard. I ran out of dots to connect.

I'm probing some endpoints presented from my cluster which most of the time returns 200 with following logged under resource.type="http_load_balancer":

httpRequest: {
 latency: "0.081658s"
 remoteIp: ""
 requestMethod: "GET"
 requestSize: "360"
 requestUrl: ""
 responseSize: "1054"
 serverIp: ""
 status: 200
insertId: "10mjvz4e8g0nq"
jsonPayload: {
 @type: ""
 statusDetails: "response_sent_by_backend"
resource: {
 labels: {
  backend_service_name: "mci-4z8mmz-80-asm-ingress-mcs-istio"
  forwarding_rule_name: "mci-4z8mmz-fws-asm-ingress-mci-istio"
  project_id: "prj-foo-bar"
  target_proxy_name: "mci-4z8mmz-asm-ingress-mci-istio"
  url_map_name: "mci-4z8mmz-asm-ingress-mci-istio"
  zone: "global"
 type: "http_load_balancer"
severity: "INFO"
spanId: "2a986abfc69bef6f"
timestamp: "2022-02-04T15:24:14.160642Z"

At random intervals, anything between 1 - 5 hours the probes start failing with 404 for a period of 5 - 10 mins and following is logged:

httpRequest: {
 requestMethod: "GET"
 requestUrl: ""
 status: 404
insertId: "10mjvz4e8g0nq"
jsonPayload: {
 @type: ""
 statusDetails: "internal_error"
resource: {
 labels: {
  backend_service_name: ""
  forwarding_rule_name: "mci-4z8mmz-fws-asm-ingress-mci-istio"
  project_id: "prj-foo-bar"
  target_proxy_name: "mci-4z8mmz-asm-ingress-mci-istio"
  url_map_name: "mci-4z8mmz-asm-ingress-mci-istio"
  zone: "global"
 type: "http_load_balancer"
severity: "WARNING"

backend_service_name and serverIp disappears and the external LB provisioned via MCI goes for an extended nap. If I try to access the endpoints in a browser during that period i get 404'd and eventually connection was closed.

I've searched logs far and wide and cannot find any leads.

Has anyone experienced a similar issue ? Could this be a regional thing ? I'm yet to try deploying to another region.

Any info/links/ideas much appreciated.


I also confirmed that health checks are fine and there are no transitions. Pods never receive the request so 404's are coming from external lb.

1 ответ

У меня была такая же/похожая проблема при использовании HTTPS с MultiClusterIngress.

Служба поддержки Google предложила использовать буквальный статический IP-адрес для аннотации: STATIC_IP_ADDRESS

Попробуйте использовать буквальный IP-адрес, например


как описано в

Если это не решит проблему, попробуйте обратиться в службу поддержки Google.