a story

[4] Cilium에서 NodeLocalDNS 사용 본문

Cilium

[4] Cilium에서 NodeLocalDNS 사용

한명 2025. 8. 2. 21:43

이번 포스트에서는 파드 DNS 질의의 기본적인 동작과, NodeLocalDNS를 사용하는 방법을 살펴보겠습니다.

 

목차

  1. 파드의 DNS질의와 coredns
  2. NodeLocalDNS 사용
  3. Cilium의 Local Redirect Policy

 

1. 파드의 DNS질의와 coredns

쿠버네티스 환경에서 DNS를 어떤 방식으로 처리하는지 coreDNS 설정과 같이 살펴보겠습니다.

# 파드의 DNS 설정 정보 확인 (search domain과 nameserver, ndots 설정 확인)
kubectl exec -it curl-pod -- cat /etc/resolv.conf

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it curl-pod -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5

# 이 설정은 kubelet에서 설정을 확인할 수 있다.
cat /var/lib/kubelet/config.yaml | grep cluster -A1

(⎈|HomeLab:N/A) root@k8s-ctr:~# cat /var/lib/kubelet/config.yaml | grep cluster -A1
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
containerRuntimeEndpoint: ""

# 10.96.0.10 자체는 kube-dns에 대한 Cluster IP이며, 이는 coredns 파드들로 연결된다.
kubectl get svc,ep -n kube-system kube-dns
kubectl get pod -n kube-system -l k8s-app=kube-dns

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl get svc,ep -n kube-system kube-dns
Warning: v1 Endpoints is deprecated in v1.33+; use discovery.k8s.io/v1 EndpointSlice
NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
service/kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   4h40m

NAME                 ENDPOINTS                                                     AGE
endpoints/kube-dns   172.20.0.224:53,172.20.1.107:53,172.20.0.224:53 + 3 more...   4h40m
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl get pod -n kube-system -l k8s-app=kube-dns -owide
NAME                       READY   STATUS    RESTARTS   AGE    IP             NODE      NOMINATED NODE   READINESS GATES
coredns-674b8bbfcf-j48sp   1/1     Running   0          3h8m   172.20.1.107   k8s-ctr   <none>           <none>
coredns-674b8bbfcf-zfdq8   1/1     Running   0          3h8m   172.20.0.224   k8s-w1    <none>           <none>

# coredns 에 대한 config로 coredns라는 ConfigMap을 참조한다.
kubectl describe pod -n kube-system -l k8s-app=kube-dns
...
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
...

# coredns 로 정의된 설정을 살펴본다.
kubectl describe cm -n kube-system coredns
...
Corefile:
----
.:53 {              # 모든 도메인 요청을 53포트에서 수신
    errors          # DNS 응답 중 에러가 발생할 경우 로그 출력
    health {        # health 엔드포인트를 제공하여 상태 확인 가능
       lameduck 5s  # 종료 시 5초간 lameduck 모드로 트래픽을 점차 줄이며 종료 (graceful shutdown)
    }
    ready           # ready 엔드포인트 제공, 8181 포트의 HTTP 엔드포인트가, 모든 플러그인이 준비되었다는 신호를 보내면 200 OK 를 반환
    kubernetes cluster.local in-addr.arpa ip6.arpa {    # Kubernetes DNS 플러그인 설정(클러스터 내부 도메인 처리), cluster.local: 클러스터 도메인
       pods insecure                         # 파드 IP로 DNS 조회 허용 (보안 없음)
       fallthrough in-addr.arpa ip6.arpa     # 해당 도메인에서 결과 없으면 다음 플러그인으로 전달
       ttl 30                                #  캐시 타임 (30초)
    }
    prometheus :9153 # Prometheus metrics 수집 가능
    forward . /etc/resolv.conf {             # fallback으로 떨어지면, 즉, CoreDNS가 모르는 도메인은 지정된 업스트림(보통 외부 DNS)으로 전달, .: 모든 쿼리
       max_concurrent 1000                   # 병렬 포워딩 최대 1000개
    }
    cache 30 {                        # DNS 응답 캐시 기능, 기본 캐시 TTL 30초
       disable success cluster.local  # 성공 응답 캐시 안 함 (cluster.local 도메인)
       disable denial cluster.local   # NXDOMAIN 응답도 캐시 안 함
    } 
    loop         # 간단한 전달 루프(loop)를 감지하고, 루프가 발견되면 CoreDNS 프로세스를 중단(halt).
    reload       # Corefile 이 변경되었을 때 자동으로 재적용, 컨피그맵 설정을 변경한 후에 변경 사항이 적용되기 위하여 약 2분정도 소요.
    loadbalance  # 응답에 대하여 A, AAAA, MX 레코드의 순서를 무작위로 선정하는 라운드-로빈 DNS 로드밸런서.
}

# 호스트의 /etc/resolv.conf 확인
(⎈|HomeLab:N/A) root@k8s-ctr:~# cat /etc/resolv.conf |grep -v "#"

nameserver 127.0.0.53
options edns0 trust-ad
search .

# 127.0.0.53은 내부적으로 아래를 참조한다.
resolvectl 

(⎈|HomeLab:N/A) root@k8s-ctr:~# resolvectl
Global
         Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
  resolv.conf mode: stub

Link 2 (eth0)
    Current Scopes: DNS
         Protocols: +DefaultRoute -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 10.0.2.3
       DNS Servers: 10.0.2.3

 

이를 통해서 살펴보면, 파드는 coredns 로 DNS를 질의하고, coredns는 cluster Domain에 대해서 응답을 하며, 그 외의 도메인에 대해서는 노드의 /etc/resolv.conf의 설정에 따라 DNS 질의에 대한 응답이 이뤄지는 것을 알 수 있습니다.

 

실습 환경에서 DNS 질의를 테스트 해보겠습니다.

# 모니터링1
cilium hubble port-forward&
hubble observe -f --port 53
hubble observe -f --port 53 --protocol UDP
hubble observe -f --pod curl-pod --port 53

# 모니터링2
tcpdump -i any udp port 53 -nn

# 실습 편리를 위해 coredns 파드를 1개로 축소
kubectl scale deployment -n kube-system coredns --replicas 1
kubectl get pod -n kube-system -l k8s-app=kube-dns -owide

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl scale deployment -n kube-system coredns --replicas 1
deployment.apps/coredns scaled
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl get pod -n kube-system -l k8s-app=kube-dns -owide
NAME                       READY   STATUS        RESTARTS   AGE     IP             NODE      NOMINATED NODE   READINESS GATES
coredns-674b8bbfcf-j48sp   1/1     Running       0          3h22m   172.20.1.107   k8s-ctr   <none>           <none>

# coredns 의 cache hit관련 메트릭 정보 확인
kubectl exec -it curl-pod -- curl kube-dns.kube-system.svc:9153/metrics | grep coredns_cache_ | grep -v ^#

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it curl-pod -- curl kube-dns.kube-system.svc:9153/metrics | grep coredns_cache_ | grep -v ^#
coredns_cache_entries{server="dns://:53",type="denial",view="",zones="."} 1
coredns_cache_entries{server="dns://:53",type="success",view="",zones="."} 0
coredns_cache_misses_total{server="dns://:53",view="",zones="."} 31
coredns_cache_requests_total{server="dns://:53",view="",zones="."} 31


# 도메인 질의
kubectl exec -it curl-pod -- nslookup -debug webpod
kubectl exec -it curl-pod -- nslookup -debug google.com

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it curl-pod -- nslookup -debug webpod
;; Got recursion not available from 10.96.0.10
Server:         10.96.0.10
Address:        10.96.0.10#53

------------
    QUESTIONS:
        webpod.default.svc.cluster.local, type = A, class = IN
    ANSWERS:
    ->  webpod.default.svc.cluster.local
        internet address = 10.96.195.112
        ttl = 30
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Name:   webpod.default.svc.cluster.local
Address: 10.96.195.112
;; Got recursion not available from 10.96.0.10
------------
    QUESTIONS:
        webpod.default.svc.cluster.local, type = AAAA, class = IN
    ANSWERS:
    AUTHORITY RECORDS:
    ->  cluster.local
        origin = ns.dns.cluster.local
        mail addr = hostmaster.cluster.local
        serial = 1754132506
        refresh = 7200
        retry = 1800
        expire = 86400
        minimum = 30
        ttl = 30
    ADDITIONAL RECORDS:
------------

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it curl-pod -- nslookup -debug google.com
;; Got recursion not available from 10.96.0.10
Server:         10.96.0.10
Address:        10.96.0.10#53

------------
    QUESTIONS:
        google.com.default.svc.cluster.local, type = A, class = IN # search domain을 붙여서 질의함
    ANSWERS:
    AUTHORITY RECORDS:
    ->  cluster.local
        origin = ns.dns.cluster.local
        mail addr = hostmaster.cluster.local
        serial = 1754132506
        refresh = 7200
        retry = 1800
        expire = 86400
        minimum = 30
        ttl = 30
    ADDITIONAL RECORDS:
------------
** server can't find google.com.default.svc.cluster.local: NXDOMAIN
;; Got recursion not available from 10.96.0.10
Server:         10.96.0.10
Address:        10.96.0.10#53

------------
    QUESTIONS:
        google.com.svc.cluster.local, type = A, class = IN # search domain을 붙여서 질의함
    ANSWERS:
    AUTHORITY RECORDS:
    ->  cluster.local
        origin = ns.dns.cluster.local
        mail addr = hostmaster.cluster.local
        serial = 1754132506
        refresh = 7200
        retry = 1800
        expire = 86400
        minimum = 30
        ttl = 30
    ADDITIONAL RECORDS:
------------
** server can't find google.com.svc.cluster.local: NXDOMAIN
;; Got recursion not available from 10.96.0.10
Server:         10.96.0.10
Address:        10.96.0.10#53

------------
    QUESTIONS:
        google.com.cluster.local, type = A, class = IN # search domain을 붙여서 질의함
    ANSWERS:
    AUTHORITY RECORDS:
    ->  cluster.local
        origin = ns.dns.cluster.local
        mail addr = hostmaster.cluster.local
        serial = 1754132506
        refresh = 7200
        retry = 1800
        expire = 86400
        minimum = 30
        ttl = 30
    ADDITIONAL RECORDS:
------------
** server can't find google.com.cluster.local: NXDOMAIN
Server:         10.96.0.10
Address:        10.96.0.10#53

------------
    QUESTIONS:
        google.com, type = A, class = IN # 최종 질의
    ANSWERS:
    ->  google.com
        internet address = 172.217.175.14
        ttl = 30
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Non-authoritative answer:
Name:   google.com
Address: 172.217.175.14
------------
    QUESTIONS:
        google.com, type = AAAA, class = IN
    ANSWERS:
    ->  google.com
        has AAAA address 2404:6800:4004:823::200e
        ttl = 30
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Name:   google.com
Address: 2404:6800:4004:823::200e

command terminated with exit code 1

# coredns 로깅, 디버깅 활성화
k9s → configmap → coredns 선택 → E(edit) → 아래처럼 log, debug 입력 후 빠져나오기
    .:53 {
        log
        debug
        errors

# 로그 모니터링 3
kubectl -n kube-system logs -l k8s-app=kube-dns -f


# 도메인 질의
kubectl exec -it curl-pod -- nslookup webpod
kubectl exec -it curl-pod -- nslookup google.com


# coredns 로그
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl -n kube-system logs -l k8s-app=kube-dns -f
linux/amd64, go1.23.3, 51e11f1
...
# webpod
[INFO] 172.20.1.218:38704 - 5586 "A IN webpod.default.svc.cluster.local. udp 50 false 512" NOERROR qr,aa,rd 98 0.000898836s
[INFO] 172.20.1.218:38693 - 50432 "AAAA IN webpod.default.svc.cluster.local. udp 50 false 512" NOERROR qr,aa,rd 143 0.003870525s
# google.com
[INFO] 172.20.1.218:52681 - 2613 "A IN google.com.default.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.001374837s
[INFO] 172.20.1.218:38214 - 31588 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.006958878s
[INFO] 172.20.1.218:48197 - 4129 "A IN google.com.cluster.local. udp 42 false 512" NXDOMAIN qr,aa,rd 135 0.00506481s
[INFO] 172.20.1.218:56039 - 7882 "A IN google.com. udp 28 false 512" NOERROR qr,rd,ra 54 0.02799622s
[INFO] 172.20.1.218:36634 - 19787 "AAAA IN google.com. udp 28 false 512" NOERROR qr,rd,ra 66 0.021259972s

# hubble observe 로그
(⎈|HomeLab:N/A) root@k8s-ctr:~# hubble observe -f --port 53 --protocol UDP
Aug  2 11:15:28.720: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:28.721: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:28.722: default/curl-pod:38704 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:28.746: default/curl-pod:38704 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:28.746: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:28.746: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)
Aug  2 11:15:28.795: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:28.795: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:28.799: default/curl-pod:38693 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:28.800: default/curl-pod:38693 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:28.805: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:28.805: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)

Aug  2 11:15:51.609: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:51.609: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:51.610: default/curl-pod:52681 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.625: default/curl-pod:52681 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.625: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.625: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)
Aug  2 11:15:51.639: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:51.640: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:51.645: default/curl-pod:38214 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.646: default/curl-pod:38214 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.650: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.650: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)
Aug  2 11:15:51.683: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:51.683: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:51.688: default/curl-pod:48197 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.699: default/curl-pod:48197 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.706: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.710: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)
Aug  2 11:15:51.745: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:51.745: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:51.747: default/curl-pod:56039 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.751: 10.0.2.3:53 (world) <> kube-system/coredns-674b8bbfcf-j48sp (ID:8066) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.751: 10.0.2.3:53 (world) <> kube-system/coredns-674b8bbfcf-j48sp (ID:8066) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.754: 10.0.2.3:53 (world) <> kube-system/coredns-674b8bbfcf-j48sp (ID:8066) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.754: kube-system/coredns-674b8bbfcf-j48sp:55313 (ID:8066) -> 10.0.2.3:53 (world) to-network FORWARDED (UDP)
Aug  2 11:15:51.757: 10.0.2.3:53 (world) <> kube-system/coredns-674b8bbfcf-j48sp (ID:8066) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.757: 10.0.2.3:53 (world) <> kube-system/coredns-674b8bbfcf-j48sp (ID:8066) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.777: default/curl-pod:56039 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.777: kube-system/coredns-674b8bbfcf-j48sp:55313 (ID:8066) <- 10.0.2.3:53 (world) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.803: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.803: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)
Aug  2 11:15:51.824: default/curl-pod (ID:13646) <> 10.96.0.10:53 (world) pre-xlate-fwd TRACED (UDP)
Aug  2 11:15:51.824: default/curl-pod (ID:13646) <> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) post-xlate-fwd TRANSLATED (UDP)
Aug  2 11:15:51.824: default/curl-pod:36634 (ID:13646) -> kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.837: default/curl-pod:36634 (ID:13646) <- kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) to-endpoint FORWARDED (UDP)
Aug  2 11:15:51.837: kube-system/coredns-674b8bbfcf-j48sp:53 (ID:8066) <> default/curl-pod (ID:13646) pre-xlate-rev TRACED (UDP)
Aug  2 11:15:51.837: 10.96.0.10:53 (world) <> default/curl-pod (ID:13646) post-xlate-rev TRANSLATED (UDP)

# tcpdump 로그
(⎈|HomeLab:N/A) root@k8s-ctr:~# tcpdump -i any udp port 53 -nn
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
# webpod
20:15:28.721334 lxcba4acff7647e In  IP 172.20.1.218.38704 > 172.20.1.107.53: 5586+ A? webpod.default.svc.cluster.local. (50)
20:15:28.736149 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.38704: 5586*- 1/0/0 A 10.96.195.112 (98)
20:15:28.796141 lxcba4acff7647e In  IP 172.20.1.218.38693 > 172.20.1.107.53: 50432+ AAAA? webpod.default.svc.cluster.local. (50)
20:15:28.799300 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.38693: 50432*- 0/1/0 (143)
# google.com
20:15:51.610506 lxcba4acff7647e In  IP 172.20.1.218.52681 > 172.20.1.107.53: 2613+ A? google.com.default.svc.cluster.local. (54)
20:15:51.612065 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.52681: 2613 NXDomain*- 0/1/0 (147)
20:15:51.640965 lxcba4acff7647e In  IP 172.20.1.218.38214 > 172.20.1.107.53: 31588+ A? google.com.svc.cluster.local. (46)
20:15:51.645524 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.38214: 31588 NXDomain*- 0/1/0 (139)
20:15:51.687149 lxcba4acff7647e In  IP 172.20.1.218.48197 > 172.20.1.107.53: 4129+ A? google.com.cluster.local. (42)
20:15:51.692518 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.48197: 4129 NXDomain*- 0/1/0 (135)
20:15:51.745120 lxcba4acff7647e In  IP 172.20.1.218.56039 > 172.20.1.107.53: 7882+ A? google.com. (28)
20:15:51.753722 lxc624d897a501a In  IP 172.20.1.107.55313 > 10.0.2.3.53: 13520+ A? google.com. (28)
20:15:51.754903 eth0  Out IP 10.0.2.15.55313 > 10.0.2.3.53: 13520+ A? google.com. (28)
20:15:51.774891 eth0  In  IP 10.0.2.3.53 > 10.0.2.15.55313: 13520 1/0/0 A 142.250.196.142 (44)
20:15:51.775751 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.56039: 7882 1/0/0 A 142.250.196.142 (54)
20:15:51.824473 lxcba4acff7647e In  IP 172.20.1.218.36634 > 172.20.1.107.53: 19787+ AAAA? google.com. (28)
20:15:51.825237 lxc624d897a501a In  IP 172.20.1.107.55313 > 10.0.2.3.53: 10205+ AAAA? google.com. (28)
20:15:51.825271 eth0  Out IP 10.0.2.15.55313 > 10.0.2.3.53: 10205+ AAAA? google.com. (28)
20:15:51.834639 eth0  In  IP 10.0.2.3.53 > 10.0.2.15.55313: 10205 1/0/0 AAAA 2404:6800:4004:818::200e (56)
20:15:51.835809 lxc624d897a501a In  IP 172.20.1.107.53 > 172.20.1.218.36634: 19787 1/0/0 AAAA 2404:6800:4004:818::200e (66)


# CoreDNS가 prometheus 플러그인을 사용하고 있다면, 메트릭 포트(:9153)를 통해 캐시 관련 정보를 수집.
## coredns_cache_entries 현재 캐시에 저장된 엔트리(항목) 수 : type: success 또는 denial (정상 응답 or NXDOMAIN 등)
## coredns_cache_hits_total    캐시 조회 성공 횟수
## coredns_cache_misses_total    캐시 미스 횟수
## coredns_cache_requests_total    캐시 관련 요청 횟수의 총합

kubectl exec -it curl-pod -- curl kube-dns.kube-system.svc:9153/metrics | grep coredns_cache_ | grep -v ^#

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it curl-pod -- curl kube-dns.kube-system.svc:9153/metrics | grep coredns_cache_ | grep -v ^#
coredns_cache_entries{server="dns://:53",type="denial",view="",zones="."} 1
coredns_cache_entries{server="dns://:53",type="success",view="",zones="."} 2
coredns_cache_misses_total{server="dns://:53",view="",zones="."} 44
coredns_cache_requests_total{server="dns://:53",view="",zones="."} 44

 

실습을 통해서 파드의 DNS 질의가 어떻게 동작과 모니터링 방법을 같이 살펴봤습니다.

 

2. NodeLocalDNS 사용

NodeLocalDNS는 DNS 질의가 과도하여 성능 저하가 있는 경우, 각 노드에 DNS 질의 결과를 캐싱하는 에이전트를 데몬 셋 형태로 실행하여 로컬 DNS 캐싱 서비스를 제공하는 방식을 말합니다.

 

아래 그림의 파드 DNS 질의 절차를 보면 Client pod는 Local DNS cache로 질의를 하고, Cache가 없는 경우에 다시 coredns로 질의하여 결과를 반환하는 형태로 동작합니다. 이 경우 coredns의 부하도 줄어들고, 한편 coredns가 다른 노드에 위치하더라도 통신을 위한 latency가 줄어드는 효과가 있습니다.

출처: https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/

 

아래로 NodeLocalDNS를 배포해보고 실습해 보겠습니다.

# iptables 확인
iptables-save | tee before.txt

# nodelocaldns.yaml 다운
wget https://github.com/kubernetes/kubernetes/raw/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml

# kubedns 는 coredns 서비스의 ClusterIP를 변수 지정
kubedns=`kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}`
domain='cluster.local'    ## default 값
localdns='169.254.20.10'  ## default 값
echo $kubedns $domain $localdns

# iptables 모드 사용 중으로 아래 명령어 수행
sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/__PILLAR__DNS__SERVER__/$kubedns/g" nodelocaldns.yaml

# nodelocaldns 설치
kubectl apply -f nodelocaldns.yaml

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl apply -f nodelocaldns.yaml
serviceaccount/node-local-dns created
service/kube-dns-upstream created
configmap/node-local-dns created
daemonset.apps/node-local-dns created
service/node-local-dns created

# 설치 확인
kubectl get pod -n kube-system -l k8s-app=node-local-dns -owide

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl get pod -n kube-system -l k8s-app=node-local-dns -owide
NAME                   READY   STATUS    RESTARTS   AGE   IP               NODE      NOMINATED NODE   READINESS GATES
node-local-dns-56cxv   1/1     Running   0          29s   192.168.10.100   k8s-ctr   <none>           <none>
node-local-dns-bgqpq   1/1     Running   0          29s   192.168.10.101   k8s-w1    <none>           <none>

# node-local-dns의 logging 설정 및 설정 확인
kubectl edit cm -n kube-system node-local-dns # 'cluster.local' 과 '.:53' 에 log, debug 추가
kubectl -n kube-system rollout restart ds node-local-dns
kubectl describe cm -n kube-system node-local-dns

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl edit cm -n kube-system node-local-dns
configmap/node-local-dns edited
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl -n kube-system rollout restart ds node-local-dns
daemonset.apps/node-local-dns restarted
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl describe cm -n kube-system node-local-dns
Name:         node-local-dns
Namespace:    kube-system
Labels:       addonmanager.kubernetes.io/mode=Reconcile
Annotations:  <none>

Data
====
Corefile:
----
cluster.local:53 {
    log
    debug
    errors
    cache {
            success 9984 30
            denial 9984 5
    }
    reload
    loop
    bind 169.254.20.10 10.96.0.10 # 기본적으로 nodelocaldns로 호출하고, 그 이후 coredns로 감
    forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
    }
    prometheus :9253
    health 169.254.20.10:8080
    }
in-addr.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 10.96.0.10
    forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
    }
    prometheus :9253
    }
ip6.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 10.96.0.10
    forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
    }
    prometheus :9253
    }
.:53 {
    errors
    cache 30
    reload
    loop
    bind 169.254.20.10 10.96.0.10
    forward . __PILLAR__UPSTREAM__SERVERS__
    prometheus :9253
    }



BinaryData
====

Events:  <none>


# iptables 확인 : 규칙 업데이트까지 다소 시간 소요됨
iptables-save | tee after.txt
diff before.txt after.txt

## 
iptables -t filter -S | grep -i dns
-A INPUT -d 10.96.0.10/32 -p udp -m udp --dport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A INPUT -d 10.96.0.10/32 -p tcp -m tcp --dport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A INPUT -d 169.254.20.10/32 -p udp -m udp --dport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A INPUT -d 169.254.20.10/32 -p tcp -m tcp --dport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A OUTPUT -s 10.96.0.10/32 -p udp -m udp --sport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A OUTPUT -s 10.96.0.10/32 -p tcp -m tcp --sport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A OUTPUT -s 169.254.20.10/32 -p udp -m udp --sport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT
-A OUTPUT -s 169.254.20.10/32 -p tcp -m tcp --sport 53 -m comment --comment "NodeLocal DNS Cache: allow DNS traffic" -j ACCEPT

##
iptables -t raw -S | grep -i dns
-A PREROUTING -d 10.96.0.10/32 -p udp -m udp --dport 53 -m comment --comment "NodeLocal DNS Cache: skip conntrack" -j NOTRACK
-A PREROUTING -d 10.96.0.10/32 -p tcp -m tcp --dport 53 -m comment --comment "NodeLocal DNS Cache: skip conntrack" -j NOTRACK
-A PREROUTING -d 169.254.20.10/32 -p udp -m udp --dport 53 -m comment --comment "NodeLocal DNS Cache: skip conntrack" -j NOTRACK
-A PREROUTING -d 169.254.20.10/32 -p tcp -m tcp --dport 53 -m comment --comment "NodeLocal DNS Cache: skip conntrack" -j NOTRACK
-A OUTPUT -s 10.96.0.10/32 -p tcp -m tcp --sport 8080 -m comment --comment "NodeLocal DNS Cache: skip conntrack" -j NOTRACK
-A OUTPUT -d 10.96.0.10/32 -p tcp -m tcp --dport 8080 -m comment --comment "NodeLocal DNS Cache: skip conntrack" -j NOTRACK
...

 

다만 nodelocaldns를 배포한 이후에도 DNS 질의를 하면 nodelocaldns를 사용하지 않는 것으로 확인됩니다.

# dns 질의가 변경 여부 확인
kubectl exec -it curl-pod -- nslookup webpod
kubectl exec -it curl-pod -- nslookup google.com

# logs : 아직 coredns 쪽으로만 로그가 남겨진다. (nodelocaldns가 제대로 동작하지 않는다는 의미)
kubectl -n kube-system logs -l k8s-app=kube-dns -f
kubectl -n kube-system logs -l k8s-app=node-local-dns -f

# 파드를 재시작해도 실제로 /etc/resolv.conf가 바뀌지 않는다.
kubectl exec -it curl-pod -- cat /etc/resolv.conf

 

그 이유는 위에 살펴본 iptables 규칙과 관련이 있으며, 실제로 nodelocaldns를 배포하면, iptables 정책을 변경하여 coredns로의 호출을 nodelocaldns가 수행하도록 동작합니다.

다만 Cilium 환경에서는 iptables가 정의한 대로 동작하지 않습니다.

 

 

3. Cilium의 Local Redirect Policy

이러한 상황에서 사용할 수 있는 옵션이 cilium의 Local Redirect Policy 옵션입니다.

Cilium에 --set localRedirectPolicy=true 옵션을 적용하고, CiliumLocalRedirectPolicy라는 CRD를 정의하여, IP 주소와 Port/Protocol의 tuple 또는 Kubernetes Service 로 향하는 파드 트래픽을 eBPF를 사용하여 노드 내 백엔드 파드로 로컬로 리디렉션할 수 있도록 하는 Cilium의 로컬 리디렉션 정책을 구성합니다.

 

아래와 같이 실습을 이어 나가겠습니다.

# 옵션 적용
helm upgrade cilium cilium/cilium --namespace kube-system --version 1.17.6 --reuse-values \
  --set localRedirectPolicy=true
kubectl rollout restart deploy cilium-operator -n kube-system
kubectl rollout restart ds cilium -n kube-system

(⎈|HomeLab:N/A) root@k8s-ctr:~# helm upgrade cilium cilium/cilium --namespace kube-system --version 1.17.6 --reuse-values \
  --set localRedirectPolicy=true
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Sat Aug  2 20:48:04 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 4
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay and Hubble UI.

Your release version is 1.17.6.

For any further help, visit https://docs.cilium.io/en/v1.17/gettinghelp
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl rollout restart deploy cilium-operator -n kube-system
deployment.apps/cilium-operator restarted
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl rollout restart ds cilium -n kube-system
daemonset.apps/cilium restarted


# local redirect용 nodelocaldns에 대한 파일 다운
wget https://raw.githubusercontent.com/cilium/cilium/1.17.6/examples/kubernetes-local-redirect/node-local-dns.yaml

# 설정 추가
kubedns=$(kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP})
sed -i "s/__PILLAR__DNS__SERVER__/$kubedns/g;" node-local-dns.yaml

# 일부 설정이 다른 것을 알 수 있다.
vi -d nodelocaldns.yaml node-local-dns.yaml

## before
args: [ "-localip", "169.254.20.10,10.96.0.10", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns-upstream" ]

## after
args: [ "-localip", "169.254.20.10,10.96.0.10", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns-upstream", "-skipteardown=true", "-setupinterface=false", "-setupiptables=false" ]

# 배포 (local redirect를 위한 추가 설정이 있는 것을 알 수 있음)
## -skipteardown=true, -setupinterface=false, and -setupiptables=false.
# Modify Node-local DNS cache’s deployment yaml to put it in non-host namespace by setting hostNetwork: false for the daemonset.
# In the Corefile, bind to 0.0.0.0 instead of the static IP.
kubectl apply -f node-local-dns.yaml

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl apply -f node-local-dns.yaml
serviceaccount/node-local-dns configured
service/kube-dns-upstream configured
configmap/node-local-dns configured
daemonset.apps/node-local-dns configured

# Logging 설정 추가
kubectl edit cm -n kube-system node-local-dns # log, debug 추가
kubectl -n kube-system rollout restart ds node-local-dns

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl edit cm -n kube-system node-local-dns
configmap/node-local-dns edited
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl -n kube-system rollout restart ds node-local-dns
daemonset.apps/node-local-dns restarted

kubectl describe cm -n kube-system node-local-dns

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl describe cm -n kube-system node-local-dns
Name:         node-local-dns
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>

Data
====
Corefile:
----
cluster.local:53 {
    log
    debug
    errors
    cache {
            success 9984 30
            denial 9984 5
    }
    reload
    loop
    bind 0.0.0.0
    forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
    }
    prometheus :9253
    health
    }
in-addr.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 0.0.0.0
    forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
    }
    prometheus :9253
    }
ip6.arpa:53 {
    errors
    cache 30
    reload
    loop
    bind 0.0.0.0
    forward . __PILLAR__CLUSTER__DNS__ {
            force_tcp
    }
    prometheus :9253
    }
.:53 {
    errors
    cache 30
    reload
    loop
    bind 0.0.0.0
    forward . __PILLAR__UPSTREAM__SERVERS__
    prometheus :9253
    }



BinaryData
====

Events:  <none>


# CiliumLocalRedirectPolicy 파일 확인
wget https://raw.githubusercontent.com/cilium/cilium/1.17.6/examples/kubernetes-local-redirect/node-local-dns-lrp.yaml
cat node-local-dns-lrp.yaml | yq
apiVersion: "cilium.io/v2"
kind: CiliumLocalRedirectPolicy
metadata:
  name: "nodelocaldns"
  namespace: kube-system
spec:
  redirectFrontend:
    serviceMatcher:
      serviceName: kube-dns
      namespace: kube-system
  redirectBackend: # redirectBackend를 node-local-dns로 지정한다!
    localEndpointSelector:
      matchLabels:
        k8s-app: node-local-dns
    toPorts:
      - port: "53"
        name: dns
        protocol: UDP
      - port: "53"
        name: dns-tcp
        protocol: TCP

kubectl apply -f https://raw.githubusercontent.com/cilium/cilium/1.17.6/examples/kubernetes-local-redirect/node-local-dns-lrp.yaml

# 생성 확인
kubectl get CiliumLocalRedirectPolicy -A

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl get CiliumLocalRedirectPolicy -A
NAMESPACE     NAME           AGE
kube-system   nodelocaldns   5s

# cilium에서 local retirection 설정 확인
kubectl exec -it -n kube-system ds/cilium -c cilium-agent -- cilium-dbg lrp list

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it -n kube-system ds/cilium -c cilium-agent -- cilium-dbg lrp list
LRP namespace   LRP name       FrontendType                Matching Service
kube-system     nodelocaldns   clusterIP + all svc ports   kube-system/kube-dns
                |              10.96.0.10:9153/TCP ->
                |              10.96.0.10:53/UDP -> 172.20.0.188:53(kube-system/node-local-dns-zcg6v),
                |              10.96.0.10:53/TCP -> 172.20.0.188:53(kube-system/node-local-dns-zcg6v),

kubectl exec -it -n kube-system ds/cilium -c cilium-agent -- cilium-dbg service list | grep LocalRedirect

(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl exec -it -n kube-system ds/cilium -c cilium-agent -- cilium-dbg service list | grep LocalRedirect
16   10.96.0.10:53/TCP          LocalRedirect   1 => 172.20.0.188:53/TCP (active)
18   10.96.0.10:53/UDP          LocalRedirect   1 => 172.20.0.188:53/UDP (active)

# coredns 호출을 node-local-dns 파드로 전달한다
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl get po -n kube-system -owide -l k8s-app=node-local-dns
NAME                   READY   STATUS    RESTARTS   AGE   IP             NODE      NOMINATED NODE   READINESS GATES
node-local-dns-chsl8   1/1     Running   0          20m   172.20.1.246   k8s-ctr   <none>           <none>
node-local-dns-zcg6v   1/1     Running   0          20m   172.20.0.188   k8s-w1    <none>           <none>

# logs 재확인
kubectl -n kube-system logs -l k8s-app=kube-dns -f
kubectl -n kube-system logs -l k8s-app=node-local-dns -f

# 호출 테스트
kubectl exec -it curl-pod -- nslookup www.google.com

# coredns 로그
(⎈|HomeLab:N/A) root@k8s-ctr:~# kubectl -n kube-system logs -l k8s-app=kube-dns -f
[INFO] 172.20.1.246:36924 - 64493 "A IN www.google.com.default.svc.cluster.local. tcp 58 false 65535" NXDOMAIN qr,aa,rd 151 0.011807205s
[INFO] 172.20.1.246:36924 - 34068 "A IN www.google.com.svc.cluster.local. tcp 50 false 65535" NXDOMAIN qr,aa,rd 143 0.003313177s
[INFO] 172.20.1.246:36924 - 6838 "A IN www.google.com.cluster.local. tcp 46 false 65535" NXDOMAIN qr,aa,rd 139 0.000671685s
# 더이상 없음

# nodelocaldns 로그
kubectl -n kube-system logs -l k8s-app=node-local-dns -f
# 첫번째 로그
[INFO] 172.20.1.218:56602 - 64493 "A IN www.google.com.default.svc.cluster.local. udp 58 false 512" NXDOMAIN qr,aa,rd 151 0.055108912s
[INFO] 172.20.1.218:44023 - 34068 "A IN www.google.com.svc.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.014067275s
[INFO] 172.20.1.218:33608 - 6838 "A IN www.google.com.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.022431702s
# 다시 nodelocaldns 에서 응답함
[INFO] 172.20.1.218:40290 - 20639 "A IN www.google.com.default.svc.cluster.local. udp 58 false 512" NXDOMAIN qr,aa,rd 151 0.001162012s
[INFO] 172.20.1.218:57376 - 55001 "A IN www.google.com.svc.cluster.local. udp 50 false 512" NXDOMAIN qr,aa,rd 143 0.000293324s
[INFO] 172.20.1.218:60696 - 6755 "A IN www.google.com.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000283324s

 

Hubble UI에서도 요청이 coredns에서 node-local-dns로 변경된 것을 알 수 있습니다.

 

해당 예제는 아래 문서의 예시 중 하나이며, 실제로는 특정 서비스로 향하는 백엔드 파드를 설정으로 변경할 수 있는 옵션으로 활용할 수 있습니다.

참고: https://docs.cilium.io/en/stable/network/kubernetes/local-redirect-policy/

 

 

마치며

이번 포스트에서는 파드 DNS 질의를 처리 coredns의 동작, 그리고 NodeLocalDNS에 대한 배경 설명을 살펴봤습니다.

기본적인 Cilium환경에서 NodeLocalDNS가 정상적으로 동작하지 않을 수 있으며, 이 경우에는 Local Redirect를 기반으로하는 NodeLocalDNS를 활용하실 수 있습니다.

 

다음 포스트에서도 Cilium의 파드 통신의 다른 주제에 대해서 살펴보도록 하겠습니다.