概述目标:避免Pod集中到单节点/单区导致可用性降低,通过拓扑分布与反亲和实现多域均衡与容灾能力。适用:有状态/无状态服务的高可用部署,多Zone/多节点集群。核心与实战TopologySpreadConstraints示例:apiVersion: apps/v1 kind: Deployment metadata: name: web namespace: prod spec: replicas: 6 selector: matchLabels: { app: web } template: metadata: labels: { app: web } spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: { app: web } - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: { app: web } containers: - name: web image: repo/web:1.0.0 反亲和规则(避免同节点聚集): affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: [web] topologyKey: kubernetes.io/hostname 示例节点与Zone标签:kubectl get nodes --show-labels | findstr topology.kubernetes.io/zone 验证分布:kubectl -n prod get pods -l app=web -o wide 验证与监控均衡性:观察不同Zone/节点的Pod数量是否接近;调整`maxSkew`。失败调度:当无法满足约束时`DoNotSchedule`会阻止;使用`ScheduleAnyway`可柔性处理(谨慎)。滚动与扩缩容:在滚动或扩缩期间保持分布均衡;监控调度事件。常见误区节点未打Zone标签导致约束无效;需确保云提供商或手工设置。约束过严导致调度失败;需要适当放宽或增加容量。与PDB/资源请求等策略冲突;需综合评估调度策略。结语通过拓扑分布与反亲和约束,可在多域环境下显著提升服务可用性,减少单点与同域风险,并在扩缩/滚动时保持稳定分布。

发表评论 取消回复