SD-WAN排错

SD-WAN日志

健康检查

  1. 健康检查对路由的影响:

    • 检测到中断,删除对应接口的静态路由。

      date=2024-01-20 time=17:06:31 eventtime=1618963591590008160 tz="-0700" logid="0100022921" type="event" subtype="system" level="critical" vd="root" logdesc="Routing information changed" name="test" interface="R150" status="down" msg="Static route on interface R150 may be removed by health-check test. Route:  (10.100.1.2->10.100.2.22 ping-down)"
      
    • 健康检查检测到恢复,恢复对应接口的静态路由。

      date=2024-01-20 time=17:11:46 eventtime=1618963906950174240 tz="-0700" logid="0100022921" type="event" subtype="system" level="critical" vd="root" logdesc="Routing information changed" name="test" interface="R150" status="up" msg="Static route on interface R150 may be added by health-check test. Route:  (10.100.1.2->10.100.2.22 ping-up)"
      
  2. SD-WAN健康检查某个成员的状态变化:

    • 健康检查失败(Dead),停止转发流量。

      date=2024-01-20 time=23:04:32 eventtime=1618985072898756700 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" interface="R150" member="1" serviceid=1 service="test" gateway=10.100.1.1 msg="Member link is unreachable or miss threshold. Stop forwarding traffic. "
      
    • 健康检查从失败(Dead)恢复为alive,继续转发流量。

      date=2024-01-20 time=23:06:08 eventtime=1618985168018789600 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" interface="R150" member="1" serviceid=1 service="test" gateway=10.100.1.1 msg="Member link is available. Start forwarding traffic. "
      
  3. 健康检查中的SLA目标:

    • 不满足SLA目标。

      date=2024-01-20 time=21:32:33 eventtime=1618979553388763760 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Health Check" healthcheck="test" slatargetid=1 oldvalue="2" newvalue="1" msg="Number of pass member changed."
      date=2024-01-20 time=21:32:33 eventtime=1618979553388751880 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Health Check" healthcheck="test" slatargetid=1 member="1" msg="Member status changed. Member out-of-sla."
      
    • SLA目标从不满足目标恢复为满足目标。

      date=2024-01-20 time=21:38:49 eventtime=1618979929908765200 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Health Check" healthcheck="test" slatargetid=1 oldvalue="1" newvalue="2" msg="Number of pass member changed."
      date=2024-01-20 time=21:38:49 eventtime=1618979929908754060 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="information" vd="root" logdesc="SDWAN status" eventtype="Health Check" healthcheck="test" slatargetid=1 member="1" msg="Member status changed. Member in sla."
      

成员转发状态

  1. 某链路带宽占用:

    • 已经达到了配置的该成员的带宽值,停止转发流量。

      date=2024-01-20 time=21:55:14 eventtime=1618980914728863220 tz="-0700" logid="0113022924" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN volume status" eventtype="Volume" interface="R160" member="2" msg="Member enters into conservative status with limited ablity to receive new sessions for too much traffic."
      
    • 占用已经恢复小于配置的带宽值,并继续开始转发流量。

      date=2024-01-20 time=22:12:52 eventtime=1618981972698753360 tz="-0700" logid="0113022924" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN volume status" eventtype="Volume" interface="R160" member="2" msg="Member resume normal status to receive new sessions for internal adjustment"
      
  2. 配置SLA类型(Lowest Cost/Maximize bandwidth)的SD-WAN规则,由于SLA检查失败,转发成员顺序发生变化。

    date=2024-01-20 time=22:40:46 eventtime=1618983646428803040 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" serviceid=1 service="test" seq="2,1" msg="Service prioritized by SLA will be redirected in sequence order."
    
  3. 配置Lowest Cost类型的SD-WAN规则,由于SLA检查从失败恢复为通过,转发成员顺序发生变化。

    date=2024-01-20 time=22:41:51 eventtime=1618983711678827920 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" serviceid=1 service="test" seq="1,2" msg="Service prioritized by SLA will be redirected in sequence order."
    
  4. 配置Best Quality类型的SD-WAN规则,转发成员的顺序发生变化。

    date=2024-01-20 time=22:56:55 eventtime=1618984615708804760 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" serviceid=1 service="test" metric="packet-loss" seq="2,1" msg="Service prioritized by performance metric will be redirected in sequence order."
    
    date=2024-01-20 time=22:56:58 eventtime=1618984618278852140 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" serviceid=1 service="test" metric="packet-loss" seq="1,2" msg="Service prioritized by performance metric will be redirected in sequence order."
    
  5. 配置Maximize bandwidth类型的SD-WAN规则:

    • 转发成员不满足SLA标准,此成员停止转发流量。

      date=2024-01-20 time=23:10:24 eventtime=1618985425048820800 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" serviceid=1 service="test" member="2(R160)" msg="Service will be load balanced among members with available routing."
      
    • 配置Maximize bandwidth类型的SD-WAN规则,转发成员从不满足SLA标准恢复到满足SLA标准,此成员可以继续转发流量。

      date=2024-01-20 time=23:11:34 eventtime=1618985494478807100 tz="-0700" logid="0113022923" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN status" eventtype="Service" serviceid=1 service="test" member="2(R160),1(R150)" msg="Service will be load balanced among members with available routing."
      
  6. 在健康检查的配置中开启了SLA周期日志:

    • 在健康检查的配置中开启了sla-fail-log-period,健康检查周期性产生的SLA失败日志。

      date=2024-01-20 time=23:18:10 eventtime=1618985890469018260 tz="-0700" logid="0113022925" type="event" subtype="sdwan" level="notice" vd="root" logdesc="SDWAN SLA information" eventtype="SLA" healthcheck="test" slatargetid=1 interface="R150" status="up" latency="0.061" jitter="0.004" packetloss="2.000%" inbandwidthavailable="0kbps" outbandwidthavailable="200.00Mbps" bibandwidthavailable="200.00Mbps" inbandwidthused="1kbps" outbandwidthused="1kbps" bibandwidthused="2kbps" slamap="0x0" metric="packetloss" msg="Health Check SLA status. SLA failed due to being over the performance metric threshold."
      
    • 在健康检查的配置中开启了sla-pass-log-period,健康检查周期性产生的SLA成功日志。

      date=2024-01-20 time=23:18:12 eventtime=1618985892509027220 tz="-0700" logid="0113022925" type="event" subtype="sdwan" level="information" vd="root" logdesc="SDWAN SLA information" eventtype="SLA" healthcheck="test" slatargetid=1 interface="R150" status="up" latency="0.060" jitter="0.003" packetloss="0.000%" inbandwidthavailable="0kbps" outbandwidthavailable="200.00Mbps" bibandwidthavailable="200.00Mbps" inbandwidthused="1kbps" outbandwidthused="1kbps" bibandwidthused="2kbps" slamap="0x1" msg="Health Check SLA status."
      

SD-WAN调试命令

SD-WAN健康检查

查看SD-WAN健康检查的状态。

FGT # diagnose sys sdwan health-check
Health Check(server):
Seq(1 R150): state(alive), packet-loss(0.000%) latency(0.110), jitter(0.024) sla_map=0x0
Seq(2 R160): state(alive), packet-loss(0.000%) latency(0.068), jitter(0.009) sla_map=0x0

Health Check(ping):
Seq(1 R150): state(alive), packet-loss(0.000%) latency(0.100), jitter(0.017) sla_map=0x0
Seq(2 R160): state(dead), packet-loss(100.000%) sla_map=0x0

FGT # diagnose sys sdwan health-check ping
Health Check(ping):
Seq(1 R150): state(alive), packet-loss(0.000%) latency(0.100), jitter(0.017) sla_map=0x0
Seq(2 R160): state(dead), packet-loss(100.000%) sla_map=0x0

SD-WAN成员状态

  1. 使用source-ip-basedsource-dest-ip-based作为load-balance的模式时,查看SD-WAN成员的状态。

    FGT # diagnose sys sdwan member
    Member(1): interface: R150, gateway: 10.100.1.1 2000:10:100:1::1, priority: 0 1024, weight: 0
    Member(2): interface: R160, gateway: 10.100.1.5 2000:10:100:1::5, priority: 0 1024, weight: 0
    
  2. 使用weight-based作为load-balance的模式时,查看SD-WAN成员的状态。

    FGT # diagnose sys sdwan member
    Member(1): interface: R150, gateway: 10.100.1.1 2000:10:100:1::1, priority: 0 1024, weight: 33
      Session count: 15
    Member(2): interface: R160, gateway: 10.100.1.5 2000:10:100:1::5, priority: 0 1024, weight: 66
      Session count: 1
    
  3. 使用measured-volume-based作为load-balance的模式时:

    • 所有的成员均还有使用余量,查看SD-WAN成员的状态。

      FGT # diagnose sys sdwan member
      Member(1): interface: R150, gateway: 10.100.1.1 2000:10:100:1::1, priority: 0 1024, weight: 33
        Config volume ratio: 33, last reading: 218067B, volume room 33MB
      Member(2): interface: R160, gateway: 10.100.1.5 2000:10:100:1::5, priority: 0 1024, weight: 66
        Config volume ratio: 66, last reading: 202317B, volume room 66MB
      
    • 某个成员的用量已经用尽。

      FGT # diagnose sys sdwan member
      Member(1): interface: R150, gateway: 10.100.1.1 2000:10:100:1::1, priority: 0 1024, weight: 0
        Config volume ratio: 33, last reading: 1287767633B, overload volume 517MB
      Member(2): interface: R160, gateway: 10.100.1.5 2000:10:100:1::5, priority: 0 1024, weight: 63
        Config volume ratio: 66, last reading: 1686997898B, volume room 63MB
      
  4. 使用usage-based`spillover作为load-balance的模式时:

    • 当溢出未发生时,查看SD-WAN成员的状态。

      FGT # diagnose sys sdwan member
      Member(1): interface: R150, gateway: 10.100.1.1 2000:10:100:1::1, priority: 0 1024, weight: 255
        Egress-spillover-threshold: 400kbit/s, ingress-spillover-threshold: 300kbit/s
        Egress-overbps=0, ingress-overbps=0
      Member(2): interface: R160, gateway: 10.100.1.5 2000:10:100:1::5, priority: 0 1024, weight: 254
        Egress-spillover-threshold: 0kbit/s, ingress-spillover-threshold: 0kbit/s
        Egress-overbps=0, ingress-overbps=0
      
    • 当某个成员发生溢出时。

      FGT # diagnose sys sdwan member
      Member(1): interface: R150, gateway: 10.100.1.1 2000:10:100:1::1, priority: 0 1024, weight: 255
        Egress-spillover-threshold: 400kbit/s, ingress-spillover-threshold: 300kbit/s
        Egress-overbps=1, ingress-overbps=0
      Member(2): interface: R160, gateway: 10.100.1.5 2000:10:100:1::5, priority: 0 1024, weight: 254
        Egress-spillover-threshold: 0kbit/s, ingress-spillover-threshold: 0kbit/s
        Egress-overbps=0, ingress-overbps=0
      
    • diagnose netlink dstmac list命令也可以查看是否发生了溢出情况。

      FGT # diagnose netlink dstmac list R150
      dev=R150 mac=00:00:00:00:00:00 vwl rx_tcp_mss=0 tx_tcp_mss=0 egress_overspill_threshold=50000 egress_bytes=100982 egress_over_bps=1 ingress_overspill_threshold=37500 ingress_bytes=40 ingress_over_bps=0 sampler_rate=0 vwl_zone_id=1 intf_qua=0
      

SD-WAN规则状态

  1. 使用manual模式的SD-WAN规则。

    FGT # diagnose sys sdwan service
    Service(1): Address Mode(IPV4) flags=0x200
      Gen(1), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(manual)
      Members(2):
        1: Seq_num(1 R150), alive, selected
        2: Seq_num(2 R160), alive, selected
      Dst address(1):
            10.100.21.0-10.100.21.255
    
  2. 使用auto模式的SD-WAN规则。

    FGT # diagnose sys sdwan service
    Service(1): Address Mode(IPV4) flags=0x200
      Gen(1), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(auto), link-cost-factor(latency), link-cost-threshold(10), heath-check(ping)
      Members(2):
        1: Seq_num(2 R160), alive, latency: 0.066, selected
        2: Seq_num(1 R150), alive, latency: 0.093
      Dst address(1):
            10.100.21.0-10.100.21.255
    
  3. 使用Priority模式(Best Quality)的SD-WAN规则。

    FGT # diagnose sys sdwan service
    Service(1): Address Mode(IPV4) flags=0x200
      Gen(1), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(priority), link-cost-factor(latency), link-cost-threshold(10), heath-check(ping)
      Members(2):
        1: Seq_num(2 R160), alive, latency: 0.059, selected
        2: Seq_num(1 R150), alive, latency: 0.077, selected
      Dst address(1):
            10.100.21.0-10.100.21.255
    
  4. 使用sla模式(Lowest Cost)的SD-WAN规则。

    FGT # diagnose sys sdwan service
    Service(1): Address Mode(IPV4) flags=0x200
      Gen(1), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(sla), sla-compare-order
      Members(2):
        1: Seq_num(1 R150), alive, sla(0x1), gid(0), cfg_order(0), cost(0), selected
        2: Seq_num(2 R160), alive, sla(0x1), gid(0), cfg_order(1), cost(0), selected
      Dst address(1):
            10.100.21.0-10.100.21.255
    
  5. 使用load-balance模式(Best Quality)的SD-WAN规则。

    FGT # diagnose sys sdwan service
    Service(1): Address Mode(IPV4) flags=0x200
      Gen(1), TOS(0x0/0x0), Protocol(0: 1->65535), Mode(load-balance hash-mode=round-robin)
      Members(2):
        1: Seq_num(1 R150), alive, sla(0x1), gid(2), num of pass(1), selected
        2: Seq_num(2 R160), alive, sla(0x1), gid(2), num of pass(1), selected
      Dst address(1):
            10.100.21.0-10.100.21.255
    

SD-WAN统计信息

  1. 过去15分钟的SD-WAN接口状态统计日志。

    FGT (root) # diagnose sys sdwan intf-sla-log R150
    Timestamp: Wed Apr 21 16:58:27 2021, used inbandwidth: 655bps, used outbandwidth: 81655306bps, used bibandwidth: 81655961bps, tx bys: 3413479982bytes, rx bytes: 207769bytes.
    Timestamp: Wed Apr 21 16:58:37 2021, used inbandwidth: 649bps, used outbandwidth: 81655540bps, used bibandwidth: 81656189bps, tx bys: 3515590414bytes, rx bytes: 208529bytes.
    Timestamp: Wed Apr 21 16:58:47 2021, used inbandwidth: 655bps, used outbandwidth: 81655546bps, used bibandwidth: 81656201bps, tx bys: 3617700886bytes, rx bytes: 209329bytes.
    Timestamp: Wed Apr 21 16:58:57 2021, used inbandwidth: 620bps, used outbandwidth: 81671580bps, used bibandwidth: 81672200bps, tx bys: 3719811318bytes, rx bytes: 210089bytes.
    Timestamp: Wed Apr 21 16:59:07 2021, used inbandwidth: 620bps, used outbandwidth: 81671580bps, used bibandwidth: 81672200bps, tx bys: 3821921790bytes, rx bytes: 210889bytes.
    Timestamp: Wed Apr 21 16:59:17 2021, used inbandwidth: 665bps, used outbandwidth: 81688152bps, used bibandwidth: 81688817bps, tx bys: 3924030936bytes, rx bytes: 211926bytes.
    Timestamp: Wed Apr 21 16:59:27 2021, used inbandwidth: 671bps, used outbandwidth: 81688159bps, used bibandwidth: 81688830bps, tx bys: 4026141408bytes, rx bytes: 212726bytes.
    ......
    
  2. 过去10分钟的SLA统计日志。

    FGT (root) # diagnose sys sdwan sla-log ping 1
    Timestamp: Wed Apr 21 17:10:11 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.079, jitter: 0.023, packet loss: 0.000%.
    Timestamp: Wed Apr 21 17:10:12 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.079, jitter: 0.023, packet loss: 0.000%.
    Timestamp: Wed Apr 21 17:10:12 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.081, jitter: 0.024, packet loss: 0.000%.
    Timestamp: Wed Apr 21 17:10:13 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.081, jitter: 0.025, packet loss: 0.000%.
    Timestamp: Wed Apr 21 17:10:13 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.082, jitter: 0.026, packet loss: 0.000%.
    Timestamp: Wed Apr 21 17:10:14 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.083, jitter: 0.026, packet loss: 0.000%.
    Timestamp: Wed Apr 21 17:10:14 2021, vdom root, health-check ping, interface: R150, status: up, latency: 0.084, jitter: 0.026, packet loss: 0.000%.
    ......
    

SD-WAN状态信息

  1. SD-WAN规则引用应用控制特征后,应用控制学习的条目状态。

    FGT # diagnose sys sdwan internet-service-app-ctrl-list
    Gmail(15817 4294836957): 64.233.191.19 6 443 Thu Apr 22 10:10:34 2021
    Gmail(15817 4294836957): 142.250.128.83 6 443 Thu Apr 22 10:06:47 2021
    Facebook(15832 4294836806): 69.171.250.35 6 443 Thu Apr 22 10:12:00 2021
    Amazon(16492 4294836342): 3.226.60.231 6 443 Thu Apr 22 10:10:57 2021
    Amazon(16492 4294836342): 52.46.135.211 6 443 Thu Apr 22 10:10:58 2021
    Amazon(16492 4294836342): 52.46.141.85 6 443 Thu Apr 22 10:10:58 2021
    Amazon(16492 4294836342): 52.46.155.13 6 443 Thu Apr 22 10:10:58 2021
    Amazon(16492 4294836342): 54.82.242.32 6 443 Thu Apr 22 10:10:59 2021
    YouTube(31077 4294838537): 74.125.202.138 6 443 Thu Apr 22 10:06:51 2021
    YouTube(31077 4294838537): 108.177.121.119 6 443 Thu Apr 22 10:08:24 2021
    YouTube(31077 4294838537): 142.250.136.119 6 443 Thu Apr 22 10:02:02 2021
    YouTube(31077 4294838537): 142.250.136.132 6 443 Thu Apr 22 10:08:16 2021
    YouTube(31077 4294838537): 142.250.148.100 6 443 Thu Apr 22 10:07:28 2021
    YouTube(31077 4294838537): 142.250.148.132 6 443 Thu Apr 22 10:10:32 2021
    YouTube(31077 4294838537): 172.253.119.91 6 443 Thu Apr 22 10:02:01 2021
    YouTube(31077 4294838537): 184.150.64.211 6 443 Thu Apr 22 10:04:36 2021
    YouTube(31077 4294838537): 184.150.168.175 6 443 Thu Apr 22 10:02:26 2021
    YouTube(31077 4294838537): 184.150.168.211 6 443 Thu Apr 22 10:02:26 2021
    YouTube(31077 4294838537): 184.150.186.141 6 443 Thu Apr 22 10:02:26 2021
    YouTube(31077 4294838537): 209.85.145.190 6 443 Thu Apr 22 10:10:36 2021
    YouTube(31077 4294838537): 209.85.200.132 6 443 Thu Apr 22 10:02:03 2021
    
  2. 查看IPSec创建的Shortcut隧道的健康检查状态(diagnose sys link-monitor interface <name> <name>_0)。

    Spoke1 # diagnose sys link-monitor interface Spoke1_WAN2 
    Interface(Spoke1_WAN2): state(down, since Fri Jan 19 16:58:13 2024), bandwidth(up:288bps, down:0bps), session count(IPv4:7, IPv6:0), tx(26595 bytes), rx(21568 bytes).
    
    Spoke1 # diagnose sys link-monitor interface Spoke1_WAN2 Spoke1_WAN2_0
    Interface(Spoke1_WAN2_0): state(up, since Fri Jan 19 16:58:08 2024), bandwidth(up:320bps, down:320bps), session count(IPv4:0, IPv6:0), tx(12360 bytes), rx(12240 bytes), latency(2.15), jitter(0.53), packet-loss(0.00).
    
  3. 查看SD-WAN中使用的BGP route-tag。

    FGT # get router info bgp network 10.100.11.0/24
    VRF 0 BGP routing table entry for 10.100.11.0/24
    Paths: (2 available, best #2, table Default-IP-Routing-Table)
      Advertised to non peer-group peers:
       10.100.1.1
      Original VRF 0
      20 10
        10.100.1.1 from 10.100.1.1 (5.5.5.5)
          Origin incomplete metric 0, route tag 15, localpref 100, valid, external, best
          Community: 30:5
          Advertised Path ID: 2
           Last update: Thu Apr 22 10:27:27 2021
    
      Original VRF 0
      20 10
        10.100.1.5 from 10.100.1.5 (6.6.6.6)
          Origin incomplete metric 0, route tag 15, localpref 100, valid, external, best
          Community: 30:5
          Advertised Path ID: 1
           Last update: Thu Apr 22 10:25:50 2021
    
    FGT # diagnose sys sdwan route-tag-list
    Route-tag: 15, address: v4(1), v6(0)Last write/now: 6543391 6566007
            service(1), last read route-tag 15 at 6543420
    Prefix(24): Address list(1):
            10.100.11.0-10.100.11.255 oif: 50 48
    
    FGT # diagnose firewall proute list
    list route policy info(vf=root):
    id=2133196801(0x7f260001) vwl_service=1(DataCenter) vwl_mbr_seq=1 2 dscp_tag=0xff 0xff flags=0x40 order-addr tos=0x00 tos_mask=0x00 protocol=0 sport=0-65535 iif=0 dport=1-65535 oif=48(R150) oif=50(R160)
    destination(1): 10.100.11.0-10.100.11.255
    source wildcard(1): 0.0.0.0/0.0.0.0
    hit_count=0 last_used=2021-04-22 10:25:10
    

Copyright © 2024 Fortinet Inc. All rights reserved. Powered by Fortinet TAC Team.
📲扫描下方二维码分享此页面👇
该页面修订于: 2024-01-19 17:26:03

results matching ""

    No results matching ""