none
How to identify the cause of network flooding using network monitor?

    Question

  • Hi experts,

      I am a new netmon user. I find intermittent link down from 3:00am ~ 3:02am sometimes. I use Windows perfmon and I found the network flooding. Could you tell me how to identify the root cause using netmon?

     

    As you can see bytes total/sec on four 10GbE(screenshot 1) below, it went from 10MB/s to 140MB/s on four links at 3:00am and caused no buffers errors.(screenshot 2)

     I check the output.zip and I know the packets/sec is increased at 3:02am but there is no suspicious IP address, please help.

    Check out the file in the netmon directory.

    ftp://ftp01.quantatw.com/

    user: sapftp

    password: wju123

     

    Thanks for your help. Any ideas will be really appreciated.

     

    Thursday, September 27, 2012 7:40 AM

Answers

  • If you can capture the data, Network Anazlyer can parse the SQL traffic and you might be able to understand what is going on.  But other than that, you'll probably need to find some SQL expertise on a different forum.

    Paul

    • Marked as answer by dennislee124 Friday, October 05, 2012 1:48 AM
    Thursday, October 04, 2012 9:38 PM
    Owner

All replies

  • This issue might be tricky to tshoot with Network Monitor.  The problem is that the issue of high bandwidth might also mean that we can't capture so much traffic.  Your bet to capture the data woudl be to use NMCap.  And if there are high-performance or blob filters you can use as a capture filter, that might help too.

    Assuming you can capture the traffic, then I would problem use TopUsers expert from codeplex to see who this machine is communicating with the most.  Then you can narrow down and try to understand what traffic explicitly is causing the issues.  From this you might be able to tie back to a process.

    Paul

    Wednesday, October 03, 2012 9:00 PM
    Owner
  • Hi Paul,

      Thanks. I tried wireshark before and I already know that it is SQL server and SAP application servers which sent a huge data out on 4 * 10GbE NICs and exhausted my network resource. But it's too many SAP scheduled jobs and Remote function calls. Is there any tools to analyze SQL server packets, ex: to identify which spid or store procedure or ad-hoc query?

    Burst rate.

    NIC1: 142MB

      NIC2: 104MB

      NIC3: 309MB(No buffers)

      NIC4: 45MB

    Thursday, October 04, 2012 4:48 AM
  • If you can capture the data, Network Anazlyer can parse the SQL traffic and you might be able to understand what is going on.  But other than that, you'll probably need to find some SQL expertise on a different forum.

    Paul

    • Marked as answer by dennislee124 Friday, October 05, 2012 1:48 AM
    Thursday, October 04, 2012 9:38 PM
    Owner
  • not sure if you have already solved the problem. 'no buffer' error basically mean NIC cannot send the traffic to main memory. It is more likely a driver issue, or there is not enough memory for driver to use.

    i do not think it matters which spid or stored procedure, etc it is. Any db query can create a huge burst.

    as a temoprary mitigation, you may tune tcp recv window size in Windows or other operating system you use to smaller or less connection number for a service ( less aggressive ). AND have DB server side to be less aggressive and respond to packet loss. From the diagram you attached, obviously the server did not respond to tcp packet loss at all or unless you are not use tcp. The server could be set a too large value in 'packet lose burst'.

    btw, no file in file. above is what I guess.

    Monday, October 22, 2012 5:11 AM
  • Hi zekwg,

      Thanks, probably not solve yet but it did not occur for 2 weeks. Yes, probably a driver issue. I run HP network survey periodically now. I want to show HP each time it occurred, the no buffers errors in  NIC increased.

       Also, I increase receive buffers from 896 to 8960 packets. It is weird that it often occurred at 3:01am so I suspect maybe some strange programs/spid caused that problem.

    Thursday, October 25, 2012 2:33 AM