none
Number of packets RRS feed

  • Question

  • Hello!

    Been puzzled by the following issue: after enabling ESP I sometimes get strange results when tracing ping command. The problem is that the single command "ping somehost" which usually produces 8 packets (4 requests and 4 replies) may produce only 6 or even 4 esp packets, for example:

    1) ping somehost

    - that's as expected.

    2) ping somehost

    - only 6 out of 8

    3) ping somehost

    - only 4 out of 8

    I can hardly imagine how the presence of ESP may affect the number of packets sent/received - what am I missing here?

    Thank you in advcance,
    Michael

    Thursday, October 31, 2019 2:29 PM

Answers

  • Hello Micheal,

    I think that this is a perfectly normal trace of 4 echo requests and 4 echo replies, sent and received in order with the expected time differences (a few milliseconds between request and reply, one second between successive requests).

    Two things combine to make the trace interpretation difficult. Firstly, the display seems to have been sorted by the DiagnosisTypes column (indicated by the "down arrow" in the image extract below):

    Mentally sorting the messages by MessageNumber or Timestamp gives a better picture of what is happening.

    Secondly, MMA has incorrectly decoded the first ESP packet carrying an echo request as a TCP segment. If one looks at the MMA OPN file that handles ESP, one can see that it uses heuristics to decide whether (in the captured trace data) the ESP payload is encrypted or not - in this case it seems to have got it wrong. There might be room for improvement in the OPN, but it would take a while to work out which decoding paths pass through this heuristic routine and what range of input might be expected. 

    Gary

    • Marked as answer by MF47 Monday, November 18, 2019 3:25 PM
    Thursday, November 14, 2019 10:16 AM

All replies

  • Hello Michael,

    It may seem like a silly/insulting question, but are you sure that the ping command is reporting 0% packet loss (i.e. 4 sent and 4 received)?

    What is noticeable in all three of the screen snapshots that you sent is that the ESP sequence number is low (single digit) - it looks as though a new security association has just been established, its establishment perhaps triggered by the first echo request. Also, the interval between each pair of packets is one second, so if packets are lost then they are contiguous (the last packets or, more likely, the first packets).

    If ping is reporting 0% loss, then I would suggest creating a trace using Event Tracing for Windows (ETW) to capture both the raw packet on the wire (ESP) and the encapsulated packet (ICMP) - Microsoft Message Analyzer can do that. This is what the resulting trace looks like:

    Gary

    Thursday, October 31, 2019 4:38 PM
  • Hello Gary,

    If this question does really insult you, you might've decided not to answer it. For me it is this question that sounds "silly/insulting" - "are you sure that the ping command is reporting 0% packet loss (i.e. 4 sent and 4 received)?" If you think I may be asking a question regarding tracing packets without understanding which packets does really get sent and which not - no comments.

    As to the question asked, here's one more - and maybe more strange - example of the issue: as you know for the authenticated/encrypted connection to be established the initial ipsec negotiation phase must be completed and in 99% I see the following packets in my trace:

    1-1) Restart the host

    1-2) Confirm no SA exist

    1-3) Ping some host - the first packet will have much longer response time due to the ipsec negotianion phase:

    1-4) The new SA gets created

    1-5) The trace:

    This trace does contain both phases and 8 ESP packets that I think are the 4 request and 4 reply packets.

    Sometimes I see this:

    2-1)  Restart the host
    2-2) Confirm no SA exist - the same as 1-2
    2-3) Ping some host - the same as 1-3

    2-4) The new SA gets created - the same as 1-4

    2-5) The trace:

    This time 1) there's no negotiation phase 2) the number of ESP packets is 6 instead of 8. I find it strange and would appreciate any help on the matter.

    Regards,
    Michael



    • Edited by MF47 Friday, November 1, 2019 10:17 AM
    Friday, November 1, 2019 10:15 AM
  • Hello Michael,

    Sorry, you misunderstood my comment. I was suggesting that my question about the packet loss might sound silly/insulting - not the original question. However, I still think that the "packet loss" question needed to be asked because it seems like there were three possibilities:

    1. ICMP packets were lost.
    2. Packets were lost from the trace (still a possibility, even after your last post).
    3. Something for which there is no obvious explanation is taking place.

    As in my previous message, I still think that you need to capture what is happening from different perspectives concurrently. I would suggest using the commands:

    netsh trace start scenario=lan capture=yes provider=Microsoft-Windows-TCPIP provider=Microsoft-Windows-WFP provider=Microsoft-Windows-Ras-NdisWanPacketCapture provider={106B464D-8043-46B1-8CB8-E92A0CD7A560} keywords=0xFFFFFFFFFFFF level=255 tracefile=ping.etl

    netsh trace stop

    {106B464D-8043-46B1-8CB8-E92A0CD7A560} is the "IKEEXT Trace Provider" (a WPP provider).

    Gary

    Friday, November 1, 2019 11:40 AM
  • Gary, please excuse me for my answer - I didn't really understand what you were saying - sorry for that!

    Thank you very much for your suggestion - I'll try it (if I'll be able to catch it once again - I've seen that issue just 3 or 4 times by now).

    Regards,
    Michael

    Friday, November 1, 2019 2:49 PM
  • Hello Michael,

    No problem. On reflection, I think that we can omit the "IKEEXT Trace Provider" from the trace; the Microsoft-Windows-WFP provider summarizes the information and we don't need to analyse the AuthIP contents in detail.

    You will probably need to use Microsoft Message Analyzer (MMA) to view the data - I don't think that Network Monitor will nicely format all of the ETW data.

    Here is a single "ping" traced at 3 different points:

    Frames 759 and 775 are from the TCP/IP driver, recording the sending/receiving of the echo.

    Frames 764 and 774 are the echo reply/request traced in the tunnel.

    Frames 769 and 770 are (presumably) the ESP packets carrying the echo request/reply.

    If the 4 pings show as OK in the command window and there is not a trace of them in this data, then I will be very surprised.

    If you install MMA, then it can be used to capture the data instead of "netsh trace"; as you might notice, "netsh trace stop" can take a long time while it gathers information that we do not need.

    Gary


    Friday, November 1, 2019 3:35 PM
  • Hello,
    Thank you for posting in our TechNet forum.

    If this question has any update or is this issue solved? Also, for the question, is there any other assistance we could provide?



    Best Regards,
    Daisy Zhou


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Tuesday, November 5, 2019 10:06 AM
    Moderator
  • Hello Gary,

    Thank you so much for the detailed example.

    I haven't been able to reproduce the issue and catch it in MMA yet but would you please explain me the following: it is this -  "Frames 764 and 774 are the echo reply/request traced in the tunnel."  (the presence of all 8 echo requests/replies inside a ESP)- that would define whether the trace looks "normal" or not. In this case what's the difference which network analyzer is being used?

    In other words, if your screenshot did not contain one of the esp packets (say 770) what should be the next step in troubleshooting?

    For example, I've seen this issue in Network Monitor once again:

    - the trace above looks normal except for the single missing ESP packet - at least I assume that if I do see 4 replies that means there could not be less than 8 ESP packets - exactly as I've captured in MMA (as you see there's no ipsec negotiation phase in this trace - I think that's because all ipsec keys still apply for the current session).

    So if I manage to capture the issue in MMA it will look exactly as in the Network Monitor (I guess) - there will be one (or two or three...) packet(s) missing and providing we can not look inside ESP packets I don't quite understand what I should do next.

    Regards,
    Michael

    Tuesday, November 5, 2019 12:57 PM
  • Hello Michael,

    The difference between Network Monitor (Netmon) and Microsoft Message Analyzer (MMA), in this case, is that MMA can capture and merge network related events from various different points in the processing of network traffic. Here is a suitable MMA capture configuration:

    We can "look inside" the ESP packets by capturing their contents before/after encryption/decryption - that is what the Microsoft-Windows-Ras-NdisWanPacketCapture provider does. The six frames that I mentioned in my previous post (3 outbound and 3 inbound) are one echo request/reply pair captured at different points in their processing.

    One thing that can happen is that a packet gets lost/dropped - then ping will report less than 100% success. Another thing that can happen is that the tracing mechanism fails to capture/record a packet, perhaps because of a shortage or resources. When tracing at all of these points in the processing of network traffic, one would hope that some points in the processing of a "seemingly" lost packet would be recorded.

    In your last post, the trace with 7 ESP packets does contain hints that Netmon just failed to capture/record a packet that was probably sent, since the outbound "Seq = 0x2" packet is missing. The presence of an improbably large (38150 byte) UDP packet at just the time the ICMP packet was expected (1 second after the previous send and a few milliseconds before the presumed ICMP reply) also hints at a tracing problem. The trace events from the Microsoft-Windows-TCPIP provider would help confirm/explain what is actually happening.

    Gary

    Tuesday, November 5, 2019 1:57 PM
  • Hello Gary,

    Thank you once again for the great clarification - I'll try "to catch" the issue in MMY and post back the screenshot.

    Frankly speeking I've never come accross such issues - I thought it's not possible for network sniffers to miss the packets...

    Regards,
    Michael

    P.S. MMA is to be depricated at 25.11.2019 - that's rather strange to me...

    Wednesday, November 6, 2019 9:48 AM
  • Hello Michael,

    Here is an extract from one of the images that you posted of Netmon:

    The number "Dropped" refers to the number of packets that were "seen" by the Netmon kernel driver but had to be "dropped" (not recorded) because there were not enough resources available.

    ETW also records the number of events "seen" that could not be recorded in the trace header:

    This trace (of a high volume event generating provider) lost 23943 events. Event loss is common, but one would expect the rate of packet send/receive events on a client PC to be manageable without packet loss.

    Gary

    Wednesday, November 6, 2019 1:17 PM
  • "The number "Dropped" refers to the number of packets that were "seen" by the Netmon kernel driver but had to be "dropped" (not recorded) because there were not enough resources available." - yes, I see - but in my previous test the number of dropped packets = 0...
    Thursday, November 7, 2019 2:42 PM
  • Hello Michael,

    Yes, I noticed. If the number was non-zero, we could immediately say that the missing packets were simply dropped from the trace data. Since the number is zero, we are investigating further - with the ETW/MMA trace.

    Gary

    Thursday, November 7, 2019 2:46 PM
  • Hello Gary,

    Thank you for the clarification!

    I still haven't pinpointed the issue in MMA - I must first learn it a bit before trying to use it as a troubleshooting tool (at least a couple of days).

    May I ask you one more question regarding the MMA configuration? According to MMA Operating Guide the Microsoft-Windows-NDIS-PacketCapture provider is enough to capture packets at Link Layer and above (that will include TCP packets as well) - what's the purpose of adding this explicit TCP provider? Please excuse me for the off-topic question - I was unable to find a good explanation for the purpose of each provider so far.

    Regards,
    Michael


    • Edited by MF47 Friday, November 8, 2019 3:17 PM
    Friday, November 8, 2019 3:17 PM
  • Hello Michael,

    Event Tracing for Windows (ETW) is a general mechanism for tracing activity in Windows - it can be used to trace file and registry operations, process and thread activity (including thread context switches) and many other things; there are roughly one thousand "manifest" based event providers (easy to interpret the capture data) and a few thousand "WPP" providers (need access to private symbols of DLLs to interpret the capture data).

    Picking a suitable set of providers to understand what is happening in any given scenario is something that requires experience.

    The Microsoft-Windows-NDIS-PacketCapture provider (ndiscap.sys) is the equivalent of Netmon's nm.sys - it captures the data seen by the network adapters.

    Microsoft-Windows-Ras-NdisWanPacketCapture captures packets too, but in a different way. The kernel driver that is responsible for encapsulating packets in ESP/AH is "instrumented" (contains explicit logging calls) to record the data that it processes.

    Microsoft-Windows-TCPIP is a bit different - it records events in the kernel TCP/IP drivers (tcpip.sys, netio.sys, etc.), but from a different perspective: it does not record packets, but does record that RTT (round trip times) were calculated/updated, memory was allocated, routing tables were consulted etc..

    Microsoft-Windows-WFP will record sending/receiving of IKE/IKEv2/AuthIP messages, security association establishment, etc.

    So, in summary, the reason for adding Microsoft-Windows-TCPIP is that it adds a different perspective/view on the activity that is complementary to the raw packet data.

    Gary

    Friday, November 8, 2019 3:44 PM
  • Hello Gary,

    Thank you so much for the great theory!!! I'll try to trace the issue asap and post back the results!

    Regards,
    Michael

    Friday, November 8, 2019 6:40 PM
  • Hello Gary,

    While trying to trace the ping that produces only 6 or 4 ESP packets I got 12 ESP packets after running the single ping (I mean all the packets from the trace have been captured during the ping only):

    Session configuration:

    Here's the overall ESP trace (only network frames):

    Looking at the screenshot above I thought the trace contains the 8 "correct" ESP packets (4 requests and 4 replies) and 4 "incorrect" ones but seems I was wrong. The biggest problem in this trace for me that it does not contain most packets from "other levels" (TCP, ICPM -764/774 on your picture,...) - 8 ESP packets have neither preceding corresponding MS_Windows_TCPIP\ICMP SendMessage frames nor subsequent MS_Windows_TCPIP\ICMP ReceivedMessage frames, for example:

    There're only four ESP packets in this trace that do have the subsequent MS_Windows_TCPIP\ICMP ReceiveMessage frames (I'll post just one - last -screenshot):

    So I probably may say that the trace contain 4 icmp replies and 8 unknown ESP packets...

    Regards,
    Michael




    • Edited by MF47 Wednesday, November 13, 2019 12:40 PM
    Wednesday, November 13, 2019 12:38 PM
  • Hello Michael,

    If you look closely at your fourth picture, you should see this:

    Message 21360 - a KeepAliveTimer for connection X has expired

    Message 21361 - a KeepAliveTimer for connection Y has expired

    Message 21362 - send keep-alive for connection X

    Message 21363 - restart KeepAliveTimer for connection X

    Message 21364 - An ESP packet - presumably the TCP keep-alive mentioned in Message 21362

    Message 21365 - send keep-alive for connection Y

    Message 21366 - restart KeepAliveTimer for connection Y

    Message 21367 - An ESP packet - presumably the TCP keep-alive mentioned in Message 21365

    I think that your 12 packets are 4 echo requests, 4 echo replies and 4 TCP keep-alives.

    Gary


    Wednesday, November 13, 2019 1:12 PM
  • "Message 21360 - a KeepAliveTimer for connection X has expired" - yes, I thought about it, but why there's no MS_Windows_TCPIP\ICMP SENDMessage and ICMP frames as in your first post?
    Wednesday, November 13, 2019 2:17 PM
  • Hello Michael,

    I don't understand your question.

    In your 5th picture, one can see an ICMP ping and Messages 23252 and 23259 show the TCP/IP driver announcing "ICMP: Sendmessage" and "ICMP: Receivemessage".

    The 4th picture does not show that text because the messages being processed by the TCP/IP driver are TCP keep-alives (and not ICMP messages); the text "send keep-alive" is shown to reflect that fact.

    Gary

    Wednesday, November 13, 2019 2:31 PM
  • Hello Gary,

    "On your 5th picture, one can see an ICMP ping and Messages 23252 and 23259" - awfully sorry...missed it - I thought these frames should be "closer" (~23256) to the network ESP packets (as the ICMP receive packets follows the ESP packets). Thank you once again for the clarification!

    Regards,
    Michael

    Wednesday, November 13, 2019 7:56 PM
  • Hello Gary,

    Suppose this trace may be conceded as the one with the missing ESP packet - MMA first has logged only 7 out of 8 ESP packets and after ~0.5 second delay has added the eighth packet (the first! one):

    The complete view:

    As far as I see MMA might have not have enough time to capture the first two packets thus resulting in (almost) missing ESP packet... or am I wrong here?

    Regards,
    Michael


    • Edited by MF47 Thursday, November 14, 2019 9:24 AM
    Thursday, November 14, 2019 9:23 AM
  • Hello Micheal,

    I think that this is a perfectly normal trace of 4 echo requests and 4 echo replies, sent and received in order with the expected time differences (a few milliseconds between request and reply, one second between successive requests).

    Two things combine to make the trace interpretation difficult. Firstly, the display seems to have been sorted by the DiagnosisTypes column (indicated by the "down arrow" in the image extract below):

    Mentally sorting the messages by MessageNumber or Timestamp gives a better picture of what is happening.

    Secondly, MMA has incorrectly decoded the first ESP packet carrying an echo request as a TCP segment. If one looks at the MMA OPN file that handles ESP, one can see that it uses heuristics to decide whether (in the captured trace data) the ESP payload is encrypted or not - in this case it seems to have got it wrong. There might be room for improvement in the OPN, but it would take a while to work out which decoding paths pass through this heuristic routine and what range of input might be expected. 

    Gary

    • Marked as answer by MF47 Monday, November 18, 2019 3:25 PM
    Thursday, November 14, 2019 10:16 AM
  • Hello, Gary!

    Thank you so much for your explanations!

    I failed to capture the "wrong-numbered" ping in MMA but I now know how to troubleshoot such issues (maybe MMA captures packets more "correctly"). Thank you once again!

    Regards,
    Michael Firsov

    Monday, November 18, 2019 3:25 PM