locked
Kernel Memory leak? Related to Resource Monitor? RRS feed

  • Вопрос

  • perfmon.exe memory usage

    I've noticed that Resource Monitor app uses a lot of RAM recently. Up to ~12 GB. Here's a screenshot with 2.9 GB usage:

    TaskManager-Details

    moreover, after 25 minutes, it uses 3.4 GB already:

    TaskManager-Details 2

    the only major activity currently happening is the Windows Defender Security Center "Full Scan".

    1. Is it possible that this large RAM usage is associated, say, with the fact that a lot of files are being processed during the "Full Scan", and Resource Monitor keeps "the full log" or something?
    2. Is it possible that this is a normal behavior? AFAIU, I've not seen anything like this before, even though Resource Monitor was working for weeks sometimes.

    -------------------------------------------------

    The backing-file for the real-time session

    I'm not sure if this is important, but I've also noticed that there's a number of "The backing-file for the real-time session "WDC.BE95A9B1-DE15-4B78-B923-A12AB70BE951" has reached its maximum size" messages in Administrative Events in Event Viewer.

    AFAIU, this Event Trace Session is started by Resource Monitor's needs. And these messages mean that time to time, Resource Monitor UI is not able to process all the events generated. One the second screenshot Event Monitor eats ~3% CPU, which is about one full CPU core. Since it is effectively a single-thread app, it looks like it's working with a maximum possible throughput. This probably explains why these messages are generated. But... I still can't say whether or not this is significant for the "whole picture".

    ------------------------------------

    General memory usage

    It looks like Resource Monitor is only a small part of the story, actually. For some reason, Windows is currently consuming ~40 GB of RAM:

    General memory usage

    Users's apps definitely do not use that much. Except Resource Monitor, I can see no apps with abnormal RAM usage (see screenshots of the "Details" tab of the Task Manager app above.

    Just for reference, here's what Resource Monitor shows:

    ---------------------------------------------

    Restarting Resource Monitor

    If I restart Resource Monitor, (via "Monitor->Stop Monitoring" & "Monitor->Start Monitoring") its memory usage goes back to normal ~100MB.... For a few minutes....

    Memory usage by Resource Monitor after restarting monitoring

    But on the whole, memory usage does not decrease much:

    Overall memory usage after restarting monitoring

    This feels like a higher-than-normal memory usage. But... well... I'm not sure whether it could be considered as "normal" or not. Maybe Windows is just trying to utilize empty RAM by caching some stuff somehow. By not closing some memory-mapped files or whatever. Usually, I've used RamMap tool to get more insights. To check what's actually happening. But... it looks like RamMap is not working for me any more.

    It works OK for a few minutes after a reboot, but than it fails like this.

    ---------------------------
    RamMap
    ---------------------------
    Error refreshing database
    ---------------------------
    OK   
    ---------------------------

    This does not look nice. And it raises my suspiciousness - it looks like something is very wrong.

    Ah... Yep, the final thing... I had 3 BSODs on this machine within last 7 days. Maybe this is what actually raises my suspiciousness. :)

    So... Maybe it's a memory leak in "iaStorA.sys" ? Any idea how to check this?

    PS

    It looks like I have to try PoolMon..

    • Изменено i3v 14 апреля 2018 г. 18:43 fixed image, added "PS"
    14 апреля 2018 г. 18:33

Ответы

  • Yep, you're right. I've compared the keys I see there with another PC and it looks like there's nothing unexpected, actually.

    Few more things I've noticed during the last few weeks:

    1. I've installed few updates, now I'm on 1703 15063.1088. Nothing changed.
    2. Resource Monitor memory usage is not growing above  Commit Size = 13.6 GB. So, it's hardly a real  memory leak. Maybe that's some hard-coded percentage of the amount of RAM installed or something. 
    3. Resource Monitor Commit Size never decreases, though. Working Set goes down easily, when the system is low on memory. And if there's not much load - it's not growing again. (That is - Resource Monitor is not actually using the memory it allocated, AFAIU). This looks like a memory leak a bit. Maybe that's "as designed". Not a big deal, anyway.
    4. I had no BSODs since I've uninstalled those Intel SATA ACHI driver.
    5. The Paged/NonPaged Pool easily goes up to ~45GB /18GB (and probably even more) not only when Windows Defender "full scan" is running. Simple unpack of a large archive (few millions of files) gives the same result. It easily goes down if some app (e.g. dummy memory test app) allocates a lot of RAM. So, there's no leak.

    So, AFAIU, there's nothing wrong. Everything is working as intended.

    Thanks for your help!

    • Помечено в качестве ответа i3v 14 мая 2018 г. 23:06
    14 мая 2018 г. 23:06

Все ответы

  • a non-paged pool size of 7Gb is not normal, mostly its < 200MB.
    When you have a "Killer" network card: their drivers are known to cause this.
    Else: Using PoolMon to Find a Kernel-Mode Memory Leak

    The high memory usage of Resource Monitor could be related, but I am not sure.

    • Изменено EckiS 14 апреля 2018 г. 20:58
    14 апреля 2018 г. 20:58
  • What is perfmon running for? if you open perfmon does it have monitoring it runs?

    What is this system, 384GB? is that correct?

    14 апреля 2018 г. 21:00
  • @Mr Happy - good catch, I missed that.
    Then a non-paged pool of 7Gb is probably ok and needed by memory manager.

    • Изменено EckiS 14 апреля 2018 г. 21:47
    14 апреля 2018 г. 21:46
  • When you have a "Killer" network card: their drivers are known to cause this.

    Nope, I don't have anything like that... Only built in Intel NIC.

    What is perfmon running for? if you open perfmon does it have monitoring it runs?

    What is this system, 384GB? is that correct?

    I use perfmon in many cases to monitor possible performance bottlenecks. I've never noticed that it's using that much RAM before. But in this case I was just trying to find something suspicious, after those BSODs, related to "iaStorA.sys". Like 100% hdd load for no reason, which usually indicates that HDD is failing. (Even though the only (non-used) DVD drive is attached to Intel SATA controller).

    The system is Dell T7910, 384GB is correct.

    @Mr Happy - good catch, I missed that. 
    Then a non-paged pool of 7Gb is probably ok and needed by memory manager.

    It's true that non-paged pool is typically larger than 200 MB on this system. But still, to my experience, usually both non-paged pool and paged-pool are at least 2x-3x smaller than now.

    Else: Using PoolMon to Find a Kernel-Mode Memory Leak

    I've just tried PoolMon. And... Found nothing. The top row are:

    Top nonpaged:

    File

     1.9 GB

    related to some general FileIO?

    Ntfx

     1.7 GB

    NTFS -> general

    MmCa

     1.3 GB

    Mm control areas for mapped files

    EtwB

     1.1 GB

    Event tracing buffer

    FMsl

     0.9 GB

    STREAM_LIST_CTRL structure

    FSro

     0.4 GB

    File System Run Time

    ReTa

     0.2 GB

    Resource Extended Table


    Top paged:

    Ntff

    5.9 GB

    NTFS -> FCB_DATA

    MmSt    

    3.3 GB

    Mm section object prototype ptes.

    FMfn

    2.5 GB

    NAME_CACHE_NODE structure

    IoNm

    1.3 GB

    Io parsing names

    (The basic info is from here)

    So:

    1. Almost everything seem to be related to "heavy IO", "NTFS", "Files". No 3rd party drivers. Just core Windows stuff. I doubt there's an actual memory leak in there.... And I doubt that their high memory usage might result in BSOD.... Well, I hope so...
    2. All "heavy IO activity" is caused by Windows Defender Full Scan, which is still running. It's reading from one HDD at a time, with ~7-20 MB/s speed. Well, I'm not sure if it actually benefits much from all those "caches". Probably this is just how Windows works. I'm not blaming it for trying to cache some stuff, if it is possible.
    3. There's nothing related to iaStorA.sys. That's a pity... 

    The only questionable thing, IMHO, is the EtwB. It sounds like something that might be connected with the Resource Monitor. Or it's just working with poolmon itself, or something else. Not sure. Probably that's OK state as well..

    So... I still don't get

    1. why RamMap is broken,
    2. why ResourceMonitor is eating more and more RAM if it's left running
    3. what's the reason behind those BSODs :)

    Maybe I've missed something?

    15 апреля 2018 г. 1:43
  • you have Bit9 installed?
    Can you try this again without any 3rd party Antivirus installed, or even in a Safe Boot?

    As for the BSOD's: Intel did not yet answer this, but I guess they would need the dump files anyway (instead of your html reports).

    From your Detailed-System-Report.html you are still on Windows 10 1703, so you could update to a newer version, perhaps this was already fixed.

    15 апреля 2018 г. 6:28
  • From your Detailed-System-Report.html you are still on Windows 10 1703, so you could update to a newer version, perhaps this was already fixed.

    Upgrading Windows version sounds like a good idea, but in-place-upgrade fails for some reason. I’ve tried few things and it looks like there’s some progress (the setupact.log is longer now). Maybe I’ll start a separate thread for that in a while.

    I guess they would need the dump files anyway (instead of your html reports).

    I do keep all the minidumps (and one larger 30GB “automatic dump”) – I’ve copied them to a separate folder. But they’ve not asked to share them yet.

    you have Bit9 installed? Can you try this again without any 3rd party Antivirus installed, or even in a Safe Boot?

    Booting into Safe Mode and attempting to run the Windows Defender Full Scan from there is another possible experiment, yep. I’m not sure how to interpret either result of it, but maybe I’ll try this a bit later.

    I’ve also noticed some users report that this driver (but different version) might cause high nonpaged pool usage by "MmSt" poolmon tag:

    RST 11.5 removal instructions (since it doesn't clean up properly)
    This driver is actually based on the RSTe branch, it installs a filter driver that registers to the host lower filter reg keys.
    SMART is not available in this driver and it causes MmSt (see poolmon/pooltag) to rapidly consume paged pool when heavy file access (searching for instance) is performed.
    The removal of this driver is not as simple as uninstalling the driver (which is also performed when doing an RST upgrade install). The iastorA and iastorF(filter) services are not disabled and removed during the update, and the iaStorF filter is not removed from the HDD bus lowerfilter, which means the driver is still loaded even with a different version of the IRST driver. This as the case, it results in the MmSt paging out heavily when buffering pte's during a search.To remove this driver properly, the user must do the following exactly in this order <....>

    There are also other reports about memory leaks, related to these drivers. This time, iaStoreV.sys (SpDN tag).

    So, for now, I’ve tried to test the same thing without this Intel driver first.

    I’ve:

    1. manually “updated driver” to Microsoft “Standard SATA ACHI Controller” instead.
    2. pnputil /enum-drivers . Found : iastora.inf, iastorac.inf, iastorb.inf, iastors.inf  (there was no iastorF)
    3. pnputil /delete-driver oem??????.inf for each of them
    4. reboot
    5. Sanity check:
      1. The pnputil /enum-drivers does not see them any longer
      2. driverquery /v do see them, but reports that they are stopped and in “Manual” start mode.
      3. DriverView tool does not see them

    So, for now I cannot see any significant changes in memory consumption:

    1. Shortly-after-restart memory usage is about the same as before (that’s a pity I don’t have explicit “before” screenshot):
      TaskManager - RAM usage shortly after reboot
    2. RamMap is working shortly after the reboot, but fails after a few hours of “Windows Defender Full Scan”
    3. After ~14 hours of “Full Scan”, Resource Monitor RAM usage (Working Set) is 1.2GB. Paged Pool is 5.5GB, Non-paged pool is 3.1GB
    4. It’s interesting to note, that EtwB is on top, currently, 1.1GB. Maybe this is the maximum, and it’s not going to grow more than that. And, thus, there’s no leak.

    Top Nonpaged

    Tag

    Type

    Allocs

    Frees

    Diff

    Bytes

    Per Alloc

    EtwB

    Nonp

    1469

    448

    1021

    1.11E+09

    1085857

    File

    Nonp

    25816115

    24376256

    1439859

    5.99E+08

    415

    Ntfx

    Nonp

    1527131

    77408

    1449723

    5.09E+08

    351

    MmCa

    Nonp

    1774504

    409197

    1365307

    3.93E+08

    288

    FMsl

    Nonp

    1500573

    51637

    1448936

    2.78E+08

    192

    FSro

    Nonp

    1306576

    55833

    1250743

    1E+08

    80

    CcSc

    Nonp

    2457219

    2369808

    87411

    48950160

    560

    MmPb

    Nonp

    3

    0

    3

    36069376

    12023125

    FMsc

    Nonp

    16909

    3231

    13678

    33483744

    2448

    ReTa

    Nonp

    515745

    8729

    507016

    32474048

    64

    Top paged

    Tag

    Type

    Allocs

    Frees

    Diff

    Bytes

    Per Alloc

    Ntff

    Paged

    1295587

    33167

    1262420

    1.72E+09

    1360

    MmSt

    Paged

    1864154

    492493

    1371661

    1.1E+09

    801

    FMfn

    Paged

    11269973

    9726292

    1543681

    7.27E+08

    470

    IoNm

    Paged

    23101047

    21835342

    1265705

    3.42E+08

    270

    I’ll probably wait until the “Full Scan” would be over. At least. Maybe even run it twice. To check if there would be a BSOD or not, and how much RAM would Resource Monitor be consuming by then.

    17 апреля 2018 г. 16:32
  • I have only seen high memory usage by EtwB (= Event Tracing buffer), when Windows Performance Recorder was running, and Logging Mode = Memory.
    Did you perhaps enable some tracing and forgot to set it back to defaults?
    17 апреля 2018 г. 16:46
  • That's an interesting idea. Maybe I've played with "Windows Performance Recorder" on this machine about a month ago. 

    For now, it shows "Recording not started":

    But.. can I trust this? Maybe it's in some broken half-started state? Is there anything else to check?

    UPD:

    Here's a post, saying that WPR was still active in the background, even though "wpr –status" was reporting "WPR is not recording". 

    I don't have any "WPR_initiated_" keys in "HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger", but I do have several other keys with Start=1. And, indeed,  it looks like "Start" property is the one the that controls the "auto-start on boot".

    E.g. "WiFiSession" (even though there's no WiFi adapter here). And there's a path that points to a fresh 2.5 MB "WiFi.etl" file. This is strange... "NtfsLog.etl" (4 KB) is also there.

    Hm... I can't remember myself starting that.

    I'll definitely try to remove all those keys tomorrow.

    • Изменено i3v 17 апреля 2018 г. 17:41
    17 апреля 2018 г. 17:14
  • some Etl traces are running all the times:

    xperf -loggers

    so I don't know if they simply use larger buffers on a system with 384Gb, and thus its working as epxteced?

    17 апреля 2018 г. 19:11
  • Yep, you're right. I've compared the keys I see there with another PC and it looks like there's nothing unexpected, actually.

    Few more things I've noticed during the last few weeks:

    1. I've installed few updates, now I'm on 1703 15063.1088. Nothing changed.
    2. Resource Monitor memory usage is not growing above  Commit Size = 13.6 GB. So, it's hardly a real  memory leak. Maybe that's some hard-coded percentage of the amount of RAM installed or something. 
    3. Resource Monitor Commit Size never decreases, though. Working Set goes down easily, when the system is low on memory. And if there's not much load - it's not growing again. (That is - Resource Monitor is not actually using the memory it allocated, AFAIU). This looks like a memory leak a bit. Maybe that's "as designed". Not a big deal, anyway.
    4. I had no BSODs since I've uninstalled those Intel SATA ACHI driver.
    5. The Paged/NonPaged Pool easily goes up to ~45GB /18GB (and probably even more) not only when Windows Defender "full scan" is running. Simple unpack of a large archive (few millions of files) gives the same result. It easily goes down if some app (e.g. dummy memory test app) allocates a lot of RAM. So, there's no leak.

    So, AFAIU, there's nothing wrong. Everything is working as intended.

    Thanks for your help!

    • Помечено в качестве ответа i3v 14 мая 2018 г. 23:06
    14 мая 2018 г. 23:06