1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Please help solve ntoskrnl high CPU usage

Discussion in 'Windows & Other Software' started by Splurgeworthy, 31 Jan 2018.

  1. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    First post, and desperately looking for a way of solving this issue, so I hope you don't mind me asking here, where wise users may know obscure solutions!

    Right, here's the basics; my system started becoming unusable recently. At first, I thought I'd found the solution, in Windows 10, I found SleepStudy was writing itself to disc endlessly and locking up my system.

    I've contacted Microsoft about that, and they're aware of it, but have no solution. I have to keep setting the file to Read Only every time and restart to make it stick.

    But I was still noticing problems with my machine over time. Only now it's CPU usage, with CPU 0 in particular being hammered to 100% constantly. It's not every time, once or twice I can start the machine and it runs clean, but it's there let's say 9/10 times. I've googled around, and found instructions to try and narrow down what it is, and as far as I can see, it's notskrnl and the Service Host Diagnostic Policy reporting some sort of error endlessly.

    What I can't tell is what is causing that. And here's where it gets frustrating. This is what I've done since;
    • Updated every driver I can find
    • Rolled Windows back to a repair
    • Done a clean reinstall of Windows
    • Done a Virus and Rootkit scan on the clean install, nothing.
    • Disabled everything except core parts of Windows under Device Manager, problem persists.
    • Checked connections, looked for damaged parts on the motherboard
    • Ran RAM check, no issues
    • Ran drive checks, no issues
    • Watched temperature sensors, machine is running cool
    • Checked voltages; I'm not an expert HWinFO64 seems to show they're normal.
    Both the SleepStudy issue, and the CPU one persist even with that fresh Windows install. I'm leaning towards it being a hardware fault, but I have no idea how to go further from here.

    Thinking about it, it could also be with SleepStudy set to Read Only, it's endlessly error reporting its attempts to write to it... Hnnngh.

    I've run a High CPU trace, which can be downloaded here, but I've no idea how to read it deeper to locate where that CPU use is coming from. Can anyone help, please?
     
  2. Orcvader

    Capodecina

    Joined: 11 Oct 2009

    Posts: 14,332

    Location: Greater London

    What's your full system specs?

    The fact that you did a clean reinstall should rule out software being the cause, so maybe a driver for something is causing the issue to kick up.
     
  3. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    Processor: (CPU) Intel® Core™i5 Quad Core Processor i5-4690 (3.5GHz) 6MB Cache
    Motherboard: ASUS® Z97-P: ATX, LG1150, USB 3.0, SATA 6GBs
    Memory (RAM): 8GB HyperX BEAST DUAL-DDR3 2133MHz X.M.P (2 x 4GB)
    Graphics Card: 4GB NVIDIA GEFORCE GTX 970 - DVI, HDMI, mDP - 3D Vision Ready
    1st Hard Disk: 500GB Samsung 850 EVO SSD, SATA 6Gb/s (upto 540MB/sR | 520MB/sW)
    Power Supply: CORSAIR 550W VS SERIES™ VS-550 POWER SUPPLY
    OS: Windows 10 Pro Build 16299
     
  4. jpaul

    Capodecina

    Joined: 1 Mar 2010

    Posts: 11,992

    if you have the html report rather than etl, folks maybe able to help more easily eg

    tracerpt test.etl -report out.html -f html

    did you create report whilst problems was occurring ?
     
  5. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    That command opens the .etl, but then fails with an "Unspecified Error" at attempting to write the output. I've not found another way to convert the report to HTML that works yet.

    Edit: I found I could get a HTML report from Performance Monitor. Download here.

    I can see System Interupts is probably the issue again, but not what is causing it. Any ideas folks?
     
    Last edited: 31 Jan 2018
  6. jpaul

    Capodecina

    Joined: 1 Mar 2010

    Posts: 11,992

    the sample looks as though it was just 60s during which nothing much was happening ? ~20% of cpu being used,

    when the problem is happening, maybe try the
    tracelog -start -f test.etl -dpcisr -UsePerfCounter -b 64
    from the link I gave.

    [incidentally I would never use mediafire - you have to enable scripts on the site before it will give you a download and it pops up an advertising tab]
     
  7. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    System, Diagnosis Service and Policy are all taking 20% each, which is the issue. I've got a flat usage of 60% at least constantly, which means I often struggle to run anything else. It should be visible in that trace.

    I've tried installing Windows Driver Kit to run that trace program, but it just closes with an error that I've got the wrong version. The download page however is Windows 10, v1709. Are the tools out of date now?
     
  8. jpaul

    Capodecina

    Joined: 1 Mar 2010

    Posts: 11,992

    I don't have 1709 so have not tried to install it,

    however for the dpc traces can use latmon or the dpc latency tool instead ,and they will also identify a bad driver sucking up cpu resources
     
  9. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    LatMon indicates it's the Microsoft Storage Space Driver, especially as it's hammering CPU0 as I had already noticed. Does anyone know how to fix this? The driver for it is up to date, but it may be conflicting with something else.

    _________________________________________________________________________________________________________
    CONCLUSION
    _________________________________________________________________________________________________________
    Your system appears to be suitable for handling real-time audio and other tasks without dropouts.
    LatencyMon has been analyzing your system for 0:05:07 (h:mm:ss) on all processors.


    _________________________________________________________________________________________________________
    SYSTEM INFORMATION
    _________________________________________________________________________________________________________
    Computer name: DESKTOP-1DO8HSE
    OS version: Windows 10 , 10.0, build: 16299 (x64)
    Hardware: All Series, ASUS, ASUSTeK COMPUTER INC., Z97-P
    CPU: GenuineIntel Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz
    Logical processors: 4
    Processor groups: 1
    RAM: 8133 MB total


    _________________________________________________________________________________________________________
    CPU SPEED
    _________________________________________________________________________________________________________
    Reported CPU speed: 3498 MHz
    Measured CPU speed: 1 MHz (approx.)

    Note: reported execution times may be calculated based on a fixed reported CPU speed. Disable variable speed settings like Intel Speed Step and AMD Cool N Quiet in the BIOS setup for more accurate results.

    WARNING: the CPU speed that was measured is only a fraction of the CPU speed reported. Your CPUs may be throttled back due to variable speed settings and thermal issues. It is suggested that you run a utility which reports your actual CPU frequency and temperature.



    _________________________________________________________________________________________________________
    MEASURED INTERRUPT TO USER PROCESS LATENCIES
    _________________________________________________________________________________________________________
    The interrupt to process latency reflects the measured interval that a usermode process needed to respond to a hardware request from the moment the interrupt service routine started execution. This includes the scheduling and execution of a DPC routine, the signaling of an event and the waking up of a usermode thread from an idle wait state in response to that event.

    Highest measured interrupt to process latency (µs): 231.557513
    Average measured interrupt to process latency (µs): 4.150377

    Highest measured interrupt to DPC latency (µs): 229.508331
    Average measured interrupt to DPC latency (µs): 2.301960


    _________________________________________________________________________________________________________
    REPORTED ISRs
    _________________________________________________________________________________________________________
    Interrupt service routines are routines installed by the OS and device drivers that execute in response to a hardware interrupt signal.

    Highest ISR routine execution time (µs): 153.046598
    Driver with highest ISR routine execution time: ACPI.sys - ACPI Driver for NT, Microsoft Corporation

    Highest reported total ISR routine time (%): 5.372115
    Driver with highest ISR total time: ACPI.sys - ACPI Driver for NT, Microsoft Corporation

    Total time spent in ISRs (%) 5.459783

    ISR count (execution time <250 µs): 7283653
    ISR count (execution time 250-500 µs): 0
    ISR count (execution time 500-999 µs): 0
    ISR count (execution time 1000-1999 µs): 0
    ISR count (execution time 2000-3999 µs): 0
    ISR count (execution time >=4000 µs): 0


    _________________________________________________________________________________________________________
    REPORTED DPCs
    _________________________________________________________________________________________________________
    DPC routines are part of the interrupt servicing dispatch mechanism and disable the possibility for a process to utilize the CPU while it is interrupted until the DPC has finished execution.

    Highest DPC routine execution time (µs): 465.799886
    Driver with highest DPC routine execution time: storport.sys - Microsoft Storage Port Driver, Microsoft Corporation

    Highest reported total DPC routine time (%): 8.113133
    Driver with highest DPC total execution time: ACPI.sys - ACPI Driver for NT, Microsoft Corporation

    Total time spent in DPCs (%) 12.632027

    DPC count (execution time <250 µs): 22806528
    DPC count (execution time 250-500 µs): 0
    DPC count (execution time 500-999 µs): 23
    DPC count (execution time 1000-1999 µs): 0
    DPC count (execution time 2000-3999 µs): 0
    DPC count (execution time >=4000 µs): 0


    _________________________________________________________________________________________________________
    REPORTED HARD PAGEFAULTS
    _________________________________________________________________________________________________________
    Hard pagefaults are events that get triggered by making use of virtual memory that is not resident in RAM but backed by a memory mapped file on disk. The process of resolving the hard pagefault requires reading in the memory from disk while the process is interrupted and blocked from execution.


    Process with highest pagefault count: none

    Total number of hard pagefaults 0
    Hard pagefault count of hardest hit process: 0
    Highest hard pagefault resolution time (µs): 0.0
    Total time spent in hard pagefaults (%): 0.0
    Number of processes hit: 0


    _________________________________________________________________________________________________________
    PER CPU DATA
    _________________________________________________________________________________________________________
    CPU 0 Interrupt cycle time (s): 178.927373
    CPU 0 ISR highest execution time (µs): 153.046598
    CPU 0 ISR total execution time (s): 65.982936
    CPU 0 ISR count: 7239113
    CPU 0 DPC highest execution time (µs): 403.420240
    CPU 0 DPC total execution time (s): 115.877141
    CPU 0 DPC count: 16392392
    _________________________________________________________________________________________________________
    CPU 1 Interrupt cycle time (s): 19.783292
    CPU 1 ISR highest execution time (µs): 142.811035
    CPU 1 ISR total execution time (s): 1.054278
    CPU 1 ISR count: 34666
    CPU 1 DPC highest execution time (µs): 465.799886
    CPU 1 DPC total execution time (s): 12.090988
    CPU 1 DPC count: 1921672
    _________________________________________________________________________________________________________
    CPU 2 Interrupt cycle time (s): 18.853544
    CPU 2 ISR highest execution time (µs): 53.887078
    CPU 2 ISR total execution time (s): 0.009949
    CPU 2 ISR count: 4665
    CPU 2 DPC highest execution time (µs): 456.897656
    CPU 2 DPC total execution time (s): 12.107321
    CPU 2 DPC count: 1980390
    _________________________________________________________________________________________________________
    CPU 3 Interrupt cycle time (s): 21.396493
    CPU 3 ISR highest execution time (µs): 9.337907
    CPU 3 ISR total execution time (s): 0.009235
    CPU 3 ISR count: 5209
    CPU 3 DPC highest execution time (µs): 449.919097
    CPU 3 DPC total execution time (s): 15.069596
    CPU 3 DPC count: 2512097
    _________________________________________________________________________________________________________
     
    Last edited: 1 Feb 2018
  10. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    Damnit, I thought I had it; via Samsung Magician I found a problem that Intel Rapid Storage Technology actually updates to an older version of the file via Windows. I had to manually download the drivers and point to the folder to get it to be the correct version.

    However the issue is still there, and now LatencyMon is just reporting it as ACPI.sys again instead. So I'm back where I started. Arrrghhhh! Any ideas anyone?
     
  11. jpaul

    Capodecina

    Joined: 1 Mar 2010

    Posts: 11,992

    thats a coincidence - I used to have that until I opted for glitch free audio see

    and others

    I installed it when I went to acpi ... but it was not needed I do not do raid or striping.
     
  12. Splurgeworthy

    Associate

    Joined: 31 Jan 2018

    Posts: 7

    Samsung Magician needs it for certain processes, or so it claims...

    Ok, I've finally got the answer, but it's FANTASTICALLY obscure and insane. Someone in comments says they put their machine into sleep to remove it, an expert says to clear the Power settings via command line and... it worked. I've reset 5 times now and I'm back down to 1-5% cpu instead of 75%. Even though I've done entire clean installs over the last few days, something, god knows what, was carrying a broken power setting from install to install. I had tried resetting the Power modes via Windows previously, and that hadn't worked. It was only until I chose sleep, then cleared them via cmd that it stuck.

    But I have no idea why. UGH so much stress, but freeeee now! Hopefully the solution will help others!
     
  13. jpaul

    Capodecina

    Joined: 1 Mar 2010

    Posts: 11,992

    not sure about your Samsung Magician comment (I use an Evo ssd)
    but the link you posted says sleeping is a workaround and also suggest getting rid of IRST (but the whole article is a bit confused)