Ik heb net een verse installatie gedaan van Windows 2008 Server Datacenter VM (4 vCPU's, Q9450), op een ESX 3.5 server, en onderworpen aan een stress-test, met OCCT. Helaas krijg ik na 30 minutes of zo al de volgende bluescreen:
"A clock interrupt was not received on a secondary processor"
Niet goed. Ik onderzocht the crash-dump (zie beneden), en daar staat: "Probably caused by : ntoskrnl.exe". Dat is minder goed zelfs. Want dat betekent dat het niet plaatsvindt in een of andere galle driver of een third-party programma.
Ik bootte de server, dit keer met Windows 2008 Server direct (geen VM). En ik kon OCCT probleemloos draaien gedurende en aantal uren. Het ligt dus niet aan de instabiliteit van mijn systeem of zo. Het lijkt er eerder op dat de scheduler van de Vmkernel het niet helemaal trekt.
Dit is nogal serieus, want een Vmware bug van deze aard kan ik natuurlijk niet zelf fiksen. Maar goed, alvorens ik conslusies trek, heeft er hier iemand ervaring met het stressen van alle 4 zijn CPU's, binnen een VM, voor lange tijd? (Met OCCT/Orthos en zo). Ik moet er niet aan denken dat je ESX wel kunt draaien, zo lang je je VMs maar niet al te hard laat werken.
Bedankt, in ieder geval.
---------
Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Windows\Minidump\Mini081508-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path. *
* Use .symfix to have the debugger choose a symbol path. *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is:
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
Unable to load image \SystemRoot\system32\ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Windows Server 2008 Kernel Version 6001 (Service Pack 1) MP (4 procs) Free x64
Product: LanManNt, suite: TerminalServer DataCenter SingleUserTS
Kernel base = 0xfffff800`01661000 PsLoadedModuleList = 0xfffff800`01826db0
Debug session time: Fri Aug 15 10:15:15.208 2008 (GMT+2)
System Uptime: 0 days 3:18:58.927
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
Unable to load image \SystemRoot\system32\ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Loading Kernel Symbols
.......................................................................................................................
Loading User Symbols
Loading unloaded module list
....
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 101, {30, 0, fffffa6001963180, 2}
*** WARNING: Unable to verify timestamp for hal.dll
*** ERROR: Module load completed but symbols could not be loaded for hal.dll
***** Kernel symbols are WRONG. Please fix symbols to do analysis.
*********************************************************************
Probably caused by : ntoskrnl.exe
Followup: MachineOwner
---------
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
CLOCK_WATCHDOG_TIMEOUT (101)
An expected clock interrupt was not received on a secondary processor in an
MP system within the allocated interval. This indicates that the specified
processor is hung and not processing interrupts.
Arguments:
Arg1: 0000000000000030, Clock interrupt time out interval in nominal clock ticks.
Arg2: 0000000000000000, 0.
Arg3: fffffa6001963180, The PRCB address of the hung processor.
Arg4: 0000000000000002, 0.
Debugging Details:
------------------
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
MODULE_NAME: nt
FAULTING_MODULE: fffff80001661000 nt
DEBUG_FLR_IMAGE_TIMESTAMP: 479192b7
BUGCHECK_STR: CLOCK_WATCHDOG_TIMEOUT
CUSTOMER_CRASH_COUNT: 1
STACK_COMMAND: .bugcheck ; kb
FOLLOWUP_NAME: MachineOwner
IMAGE_NAME: ntoskrnl.exe
BUCKET_ID: WRONG_SYMBOLS
Followup: MachineOwner
---------
"A clock interrupt was not received on a secondary processor"
Niet goed. Ik onderzocht the crash-dump (zie beneden), en daar staat: "Probably caused by : ntoskrnl.exe". Dat is minder goed zelfs. Want dat betekent dat het niet plaatsvindt in een of andere galle driver of een third-party programma.
Ik bootte de server, dit keer met Windows 2008 Server direct (geen VM). En ik kon OCCT probleemloos draaien gedurende en aantal uren. Het ligt dus niet aan de instabiliteit van mijn systeem of zo. Het lijkt er eerder op dat de scheduler van de Vmkernel het niet helemaal trekt.
Dit is nogal serieus, want een Vmware bug van deze aard kan ik natuurlijk niet zelf fiksen. Maar goed, alvorens ik conslusies trek, heeft er hier iemand ervaring met het stressen van alle 4 zijn CPU's, binnen een VM, voor lange tijd? (Met OCCT/Orthos en zo). Ik moet er niet aan denken dat je ESX wel kunt draaien, zo lang je je VMs maar niet al te hard laat werken.
Bedankt, in ieder geval.
---------
Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Windows\Minidump\Mini081508-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available
Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path. *
* Use .symfix to have the debugger choose a symbol path. *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is:
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
Unable to load image \SystemRoot\system32\ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Windows Server 2008 Kernel Version 6001 (Service Pack 1) MP (4 procs) Free x64
Product: LanManNt, suite: TerminalServer DataCenter SingleUserTS
Kernel base = 0xfffff800`01661000 PsLoadedModuleList = 0xfffff800`01826db0
Debug session time: Fri Aug 15 10:15:15.208 2008 (GMT+2)
System Uptime: 0 days 3:18:58.927
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
Unable to load image \SystemRoot\system32\ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Loading Kernel Symbols
.......................................................................................................................
Loading User Symbols
Loading unloaded module list
....
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 101, {30, 0, fffffa6001963180, 2}
*** WARNING: Unable to verify timestamp for hal.dll
*** ERROR: Module load completed but symbols could not be loaded for hal.dll
***** Kernel symbols are WRONG. Please fix symbols to do analysis.
*********************************************************************
Probably caused by : ntoskrnl.exe
Followup: MachineOwner
---------
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
CLOCK_WATCHDOG_TIMEOUT (101)
An expected clock interrupt was not received on a secondary processor in an
MP system within the allocated interval. This indicates that the specified
processor is hung and not processing interrupts.
Arguments:
Arg1: 0000000000000030, Clock interrupt time out interval in nominal clock ticks.
Arg2: 0000000000000000, 0.
Arg3: fffffa6001963180, The PRCB address of the hung processor.
Arg4: 0000000000000002, 0.
Debugging Details:
------------------
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
MODULE_NAME: nt
FAULTING_MODULE: fffff80001661000 nt
DEBUG_FLR_IMAGE_TIMESTAMP: 479192b7
BUGCHECK_STR: CLOCK_WATCHDOG_TIMEOUT
CUSTOMER_CRASH_COUNT: 1
STACK_COMMAND: .bugcheck ; kb
FOLLOWUP_NAME: MachineOwner
IMAGE_NAME: ntoskrnl.exe
BUCKET_ID: WRONG_SYMBOLS
Followup: MachineOwner
---------
i9 12900K | MSI Meg CoreLiquid S360 | ASUS ROG STRIX Z690-A GAMING WIFI D4 | G.Skill Trident Z Royal Elite 2x32GB 4266Mhz Gold | AORUS RTX 4090 MASTER | Dark Power 13 1300W | Samsung 980/860/970/990 Pro | Logitech Z-906 | Phanteks Evolv X | Dell AW3821DW