Thursday, April 10, 2008

How to read and understand Kernel Panic Screen


here are some Panic messages example and how to read them:


example 1:
Unresolved kernel trap(cpu 0): 0x300 - Data access DAR=0x0000000030D6334E PC=0x0000000027B5CD3C
Latest crash info for cpu 0:
Exception state (sv=0x27CA4500)
PC=0x27B5CD3C; MSR=0x00009030; DAR=0x30D6334E; DSISR=0x40000000; LR=0x27B5CD24; R1=0x0D80BAE0; XCP=0x0000000C (0x300 - Data access)
Backtrace:
0x27B5E6C4 0x27B5D82C 0x27B5607C 0x27B45C74 0x002E9A80 0x002EB94C
0x0008C248 0x00029234 0x000233F8 0x000ABEAC 0x8001016C

Kernel loadable modules in backtrace (with dependencies):
com.apple.GeForce(4.1.8)@0x27b3a000
dependency: com.apple.iokit.IOPCIFamily(1.7)@0x1d8f7000
dependency: com.apple.iokit.IOGraphicsFamily(1.4.2)@0x27867000
dependency: com.apple.iokit.IONDRVSupport(1.4.2)@0x2788b000
dependency: com.apple.NVDAResman(4.1.8)@0x278a1000

Proceeding back via exception chain:
Exception state (sv=0x27CA4500)
previously dumped as "Latest" state. skipping...
Exception state (sv=0x1D92D280)
PC=0x9000B348; MSR=0x0200F030; DAR=0x02A8A000; DSISR=0x42000000; LR=0x9000B29C; R1=0xBFFFE900; XCP=0x00000030 (0xC00 - System call)



Kernel version:
Darwin Kernel Version 8.11.0: Wed Oct 10 18:26:00 PDT 2007; root:xnu-792.24.17~1/RELEASE_PPC

panic(cpu 0 caller 0xFFFF0003): 0x300 - Data access
Latest stack backtrace for cpu 0:
Backtrace:
0x000954F8 0x00095A10 0x00026898 0x000A8204 0x000ABB80
Proceeding back via exception chain:
Exception state (sv=0x27CA4500)
PC=0x27B5CD3C; MSR=0x00009030; DAR=0x30D6334E; DSISR=0x40000000; LR=0x27B5CD24; R1=0x0D80BAE0; XCP=0x0000000C (0x300 - Data access)
Backtrace:
0x27B5E6C4 0x27B5D82C 0x27B5607C 0x27B45C74 0x002E9A80 0x002EB94C
0x0008C248 0x00029234 0x000233F8 0x000ABEAC 0x8001016C
Kernel loadable modules in backtrace (with dependencies):
com.apple.GeForce(4.1.8)@0x27b3a000
dependency: com.apple.iokit.IOPCIFamily(1.7)@0x1d8f7000
dependency: com.apple.iokit.IOGraphicsFamily(1.4.2)@0x27867000
dependency: com.apple.iokit.IONDRVSupport(1.4.2)@0x2788b000
dependency: com.apple.NVDAResman(4.1.8)@0x278a1000
Exception state (sv=0x1D92D280)
PC=0x9000B348; MSR=0x0200F030; DAR=0x02A8A000; DSISR=0x42000000; LR=0x9000B29C; R1=0xBFFFE900; XCP=0x00000030 (0xC00 - System call
*********


OK you should understand that what happened here is that the OS has stopped working since it has a problem, the first thing it does is look for a debugger to pass the control to (since this is not a Mac OS X developer station it will not find one), next thing it will do is dump to the screen all the data it can on the incident so you or a qualified technician can understand what the problem is.

the first part (blue part) display data of the CPU registers, program control exception state for CPU number 0 (this is your first CPU) at the time of crash, this data is hex address in the memory, it will not do you much good unless you are a mega advanced user.

next part Backtrace (green part) also uses hex memory address to point the command the CPU ran before crashing, again will not do us any good.

next part (bolded black) is still regarding backtrace (what happened before the crash), but it tells us what modules (usually kext) where loaded, usually this part can tell us allot, since this loading sequence ended with crashing the OS. in this section we see the last loaded module (in this example the GeForce) and the modules it depends on that needed to be loaded before it so they probably did not cause the crash since they are already loaded (although still possible that they are the fault - but rarely they are).

next part (Red part) is the exception chain, again stating the data as hex memory address and the CPU registers, will not help us.

last interesting part is the kernel version part (orange part) this part will state what kernel you are using, name version build date and platform in this example:
Darwin
8.11.0
oct 10
PPC (Power PC)

the rest is the same data from the caller point of view but you might find different data there (i never had).

Conclusion 1: the problem is with the GeForce kext.

Example 2:
panic(cpu 0 caller 0x001A429B): Unresolved kernel trap (CPU 0, Type 14=page fault), registers:
CR0: 0x8001003b, CR2: 0x00000024, CR3: 0x00d7b000, CR4: 0x000006e0
EAX: 0x00000000, EBX: 0x02bacc00, ECX: 0x025dc9a4, EDX: 0x00000000
CR2: 0x00000024, EBP: 0x14053ef8, ESI: 0x00841684, EDI: 0x0083df64
EFL: 0x00010206, EIP: 0x003bd3b3, CS: 0x00000008, DS: 0x14050010

Backtrace, Format - Frame : Return Address (4 potential args on stack) 
0x14053d38 : 0x128d08 (0x3cb134 0x14053d5c 0x131de5 0x0) 
0x14053d78 : 0x1a429b (0x3d0e4c 0x0 0xe 0x3d0670) 
0x14053e88 : 0x19ada4 (0x14053e98 0x14053ea8 0xe 0x48) 
0x14053ef8 : 0x83df81 (0x2bacc00 0x841684 0x14053f28 0x38073e) 
0x14053f28 : 0x39c536 (0x2bacc00 0x28b9880 0x8 0x2) 
0x14053f78 : 0x13d7d9 (0x28b9880 0x2686021 0x0 0xbffff378) 
0x14053fc8 : 0x19ac1c (0x0 0x0 0x4 0x207) Backtrace terminated-invalid frame pointer 0x0
Kernel loadable modules in backtrace (with dependencies):
com.apple.driver.IOBluetoothHIDDriver(1.7.2b2)@0x8 37000
dependency: com.apple.iokit.IOBluetoothFamily(1.7.14f14)@0x6be 000
dependency: com.apple.iokit.IOHIDFamily(1.4.10)@0x531000

Conclusion 2:we probably have a problem with the Bluetooth kext.

so up to here it was easy and straight forward, the next doesn't have the loaded module part.
Example 3:
Unresolved kernel trap(cpu 0): 0x300 - Data access DAR=0x0000000000000010 PC=0x00000000000819E8
Latest crash info for cpu 0:
Exception state (sv=0x3D849280)
PC=0x000819E8; MSR=0x00009030; DAR=0x00000010; DSISR=0x40000000; LR=0x000819CC; R1=0x2720BB00; XCP=0x0000000C (0x300 - Data access)
Backtrace:
0x00032AC8 0x000823DC 0x00075F58 0x00075918 0x0006B45C 0x0006B730 
0x000578A0 0x0002921C 0x000233F8 0x000ABAAC 0x414C5945 
Proceeding back via exception chain:
Exception state (sv=0x3D849280)
previously dumped as "Latest" state. skipping...
Exception state (sv=0x42AF9280)
PC=0x9000AB48; MSR=0x0000F030; DAR=0x011DB004; DSISR=0x42000000; LR=0x9000AA9C; R1=0xF0101080; XCP=0x00000030 (0xC00 - System call)

Kernel version:
Darwin Kernel Version 8.8.0: Fri Sep 8 17:18:57 PDT 2006; root:xnu-792.12.6.obj~1/RELEASE_PPC
panic(cpu 0 caller 0xFFFF0003): 0x300 - Data access
Latest stack backtrace for cpu 0:
Backtrace:
0x00095138 0x00095650 0x00026898 0x000A7E04 0x000AB780 
Proceeding back via exception chain:
Exception state (sv=0x3D849280)
PC=0x000819E8; MSR=0x00009030; DAR=0x00000010; DSISR=0x40000000; LR=0x000819CC; R1=0x2720BB00; XCP=0x0000000C (0x300 - Data access)
Backtrace:
0x00032AC8 0x000823DC 0x00075F58 0x00075918 0x0006B45C 0x0006B730 
0x000578A0 0x0002921C 0x000233F8 0x000ABAAC 0x414C5945 
Exception state (sv=0x42AF9280)
PC=0x9000AB48; MSR=0x0000F030; DAR=0x011DB004; DSISR=0x42000000; LR=0x9000AA9C; R1=0xF0101080; XCP=0x00000030 (0xC00 - System call)

Kernel version:
Darwin Kernel Version 8.8.0: Fri Sep 8 17:18:57 PDT 2006; root:xnu-792.12.6.obj~1/RELEASE_PPC

*********


Conclusion 3: so this can happen for several reasons:
1. first option is a random memory access error, meaning that the memory has been accessed to an area that wasn't expected or allowed, maybe even an application has written into memory that it shouldn't have (that was not its space) and by that caused the OS/kernel to crash, you should check what recent applications/utilities/kext/bundle/plugin/login items you have installed and remove it or disable its launch for a while until you can sort it out.
2. second option is usually the case for real macs but also possible for PC, as simple as a hardware problem, or bad memory card that causing the problem, maybe it act badly only if it is cold (immediately after booting the machine) or when it is hot (after several hours of work, depends sometimes minutes is enough on a sunny day), another hardware problem can be any hardware that access memory asynchronously, like IO (bluetooth card, modem, wifi, network card, etc.).
3. another option is that your combination of kext and bundles is not working (maybe versions of them some are older then should be?) so for such a case i always keep a bootable/loadable System/Library/Extensions folder on the disk.
4. this option is rarely the case, but can happen, if the main boot partition doesn't have enough free space it could cause the problem, so all you need to do in this case is to boot in safe mode and free some space.
5. another simple option is the case of a kext/application/kernel trying to access a file that it doesn't have a permission to access to, this can be caused due to wrong unix file mod, in this case boot into single user and fix permissions.
6. last option (that i can think of) is a bad kernel, so since the kernel itself is badly behaving that is why there are no kext loaded yet, since the kernel hasn't finished loading the core. this is why i keep a spare bootable/loadable copy on the disk so i can boot from it on a rainy kernel problem day.

so here are some articles like 10 things to do in order to get rid of the Panic:


Enjoy (from a Panic? how can you?).

No comments:

 
the menu is from: Milonic DHTML menus