![]() |
![]() ![]() ![]() ![]() ![]() ![]() |
|
|
|
Anyone who has been using Sun hardware and the Solaris operating system has no doubt felt the effects of Sun Burn® at some point in his/her career. Most recently, we've been seeing more and more "E-cache parity errors" as a result of the lack of ECC on the processor cache. Sun claims this is a result of "naturally occurring ionizing radiation" and "bad computer room environments" while at the same time blaming the manufacturer of the cache for not keeping up to the speed specs required by Sun.
I have come to believe that neutrinos, which commonly zip through the earth without touching a single atom, are attracted to Sun processors. The ionizing radiation they leave in their paths flips a bit or two in the processor cache as it zings through the server causing it to crash (remember, there's no Error Correcting Code on a Sun processor and a single bit flip will cause an unchecked parity error). Cosmic humor is at its highest when Sun Microsystems admits to engineering without ECC because they preferred the speed over precision and turns around to blame those hurt most by their lack of thought by saying their computer rooms are "bad environments" to run Sun equipment. It has been suggested by a close circle of computing professionals that the best environment in which to run a Sun server is encased in plastic and immersed in a bath of mercury to shield the ionizing radiation from causing undesirable and unexpected e-cache parity errors. Not enough research has been done about room-sized neutrino shields yet to justify the health risk of the mercury. Ironically, the sun is the source of most naturally occurring ionizing radiation in our solar system.
To Sun's defense, they have created a software patch ("cache scrubber") in an attempt to correct their hardware problem. If anyone is having these problems, I'd recommend patching to the latest MU or at least recommended patch bundle; it does help! The kicker? The "ecache parity error" in crash dumps and logs has been made cryptic so you have to submit them to Sun for decryption. The second kicker? It takes more CPU cycles to scrub the cache and recache data until the next scrub thus nullifying the speed vs. precision justification.
Instead of fixing the processor's lack of ECC on existing UltraSparc II and re-engineering the new UltraSparc III processors, Sun has decided to mirror the cache, instead, to lessen the probability that a bit will be flipped in both sets of cache (I'm wonder what will happen if the proper number of neutrinos flip the bits enough to cause parity to be correct, yet make bits in cache A not match cache B). And what happens when the mirror doesn't match the prime? Wouldn't it be less costly to engineer the ECC instead of doubling the expensive cache memory? You'd think so, but "redundant processor cache" looks better on the nice, slick, glossy Sun propoganda and to investors than "industry-standard ECC" I suppose.
Update!! In the new SunFire "midframe" servers running the UltraSPARC III processors, they've finally begun to put ECC on the cache! Woo hoo! When that was announced in a meeting I attended with Sun to do their "dog and pony" show of the new Sun Fire line, they made it a very specific point of their presentation. Of the nine or so people in the room, four of us shouted "yaaaaaay!" simultaneously.
I can't remember all the instances that have incremented the neutrino score, but a bunch of them came from our Veritas cluster both before and after processor replacements. The rest came from various other machines of the ones I help maintain. I've started to count other reported crashes from trusted sources.
Crash Log (the first 10 were not logged - from then on, here are the accounts):
August 15, 2001 - one of my E420's was hit with it twice this morning, resulting in the new and improved error message. The reason neutrinos got two points is that Sun claims the cache scrubber will accommodate a single hit without crashing the server (I still haven't figured out the logic in that statement yet). If it hits twice, however, be prepared to pick through crash dumps for an explanation! It also looks bad if Sun gets too far behind... therefore, Sun gets one point because the cache scrubber may have saved it from the first hit.
November 20, 2001 - one of my E450's took a couple hits and died (Neutrinos +2, Sun +1). Sun opted to replace both processors in the box.
April 16, 2002 - I accidentally deleted several scores and didn't have a recent backup of the board, so I lost a couple hits. No matter - this will make up for it! One of our Netra T1 AC200's went unchecked in its thrashing as it was in a pool of "toasters" each serving the same function in a round-robin DNS fashion, but we finally decided to check it out. Between March 16, 2002 and this morning, the system has taken 59 self-reboots due to "Uncorrectable Memory Errors on CPU0." I will spare you the details but as is traditional, I will update neutrinos by 59x2 and Sun by 59.
April 17, 2002 - One of our E420's took a digger following a neutrino hit this morning at 2:00am spitting out the following error as it sputtered to its death (server name changed to protect the innocent -- again two for neutrinos one for Sun due to the scrubber patch):
Apr 17 02:15:38 hostname unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU2 Data access at TL=0, errID 0x000f8133.1d0afba9 Apr 17 02:15:38 hostname unix: AFSR 0x00000000.80200000AFAR 0x00000000.ffab6ce8 Apr 17 02:15:38 hostname unix: AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10066cc4 Apr 17 02:15:38 hostname unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203 UDBL.ESYND 0x03 Apr 17 02:15:38 hostname unix: UDBL Syndrome 0x3 Memory Module U1404 U0404 U1403 U0403 Apr 17 02:15:38 hostname unix: WARNING: [AFT1] errID 0x000f8133.1d0afba9 Syndrome 0x3 indicates that this may not be a memory module problem Apr 17 02:15:38 hostname unix: [AFT2] errID 0x000f8133.1d0afba9 PA=0x00000000.ffab6ce8 Apr 17 02:15:38 hostname unix: E$tag 0x00000000.0a401ff5 E$State: Shared E$parity 0x05 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x00): 0x726b7265.6b657925 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x08): 0x6467756c.6674656c Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x10): 0x2564636f.6d00000a Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x18): 0x00000000.1eda4445 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x20): 0x6173c17c.618142fc Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x28): 0x00020000.62ce3da4 *Bad* PSYND=0x00ff Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x30): 0x626338d4.39327075 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x38): 0x6e67652e.6e657750 Apr 17 02:15:38 hostname unix: WARNING: [AFT1] CP event on CPU1 (caused Data access error on CPU2), errID 0x000f8133.1d0afba9 Apr 17 02:15:38 hostname unix: AFSR 0x00000000.01000040 AFAR 0x00000000.ffab6ce8 Apr 17 02:15:38 hostname unix: AFSR.PSYND 0x0040(Score 95) AFSR.ETS 0x00 Apr 17 02:15:38 hostname unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Apr 17 02:15:38 hostname unix: [AFT2] errID 0x000f8133.1d0afba9 PA=0x00000000.ffab6ce8 Apr 17 02:15:38 hostname unix: E$tag 0x00000000.0a401ff5 E$State: Shared E$parity 0x05 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x00): 0x726b7265.6b657925 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x08): 0x6467756c.6674656c Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x10): 0x2564636f.6d00000a Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x18): 0x00000000.1eda4445 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x20): 0x6173c17c.618142fc Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x28): 0x00020000.62ce3da4 *Bad* PSYND=0x0040 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x30): 0x626338d4.39327075 Apr 17 02:15:38 hostname unix: [AFT2] E$Data (0x38): 0x6e67652e.6e657750 Apr 17 02:15:38 hostname unix: panic[cpu2]/thread=0x62d8a440: [AFT1] errID 0x000f8133.1d0afba9 UE Error(s)
September 3, 2002 - One of our E420's took a hit this morning at 3:46am. The following was in the messages file (again, hostname modified to protect the innocent but give me an idea of which server it was :)...
Sep 3 03:46:12 hostname12 SUNW,UltraSPARC-II: [ID 230320 kern.info] NOTICE: [AFT2] errID 0x000d5a5e.c5d9ca05 CBI event on CPU3 Sep 3 03:46:12 hostname12 SUNW,UltraSPARC-II: [ID 433929 kern.info] [AFT2] errID 0x000d5a5e.c5d9ca05 PA=0x00000000.003d27c0 Sep 3 03:46:12 hostname12 E$tag 0x00000000.0c400007 E$State: Shared E$parity 0x06 Sep 3 03:46:12 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0xfffeff03.00c20932 Sep 3 03:46:12 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x0300c216.20010045 Sep 3 03:46:12 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x04003032.39380400 Sep 3 03:46:12 hostname12 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x18): 0x30353535.0900303a *Bad* PSYND=0x0001a (0x08): 0x0300c216.20010045
September 6, 2002 - The same server that took a hit on 9/3 took another one but on a different processor...
Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 868287 kern.info] NOTICE: [AFT2] errID 0x000e5588.ba1953de DBI event on CPU1 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 128561 kern.info] [AFT2] errID 0x000e5588.ba1953de PA=0x00000000.d936c7c0 Sep 6 08:28:51 hostname12 E$tag 0x00000000.0dc01b26 E$State: Modified E$parity 0x06 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.0b2bc738 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x08): 0x00000004.10021a1f *Bad* PSYND=0x0008 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x003a0000.0000003a Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x140000fe.baddcafe Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000300.0a90aee0 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000300.0b17fdb0 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000300.05688288 Sep 6 08:28:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000004.00025905 Sep 6 08:32:07 hostname12 genunix: [ID 540533 kern.notice] SunOS Release 5.8 Version Generic_108528-15 64-bit Sep 6 08:32:07 hostname12 genunix: [ID 913631 kern.notice] Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved.
September 10, 2002 - Yet another 420 took two hits almost simultaneously seconds before midnight on the first anniversary of the "Day of Infamy." Score updated Sun+l, Neutrinos+3 (since they weren't even a second apart, I'll give the customary 1 point to Sun for the cache scrubber patch, but the Neutrinos get a bonus point for aiming extremely well to hit a job just as it was jumping processors to cause both to fail). As an after thought, I find it rather interesting that the Neutrinos took out two processors seconds before the anniversary... can you say "Neutraliban?" The entries in the messages log are as follows:
Sep 10 23:59:43 hostname9 unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data access at TL=0, errID 0x000fc296.66ce49ef Sep 10 23:59:43 hostname9 unix: AFSR 0x00000000.80200000AFAR 0x00000000.cfda1350 Sep 10 23:59:43 hostname9 unix: AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1002e0cc Sep 10 23:59:43 hostname9 unix: UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Sep 10 23:59:43 hostname9 unix: UDBH Syndrome 0x3 Memory Module U1402 U0402 U1401 U0401 Sep 10 23:59:43 hostname9 unix: WARNING: [AFT1] errID 0x000fc296.66ce49ef Syndrome 0x3 indicates that this may not be a memory module problem Sep 10 23:59:43 hostname9 unix: [AFT2] errID 0x000fc296.66ce49ef PA=0x00000000.cfda1350 Sep 10 23:59:43 hostname9 unix: E$tag 0x00000000.084019fb E$State: Shared E$parity 0x04 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x00): 0x00000000.00000000 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x08): 0x30922000.00004000 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x10): 0x0400001a.00000000 *Bad* PSYND=0xff00 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x18): 0x603b5244.63828120 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x20): 0x00000004.00000964 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x28): 0x003a0000.0000003a Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x30): 0x14000000.62ac09c0 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x38): 0x6704ff40.6799dba0 Sep 10 23:59:43 hostname9 unix: WARNING: [AFT1] CP event on CPU1 (caused Data access error on CPU0), errID 0x000fc296.66ce49ef Sep 10 23:59:43 hostname9 unix: AFSR 0x00000000.01008000 AFAR 0x00000000.cfda1350 Sep 10 23:59:43 hostname9 unix: AFSR.PSYND 0x8000(Score 95) AFSR.ETS 0x00 Sep 10 23:59:43 hostname9 unix: UDBH 0x012f UDBH.ESYND 0x2f UDBL 0x0000 UDBL.ESYND 0x00 Sep 10 23:59:43 hostname9 unix: [AFT2] errID 0x000fc296.66ce49ef PA=0x00000000.cfda1350 Sep 10 23:59:43 hostname9 unix: E$tag 0x00000000.194019fb E$State: Owner E$parity 0x0c Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x00): 0x00000000.00000000 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x08): 0x30922000.00004000 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x10): 0x0400001a.00000000 *Bad* PSYND=0x8000 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x18): 0x603b5244.63828120 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x20): 0x00000004.00000964 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x28): 0x003a0000.0000003a Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x30): 0x14000000.62ac09c0 Sep 10 23:59:43 hostname9 unix: [AFT2] E$Data (0x38): 0x6704ff40.6799dba0 Sep 10 23:59:43 hostname9 unix: panic[cpu0]/thread=0x3003fe80: [AFT1] errID 0x000fc296.66ce49ef UE Error(s)
September 11, 2002 - The most amazing thing has happened! One of our E6500's took a hit and recovered itself! This log shows it clearly and within a few seconds the box scrubbed itself and actually continued running without a reboot or data loss. The Neutraliban cannot completely shake the stability! For this, Sun gets 1 point, Neutrinos 0. This makes the score so far Neutrinos 141, Sun 67.
Sep 11 05:47:46 hostname8 unix: WARNING: [AFT1] WP event on CPU0, errID 0x00064c5c.b4922ad0 Sep 11 05:47:46 hostname8 AFSR 0x00000000.00800800AFAR 0x000001dd.20000000 Sep 11 05:47:46 hostname8 AFSR.PSYND 0x0800(Score 95) AFSR.ETS 0x00 Fault_PC 0x101259e0 Sep 11 05:47:46 hostname8 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Sep 11 05:48:10 hostname8 unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU28 Data access at TL=0, errID 0x00064c62.22dc5992 Sep 11 05:48:10 hostname8 AFSR 0x00000000.80200000 AFAR 0x00000001.dd70e410 Sep 11 05:48:10 hostname8 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x100272cc Sep 11 05:48:10 hostname8 UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Sep 11 05:48:10 hostname8 UDBH Syndrome 0x3 Memory Module Board 0 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800 Sep 11 05:48:10 hostname8 unix: WARNING: [AFT1] errID 0x00064c62.22dc5992 Syndrome 0x3 indicates that this may not be a memory module problem Sep 11 05:48:10 hostname8 unix: [AFT2] errID 0x00064c62.22dc5992 PA=0x00000001.dd70e410 Sep 11 05:48:10 hostname8 E$tag 0x00000000.18c03bae E$State: Exclusive E$parity 0x0c Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x00): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x08): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x10): 0x00000000.02000000 *Bad* PSYND=0xff00 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x18): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x20): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x28): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x30): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: [AFT2] E$Data (0x38): 0x00000000.00000000 Sep 11 05:48:10 hostname8 unix: NOTICE: Scheduling clearing of error on page 0x00000001.dd70e000 Sep 11 05:48:22 hostname8 unix: NOTICE: Previously reported error on page 0x00000001.dd70e000 cleared Sep 11 05:48:22 hostname8 unix: [AFT3] errID 0x00064c62.22dc5992 Above Error detected by protected Kernel code Sep 11 05:48:22 hostname8 that will try to clear error from system
August 19, 2002 - I must backtrack about a month due to a discovery made on an E5500... I noticed a processor listed as dead in prtdiag and the all-too-familiar error in 'messages' saying why it died. But - just as the box stayed alive on 9/11, this one, too, survived the hit and ran for a month on a crippled processor (shame on me for not noticing for a month, but this server usually runs without much excitement or attention). For this, Sun gets one point, Neutrinos get nothing. Way to go Sun! Now if you could keep your stock above $3 per share and maybe even push it back up to where I bought it so I can dump it, you'll make me even happier. Here's the smoking gun:
Aug 19 07:10:14 hostname6 unix: WARNING: [AFT1] WP event on CPU5, errID 0x0008cc2e.ad14d0af Aug 19 07:10:14 hostname6 unix: AFSR 0x00000000.00800400AFAR 0x00000000.002e47a0 Aug 19 07:10:14 hostname6 unix: AFSR.PSYND 0x0400(Score 95) AFSR.ETS 0x00 Fault_PC 0x10007a7c Aug 19 07:10:14 hostname6 unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Aug 19 07:10:21 hostname6 unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0, errID 0x0008cc30.50825ba9 Aug 19 07:10:21 hostname6 unix: AFSR 0x00000000.80200000 AFAR 0x00000000.b8a20b80 Aug 19 07:10:21 hostname6 unix: AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10020fa8 Aug 19 07:10:21 hostname6 unix: UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Aug 19 07:10:21 hostname6 unix: UDBH Syndrome 0x3 Memory Module Board 4 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800 Aug 19 07:10:21 hostname6 unix: WARNING: [AFT1] errID 0x0008cc30.50825ba9 Syndrome 0x3 indicates that this may not be a memory module problem Aug 19 07:10:21 hostname6 unix: [AFT2] errID 0x0008cc30.50825ba9 PA=0x00000000.b8a20b80 Aug 19 07:10:21 hostname6 unix: E$tag 0x00000000.1ec01714 E$State: Exclusive E$parity 0x0f Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x00): 0xff0a3330.39163933 *Bad* PSYND=0xff00 Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x08): 0x32363332.ffffff02 Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x10): 0xc102ff03.c25e1e04 Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x18): 0xc3024a5c.ffffff07 Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x20): 0x78650517.10101f07 Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x28): 0x78650517.10101fff Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x30): 0xffff02c1.0203c203 Aug 19 07:10:21 hostname6 unix: [AFT2] E$Data (0x38): 0x0703c202.1002c11f Aug 19 07:10:21 hostname6 unix: NOTICE: Scheduling clearing of error on page 0x00000000.b8a20000 Aug 19 07:10:31 hostname6 unix: NOTICE: Previously reported error on page 0x00000000.b8a20000 cleared Aug 19 07:10:31 hostname6 unix: [AFT3] errID 0x0008cc30.50825ba9 Above Error detected by protected Kernel code Aug 19 07:10:31 hostname6 unix: that will try to clear error from system
September 22, 2002 - Yet another hit on "hostname12." This time the message is a bit more strange than usual indicating it would reboot... N+2, S+1.
Sep 22 09:13:50 hostname12 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x30): 0x00000100.00000000 *Bad* PSYND=0x2000 Sep 22 09:13:50 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000000.00000000 Sep 22 09:13:50 hostname12 SUNW,UltraSPARC-II: [ID 195282 kern.info] [AFT2] errID 0x0004ebab.26d314dc AFAR was derived from E$Tag Sep 22 09:13:50 hostname12 unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.0378e000 Sep 22 09:13:50 hostname12 SUNW,UltraSPARC-II: [ID 511200 kern.info] [AFT3] errID 0x0004ebab.26d314dc Above Error is due to Kernel access Sep 22 09:13:50 hostname12 to User space and is fatal: will reboot Sep 22 09:13:50 hostname12 SUNW,UltraSPARC-II: [ID 164355 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0, errID 0x0004ebab.2ee99825 Sep 22 09:13:50 hostname12 AFSR 0x00000000.80200000AFAR 0x00000000.0378e8f0 Sep 22 09:13:50 hostname12 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1000ba7c Sep 22 09:13:50 hostname12 UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Sep 22 09:13:50 hostname12 UDBH Syndrome 0x3 Memory Module U1404 U0404 U1403 U0403 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 460529 kern.warning] WARNING: [AFT1] errID 0x0004ebab.2ee99825 Syndrome 0x3 indicates that this may not be a memory module problem Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 107686 kern.info] [AFT2] errID 0x0004ebab.2ee99825 PA=0x00000000.0378e8f0 Sep 22 09:13:51 hostname12 E$tag 0x00000000.0fc0006f E$State: Modified E$parity 0x07 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000000 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00000000.00000000 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x30): 0x00000100.00000000 *Bad* PSYND=0xff00 Sep 22 09:13:51 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000000.00000000 Sep 22 09:13:51 hostname12 unix: [ID 836849 kern.notice] Sep 22 09:13:51 hostname12 panic[cpu1]/thread=3000b7cb180: Sep 22 09:13:51 hostname12 unix: [ID 498001 kern.notice] [AFT1] errID 0x0004ebab.2ee99825 UE Error(s) Sep 22 09:13:51 hostname12 See previous message(s) for details Sep 22 09:13:51 hostname12 unix: [ID 100000 kern.notice] Sep 22 09:13:51 hostname12 genunix: [ID 723222 kern.notice] 000002a101f1d350 SUNW,UltraSPARC-II:cpu_aflt_log+4e0 (2a101f1d40e, 1, 101484e0, 2a101f1d598, 2a101f1d45b, 10148508) Sep 22 09:13:51 hostname12 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 000002a101f1d660 0000000000000003 0000000000000010 Sep 22 09:13:51 hostname12 %l4-7: 0000030001aa7180 000000000000003c 00000300002ac508 0000000000000000 Sep 22 09:13:51 hostname12 genunix: [ID 723222 kern.notice] 000002a101f1d5a0 SUNW,UltraSPARC-II:cpu_async_error+868 (104598b0, 2a101f1d660, 80200000, 0, 640040680200000, 2a101f1d820) Sep 22 09:13:51 hostname12 genunix: [ID 179002 kern.notice] %l0-3: 000000001040dae4 0000000000000032 0000000000000000 0000000000000203 Sep 22 09:13:51 hostname12 %l4-7: 000000000378e8c0 0000000000400000 0000000000400000 0000000000000001 Sep 22 09:13:51 hostname12 genunix: [ID 723222 kern.notice] 000002a101f1d770 unix:prom_rtt+0 (3000b6da8c0, 30, 20,100, 30000032000, 0) Sep 22 09:13:51 hostname12 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000001400 0000000080001607 000000001013fc94 Sep 22 09:13:51 hostname12 %l4-7: 00000000000000b0 0000000010412a78 0000000000000000 000002a101f1d820 Sep 22 09:13:52 hostname12 genunix: [ID 723222 kern.notice] 000002a101f1d8c0 genunix:core+ec (b, 1042e000, 3000b6da780, 1fff, 9fbff057, b) Sep 22 09:13:52 hostname12 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000001 00000300033eca98 0000000000000001 0000000000000000 Sep 22 09:13:52 hostname12 %l4-7: 0000000000000000 0000000000000000 000003000b6da8f8 0000000000000000 Sep 22 09:13:52 hostname12 genunix: [ID 723222 kern.notice] 000002a101f1d970 genunix:psig+310 (1045a800, 0, 68, e,2, feb9b720) Sep 22 09:13:52 hostname12 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 00000300033eca98 0000000000000400 000003000b6da780 Sep 22 09:13:52 hostname12 %l4-7: 000000000000000b 0000000000000000 000000000000000e 000002a101f1da10 Sep 22 09:13:52 hostname12 genunix: [ID 723222 kern.notice] 000002a101f1da20 genunix:post_syscall+3ec (3000b7cb180, 35, 1, ffbee704, 4, 0) Sep 22 09:13:52 hostname12 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 000002a101f1dba0 000003000b6da780 000000000000005b Sep 22 09:13:52 hostname12 %l4-7: 0000000000000000 00000300033eca98 0000000000000004 00000000018a9d70 Sep 22 09:13:52 hostname12 unix: [ID 100000 kern.notice] Sep 22 09:13:52 hostname12 genunix: [ID 672855 kern.notice] syncing file systems... Sep 22 09:13:53 hostname12 genunix: [ID 904073 kern.notice] done
October 3, 2002 - And yet another hit on "hostname12." This time we replaced the processor since this was the third hit on the same CPU. Again, this time it indicated it would reboot itself in the log. We did recently patch these servers, so perhaps the log text has been augmented or modified. N+2, S+1.
Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 186118 kern.info] [AFT2] errID 0x000364bc.b6995dcd PA=0x00000000.d7445030 Oct 3 10:34:54 hostname12 E$tag 0x00000000.0bc01ae8 E$State: Modified E$parity 0x05 Badlines found=9 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00010000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00000000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x30): 0x00000100.00000000 *Bad* PSYND=0x2000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000000.00000000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 220329 kern.info] [AFT2] errID 0x000364bc.b6995dcd AFAR was derived from E$Tag Oct 3 10:34:54 hostname12 unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.d7444000 Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 406729 kern.info] [AFT3] errID 0x000364bc.b6995dcd Above Error is due to Kernel access Oct 3 10:34:54 hostname12 to User space and is fatal: will reboot Oct 3 10:34:54 hostname12 SUNW,UltraSPARC-II: [ID 304463 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0, errID 0x000364bc.becfb1b9 Oct 3 10:34:54 hostname12 AFSR 0x00000000.80200000AFAR 0x00000000.d7445fe0 Oct 3 10:34:54 hostname12 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1001bef4 Oct 3 10:34:54 hostname12 UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Oct 3 10:34:54 hostname12 UDBH Syndrome 0x3 Memory Module U1404 U0404 U1403 U0403 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 817842 kern.warning] WARNING: [AFT1] errID 0x000364bc.becfb1b9 Syndrome 0x3 indicates that this may not be a memory module problem Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 592722 kern.info] [AFT2] errID 0x000364bc.becfb1b9 PA=0x00000000.d7445fe0 Oct 3 10:34:55 hostname12 E$tag 0x00000000.0bc01ae8 E$State: Modified E$parity 0x05 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000820 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000007.00000000 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x000002a1.0083dda0 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00100100.00000000 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x05040017.00000003 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00c00022.179d9b46 Oct 3 10:34:55 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000000.01000000 Oct 3 10:34:55 hostname12 unix: [ID 836849 kern.notice] Oct 3 10:34:55 hostname12 panic[cpu1]/thread=300086c2cc0: Oct 3 10:34:55 hostname12 unix: [ID 579625 kern.notice] [AFT1] errID 0x000364bc.becfb1b9 UE Error(s) Oct 3 10:34:55 hostname12 See previous message(s) for details Oct 3 10:34:55 hostname12 unix: [ID 836849 kern.notice] Oct 3 10:34:55 hostname12 panic[cpu1]/thread=300086c2cc0: Oct 3 10:34:55 hostname12 unix: [ID 799565 kern.notice] BAD TRAP: type=31 rp=10422be0 addr=100fffe5bd0 mmu_fsr=0 Oct 3 10:34:55 hostname12 unix: [ID 100000 kern.notice] Oct 3 10:34:55 hostname12 genunix: [ID 672855 kern.notice] syncing file systems... Oct 3 10:34:55 hostname12 unix: [ID 836849 kern.notice] Oct 3 10:34:55 hostname12 panic[cpu1]/thread=300086c2cc0: Oct 3 10:34:55 hostname12 unix: [ID 799565 kern.notice] BAD TRAP: type=31 rp=10421b60 addr=100fffe5bd0 mmu_fsr=0 Oct 3 10:34:55 hostname12 unix: [ID 100000 kern.notice]
November 23, 2002 - One of our Netra T1 "prod-dev" (don't ask) servers took a hit this evening forcing an fsck of two filesystems resulting in 2 disconnected inodes that had to be cleared. This is the first real data loss we've suffered from a Neutrino, so I'm gonna give them a bonus point. N+3, S+1.
Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 339143 kern.warning] WARNING: [AFT1] EDP event on CPU0 Instruction access at TL=0, errID 0x00031a2e.df7d7174 Nov 23 17:48:11 hostname4dev AFSR 0x00000000.00400080AFAR 0x00000000.11262418 Nov 23 17:48:11 hostname4dev AFSR.PSYND 0x0080(Score 95) AFSR.ETS 0x00 Fault_ PC 0xfec8a3fc Nov 23 17:48:11 hostname4dev UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 559225 kern.info] [AFT2] errID 0x00031a2e.df7d7174 PA=0x00000000.11262418 Nov 23 17:48:11 hostname4dev E$tag 0x00000000.00028449 E$State: Exclusive E$parity 0x02 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x12800003.80a5e000 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x9610220c.d625a004 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x22bffe50.84102000 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 989652 kern.info] [AFT2] E$Data (0x18): 0xd523a074.8410200a *Bad* PSYND=0x0080 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x9600e004.d023a064 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x94102000.98102177 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0xd023a068.9010001a Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0xd223a070.92100018 Nov 23 17:48:11 hostname4dev SUNW,UltraSPARC-IIi: [ID 511021 kern.info] [AFT2] errID 0x00031a2e.df7d7174 AFAR was derived from E$Tag Nov 23 17:48:11 hostname4dev unix: [ID 836849 kern.notice] Nov 23 17:48:11 hostname4dev panic[cpu0]/thread=30001978da0: Nov 23 17:48:11 hostname4dev unix: [ID 124028 kern.notice] [AFT1] errID 0x00031a2e.df7d7174 EDP Error(s) Nov 23 17:48:11 hostname4dev unix: [ID 100000 kern.notice] Nov 23 17:48:11 hostname4dev genunix: [ID 723222 kern.notice] 000002a1010456d0 SUNW,UltraSPARC-IIi:cpu_aflt_log+4e0 (2a10104578e, 1, 10146398, 2a101045918, 2a1010457db, 101463c0) Nov 23 17:48:11 hostname4dev genunix: [ID 179002 kern.notice] %l0-3: 0000000000000001 000002a1010459e0 0000000000000003 0000000000000010 Nov 23 17:48:11 hostname4dev %l4-7: 0000000000200000 0000000000400000 0000000000000000 0000000000000000 Nov 23 17:48:11 hostname4dev genunix: [ID 723222 kern.notice] 000002a101045920 SUNW,UltraSPARC-IIi:cpu_async_error+830 (1, 2a1010459e0, 400080, 0, 0, 140000000400080) Nov 23 17:48:12 hostname4dev genunix: [ID 179002 kern.notice] %l0-3: 000002a101045ba0 000000000000000a 0000000000000000 0000000000000000 Nov 23 17:48:12 hostname4dev %l4-7: 0000000004004208 0000000000000000 000001efd777dd68 0000000000000000 Nov 23 17:48:12 hostname4dev unix: [ID 100000 kern.notice] Nov 23 17:48:12 hostname4dev genunix: [ID 672855 kern.notice] syncing file systems... Nov 23 17:48:12 hostname4dev genunix: [ID 904073 kern.notice] done
November 25, 2002 - Another 4proc/4gb E420 nosedived with some strange output. The informational blurb was short and sweet in /var/adm/messages this time, but dmesg reports something different...
/var/adm/messages
Nov 25 12:53:32 hostname-cstlml unix: WARNING: [AFT1] WP event on CPU0, errID 0x00302915.5a1d3a9e Nov 25 12:53:32 hostname-cstlml unix: AFSR.PSYND 0x0008(Score 95) AFSR.ETS 0x00 Fault_PC 0x1007356c Nov 25 12:53:32 hostname-cstlml unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Nov 25 13:28:30 hostname-cstlml unix: 87 dynamic kernel data pages Nov 25 13:28:30 hostname-cstlml unix: 274 kernel-pageable pages Nov 25 13:28:30 hostname-cstlml unix: 0 segkmap kernel pages Nov 25 13:28:30 hostname-cstlml unix: 0 segvn kernel pages Nov 25 13:28:30 hostname-cstlml unix: 195 current user process pages Nov 25 13:28:30 hostname-cstlml unix: 35081 total pages (35081 chunks) Nov 25 13:28:30 hostname-cstlml unix: dumping to vp 621e3a94, offset 3632847 Nov 25 13:28:30 hostname-cstlml unix: panic[cpu1]/thread=0x30053e80: panic dump timeout Nov 25 13:28:30 hostname-cstlml unix: Dump Aborted.dmesg output
WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0, errID 0x00302915.62e20799
AFSR 0x00000000.80200000 AFAR 0x00000000.fa4e2bb8
AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1003a8f4
UDBH 0x004a UDBH.ESYND 0x4a UDBL 0x0203 UDBL.ESYND 0x03
UDBL Syndrome 0x3 Memory Module U1304 U0304 U1303 U0303
WARNING: [AFT1] errID 0x00302915.62e20799 Syndrome 0x3 indicates that this may not be a memory module problem
[AFT2] errID 0x00302915.62e20799 PA=0x00000000.fa4e2bb8
E$tag 0x00000000.18c01f49 E$State: Exclusive E$parity 0x0c
[AFT2] E$Data (0x00): 0x6e2a8ca8.00000000
[AFT2] E$Data (0x08): 0x1186cb80.1186cb80
[AFT2] E$Data (0x10): 0x1186cb80.1186cb80
[AFT2] E$Data (0x18): 0x00000000.00000000
[AFT2] E$Data (0x20): 0x00000000.00000000
[AFT2] E$Data (0x28): 0x00000000.00000000
[AFT2] E$Data (0x30): 0x00000000.0003f5fc
[AFT2] E$Data (0x38): 0x02010000.00000000 *Bad* PSYND=0x00ff
panic[cpu1]/thread=0x6a3502e0: [AFT1] errID 0x00302915.62e20799 UE Error(s)
panic[cpu1]/thread=0x6a3502e0: [AFT1] errID 0x00302915.62e20799 UE Error(s)
See previous message(s) for details
syncing file systems... [45] 7 done
34525 static and sysmap kernel pages
87 dynamic kernel data pages
274 kernel-pageable pages
0 segkmap kernel pages
0 segvn kernel pages
195 current user process pages
35081 total pages (35081 chunks)
dumping to vp 621e3a94, offset 3632847
panic[cpu1]/thread=0x30053e80: panic dump timeout
Dump Aborted.
Notice how CPU0 is listed in /var/adm/messages but the dmesg output lists a parity error also occurred on CPU1... interesting. Also notice how cpu1 panics in the middle of dumping the CPU0 output. It is my belief that CPU0 took a hit forcing the box to drop to the PROM prompt and while in the process of dumping to reboot, CPU1 took a hit (almost 30 minutes after the box dropped to an 'ok' prompt (I was at lunch, sue me!) - CPU0 panics on 12:53 and the reboot lists the dump output starting at 13:28).
I would like to hereby award Neutrinos 5 points and Sun 1. Why? Neutrinos were able to hit the system to knock it on its butt and then send a second squadron to kick it down just as it was coming back to its feet. Why should I give Sun a point? I'm not sure. I guess I just feel sorry for it. Neutrinos 155, Sun 73.
February 6, 2003 - Our mongo 20proc/20GB E6500 was smacked down by a neutrino as I was on my way to lunch. It rebooted and came back alive, but only after every billing process fell flat on its butt and knocked every user offline. I guess the server tried to shirk responsibility because it blamed Oracle (near end of message - PID 5803). Poor Oracle! Let me punish the mean OS for you. Bad Solaris! Bad! Bad! Don't do that ever again! For doing that, you can't swap for 2 days. Now go to your room!
Hey - that was amusing! To add insult to this posting, as I was typing it I was IM-alerted that Sun's stock is at a breath taking $3.24 per share, up from 5% of the previous day's closing. How sad is that? I paid $10.61 for it. I suck.
Feb 6 12:19:08 hostname8 unix: WARNING: [AFT1] EDP event on CPU9 Instruction access at TL=0, errID 0x0001cbe4.1e434616 Feb 6 12:19:08 hostname8 AFSR 0x00000000.00400100AFAR 0x00000004.b6522070 Feb 6 12:19:08 hostname8 AFSR.PSYND 0x0100(Score 95) AFSR.ETS 0x00 Fault_PC 0x52205c Feb 6 12:19:08 hostname8 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Feb 6 12:19:08 hostname8 unix: [AFT2] errID 0x0001cbe4.1e434616 PA=0x00000004.b6522070 Feb 6 12:19:08 hostname8 E$tag 0x00000000.0e4096ca E$State: Shared E$parity 0x07 Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x00): 0xd0066000.90102020 Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x08): 0xd037bfc2.d007bfac Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x10): 0xd027be60.d0066000 Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x18): 0x808a2040.02800009 Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x20): 0x90102001.d027be64 Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x28): 0x90102071.d027be5c Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x30): 0xc027be68.c027bfc8 *Bad* PSYND=0x0100 Feb 6 12:19:08 hostname8 unix: [AFT2] E$Data (0x38): 0x10800009.d0066000 Feb 6 12:19:08 hostname8 unix: [AFT2] errID 0x0001cbe4.1e434616 AFAR was derived from E$Tag Feb 6 12:19:08 hostname8 unix: NOTICE: Scheduling clearing of error on page 0x00000004.b6522000 Feb 6 12:19:20 hostname8 unix: NOTICE: Previously reported error on page 0x00000004.b6522000 cleared Feb 6 12:19:20 hostname8 unix: [AFT3] errID 0x0001cbe4.1e434616 Above Error is in User Mode Feb 6 12:19:20 hostname8 and is fatal: will reboot Feb 6 12:19:20 hostname8 unix: WARNING: [AFT1] initiating reboot due to above error in pid 5803 (oracle) Feb 6 12:23:46 hostname8 syslogd: going down on signal 15 Feb 6 12:24:26 hostname8 unix: syncing file systems... Feb 6 12:24:26 hostname8 unix: done
June 6, 2003 - Another E420 ate some subatomic energy that whisked its crucial bit from ecache into some electon orbit never to be seen again. You can see this one has an A1000 attached to it if you have a keen eye (or if you can read English).
Jun 6 09:18:37 hostname-glfml unix: WARNING: [AFT1] EDP event on CPU2 Instruction access at TL=0, errID 0x002bb4ba.557337d6 Jun 6 09:18:37 hostname-glfml unix: AFSR 0x00000000.00408000AFAR 0x00000000.da8126b0 Jun 6 09:18:37 hostname-glfml unix: AFSR.PSYND 0x8000(Score 95) AFSR.ETS 0x00 Fault_PC 0xef42e6a0 Jun 6 09:18:37 hostname-glfml unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jun 6 09:18:37 hostname-glfml unix: [AFT2] errID 0x002bb4ba.557337d6 PA=0x00000000.da8126b0 Jun 6 09:18:37 hostname-glfml unix: E$tag 0x00000000.0e401b50 E$State: Shared E$parity 0x07 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x00): 0xd006e008.80a23fff Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x08): 0x32800006.d004a000 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x10): 0x40007c48.90072008 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x18): 0x1080000e.01000000 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x20): 0x92222001.80a22000 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x28): 0x0480000a.d224a000 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x30): 0xbb072008.40007c42 *Bad* PSYND=0x8000 Jun 6 09:18:37 hostname-glfml unix: [AFT2] E$Data (0x38): 0x9010001d.e004a000 Jun 6 09:18:37 hostname-glfml unix: [AFT2] errID 0x002bb4ba.557337d6 AFAR was derived from E$Tag Jun 6 09:18:37 hostname-glfml unix: NOTICE: Scheduling clearing of error on page 0x00000000.da812000 Jun 6 09:18:41 hostname-glfml unix: NOTICE: Previously reported error on page 0x00000000.da812000 cleared Jun 6 09:18:41 hostname-glfml unix: [AFT3] errID 0x002bb4ba.557337d6 Above Error is in User Mode Jun 6 09:18:41 hostname-glfml unix: and is fatal: will reboot Jun 6 09:18:41 hostname-glfml unix: WARNING: [AFT1] initiating reboot due to above error in pid 12305 (smtpd) Jun 6 09:18:43 hostname-glfml syslogd: going down on signal 15 Jun 06 09:18:44 hostname-glfml Array Monitor stopped Jun 06 09:18:54 hostname-glfml RDAC daemons stopped
June 8, 2003 - Our monster E6500 (now up to 24 procs/24 gb) was hit yet again by a neutrino in the middle of running "treatment letters" for people who can't pay their bills. See what happens when you get behind in payments, you deadbeat customers? You crash my freakin' servers! Pay up! I feel like giving the neutrinos an extra point this time. I'm not sure what happened, but it looks like the event on CPU19 was trying to do something with CPU20 at the time it hit. I'm not sure what that means, but I'm just feeling particular evil today, so "Huzzah, Neutrinos!" It kinda looks like the same thing that happened to this exact same server on Sept 10, 2002, however, but on different processors.
Jun 8 08:59:53 hostname8 unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU20 Data access at TL=0, errID 0x00168bfa.8bfb58bc Jun 8 08:59:53 hostname8 AFSR 0x00000000.80200000AFAR 0x00000005.d488e028 Jun 8 08:59:53 hostname8 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x101d16b0 Jun 8 08:59:53 hostname8 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203 UDBL.ESYND 0x03 Jun 8 08:59:53 hostname8 UDBL Syndrome 0x3 Memory Module Board 6 J3101 J3201 J3301 J3401 J3501 J3601 J3701 J3801 Jun 8 08:59:53 hostname8 unix: WARNING: [AFT1] errID 0x00168bfa.8bfb58bc Syndrome 0x3 indicates that this may not be a memory module problem Jun 8 08:59:53 hostname8 unix: [AFT2] errID 0x00168bfa.8bfb58bc PA=0x00000005.d488e028 Jun 8 08:59:53 hostname8 E$tag 0x00000000.09c0ba91 E$State: Modified E$parity 0x04 Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x00): 0x00000ffa.00000000 Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x08): 0x00000000.1048c0e8 Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x10): 0x00000ffb.00000000 Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x18): 0x00000300.08fe4e80 Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x20): 0x00000000.00000000 Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x28): 0x08000000.00007d72 *Bad* PSYND=0x00ff Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x30): 0x00000000.0007828e Jun 8 08:59:53 hostname8 unix: [AFT2] E$Data (0x38): 0x00000000.00000000 Jun 8 08:59:53 hostname8 unix: WARNING: [AFT1] CP event on CPU19 (caused Data access error on CPU20), errID 0x00168bfa.8bfb58bc Jun 8 08:59:53 hostname8 AFSR 0x00000000.01000080 AFAR 0x00000005.d488e028 Jun 8 08:59:53 hostname8 AFSR.PSYND 0x0080(Score 95) AFSR.ETS 0x00 Jun 8 08:59:53 hostname8 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jun 8 08:59:53 hostname8 unix: WARNING: [AFT2] errID 0x00168bfa.8bfb58bc No cache dump available Jun 8 08:59:53 hostname8 unix: panic[cpu20]/thread=30008fe4e80: Jun 8 08:59:53 hostname8 unix: [AFT1] errID 0x00168bfa.8bfb58bc UE Error(s) Jun 8 08:59:53 hostname8 See previous message(s) for details Jun 8 08:59:53 hostname8 unix: Jun 8 08:59:54 hostname8 unix: syncing file systems... Jun 8 09:00:14 hostname8 unix: done Jun 8 09:00:14 hostname8 unix: panic[cpu20]/thread=2a1000abd60: Jun 8 09:00:14 hostname8 unix: panic sync timeout
June 14, 2004 - Our financial system fell down, went boom. The OS tried to blame Veritas NetBackup for the stumble, just as it did to Oracle a few entries ago. Bad Solaris! Bad! Bad! Here's the proof.
Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 171966 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU8 Data access at TL=0, errID 0x00404337.95862752 Jun 14 00:16:45 hostname1 AFSR 0x00000000.80200000AFAR 0x00000002.b8a1c628 Jun 14 00:16:45 hostname1 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10148e5c Jun 14 00:16:45 hostname1 UDBH 0x002f UDBH.ESYND 0x2f UDBL 0x0203 UDBL.ESYND 0x03 Jun 14 00:16:45 hostname1 UDBL Syndrome 0x3 Memory Module Board 4 J3101 J3201 J3301 J3401 J3501 J3601 J3701 J3801 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 125565 kern.warning] WARNING: [AFT1] errID 0x00404337.95862752 Syndrome 0x3 indicates that this may not be a memory module problem Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 495010 kern.info] [AFT2] errID 0x00404337.95862752 PA=0x00000002.b8a1c628 Jun 14 00:16:45 hostname1 E$tag 0x00000000.0a405714 E$State: Shared E$parity 0x05 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0xc1032020.30313032 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x30303030.30303030 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x30303030.30303030 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x30303030.30303030 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x30304f4e.47524320 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x20202020.20202024 *Bad* PSYND=0x00ff Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x2020204c.c1035445 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0xc2194e3a.c1613320 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 242399 kern.warning] WARNING: [AFT1] AFAR was derived from UE report, CP event on CPU13 (caused Data access error on CPU8), errID 0x00404337.95862752 Jun 14 00:16:45 hostname1 AFSR 0x00000000.01000001 AFAR 0x00000002.b8a1c628 Jun 14 00:16:45 hostname1 AFSR.PSYND 0x0001(Score 95) AFSR.ETS 0x00 Jun 14 00:16:45 hostname1 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 495010 kern.info] [AFT2] errID 0x00404337.95862752 PA=0x00000002.b8a1c628 Jun 14 00:16:45 hostname1 E$tag 0x00000000.1b405714 E$State: Owner E$parity 0x0d Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0xc1032020.30313032 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x30303030.30303030 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x30303030.30303030 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x30303030.30303030 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x30304f4e.47524320 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x20202020.20202024 *Bad* PSYND=0x0001 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x2020204c.c1035445 Jun 14 00:16:45 hostname1 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0xc2194e3a.c1613320 Jun 14 00:16:45 hostname1 unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000002.b8a1c000 Jun 14 00:16:57 hostname1 unix: [ID 221039 kern.notice] NOTICE: Previously reported error on page 0x00000002.b8a1c000 cleared Jun 14 00:16:57 hostname1 SUNW,UltraSPARC-II: [ID 171070 kern.info] [AFT3] errID 0x00404337.95862752 Above Error is due to Kernel access Jun 14 00:16:57 hostname1 to User space and is fatal: will reboot Jun 14 00:16:57 hostname1 unix: [ID 855177 kern.warning] WARNING: [AFT1] initiating reboot due to above error in pid 11901 (bpbkar)
Jul 10 13:44:59 HOSTNAME2 unix: WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID 0x0000025b.0b2535a8 Jul 10 13:44:59 HOSTNAME2 AFSR 0x00000000.80402000AFAR 0x00000000.bf382000 Jul 10 13:44:59 HOSTNAME2 AFSR.PSYND 0x2000(Score 95) AFSR.ETS 0x00 Fault_PC 0x10034788 Jul 10 13:44:59 HOSTNAME2 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] errID 0x0000025b.0b2535a8 PA=0x00000000.bf382000 Jul 10 13:44:59 HOSTNAME2 E$tag 0x00000000.0fc017e7 E$State: Modified E$parity 0x07 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x00): 0x00000100.00000000 *Bad* PSYND=0x2000 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x08): 0x70ce4238.0003e00b Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x10): 0x0003e00b.00000000 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x18): 0x3fc0f360.7ffff930 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x20): 0x00000000.00000000 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x28): 0x00000000.00000000 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x30): 0x00000000.00000000 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] E$Data (0x38): 0x00000000.00000000 Jul 10 13:44:59 HOSTNAME2 unix: [AFT2] errID 0x0000025b.0b2535a8 AFAR was derived from E$Tag Jul 10 13:44:59 HOSTNAME2 unix: panic[cpu0]/thread=4002be60: Jul 10 13:44:59 HOSTNAME2 unix: [AFT1] errID 0x0000025b.0b2535a8 EDP Error(s) Jul 10 13:44:59 HOSTNAME2 See previous message(s) for details Jul 10 13:44:59 HOSTNAME2 unix: Jul 10 13:45:00 HOSTNAME2 unix: syncing file systems... Jul 10 13:45:00 HOSTNAME2 unix: 2 Jul 10 13:45:20 HOSTNAME2 unix: done Jul 10 13:45:20 HOSTNAME2 unix: panic[cpu0]/thread=4003fe60: Jul 10 13:45:20 HOSTNAME2 unix: panic sync timeout Jul 10 13:45:20 HOSTNAME2 unix:
Aug 7 17:36:10 financial SUNW,UltraSPARC-II: [ID 787124 kern.info] [AFT0] Corrected Memory Error detected by CPU12, errID 0x0010cc0b.2c6e8d55 Aug 7 17:36:10 financial AFSR 0x00000000.00100000AFAR 0x00000001.4b79c0f0 Aug 7 17:36:10 financial AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10025290 Aug 7 17:36:10 financial UDBH Syndrome 0x89 Memory Module Board 6 J3300 Aug 7 17:36:10 financial SUNW,UltraSPARC-II: [ID 919014 kern.info] [AFT0] errID 0x0010cc0b.2c6e8d55 Corrected Memory Error on Board 6 J3300 is Persistent Aug 7 17:36:10 financial SUNW,UltraSPARC-II: [ID 658497 kern.info] [AFT0] errID 0x0010cc0b.2c6e8d55 ECC Data Bit 6 was in error and corrected
Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 198611 kern.info] NOTICE: [AFT2] errID 0x00126707.7c63ce3f CBI event on CPU4 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 181335 kern.info] [AFT2] errID 0x00126707.7c63ce3f PA=0x00000000.00d4cd40 Aug 12 23:07:30 financial E$tag 0x00000000.0c40001a E$State: Shared E$parity 0x06 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x20077564.01010101 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x08): 0x0101800a.64656c6f *Bad* PSYND=0x0002 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x6c617973.20200778 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x64090e01.010104c3 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x12352001.4502c102 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x05202020.20200180 Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x014e2c00.1202c15c Aug 12 23:07:30 financial SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x03c2170a.018003c2
Aug 14 10:55:37 hosthelp unix: [ID 836849 kern.notice] Aug 14 10:55:37 hosthelp panic[cpu0]/thread=2a100071d40: Aug 14 10:55:37 hosthelp unix: [ID 695590 kern.notice] CPU0 Ecache SRAM Data Parity Error: AFSR 0x00000000.80400004 AFAR 0x00000000.34bbb678 Aug 14 10:55:37 hosthelp unix: [ID 100000 kern.notice] Aug 14 10:55:37 hosthelp genunix: [ID 723222 kern.notice] 000002a100071460 SUNW,UltraSPARC-IIi:check_misc_err+104 (80400004, 34bbb678, 20, 0, 300064c5d38, 30005e55f00) Aug 14 10:55:37 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 0000000000061217 0000030001093550 0000000000000029 0000000000000000 Aug 14 10:55:37 hosthelp %l4-7: 00000300079aab80 0000000000000001 0000000000000001 8000000000000012 Aug 14 10:55:38 hosthelp genunix: [ID 723222 kern.notice] 000002a100071520 SUNW,UltraSPARC-IIi:cpu_async_error+f0 (34bbb678, 0, 80400004, 0, 0, 0) Aug 14 10:55:38 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000000 0000000000000000 000002a100071710 Aug 14 10:55:38 hosthelp %l4-7: 0000000000000000 0000000000000000 0000000010031598 0000000000000000 Aug 14 10:55:38 hosthelp genunix: [ID 723222 kern.notice] 000002a100071660 unix:prom_rtt+0 (3ffffffe0466d4d8, 30001059da8, 64, 284ed79fe9c7c9, 0, 0) Aug 14 10:55:38 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 0000000000000005 0000000000001400 00000044f0001604 00000000101329a8 Aug 14 10:55:38 hosthelp %l4-7: 0000000010456000 000000003b9aca00 0000000000000000 000002a100071710 Aug 14 10:55:38 hosthelp genunix: [ID 723222 kern.notice] 000002a1000717b0 genunix:qtimeout+30 (30000fe94c8, 102553cc, 0, 64, 30004763510, 100710d8) Aug 14 10:55:38 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 0000030001059da8 3ffffffe0466d4d8 0000000000000000 000003000006a4ac Aug 14 10:55:38 hosthelp %l4-7: 00000300000598c8 000003000020bea8 0000000000000000 000003000020bed0 Aug 14 10:55:38 hosthelp genunix: [ID 723222 kern.notice] 000002a100071860 tcp:tcp_time_wait_collector+110 (10482c00, 2a100071d40, 1047a268, 30005cdbd38, 0, 10141b1c) Aug 14 10:55:39 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 0000000000000001 00000300013fd020 0000000010461428 0000000000000000 Aug 14 10:55:39 hosthelp %l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Aug 14 10:55:39 hosthelp genunix: [ID 723222 kern.notice] 000002a100071920 genunix:qcallbwrapper+17c (30001059e08, 8800, 7fff, 30000fe94c8, 30001059e08, 0) Aug 14 10:55:39 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 00000000102553cc 0000000000000016 000000000000000a 000002a10001fd40 Aug 14 10:55:39 hosthelp %l4-7: 0000000000000000 0000000000000007 0000000000000000 000002a10001fa00 Aug 14 10:55:39 hosthelp genunix: [ID 723222 kern.notice] 000002a1000719d0 genunix:callout_execute+90 (bffffffe0466d5e8, 1, 300001b1038, 99afc249, 300001b0038, 0) Aug 14 10:55:39 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 00000000100e587c 8000000000000000 0000000000000004 00000300001b1280 Aug 14 10:55:39 hosthelp %l4-7: 0000000099afc249 00000300001b0000 0000030004888f28 000002a1000719e0 Aug 14 10:55:39 hosthelp genunix: [ID 723222 kern.notice] 000002a100071a80 genunix:taskq_thread+18c (300007a5e80, 0, 10423df8, 10000, 300007a5eb2, 300007a5ed8) Aug 14 10:55:39 hosthelp genunix: [ID 179002 kern.notice] %l0-3: 000000001006fec0 00000300007a5eb0 00000300007a5ea8 00000300007a5e80 Aug 14 10:55:39 hosthelp %l4-7: 00000300007a5ea0 000003000079ffa8 000000001041be18 0000000000000540 Aug 14 10:55:39 hosthelp unix: [ID 100000 kern.notice] Aug 14 10:55:39 hosthelp genunix: [ID 672855 kern.notice] syncing file systems... Aug 14 10:55:40 hosthelp genunix: [ID 904073 kern.notice] done
Aug 22 02:33:06 financial AFSR 0x00000000.00100000AFAR 0x00000000.8771e5b8 Aug 22 02:33:06 financial AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10025288 Aug 22 02:33:06 financial UDBL Syndrome 0x19 Memory Module Board 0 J3601 Aug 22 02:33:06 financial SUNW,UltraSPARC-II: [ID 477896 kern.info] [AFT0] errID 0x0015357a.7d7bf561 Corrected Memory Error on Board 0 J3601 is Persistent Aug 22 02:33:06 financial SUNW,UltraSPARC-II: [ID 695901 kern.info] [AFT0] errID 0x0015357a.7d7bf561 ECC Data Bit 37 was in error and corrected
Dec 10 09:43:56 hostname8 unix: WARNING: [AFT1] Uncorrectable Memory Error on CPU8 Data access at TL=0, errID 0x000b3550.f68ffd85 Dec 10 09:43:56 hostname8 AFSR 0x00000000.00200000AFAR 0x00000002.d2b378c0 Dec 10 09:43:56 hostname8 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x62809c Dec 10 09:43:56 hostname8 UDBH 0x0203 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Dec 10 09:43:56 hostname8 UDBH Syndrome 0x3 Memory Module Board 5 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800 Dec 10 09:43:56 hostname8 unix: WARNING: [AFT1] errID 0x000b3550.f68ffd85 Syndrome 0x3 indicates that this may not be a memory module problem Dec 10 09:43:56 hostname8 unix: [AFT2] errID 0x000b3550.f68ffd85 PA=0x00000002.d2b378c0 Dec 10 09:43:56 hostname8 E$tag 0x00000000.0e405a56 E$State: Shared E$parity 0x07 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x00): 0x497f4454.315f436e *Bad* PSYND=0xff00 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x08): 0x74260778.640b060e Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x10): 0x371cffff.0777c401 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x18): 0x01010101.07c7c70c Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x20): 0x1e183c3c.018002c1 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x28): 0x0402c104.02c1042c Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x30): 0x000b05c4.02011542 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x38): 0x1073434e.5f4d5249 Dec 10 09:43:56 hostname8 unix: WARNING: [AFT1] CP event on CPU17 (caused Data access error on CPU8), errID 0x000b3550.f68ffd85 Dec 10 09:43:56 hostname8 AFSR 0x00000000.01004000 AFAR 0x00000002.d2b378c0 Dec 10 09:43:56 hostname8 AFSR.PSYND 0x4000(Score 95) AFSR.ETS 0x00 Dec 10 09:43:56 hostname8 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Dec 10 09:43:56 hostname8 unix: [AFT2] errID 0x000b3550.f68ffd85 PA=0x00000002.d2b378c0 Dec 10 09:43:56 hostname8 E$tag 0x00000000.0e405a56 E$State: Shared E$parity 0x07 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x00): 0x497f4454.315f436e *Bad* PSYND=0x4000 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x08): 0x74260778.640b060e Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x10): 0x371cffff.0777c401 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x18): 0x01010101.07c7c70c Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x20): 0x1e183c3c.018002c1 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x28): 0x0402c104.02c1042c Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x30): 0x000b05c4.02011542 Dec 10 09:43:56 hostname8 unix: [AFT2] E$Data (0x38): 0x1073434e.5f4d5249 Dec 10 09:43:56 hostname8 unix: NOTICE: Scheduling clearing of error on page 0x00000002.d2b36000 Dec 10 09:43:56 hostname8 unix: [AFT3] errID 0x000b3550.f68ffd85 Above Error is in User Mode Dec 10 09:43:56 hostname8 and is fatal: will reboot Dec 10 09:43:56 hostname8 unix: WARNING: [AFT1] initiating reboot due to above error in pid 9459 (oracle) Dec 10 09:48:41 hostname8 unix: NOTICE: Previously reported error on page 0x00000002.d2b36000 cleared Dec 10 09:50:00 hostname8 syslogd: going down on signal 15 Dec 10 09:50:02 hostname8 /usr/sbin/vold[22133]: problem unmounting /vol; Interrupted system call Dec 10 09:50:43 hostname8 unix: syncing file systems...
Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 116611 kern.warning] WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID 0x0002141b.1ed7bc07 Jan 14 15:42:52 hostname2 AFSR 0x00000000.80402000AFAR 0x00000000.b4e67900 Jan 14 15:42:52 hostname2 AFSR.PSYND 0x2000(Score 95) AFSR.ETS 0x00 Fault_PC 0x10264548 Jan 14 15:42:52 hostname2 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 508604 kern.info] [AFT2] errID 0x0002141b.1ed7bc07 PA=0x00000000.b4e67900 Jan 14 15:42:52 hostname2 E$tag 0x00000000.0bc0169c E$State: Modified E$parity 0x05 Badlines found=3 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x00): 0x00000100.00000000 *Bad* PSYND=0x2000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000000.00000000 Jan 14 15:42:52 hostname2 SUNW,UltraSPARC-II: [ID 490108 kern.info] [AFT2] errID 0x0002141b.1ed7bc07 AFAR was derived from E$Tag Jan 14 15:42:52 hostname2 unix: [ID 836849 kern.notice] Jan 14 15:42:52 hostname2 ^Mpanic[cpu0]/thread=77fc9380: Jan 14 15:42:52 hostname2 unix: [ID 106827 kern.notice] [AFT1] errID 0x0002141b.1ed7bc07 EDP Error(s) Jan 14 15:42:52 hostname2 See previous message(s) for details Jan 14 15:42:52 hostname2 unix: [ID 100000 kern.notice] Jan 14 15:42:52 hostname2 genunix: [ID 872817 kern.notice] 40267640 SUNW,UltraSPARC-II:cpu_aflt_log+548 (40267830, 402676f3, 40267898, 10, 3, 1013f4a0) Jan 14 15:42:52 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 40267898 1013f47c 10449295 104070c4 030d18b0 00400000 00400000 40267960 Jan 14 15:42:52 hostname2 genunix: [ID 872817 kern.notice] 40267830 SUNW,UltraSPARC-II:cpu_async_error+9e8 (0, 0, 80402000, 1, 0, 402679e8) Jan 14 15:42:52 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00004208 80402000 00000000 00000000 01801000 40267898 030d38a0 00000032 Jan 14 15:42:53 hostname2 genunix: [ID 872817 kern.notice] 40267988 unix:prom_rtt+0 (0, 0, 0, 100, 10000010, 0) Jan 14 15:42:53 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00000005 00001c00 00001e04 10136470 40267bf0 00000000 00000000 402679e8 Jan 14 15:42:53 hostname2 genunix: [ID 872817 kern.notice] 40267a78 sockfs:sorecvmsg+150 (1, 0, 0, 0, ffbedde7, 0) Jan 14 15:42:53 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 030da2a8 00000502 00000810 7351c720 7351c6d8 40267b6c 00000000 40267bf0 Jan 14 15:42:53 hostname2 genunix: [ID 872817 kern.notice] 40267b08 sockfs:sock_read+54 (7351c6d8, 40267bf0, 40267bf0, 7068d4a0, 7351c6d8, 77f66000) Jan 14 15:42:53 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00000000 030d21c0 0303dbb0 00000001 030d18b0 ffbef347 00000000 00000000 Jan 14 15:42:53 hostname2 genunix: [ID 872817 kern.notice] 40267b88 genunix:read+270 (f, 3, 810, 10b05b, 77fdfb08, 7351c6d8) Jan 14 15:42:53 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 1025ecdc 00000810 00000810 00000000 00000000 00000002 77fc9380 40267ce0 Jan 14 15:42:53 hostname2 unix: [ID 100000 kern.notice] Jan 14 15:42:53 hostname2 genunix: [ID 672855 kern.notice] syncing file systems... Jan 14 15:42:54 hostname2 genunix: [ID 904073 kern.notice] done
Jan 15 00:35:50 hostname2 pcipsy: [ID 139652 kern.warning] WARNING: uncorrectable error detected by pci0 (upa mid 1f) during Jan 15 00:35:50 hostname2 DVMA read transaction Jan 15 00:35:50 hostname2 pcipsy: [ID 475334 kern.info] Transaction was a block operation. Jan 15 00:35:50 hostname2 pcipsy: [ID 750218 kern.info] AFSR=40000000.1f800000 AFAR=00000000.b821a6c0, Jan 15 00:35:50 hostname2 double word offset=0, Memory Module U1001 U1002 U1003 U1004 id 31. Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 208304 kern.warning] WARNING: [AFT1] AFAR was derived from UE report, CP event on CPU0 (caused access error on IOBUS 31), errID 0x00001d0a.05f58461 Jan 15 00:35:50 hostname2 AFSR 0x00000000.01002000AFAR 0x00000000.b821a6c0 Jan 15 00:35:50 hostname2 AFSR.PSYND 0x2000(Score 95) AFSR.ETS 0x00 Jan 15 00:35:50 hostname2 UDBH 0x0003 UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND 0x00 Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 781852 kern.info] [AFT2] errID 0x00001d0a.05f58461 PA=0x00000000.b821a6c0 Jan 15 00:35:50 hostname2 E$tag 0x00000000.0dc01704 E$State: Modified E$parity 0x06 Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x00): 0xe3e08dcd.06685075 *Bad* PSYND=0x2000 Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x03b7d377.277d0dfa Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0xdeafab07.afe2d0a9 Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0xe16b20b1.b7d1eb47 Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0xe81bbea4.ab3f6b59 Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x6f6fcafe.baddcafe Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0xbaddcafe.baddcafe Jan 15 00:35:50 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0xbaddcafe.baddcafe Jan 15 00:35:50 hostname2 unix: [ID 836849 kern.notice] Jan 15 00:35:50 hostname2 ^Mpanic[cpu2]/thread=400abe40: Jan 15 00:35:50 hostname2 unix: [ID 261965 kern.notice] Fatal PCI UE Error Jan 15 00:35:50 hostname2 unix: [ID 100000 kern.notice] Jan 15 00:35:50 hostname2 last message repeated 1 time Jan 15 00:35:50 hostname2 genunix: [ID 672855 kern.notice] syncing file systems... Jan 15 00:35:51 hostname2 genunix: [ID 904073 kern.notice] done Jan 15 00:35:52 hostname2 genunix: [ID 353387 kern.notice] dumping to /dev/dsk/c0t0d0s1, offset 65536 Jan 15 00:36:08 hostname2 genunix: [ID 409368 kern.notice] ^M100% done: 24054 pages dumped, compression ratio 3.34, Jan 15 00:36:08 hostname2 genunix: [ID 851671 kern.notice] dump succeeded
Jan 30 11:11:59 gtmailhost2 SUNW,UltraSPARC-II: [ID 706703 kern.warning] WARNING: [AFT1] EDP event on CPU2 Instruction access at TL=0, errID 0x0012ea28.f9916107 Jan 30 11:11:59 gtmailhost2 AFSR 0x00000000.80400001AFAR 0x00000000.ffa6bb28 Jan 30 11:11:59 gtmailhost2 AFSR.PSYND 0x0001(Score 95) AFSR.ETS 0x00 Fault_PC 0x1026bb1c Jan 30 11:11:59 gtmailhost2 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 819439 kern.info] [AFT2] errID 0x0012ea28.f9916107 PA=0x00000000.ffa6bb28 Jan 30 11:12:00 gtmailhost2 E$tag 0x00000000.08401ff4 E$State: Shared E$parity 0x04 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0xa2100008.80a48019 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x1240002a.853a2000 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0xb007a6ff.9528b003 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x9010000b.7ff84718 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x9207a6ff.80900008 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x02400007.9007a57f *Bad* PSYND=0x0001 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x7ff8182d.9010200e Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x91322000.81c7e008 Jan 30 11:12:00 gtmailhost2 SUNW,UltraSPARC-II: [ID 402937 kern.info] [AFT2] errID 0x0012ea28.f9916107 AFAR was derived from E$Tag Jan 30 11:12:00 gtmailhost2 unix: [ID 836849 kern.notice] Jan 30 11:12:00 gtmailhost2 ^Mpanic[cpu2]/thread=300022ef3e0: Jan 30 11:12:00 gtmailhost2 unix: [ID 500536 kern.notice] [AFT1] errID 0x0012ea28.f9916107 EDP Error(s) Jan 30 11:12:00 gtmailhost2 See previous message(s) for details Jan 30 11:12:00 gtmailhost2 unix: [ID 100000 kern.notice] Jan 30 11:12:00 gtmailhost2 genunix: [ID 723222 kern.notice] 000002a10053d290 SUNW,UltraSPARC-II:cpu_aflt_log+568 (2a10053d34e, 1, 10154068, 2a10053d4d8, 2a10053d39b, 10154090) Jan 30 11:12:00 gtmailhost2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000003 000002a10053d5a0 0000000000000010 Jan 30 11:12:00 gtmailhost2 %l4-7: 0000000000400000 0000000000400000 0000000000000000 0000000000000000 Jan 30 11:12:00 gtmailhost2 genunix: [ID 723222 kern.notice] 000002a10053d4e0 SUNW,UltraSPARC-II:cpu_async_error+868 (1, 2a10053d5a0, 80400001, 0, 140000080400001, 2a10053d760) Jan 30 11:12:00 gtmailhost2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000001 000000000000000a 0000000000000000 0000000000000000 Jan 30 11:12:00 gtmailhost2 %l4-7: 0000000014000208 0000000000000000 0000000000000000 0000000000000000 Jan 30 11:12:00 gtmailhost2 genunix: [ID 723222 kern.notice] 000002a10053d6b0 unix:prom_rtt+0 (ff0d1b70, 0, 8, ff0d1b70, 0,0) Jan 30 11:12:00 gtmailhost2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000002 0000000000001400 0000004400001601 000000001014b678 Jan 30 11:12:00 gtmailhost2 %l4-7: 0000000000000000 0000000000000000 0000000000000000 000002a10053d760 Jan 30 11:12:00 gtmailhost2 genunix: [ID 723222 kern.notice] 000002a10053d800 sockfs:recvmsg+b0 (2a10053d9f0, 100000, 100000, 14, 0, ff0d1b58) Jan 30 11:12:01 gtmailhost2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000001 0000000000100000 0000000000000000 Jan 30 11:12:01 gtmailhost2 %l4-7: 00000000ff0d09d8 0000000010411618 000000000016f800 00000000ff26cd38 Jan 30 11:12:01 gtmailhost2 unix: [ID 100000 kern.notice] Jan 30 11:12:01 gtmailhost2 genunix: [ID 672855 kern.notice] syncing file systems...
Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 332087 kern.warning] WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID 0x0000 b7ff.3abd1f67 Nov 12 06:12:21 hostname2 AFSR 0x00000000.80400008AFAR 0x00000000.bb582008 Nov 12 06:12:21 hostname2 AFSR.PSYND 0x0008(Score 95) AFSR.ETS 0x00 Fault_PC 0x10033a08 Nov 12 06:12:21 hostname2 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 258533 kern.info] [AFT2] errID 0x0000b7ff.3abd1f67 PA=0x00000000.bb582008 Nov 12 06:12:21 hostname2 E$tag 0x00000000.0dc0176b E$State: Modified E$parity 0x06 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x40033e40.00000000 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x08): 0x70cb65c8.1134972b *Bad* PSYND=0x0008 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x0134972c.70687f20 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x3e6079d8.7fff9428 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x00000000.00000000 Nov 12 06:12:21 hostname2 SUNW,UltraSPARC-II: [ID 673557 kern.info] [AFT2] errID 0x0000b7ff.3abd1f67 AFAR was derived from E$Tag Nov 12 06:12:21 hostname2 unix: [ID 836849 kern.notice] Nov 12 06:12:21 hostname2 ^Mpanic[cpu0]/thread=40033e40: Nov 12 06:12:21 hostname2 unix: [ID 303753 kern.notice] [AFT1] errID 0x0000b7ff.3abd1f67 EDP Error(s) Nov 12 06:12:21 hostname2 See previous message(s) for details Nov 12 06:12:21 hostname2 unix: [ID 100000 kern.notice] Nov 12 06:12:21 hostname2 genunix: [ID 872817 kern.notice] 40033668 SUNW,UltraSPARC-II:cpu_aflt_log+548 (40033858, 4003371b, 400338c0, 10, 3, 1013f4a0) Nov 12 06:12:21 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 400338c0 1013f47c 10449295 104070c4 71bff8f8 00400000 00400000 4003398 8 Nov 12 06:12:21 hostname2 genunix: [ID 872817 kern.notice] 40033858 SUNW,UltraSPARC-II:cpu_async_error+9e8 (0, 0, 80400008, 1, 0, 40033a10 ) Nov 12 06:12:21 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00000219 80400008 00000000 00000000 bf3a4b30 400338c0 702090a8 0000003 2 Nov 12 06:12:21 hostname2 genunix: [ID 872817 kern.notice] 400339b0 unix:prom_rtt+0 (7067e000, 0, 80000000, 134972d, 0, 1042cc38) Nov 12 06:12:21 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00000003 00001c00 00001e02 10136470 10411228 0000001e 0000000a 40033a1 0 Nov 12 06:12:21 hostname2 genunix: [ID 872817 kern.notice] 40033aa0 genunix:callout_schedule_1+8 (7067e000, 40033e40, 7067e000, 3b9aca00, 1, 0) Nov 12 06:12:22 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00000004 00000002 00000001 10411518 10411228 0000001e 10411710 40033a9 0 Nov 12 06:12:22 hostname2 genunix: [ID 872817 kern.notice] 40033b00 genunix:callout_schedule+4c (10426ecc, 1, 10426e88, 8, 1, 0) Nov 12 06:12:22 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 10406c00 00000000 0134972c fffded80 00000000 70062000 10411710 ff21f12 0 Nov 12 06:12:22 hostname2 genunix: [ID 872817 kern.notice] 40033b60 genunix:clock+488 (1044a45c, 10420400, 0, 0, 0, 0) Nov 12 06:12:22 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 000052cc 00000001 104496f0 00000000 10411228 0000001e 10411710 70093ba 8 Nov 12 06:12:22 hostname2 genunix: [ID 872817 kern.notice] 40033bd8 genunix:cyclic_softint+7c (10411228, 70093c44, 3, 7006c870, 1006f280, 70093c48) Nov 12 06:12:22 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 01349726 00000000 7006c878 70093c08 70093b88 7006c860 00000001 70093ba 8 Nov 12 06:12:22 hostname2 genunix: [ID 872817 kern.notice] 40033c40 unix:cbe_level10+8 (0, 803, 10411228, 40033e40, 8030, 1000aff4) Nov 12 06:12:22 hostname2 genunix: [ID 645206 kern.notice] %l0-7: 00001e05 00000001 00000001 1000966c 403abc80 00000000 00000000 4001fb5 0 Nov 12 06:12:22 hostname2 unix: [ID 100000 kern.notice] Nov 12 06:12:22 hostname2 genunix: [ID 672855 kern.notice] syncing file systems... Nov 12 06:12:23 hostname2 genunix: [ID 904073 kern.notice] done
Nov 12 15:43:35 hostname6 unix: [ID 596940 kern.warning] WARNING: [AFT0] 2375 soft errors in less than 24:00 (hh:mm) detected from M emory Module Board 0 J3500 Nov 12 15:43:35 hostname6 SUNW,UltraSPARC-II: [ID 509127 kern.info] [AFT0] errID 0x00248963.2ba6dbbd Corrected Memory Error on Board 0 J3500 is Persistent Nov 12 15:43:35 hostname6 SUNW,UltraSPARC-II: [ID 261526 kern.info] [AFT0] errID 0x00248963.2ba6dbbd ECC Data Bit 26 was in error an d corrected Nov 12 15:46:24 hostname6 SUNW,UltraSPARC-II: [ID 621828 kern.info] [AFT0] Corrected Memory Error detected by CPU9, errID 0x0024898a .aac0cdfc Nov 12 15:46:24 hostname6 AFSR 0x00000000.00100000AFAR 0x00000000.8cd62038 Nov 12 15:46:24 hostname6 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x24df4c Nov 12 15:46:24 hostname6 UDBL Syndrome 0xb9 Memory Module Board 0 J3500 Nov 12 15:46:24 hostname6 unix: [ID 596940 kern.warning] WARNING: [AFT0] 2376 soft errors in less than 24:00 (hh:mm) detected from M emory Module Board 0 J3500 Nov 12 15:46:24 hostname6 unix: [ID 618185 kern.notice] NOTICE: Scheduling removal of page 0x00000000.8cd62000 Nov 12 15:46:24 hostname6 SUNW,UltraSPARC-II: [ID 894452 kern.info] [AFT0] errID 0x0024898a.aac0cdfc Corrected Memory Error on Board 0 J3500 is Persistent Nov 12 15:46:24 hostname6 SUNW,UltraSPARC-II: [ID 904179 kern.info] [AFT0] errID 0x0024898a.aac0cdfc ECC Data Bit 26 was in error an d corrected Nov 12 15:46:24 hostname6 SUNW,UltraSPARC-II: [ID 175402 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU9 Data acces s at TL>0, errID 0x0024898a.ad930d7a Nov 12 15:46:24 hostname6 AFSR 0x00000000.00200000 AFAR 0x00000000.8cc68038 Nov 12 15:46:24 hostname6 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x24df4c Nov 12 15:46:24 hostname6 UDBH 0x0083 UDBH.ESYND 0x83 UDBL 0x02ed UDBL.ESYND 0xed Nov 12 15:46:24 hostname6 UDBL Syndrome 0xed Memory Module Board 0 J3100 J3200 J3300 J3400 J3500 J3600 J3700 J3800 Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 635909 kern.info] [AFT2] errID 0x0024898a.ad930d7a PA=0x00000000.8cc68038 Nov 12 15:46:25 hostname6 E$tag 0x00000000.1cc01198 E$State: Exclusive E$parity 0x0e Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x06020000.7440963d Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x5fd04cde.00000106 Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x4ed20000.01000000 Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x0045a89d.5fd04cdd Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x0000fa78.00023200 Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x74409605.00050031 Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00045a9b.0080cc8b Nov 12 15:46:25 hostname6 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x38): 0x04365800.2c2a0000 *Bad* PSYND=0x00ff Nov 12 15:46:25 hostname6 unix: [ID 836849 kern.notice] Nov 12 15:46:25 hostname6 ^Mpanic[cpu9]/thread=30018c902c0: Nov 12 15:46:25 hostname6 unix: [ID 766774 kern.notice] [AFT1] errID 0x0024898a.ad930d7a UE Error(s) Nov 12 15:46:25 hostname6 See previous message(s) for details Nov 12 15:46:25 hostname6 unix: [ID 100000 kern.notice] Nov 12 15:46:25 hostname6 genunix: [ID 723222 kern.notice] 000002a1006656d0 SUNW,UltraSPARC-II:cpu_aflt_log+568 (2a10066578e, 1, 101 51d68, 2a100665918, 2a1006657db, 10151d90) Nov 12 15:46:25 hostname6 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000003 000002a1006659e0 0000000000000 010 Nov 12 15:46:25 hostname6 %l4-7: 0000000000000000 0000000000800000 000000000075ff40 0000000000000000 Nov 12 15:46:25 hostname6 genunix: [ID 723222 kern.notice] 000002a100665920 SUNW,UltraSPARC-II:cpu_async_error+868 (1046a470, 2a1006 659e0, 200000, 0, 4657690600200000, 2a100665ba0) Nov 12 15:46:25 hostname6 genunix: [ID 179002 kern.notice] %l0-3: 000000001040db3c 0000000000000032 00000000000002ed 0000000000000 083 Nov 12 15:46:25 hostname6 %l4-7: 000000008cc68000 0000000000800000 0000000000800000 0000000000000001 Nov 12 15:46:25 hostname6 unix: [ID 100000 kern.notice] Nov 12 15:46:25 hostname6 genunix: [ID 672855 kern.notice] syncing file systems... Nov 12 15:46:26 hostname6 genunix: [ID 904073 kern.notice] done
Fri May 25 09:55:05 EDT 2007 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 574036 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU3 Data access at TL=0, errID 0x00035245.72bad7c8 May 25 09:40:48 hostname12 AFSR 0x00000000.00200000AFAR 0x00000000.6c25db68 May 25 09:40:48 hostname12 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0xfe890de8 May 25 09:40:48 hostname12 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203 UDBL.ESYND 0x03 May 25 09:40:48 hostname12 UDBL Syndrome 0x3 Memory Module U1402 U0402 U1401 U0401 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 510931 kern.warning] WARNING: [AFT1] errID 0x00035245.72bad7c8 Syndrome 0x3 indicates that this may not be a memory module problem May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 215430 kern.info] [AFT2] errID 0x00035245.72bad7c8 PA=0x00000000.6c25db68 May 25 09:40:48 hostname12 E$tag 0x00000000.1ac00d84 E$State: Exclusive E$parity 0x0d May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0xfd58d63f.1abc7a99 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0xaa3d582d.b45bca43 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x82acd9a6.2692f4b1 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x36deab7d.ac29caf6 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x635bcad5.1b7f0427 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x9ac7482f.a2ca13e4 *Bad* PSYND=0x00ff May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x4b04d78c.9ad1421c May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x495cfdc9.bd42232c May 25 09:40:48 hostname12 unix: [ID 321153 kern.notice] NOTICE: Scheduling clearing of error on page 0x00000000.6c25c000 May 25 09:40:48 hostname12 SUNW,UltraSPARC-II: [ID 252112 kern.info] [AFT3] errID 0x00035245.72bad7c8 Above Error is in User Mode May 25 09:40:48 hostname12 and is fatal: will reboot May 25 09:40:48 hostname12 unix: [ID 855177 kern.warning] WARNING: [AFT1] initiating reboot due to above error in pid 436 (dbsnmp) May 25 09:40:52 hostname12 unix: [ID 221039 kern.notice] NOTICE: Previously reported error on page 0x00000000.6c25c000 cleared
Mar 13 01:54:50 ldap3 SUNW,UltraSPARC-II: [ID 430098 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU2 Data access at TL=0, errID 0x0073357d.48356a80 Mar 13 01:54:50 ldap3 AFSR 0x00000000.80200000AFAR 0x00000000.ddec6b28 Mar 13 01:54:50 ldap3 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x10034b44 Mar 13 01:54:50 ldap3 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203 UDBL.ESYND 0x03 Mar 13 01:54:50 ldap3 UDBL Syndrome 0x3 Memory Module U1302 U0302 U1301 U0301 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 306615 kern.warning] WARNING: [AFT1] errID 0x0073357d.48356a80 Syndrome 0x3 indicates that this may not be a memory module problem Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 456232 kern.info] [AFT2] errID 0x0073357d.48356a80 PA=0x00000000.ddec6b28 Mar 13 01:54:51 ldap3 E$tag 0x00000000.0fc01bbd E$State: Modified E$parity 0x07 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.02a9eb64 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000300.02a9eb78 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000300.02a9eb64 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0xf83e0000.00000000 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x00000001.00000000 *Bad* PSYND=0x00ff Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.ffffffff Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x0000000c.00000010 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 591976 kern.warning] WARNING: [AFT1] AFAR was derived from UE report, CP event on CPU3 (caused Data access error on CPU2), errID 0x0073357d.48356a80 Mar 13 01:54:51 ldap3 AFSR 0x00000000.01000010 AFAR 0x00000000.ddec6b28 Mar 13 01:54:51 ldap3 AFSR.PSYND 0x0010(Score 95) AFSR.ETS 0x00 Mar 13 01:54:51 ldap3 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 456232 kern.info] [AFT2] errID 0x0073357d.48356a80 PA=0x00000000.ddec6b28 Mar 13 01:54:51 ldap3 E$tag 0x00000000.1f801bbd E$State: Invalid E$parity 0x0f Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.02a9eb79 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000300.02a9ec16 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000300.02a9eb79 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x57a60000.00000000 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x28): 0x00000001.00000000 *Bad* PSYND=0x0010 Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.ffffffff Mar 13 01:54:51 ldap3 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x0000000c.00000010 Mar 13 01:54:51 ldap3 unix: [ID 836849 kern.notice] Mar 13 01:54:51 ldap3 ^Mpanic[cpu2]/thread=2a1001cbd20: Mar 13 01:54:51 ldap3 unix: [ID 917101 kern.notice] [AFT1] errID 0x0073357d.48356a80 UE Error(s) Mar 13 01:54:51 ldap3 See previous message(s) for details Mar 13 01:54:51 ldap3 unix: [ID 100000 kern.notice] Mar 13 01:54:51 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb100 SUNW,UltraSPARC-II:cpu_aflt_log+568 (2a1001cb1be, 1, 10154068, 2a1001cb348, 2a1001cb20b, 10154090) Mar 13 01:54:51 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000003 000002a1001cb410 0000000000000010 Mar 13 01:54:51 ldap3 %l4-7: 00000000fd3d1d98 0000000000000000 0000000000000000 0000000000000000 Mar 13 01:54:52 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb350 SUNW,UltraSPARC-II:cpu_async_error+868 (1046b130, 2a1001cb410, 80200000, 0, 650180080200000, 2a1001cb5d0) Mar 13 01:54:52 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000000010475b30 0000000000000032 0000000000000203 0000000000000000 Mar 13 01:54:52 ldap3 %l4-7: 00000000ddec6b00 0000000000400000 0000000000400000 0000000000000001 Mar 13 01:54:52 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb520 unix:prom_rtt+0 (3000191fc28, 2a1001cbd20, 30002a9eb64, 2f83c, 30002a9eac0, 8) Mar 13 01:54:52 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000003 0000000000001400 0000004400001602 000000001014b678 Mar 13 01:54:52 ldap3 %l4-7: 00000000fd3d19b8 000002a100c8baf0 0000000000000006 000002a1001cb5d0 Mar 13 01:54:52 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb670 ip:ip_rput_local+800 (0, 80e30185, 3000191fc28, 0, 3000517e300, 30001d64bf8) Mar 13 01:54:52 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 000003000191fc20 0000030002a9eb50 0000030001d61688 0000030000156628 Mar 13 01:54:52 ldap3 %l4-7: 000003000517e300 00000000d8e7a32f 0000000000000000 000000000000ffff Mar 13 01:54:52 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb760 ip:ip_rput+12c4 (6, 30000156628, 30001d64bf8, 30002a9eb50, 30001d61688, 3000517e300) Mar 13 01:54:52 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000000000061281 0000000000000000 0000000000000000 0000000000000028 Mar 13 01:54:52 ldap3 %l4-7: 0000000000000028 0000000000000001 0000000000000001 9000000000000012 Mar 13 01:54:53 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb830 unix:putnext+1cc (30001b55ec0, 30001b5f160, 30001d64bf8, 3000517e300, 30001b55ec8, 30001b55ec0) Mar 13 01:54:53 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000030001d64bf8 0000030001b5dea8 0000030001d65578 0000000000000000 Mar 13 01:54:53 ldap3 %l4-7: 00000000101a0f00 0000000000000000 0000000000000000 0000000000000000 Mar 13 01:54:53 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb8e0 hme:hmeread+33c (0, 30001d65578, 30001d5d4d0, 30001d5d498, ae, 370) Mar 13 01:54:53 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000030001d5d4a0 0000030001d5cc40 0000000000001400 0000030001d5c000 Mar 13 01:54:53 ldap3 %l4-7: 000000000000003c 000003000517e300 000003000034a570 0000000000001498 Mar 13 01:54:53 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cb9b0 hme:hmeintr+374 (80000000, 1498, 0, 3010101, 14c0, 14a0) Mar 13 01:54:53 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 000003000034a370 0000000000001400 0000030001d5c000 0000030001d5c3e8 Mar 13 01:54:53 ldap3 %l4-7: 0000000000000001 0000030001d5c000 0000000000000000 0000000000000000 Mar 13 01:54:53 ldap3 genunix: [ID 723222 kern.notice] 000002a1001cba60 pcipsy:pci_intr_wrapper+80 (104a2028, 104a2060, 3000013f550, 30000eb5e48, 3000016b748, 0) Mar 13 01:54:54 ldap3 genunix: [ID 179002 kern.notice] %l0-3: 0000000010345d7c 0000000000000000 00000300000a0578 0000000000172688 Mar 13 01:54:54 ldap3 %l4-7: 00000000fd3d1d98 0000000000000000 0000000000000000 0000000000000000 Mar 13 01:54:54 ldap3 unix: [ID 100000 kern.notice] Mar 13 01:54:54 ldap3 genunix: [ID 672855 kern.notice] syncing file systems... Mar 13 01:54:54 ldap3 genunix: [ID 733762 kern.notice] 10 Mar 13 01:54:55 ldap3 genunix: [ID 733762 kern.notice] 9 Mar 13 01:55:22 ldap3 last message repeated 20 times Mar 13 01:55:23 ldap3 genunix: [ID 622722 kern.notice] done (not all i/o completed)
Jul 21 18:31:05 coast-mail2 SUNW,UltraSPARC-II: [ID 614794 kern.warning] WARNING: [AFT1] Uncorrectable Memory Error on CPU1 Data access at TL=0, errID 0x001cf75c.56a368d4 Jul 21 18:31:05 coast-mail2 AFSR 0x00000000.80200000AFAR 0x00000000.f4b08478 Jul 21 18:31:05 coast-mail2 AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC 0x1007169c Jul 21 18:31:05 coast-mail2 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203 UDBL.ESYND 0x03 Jul 21 18:31:05 coast-mail2 UDBL Syndrome 0x3 Memory Module U1402 U0402 U1401 U0401 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 301971 kern.warning] WARNING: [AFT1] errID 0x001cf75c.56a368d4 Syndrome 0x3 indicates that this may not be a memory module problem Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 938084 kern.info] [AFT2] errID 0x001cf75c.56a368d4 PA=0x00000000.f4b08478 Jul 21 18:31:06 coast-mail2 E$tag 0x00000000.0fc01e96 E$State: Modified E$parity 0x07 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000300.01f528a0 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x38): 0x00010000.00000000 *Bad* PSYND=0x00ff Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 351707 kern.warning] WARNING: [AFT1] AFAR was derived from UE report, CP event on CPU0 (caused Data access error on CPU1), errID 0x001cf75c.56a368d4 Jul 21 18:31:06 coast-mail2 AFSR 0x00000000.01000040 AFAR 0x00000000.f4b08478 Jul 21 18:31:06 coast-mail2 AFSR.PSYND 0x0040(Score 95) AFSR.ETS 0x00 Jul 21 18:31:06 coast-mail2 UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 938084 kern.info] [AFT2] errID 0x001cf75c.56a368d4 PA=0x00000000.f4b08478 Jul 21 18:31:06 coast-mail2 E$tag 0x00000000.1f801e96 E$State: Invalid E$parity 0x0f Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x08): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x00000000.00000000 Jul 21 18:31:06 coast-mail2 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x38): 0x00010000.00000000 *Bad* PSYND=0x0040 Jul 21 18:31:06 coast-mail2 unix: [ID 836849 kern.notice] Jul 21 18:31:06 coast-mail2 ^Mpanic[cpu1]/thread=30001f528a0: Jul 21 18:31:06 coast-mail2 unix: [ID 224311 kern.notice] [AFT1] errID 0x001cf75c.56a368d4 UE Error(s) Jul 21 18:31:06 coast-mail2 See previous message(s) for details Jul 21 18:31:06 coast-mail2 unix: [ID 100000 kern.notice] Jul 21 18:31:06 coast-mail2 genunix: [ID 723222 kern.notice] 000002a10048b4b0 SUNW,UltraSPARC-II:cpu_aflt_log+568 (2a10048b56e, 1, 10154068, 2a10048b6f8, 2a10048b5bb, 10154090) Jul 21 18:31:06 coast-mail2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 0000000000000003 000002a10048b7c0 0000000000000010 Jul 21 18:31:06 coast-mail2 %l4-7: 00000000001788c0 00000000ff041c28 000000000015a000 0000000000159400 Jul 21 18:31:07 coast-mail2 genunix: [ID 723222 kern.notice] 000002a10048b700 SUNW,UltraSPARC-II:cpu_async_error+868 (1046b130, 2a10048b7c0, 80200000, 0, 650180080200000, 2a10048b980) Jul 21 18:31:07 coast-mail2 genunix: [ID 179002 kern.notice] %l0-3: 0000000010475b30 0000000000000032 0000000000000203 0000000000000000 Jul 21 18:31:07 coast-mail2 %l4-7: 00000000f4b08440 0000000000400000 0000000000400000 0000000000000001 Jul 21 18:31:07 coast-mail2 genunix: [ID 723222 kern.notice] 000002a10048b8d0 unix:prom_rtt+0 (30000efc440, 0, 2, 2a10048bae8, 16, 30001c8fc08) Jul 21 18:31:07 coast-mail2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000003 0000000000001400 0000000080001602 000000001014b678 Jul 21 18:31:07 coast-mail2 %l4-7: 0000000000000001 0000000000020000 0000000000000000 000002a10048b980 Jul 21 18:31:07 coast-mail2 genunix: [ID 723222 kern.notice] 000002a10048ba20 genunix:post_syscall+304 (30001f528a0, 93, 1, 1, 0, 0) Jul 21 18:31:07 coast-mail2 genunix: [ID 179002 kern.notice] %l0-3: 0000000000000000 000002a10048bba0 0000030001fe74c0 0000000000000000 Jul 21 18:31:07 coast-mail2 %l4-7: 0000000000000000 0000030001fbcac0 0000000000000000 00000000ffffffff Jul 21 18:31:07 coast-mail2 unix: [ID 100000 kern.notice] Jul 21 18:31:07 coast-mail2 genunix: [ID 672855 kern.notice] syncing file systems... Jul 21 18:31:09 coast-mail2 genunix: [ID 733762 kern.notice] 52 Jul 21 18:31:10 coast-mail2 genunix: [ID 733762 kern.notice] 42 Jul 21 18:31:11 coast-mail2 genunix: [ID 733762 kern.notice] 37 Jul 21 18:31:38 coast-mail2 last message repeated 20 times Jul 21 18:31:39 coast-mail2 genunix: [ID 622722 kern.notice] done (not all i/o completed) Jul 21 18:31:40 coast-mail2 genunix: [ID 353387 kern.notice] dumping to /dev/dsk/c0t0d0s1, offset 859373568 Jul 21 18:33:42 quickbeam genunix: [ID 409368 kern.notice] ^M100% done: 90404 pages dumped, compression ratio 3.05, Jul 21 18:33:42 quickbeam genunix: [ID 851671 kern.notice] dump succeeded