Sun Fire X4140, X4240, and Servers Diagnostics Guide
Sun Microsystems, Inc
August 2008, Revision A
Please Recycle
3. Troubleshooting DIMM Problems
Contents
2. Using SunVTS Diagnostic Software
Preface
Error Handling
Status Indicator LEDs
Using the ILOM Service Processor GUI to View System Information
A. Event Logs and POST Codes
Handling of Parity Errors PERR
Handling of Uncorrectable Errors
Handling of Correctable Errors
Handling of System Errors SERR
vi Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Preface
Before You Read This Document
Related Documentation
http//docs.sun.com
Typographic ConventionsThird-Party
Web Sites
Sun Welcomes Your Comments
x Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
“Gathering Service Information” on page “System Inspection” on page
Service Troubleshooting Flowchart
“Service Troubleshooting Flowchart” on page
Initial Inspection of the Server
1. Collect information about the following items
Gathering Service Information
2. Document the server settings before you make any changes
4. Check for potential device conflicts before you add a new device
Troubleshooting Power Problems
System Inspection
Externally Inspecting the Server
Internally Inspecting the Server
2. Remove the server cover
6 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Using SunVTS Diagnostic Software
Running SunVTS Diagnostic Tests
Diagnosing Server Problems With the Bootable Diagnostics CD
SunVTS Documentation
Requirements
Using the Bootable Diagnostics CD
a. Click the Log button
c. With the three lower buttons you can perform the following actions
Close the Log file window - The window is closed
“How DIMM Errors Are Handled by the System” on page
Troubleshooting DIMM Problems
“DIMM Population Rules” on page “DIMM Replacement Policy” on page
“Isolating and Correcting DIMM ECC Errors” on page
DIMM Replacement Policy
How DIMM Errors Are Handled by the System
Uncorrectable DIMM Errors
# ipmitool -H 10.6.77.249 -U root -P changeme -I lanplus sel list
Correctable DIMM Errors
BIOS DIMM Error Messages
DIMM Fault LEDs
DIMM fault LED is off - The DIMM is operating properly
Chapter 3 Troubleshooting DIMM Problems
FIGURE 3-1 DIMMs and LEDs on Motherboard
Isolating and Correcting DIMM ECC Errors
10. Reconnect AC power cords to the server
4. Disconnect the AC power cords from the server
8. Dust off the DIMMs, clean the contacts, and reseat them
9. If there is no obvious damage, replace any failed DIMMs
11. Power on the server and run the diagnostics test again
20 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
12. Review the log file
“Viewing Event Logs” on page “Power-On Self-Test POST” on page
Event Logs and POST Codes
Viewing Event Logs
Main Advanced PCIPnP Boot Security Chipset Exit
The Advanced Menu Event Logging Details screen is displayed
Appendix A Event Logs and POST Codes
24 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Power-On Self-Test POST
How BIOS POST Memory Testing Works
Redirecting Console Output
11. Click the Start Redirection button
Changing POST Options
28 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
3. Select Boot Settings Configuration
30 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Codes
32 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Codes Continued
POST Code Checkpoints
34 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Code Checkpoints Continued
Initializes NUM-LOCK status and programs the KBD typematic rate
36 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Code Checkpoints Continued
Status Indicator LEDs
External Status Indicator LEDs
38 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Front Panel LEDs
Back Panel LEDs
Rear PS LED Amber Power supply fault
Internal Status Indicator LEDs
Hard Drive LEDs
40 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
FIGURE B-4 DIMMs and LEDs on Motherboard
Appendix B Status Indicator LEDs
FIGURE B-5 DIMMs and LEDs on Mezzanine Board
42 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
“Viewing ILOM SP Event Logs” on page
Using the ILOM Service Processor GUI to View System Information
“Making a Serial Connection to the SP” on page
“Viewing Replaceable Component Information” on page
“Viewing ILOM SP Event Logs” on page
Making a Serial Connection to the SP
cd /SP/console start
“Viewing Replaceable Component Information” on page
Viewing ILOM SP Event Logs
You can select from the following types of events
Interpreting Event Log Time Stamps
Viewing Replaceable Component Information
48 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
2. From the System Information tab, select Components
Viewing Sensors
50 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
4. Click a sensor to display its thresholds
FIGURE C-3 Sensor Readings Page
52 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
FIGURE C-4 Sensor Details Page
“Handling of Uncorrectable Errors” on page
Error Handling
Handling of Uncorrectable Errors
“Handling of Correctable Errors” on page
Note the following considerations for this revision
FIGURE D-1 DMI Log Screen, Uncorrectable Error
Appendix D Error Handling
Handling of Correctable Errors
FIGURE D-2 DMI Log Screen, Correctable Error
The BIOS logs an SEL record The BIOS logs an event in DMI
Appendix D Error Handling
EXAMPLE D-1 DMI Log Screen, Correctable Error, Memory Decreased
58 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Handling of Parity Errors PERR
FIGURE D-3 DMI Log Screen, PCI Parity Error
Handling of System Errors SERR
DMI Log Screen with Error
62 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
No SEL or DMI event is recorded
Handling Mismatching Processors
The BIOS performs a complete POST
The system enters Halt mode and the following message is displayed
Hardware Error Handling Summary
Hardware Error Handling Summary
boot, press F1 to continue
Hardware Error Handling Summary
sync flood error occurred on last
Continued
Hardware Error Handling Summary
66 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Continued
tach signals
Hardware Error Handling Summary Continued
The Front Fan Fault, Service Action Required
Multiple fan
68 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Index