August 2008, Revision A
Sun Fire X4140, X4240, and Servers Diagnostics Guide
Sun Microsystems, Inc
Please Recycle
2. Using SunVTS Diagnostic Software
Contents
3. Troubleshooting DIMM Problems
Preface
Using the ILOM Service Processor GUI to View System Information
Status Indicator LEDs
Error Handling
A. Event Logs and POST Codes
Handling of Correctable Errors
Handling of Uncorrectable Errors
Handling of Parity Errors PERR
Handling of System Errors SERR
vi Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Before You Read This Document
Preface
http//docs.sun.com
Related Documentation
Web Sites
Typographic ConventionsThird-Party
x Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Sun Welcomes Your Comments
“Service Troubleshooting Flowchart” on page
Service Troubleshooting Flowchart
“Gathering Service Information” on page “System Inspection” on page
Initial Inspection of the Server
2. Document the server settings before you make any changes
Gathering Service Information
1. Collect information about the following items
4. Check for potential device conflicts before you add a new device
Externally Inspecting the Server
Troubleshooting Power Problems
System Inspection
Internally Inspecting the Server
2. Remove the server cover
6 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Running SunVTS Diagnostic Tests
Using SunVTS Diagnostic Software
Requirements
Diagnosing Server Problems With the Bootable Diagnostics CD
SunVTS Documentation
Using the Bootable Diagnostics CD
Close the Log file window - The window is closed
a. Click the Log button
c. With the three lower buttons you can perform the following actions
“DIMM Population Rules” on page “DIMM Replacement Policy” on page
Troubleshooting DIMM Problems
“How DIMM Errors Are Handled by the System” on page
“Isolating and Correcting DIMM ECC Errors” on page
Uncorrectable DIMM Errors
DIMM Replacement Policy
How DIMM Errors Are Handled by the System
# ipmitool -H 10.6.77.249 -U root -P changeme -I lanplus sel list
Correctable DIMM Errors
DIMM Fault LEDs
BIOS DIMM Error Messages
DIMM fault LED is off - The DIMM is operating properly
FIGURE 3-1 DIMMs and LEDs on Motherboard
Chapter 3 Troubleshooting DIMM Problems
Isolating and Correcting DIMM ECC Errors
8. Dust off the DIMMs, clean the contacts, and reseat them
4. Disconnect the AC power cords from the server
10. Reconnect AC power cords to the server
9. If there is no obvious damage, replace any failed DIMMs
12. Review the log file
11. Power on the server and run the diagnostics test again
20 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Viewing Event Logs
“Viewing Event Logs” on page “Power-On Self-Test POST” on page
Event Logs and POST Codes
Main Advanced PCIPnP Boot Security Chipset Exit
Appendix A Event Logs and POST Codes
The Advanced Menu Event Logging Details screen is displayed
24 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
How BIOS POST Memory Testing Works
Power-On Self-Test POST
Redirecting Console Output
11. Click the Start Redirection button
28 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Changing POST Options
3. Select Boot Settings Configuration
30 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Codes
POST Codes Continued
32 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Code Checkpoints
POST Code Checkpoints Continued
34 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Initializes NUM-LOCK status and programs the KBD typematic rate
POST Code Checkpoints Continued
36 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
External Status Indicator LEDs
Status Indicator LEDs
Back Panel LEDs
Front Panel LEDs
38 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Rear PS LED Amber Power supply fault
Hard Drive LEDs
Internal Status Indicator LEDs
FIGURE B-4 DIMMs and LEDs on Motherboard
40 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
FIGURE B-5 DIMMs and LEDs on Mezzanine Board
Appendix B Status Indicator LEDs
42 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
“Making a Serial Connection to the SP” on page
Using the ILOM Service Processor GUI to View System Information
“Viewing ILOM SP Event Logs” on page
“Viewing Replaceable Component Information” on page
cd /SP/console start
Making a Serial Connection to the SP
“Viewing ILOM SP Event Logs” on page
“Viewing Replaceable Component Information” on page
Viewing ILOM SP Event Logs
You can select from the following types of events
Interpreting Event Log Time Stamps
48 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Viewing Replaceable Component Information
2. From the System Information tab, select Components
50 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Viewing Sensors
FIGURE C-3 Sensor Readings Page
4. Click a sensor to display its thresholds
FIGURE C-4 Sensor Details Page
52 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Handling of Uncorrectable Errors
Error Handling
“Handling of Uncorrectable Errors” on page
“Handling of Correctable Errors” on page
Note the following considerations for this revision
Appendix D Error Handling
FIGURE D-1 DMI Log Screen, Uncorrectable Error
Handling of Correctable Errors
Appendix D Error Handling
FIGURE D-2 DMI Log Screen, Correctable Error
The BIOS logs an SEL record The BIOS logs an event in DMI
58 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
EXAMPLE D-1 DMI Log Screen, Correctable Error, Memory Decreased
Handling of Parity Errors PERR
FIGURE D-3 DMI Log Screen, PCI Parity Error
Handling of System Errors SERR
62 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
DMI Log Screen with Error
The BIOS performs a complete POST
Handling Mismatching Processors
No SEL or DMI event is recorded
The system enters Halt mode and the following message is displayed
Hardware Error Handling Summary
Hardware Error Handling Summary
sync flood error occurred on last
Hardware Error Handling Summary
boot, press F1 to continue
Continued
Continued
Hardware Error Handling Summary
66 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
The Front Fan Fault, Service Action Required
Hardware Error Handling Summary Continued
tach signals
Multiple fan
68 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Index