Sun Microsystems, Inc
Sun Fire X4140, X4240, and Servers Diagnostics Guide
August 2008, Revision A
Please Recycle
Preface
Contents
2. Using SunVTS Diagnostic Software
3. Troubleshooting DIMM Problems
A. Event Logs and POST Codes
Status Indicator LEDs
Using the ILOM Service Processor GUI to View System Information
Error Handling
Handling of System Errors SERR
Handling of Uncorrectable Errors
Handling of Correctable Errors
Handling of Parity Errors PERR
vi Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Before You Read This Document
Preface
http//docs.sun.com
Related Documentation
Web Sites
Typographic ConventionsThird-Party
x Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Sun Welcomes Your Comments
Initial Inspection of the Server
Service Troubleshooting Flowchart
“Service Troubleshooting Flowchart” on page
“Gathering Service Information” on page “System Inspection” on page
4. Check for potential device conflicts before you add a new device
Gathering Service Information
2. Document the server settings before you make any changes
1. Collect information about the following items
System Inspection
Troubleshooting Power Problems
Externally Inspecting the Server
Internally Inspecting the Server
2. Remove the server cover
6 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Running SunVTS Diagnostic Tests
Using SunVTS Diagnostic Software
SunVTS Documentation
Diagnosing Server Problems With the Bootable Diagnostics CD
Requirements
Using the Bootable Diagnostics CD
c. With the three lower buttons you can perform the following actions
a. Click the Log button
Close the Log file window - The window is closed
“Isolating and Correcting DIMM ECC Errors” on page
Troubleshooting DIMM Problems
“DIMM Population Rules” on page “DIMM Replacement Policy” on page
“How DIMM Errors Are Handled by the System” on page
How DIMM Errors Are Handled by the System
DIMM Replacement Policy
Uncorrectable DIMM Errors
# ipmitool -H 10.6.77.249 -U root -P changeme -I lanplus sel list
Correctable DIMM Errors
DIMM Fault LEDs
BIOS DIMM Error Messages
DIMM fault LED is off - The DIMM is operating properly
FIGURE 3-1 DIMMs and LEDs on Motherboard
Chapter 3 Troubleshooting DIMM Problems
Isolating and Correcting DIMM ECC Errors
9. If there is no obvious damage, replace any failed DIMMs
4. Disconnect the AC power cords from the server
8. Dust off the DIMMs, clean the contacts, and reseat them
10. Reconnect AC power cords to the server
20 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
11. Power on the server and run the diagnostics test again
12. Review the log file
Event Logs and POST Codes
“Viewing Event Logs” on page “Power-On Self-Test POST” on page
Viewing Event Logs
Main Advanced PCIPnP Boot Security Chipset Exit
Appendix A Event Logs and POST Codes
The Advanced Menu Event Logging Details screen is displayed
24 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
How BIOS POST Memory Testing Works
Power-On Self-Test POST
Redirecting Console Output
11. Click the Start Redirection button
28 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Changing POST Options
3. Select Boot Settings Configuration
30 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Codes
POST Codes Continued
32 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
POST Code Checkpoints
POST Code Checkpoints Continued
34 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Initializes NUM-LOCK status and programs the KBD typematic rate
POST Code Checkpoints Continued
36 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
External Status Indicator LEDs
Status Indicator LEDs
Rear PS LED Amber Power supply fault
Front Panel LEDs
Back Panel LEDs
38 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Hard Drive LEDs
Internal Status Indicator LEDs
FIGURE B-4 DIMMs and LEDs on Motherboard
40 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
FIGURE B-5 DIMMs and LEDs on Mezzanine Board
Appendix B Status Indicator LEDs
42 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
“Viewing Replaceable Component Information” on page
Using the ILOM Service Processor GUI to View System Information
“Making a Serial Connection to the SP” on page
“Viewing ILOM SP Event Logs” on page
“Viewing Replaceable Component Information” on page
Making a Serial Connection to the SP
cd /SP/console start
“Viewing ILOM SP Event Logs” on page
Viewing ILOM SP Event Logs
You can select from the following types of events
Interpreting Event Log Time Stamps
48 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Viewing Replaceable Component Information
2. From the System Information tab, select Components
50 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Viewing Sensors
FIGURE C-3 Sensor Readings Page
4. Click a sensor to display its thresholds
FIGURE C-4 Sensor Details Page
52 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
“Handling of Correctable Errors” on page
Error Handling
Handling of Uncorrectable Errors
“Handling of Uncorrectable Errors” on page
Note the following considerations for this revision
Appendix D Error Handling
FIGURE D-1 DMI Log Screen, Uncorrectable Error
Handling of Correctable Errors
The BIOS logs an SEL record The BIOS logs an event in DMI
FIGURE D-2 DMI Log Screen, Correctable Error
Appendix D Error Handling
58 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
EXAMPLE D-1 DMI Log Screen, Correctable Error, Memory Decreased
Handling of Parity Errors PERR
FIGURE D-3 DMI Log Screen, PCI Parity Error
Handling of System Errors SERR
62 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
DMI Log Screen with Error
The system enters Halt mode and the following message is displayed
Handling Mismatching Processors
The BIOS performs a complete POST
No SEL or DMI event is recorded
Hardware Error Handling Summary
Hardware Error Handling Summary
Continued
Hardware Error Handling Summary
sync flood error occurred on last
boot, press F1 to continue
66 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Hardware Error Handling Summary
Continued
Multiple fan
Hardware Error Handling Summary Continued
The Front Fan Fault, Service Action Required
tach signals
68 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide August
Index