Sun Microsystems X4140, X4240, X4440 manual Hardware Error Handling Summary

Page 74

Hardware Error Handling Summary

TABLE D-1summarizes the most common hardware errors that you might encounter with these servers.

TABLE D-1

Hardware Error Handling Summary

 

 

 

 

 

 

 

 

 

 

Logged (DMI

 

 

 

 

Log or SP

 

Error

Description

Handling

SEL)

Fatal?

 

 

 

 

 

SP failure

The SP fails to boot

 

upon application of

 

system power.

SP failure

SP boots but fails

 

POST.

BIOS POST

Server BIOS does

failure

not pass POST.

The SP controls the system reset, so the

Not logged

Fatal

system may power on, but will not come out

 

 

of reset.

 

 

• During power up, the SP's boot loader

 

 

turns on the power LED.

 

 

• During SP boot, Linux startup, and SP

 

 

sanity check, the power LED blinks.

 

 

• The LED is turned off when SP

 

 

management code (the IPMI stack) is

 

 

started.

 

 

• At exit of BIOS POST, the LED goes to

 

 

STEADY ON state.

 

 

The SP controls the system RESET, so the

Not logged

Fatal

system will not come out of reset.

 

 

There are fatal and non-fatal errors in POST.

 

 

The BIOS does detect some errors that are

 

 

announced during POST as POST codes on

 

 

the bottom right corner of the display on the

 

 

serial console and on the video display. Some

 

 

POST codes are forwarded to the SP for

 

 

logging.

 

 

The POST codes do not come out in

 

 

sequential order and some are repeated,

 

 

because some POST codes are issued by code

 

 

in add-in card BIOS expansion ROMs.

 

 

In the case of early POST failures (for

 

 

example, the BSP fails to operate correctly),

 

 

BIOS just halts without logging.

 

 

For some other POST failures subsequent to

 

 

memory and SP initialization, the BIOS logs a

 

 

message to the SP’s SEL.

 

 

64 Sun Fire X4140, X4240, and X4440 Servers Diagnostics Guide • August 2008

Image 74
Contents Sun Fire X4140, X4240, Servers Diagnostics Guide Please Recycle Contents Event Logs and Post Codes Status Indicator LEDsError Handling Index Page Preface Before You Read This DocumentRelated Documentation Typographic ConventionsThird-Party Web SitesSun Welcomes Your Comments Service Troubleshooting Flowchart Initial Inspection of the ServerCollect information about the following items Gathering Service InformationDocument the server settings before you make any changes Externally Inspecting the Server Troubleshooting Power ProblemsSystem Inspection Internally Inspecting the Server 1X4140 Server Front PanelLocate Button/LED Power Button Page Using SunVTS Diagnostic Software Running SunVTS Diagnostic TestsDiagnosing Server Problems With the Bootable Diagnostics CD SunVTS DocumentationUsing the Bootable Diagnostics CD Click the Log button Close the Log file window The window is closedTroubleshooting Dimm Problems Dimm Population RulesUncorrectable Dimm Errors Dimm Replacement PolicyHow Dimm Errors Are Handled by the System Troubleshooting Dimm Problems Correctable Dimm Errors 1Lines in Ipmi OutputBios Dimm Error Messages Dimm Fault LEDsPage 1DIMMs and LEDs on Motherboard Isolating and Correcting Dimm ECC Errors 2DIMMs and LEDs on Mezzanine BoardReconnect AC power cords to the server Page Event Logs and Post Codes Viewing Event LogsESC Advanced Menu Event Logging Details screen is displayed ESC Power-On Self-Test Post How Bios Post Memory Testing WorksRedirecting Console Output Appendix a Event Logs and Post Codes Select Boot Changing Post OptionsSelect Boot Settings Configuration Boot Settings Configuration screen is displayedPage Post Codes Post Codes Post Code Checkpoints Primary I/O portPost Code Checkpoints Initialize Int-13 and prepare for IPL detection Save system context for Acpi Status Indicator LEDs External Status Indicator LEDsFront Panel LEDs Back Panel LEDsInternal Status Indicator LEDs Hard Drive LEDsFigure B-4DIMMs and LEDs on Motherboard Figure B-5DIMMs and LEDs on Mezzanine Board Page P E N D I X C To start the serial console, type the following commands Making a Serial Connection to the SPViewing Ilom SP Event Logs From the System Monitoring tab, select Event LogsFigure C-1System Event Logs Interpreting Event Log Time Stamps Table C-1Event Log FieldsViewing Replaceable Component Information From the System Information tab, select Components Figure C-2Replaceable Component InformationViewing Sensors Figure C-3Sensor Readings Figure C-4Sensor Details Error Handling Handling of Uncorrectable ErrorsIpmitool sel list Figure D-1DMI Log Screen, Uncorrectable Error Handling of Correctable Errors Appendix D Error Handling Page Handling of Parity Errors Perr NMI Event Handling of System Errors Serr PCI Serr Handling Mismatching Processors Hardware Error Handling Summary Hardware Error Handling SummarySP SEL PCI Serr SP SEL Page Bios IndexPost

X4140, X4440, X4240 specifications

Sun Microsystems was a prominent player in the computing industry, known for its innovative and powerful server systems. Among its notable offerings were the Sun Fire X4240, X4440, and X4140 servers, which made significant inroads in the market for high-performance computing solutions.

The Sun Fire X4240 server was designed to meet the demands of data-intensive applications. It offered impressive scalability, supporting up to 64 GB of DDR2 memory across its eight memory slots. This server utilized AMD Opteron processors, which provided excellent performance thanks to their multi-core architecture. The X4240 also featured a flexible I/O architecture, allowing for various configurations tailored to specific workload requirements.

Next in line was the Sun Fire X4440, which expanded on the capabilities of the X4240. This server was particularly valuable for virtualization and consolidation tasks. It featured up to 128 GB of memory and supported more CPU options, with dual- and quad-core Opteron processors available. The X4440 also included a high-density design that enabled increased storage capacity, accommodating up to 12 SFF drives. This made it ideal for databases and enterprise applications needing fast access to large volumes of data.

Finally, the Sun Fire X4140 brought a balance of performance and efficiency. Like its counterparts, it supported AMD's Opteron processors, delivering robust processing power. The X4140 was designed for environments where space and power efficiency were critical. It was notable for its compact form factor, which allowed organizations to pack more servers into less physical space without sacrificing performance. The server architecture included advanced thermal management technologies, ensuring optimal airflow and cooling, which contributed to reliability in demanding environments.

In terms of connectivity, all three servers featured multiple Gigabit Ethernet ports, offering redundant network connectivity essential for enterprise-level applications. The integrated management interfaces simplified server monitoring and maintenance, ensuring that IT administrators could efficiently manage their resources.

In summary, the Sun Fire X4240, X4440, and X4140 were pivotal servers from Sun Microsystems that provided robust performance, scalability, and efficiency. Their features made them suitable for a variety of workloads, from virtualization to data management, cementing their place in the server market during their era.