Sun Microsystems T6320 service manual Memory Fault Handling, Troubleshooting Memory Faults

Page 37

TABLE 2-2FB-DIMM Configuration and Installation (Continued)

 

 

Motherboard

FB-DIMM

 

 

 

FB-DIMM

Installation

FB-DIMM

Branch Name Channel Name

FRU Name

Connector

Order*

Pair\

 

/SYS/MB/CMP0/BR3/CH0/D1

J2501

3

H

Channel 1

/SYS/MB/CMP0/BR3/CH1/D0

J2601

2

G

 

/SYS/MB/CMP0/BR3/CH1/D1

J2701

3

H

 

 

 

 

 

*Upgrade path: DIMMs should be added with each group populated in the order shown. \ Fault replacement path: Each pair is addressed as a unit, and each pair must be identical.

2.2.1.2Memory Fault Handling

The Sun Blade T6320 server module uses advanced ECC technology, also called chipkill, that corrects up to 4-bits in error on nibble boundaries, as long as they are all in the same DRAM. If a DRAM fails, the DIMM continues to function.

Note – The chipkill function is only supported on DIMMs that use “x4” DRAMs.

The following server module features manage memory faults independently.

POST – Runs when the server module is powered on (based on configuration variables) and thoroughly tests the memory subsystem.

If a memory fault is detected, POST displays the fault with the FRU name of the faulty DIMMs, logs the fault, and disables the faulty DIMMs by placing them in the Automatic System Recovery (ASR) blacklist. For a given memory fault, POST disables half of the physical memory in the system. When this occurs, you must replace the faulty DIMMs based on the fault message and enable the disabled DIMMs with the ILOM command set /SYS/component component_state= enabled .

Solaris Predictive Self-healing (PSH) technology – A feature of the Solaris OS, uses the fault manager daemon (fmd) to watch for various kinds of faults. When a fault occurs, the fault is assigned a unique fault ID (UUID), and logged. PSH reports the fault and provides a recommended proactive replacement for the DIMMs associated with the fault.

2.2.1.3Troubleshooting Memory Faults

If you suspect that the server module has a memory problem, follow the flowchart (see FIGURE 2-1). Type the ILOM command: show /SP/faultmgmt . The faultmgmt command lists memory faults and lists the specific DIMMs that are

Chapter 2 Sun Blade T6320 Server Module Diagnostics 2-11

Image 37
Contents Sun Microsystems, Inc Page Contents Page Replacing Hot-Swappable and Hot-Pluggable Components Safety InformationReplacing Cold-Swappable Components Page XVR-50 Graphics Accelerator SpecificationsHD15 Video Output Port D-6 Checking Device Configuration D-5Index Index-1 Preface Using Unix CommandsTypographic Conventions Accessing Sun DocumentationShell Prompts Module Sun Blade T6320 Server Module Safety and Compliance ManualDocumentation, Support, and Training Third-Party Web SitesSun Welcomes Your Comments Component Overview Sun Blade T6320 Server Module Product Description1Sun Blade T6320 Server Module With Chassis Front View 3Cable Dongle Connectors Insert the connector straight into the server moduleDuring normal system operation 1Sun Blade T6320 Server Module FeaturesConnections SAS/SATA 2Interfaces With the chassis4Field-Replaceable Units 3Sun Blade T6320 Server Module FRU ListHard Drive on Sun Blade T6320 Server Module FRU ListMulticore Processor Information Support for RAID Storage Configurations Sun Blade RAID 0/1 G2 Expansion Module Finding the Serial NumberMAC address Alom CMT example Additional Service Related Information Sun Blade T6320 Server Module Diagnostics Alom CMT CLI Sun Blade T6320 Server Module Diagnostics OverviewSun Blade T6320 Server Module Diagnostics 1Diagnostic Flowchart System LEDs on 1Diagnostic Flowchart ActionsFRU-namedeemed faulty and disabled FB-DIMM Configuration Guidelines Memory Configuration and Fault HandlingSupported FB-DIMMs and Part Numbers Dimm Installation RulesBR1/CH0/D0 BR0/CH0/D0 BR2/CH0/D0 BR3/CH0/D0 3FB-DIMM Installation RulesYou can also use -2to identify the DIMMs you want to remove FB-DIMM Configuration and InstallationTroubleshooting Memory Faults Memory Fault Handling2FB-DIMM Configuration and Installation Front Panel LEDs and Buttons Interpreting System LEDs3LED Behavior and Meaning LEDs have assigned meanings, described in Table4LED Behaviors With Assigned Meanings 5Front Panel Buttons Ethernet Port LEDs Using Ilom for Diagnosis and Repair VerificationIlom Fru at location is OK Using the Ilom Web Interface For Diagnostics8ILOM Login Screen Changing Post Settings With the Ilom Web InterfacePage Displaying System Faults Changing Post Settings With the Ilom CLIType the show command to see the current Post settings 10Fault Management Page Example Viewing Fault Status Using the Ilom Web Interface11Faulted Component ID Window Viewing Fault Status Using the Ilom CLIShow /SYS/MB/VVCORE Displaying the Environmental Status withSun Blade T6320 Server Module Diagnostics Displaying FRU Information Using the Ilom Web Interface to Display FRU Information15Static FRU Information in the Ilom Web Interface Using the CLI to Display FRU InformationShow /SYS/MB At the -prompt, type the show commandSegment TH Controlling How Post Runs Running Post16. This parameter overrides all other 6Parameters Used For Post Configuration16Flowchart of Ilom Variables for Post Configuration 7describes how the Post settings will execute Using the Web Interface to Change Post Parameters7POST Modes and Parameter Settings Changing Post Parameters17Setting Post Parameters With the Ilom Web Interface 18Changing Power Settings with the Ilom Web Interface Type the set command to change the Post parameters Using the CLI to Change Post ParametersShow /HOST/diag Post error messages use the following syntax Power cycle the server module to run PostInterpreting Post Messages Interpret the Post messagesClearing Post Detected Faults 19Enabling Components With the Ilom Web Interface Clearing Faults With the Web InterfaceReboot the server module Clearing Faults With the Ilom CLISet componentstate=enabled Cd /SYS/MB/CMP0/P32Clearing Faults Manually with Ilom Using the Solaris Predictive Self-Healing FeatureClearing Hard Drive Faults # fmdadm faulty Using the fmadm faulty Command# fmdump Using the fmdump CommandFollow the suggested actions to repair the fault Clearing PSH Detected FaultsAfter replacing a faulty FRU, boot the system Clearing the PSH Fault From the Ilom LogsClear the fault from all persistent fault records # fmadm faultySet /SYS/component clearfaultaction=true Collecting Information From Solaris OS Files and CommandsChecking the Message Buffer Log in as superuserType the syslogd command Managing Components With Automatic System Recovery CommandsLog in as superuser Type the following command If you want to view all logged messages, type this commandSYS/component 8ASR CommandsAn example with no disabled components Displaying System Components With the show /SYS CommandAn example showing a disabled component Show /SYS/MB/USB09Sample of installed SunVTS Packages Checking SunVTS Software InstallationExercising the System With SunVTS # pkginfo grep -i vtsEnable the remote display. On the display system, type Exercising the System Using SunVTS SoftwareWhere test-systemis the name of the server you plan to test Steps for Exercising the System With SunVTS Software20 SunVTS BI # /opt/SUNWvts/bin/sunvts -display display-system0Optional Select the test category you want to run Start testingOptional Customize individual tests 10Useful SunVTS Tests to Run on This ServerTo Reset the Root Password to the Factory Default Resetting the Password to the Factory DefaultChange the root password Remove the server module from the modular system chassisSun Blade T6320 Server Module Diagnostics Page Hot-Plugging a Hard Drive Hot-Pluggable Hard DrivesRemoving a Hard Drive Rules for Hot-PluggingHDD2 HDD0 HDD1 HDD3 2Hard Drive Locations, Release Button, and Latch Replacing a Hard Drive or Installing a New Hard DriveAdding PCI ExpressModules Page Replacing Cold-Swappable Components Safety InformationElectrostatic Discharge Safety Safety SymbolsUsing an Antistatic Wrist Strap Using an Antistatic Mat Common Procedures for Parts ReplacementRequired Tools Shutting Down the SystemLog in as superuser or equivalent Using the Ilom Web Interface to Shut Down the Server ModuleNotify affected users Save any open files and quit all running programs1Powering Off the Server Module with the Ilom Web Interface Using the Ilom CLI to Shut Down the Server ModuleSet /SYS/ preparetoremoveaction=true At the Ilom -prompt, type the set /SYS/PS0Set /SYS/LOCATE value=fastblink 2Disconnecting the Cable Dongle 3Removing the Sun Blade T6320 Server Module From the Chassis Open the ejector levers Figure4Stack Five Server Modules or Fewer Removing and Replacing DIMMs This section describes how to remove and replace DIMMsRemoving the DIMMs 6DIMM Locate Button and Dimm LEDs Locate the DIMMs that you want to replace FigureSixteen DIMMs installed FB-DIMM Configuration 8Removing DIMMs Replacing the DIMMsRemoving the Service Processor Removing and Replacing the Service ProcessorVerifying Dimm Installation 9Removing the Service Processor Prom is keyed to ensure proper orientation Replacing the Service ProcessorRemoving and Replacing the Battery on the Service Processor 11Removing the Battery From the Service Processor Replacing the Battery on the Service ProcessorSet /SP/clock datetime=10 Removing the RAID 5 Expansion Module12Removing the RAID Expansion Module Installing the RAID 5 Expansion Module13Replacing the RAID 5 Expansion Module Verifying the RAID 5 Expansion Module InstallationOk show-disks Return to the root node by using the unselect-devcommand Configuring the RAID 5 Expansion ModuleFor details, see Appendix B and Appendix C Ok .propertiesAdditional Information Creating a Bootable Array With the RAID 5 Expansion ModuleRemoving the RAID 0/1 Expansion Module For more information, refer to the following documents at14Removing the RAID Expansion Module Replacing the RAID 0/1 Expansion Module15Replacing the RAID 0/1 Expansion Module Verifying the RAID 0/1 Expansion Module InstallationReinstalling the Server Module in the Chassis Finishing Component ReplacementReplacing the Cover 17Inserting the Server Module in the Chassis Page Lbs fully configured Physical SpecificationsTable A-1Exterior Dimensions 77 kgOperating Temperature and Altitude System Environmental SpecificationsNon-Operating Temperature and Altitude Temperature -40˚ C to 60˚ C Maximum altitude 40,000 ftFigure A-2Motherboard Block Diagram Motherboard Block DiagramPage P E N D I X B Creating a Bootable Array Task Map About Creating a Bootable Array on a Sparc SystemInstall and connect the HBA and disk drives Modify two locations on the network install server Modifying the Miniroot Directory On the Install Server# cd /cdrom/raidlive/s0/Raidcard To Modify the Miniroot Directory# cp -r SUNWaac installdirpath/Solaris10/Product To Modify the Product Installation DirectoryTo Create a Logical Drive Using a Network Install Server Building a Logical Drive On Which to Install the Solaris OSOk boot net -s # ./arcconf Create 1 Logicaldrive MAX 5 0 2 0 3 0 # cd /opt/StorMan # ./arcconf GetconfigOk boot cdrom To Create a Logical Drive Without a Network Install ServerBringing the drive online Sc shownetworkTo Delete a Logical Drive on the REM Run the Create command as shown in the following exampleTo Label the Newly Created Logical Drive # ./arcconf Getconfig 1 LD# ./arcconf Delete 1 Logicaldrive # devfsadm# format Next Steps# init Additional Information To Prepare to Install the Solaris OS Preparing to Install the Solaris OSComplete the procedures in Appendix B This section contains the following subsection Use the df command to verify the followingApply the HBA driver package, SUNWaac # reboot# cd /cdrom/Solaris10/Product # pkgadd -R /a -d. SUNWaac Next Steps XVR-50 Graphics Accelerator FeaturesTable D-1lists video formats supported by the HD15 port Video FormatsTable D-1XVR-50 Graphics Accelerator HD15 Video Formats Man Pages Sun OpenGL for Solaris SoftwareThis example shows a list of graphics devices displayed Optional Video Output Default Color DepthLog out and then log back in for the change to take effect Checking Device ConfigurationHD15 Video Output Port Host% fbconfig -dev pfb0 -prconfIndex FRU Man page, D-3, D-4Post SYS/MB server module FRU name Index-5
Related manuals
Manual 22 pages 12.42 Kb

T6320 specifications

The Sun Microsystems T6320 is a high-performance server designed to meet the demands of modern data centers and enterprise applications. As part of the Sun Fire series, the T6320 is built for scalability, efficient resource utilization, and reliability, making it an ideal choice for businesses looking to optimize their IT infrastructure.

One of the key features of the T6320 is its support for the UltraSPARC T2 processor architecture. This multicore processor can handle up to eight threads per core, meaning the T6320 can manage up to 64 simultaneous threads. This threading capability is particularly beneficial for virtualization and multi-threaded applications, allowing organizations to maximize the performance of their software while minimizing latency.

The T6320 also comes equipped with a high-speed memory subsystem, supporting up to 256 GB of DDR2 memory. With a memory bandwidth of up to 17 GB/s, the server ensures that data transfer rates do not become a bottleneck, facilitating faster processing and smoother operation for demanding applications. Moreover, the server supports multi-tier storage configurations, enabling organizations to choose the right balance of performance, capacity, and cost.

In terms of connectivity, the T6320 offers multiple gigabit Ethernet ports, creating a resilient network architecture capable of handling the high data loads typical in enterprise environments. Its redundancy features, including hot-swappable components and mirrored disks, further add to its reliability, ensuring continuous service even during maintenance.

The T6320 is built with energy efficiency in mind, minimizing power consumption without compromising performance. This characteristic is increasingly critical for organizations focused on sustainability and cost savings in their energy expenditures.

Additionally, Sun Microsystems has integrated advanced security features into the T6320, such as hardware-based security mechanisms to protect sensitive data and applications. This feature is vital for businesses operating in regulated industries or those that prioritize data integrity.

Finally, the server supports a variety of operating systems, including Solaris, Linux, and various UNIX flavors. This flexibility allows organizations to run their preferred software environments, making the T6320 a versatile option for diverse IT needs.

Overall, the Sun Microsystems T6320 stands out as a powerful, flexible, and efficient server solution, adept at handling the complexities of today's enterprise workloads while paving the way for future growth and technological advancements.