Cisco Systems UBR10012 manual Bus Errors

Page 53

Chapter 4 Troubleshooting Line Cards

General Information for Troubleshooting Line Card Crashes

Hard parity errors occur when a hardware defect in the DRAM or processor board causes data to be repeatedly corrupted at the same address. In general, a hard parity error occurs when more than one parity error in a particular memory region occurs in a relatively short period of time (several weeks to months).

When parity occurs, take the following steps to resolve the problem:

Step 1 Determine whether this is a soft parity error or a hard parity error. Soft parity errors are 10 to 100 times more frequent than hard parity errors. Therefore, wait for a second parity error before taking any action. Monitor the router for several weeks after the first incident, and if the problem reoccurs, assume that the problem is a hard parity error and proceed to the next step.

Step 2 When a hard parity error occurs (two or more parity errors at the same memory location), try removing and reinserting the line card, making sure to fully insert the card and to securely tighten the restraining screws on the front panel.

Step 3 If this does not resolve the problem, remove and reseat the DRAM chips. If the problem continues, replace the DRAM chips.

Step 4 If parity errors occur, the problem is either with the line card or the router chassis. Try removing the line card and reinserting it. If the problem persists, try removing the line card from its current slot and reinserting it in another slot, if one is available. If that does not fix the problem, replace the line card.

Step 5 If the problems continue, collect the following information and contact Cisco TAC:

All relevant information about the problem that you have available, including any troubleshooting you have performed.

Any console output that was generated at the time of the problem.

Output of the show tech-supportcommand.

Output of the show log command (or the log that was captured by your SYSLOG server, if available).

For information on contacting TAC and opening a case, see the “Obtaining Technical Assistance” section on page x.

Bus Errors

Bus errors (SIG type is 10) occur when the line card tries to access a memory location that either does not exist (which indicates a software error) or that does not respond (which indicates a hardware error). Use the following procedure to determine the cause of a bus error and to resolve the problem.

Perform these steps as soon as possible after the bus error. In particular, perform these steps before manually reloading or power cycling the router, or before performing an Online Insertion/Removal (OIR) of the line card, because doing so eliminates much of the information that is useful in debugging line card crashes.

Step 1 Capture the output of the show stacks, show context, and show tech-supportcommands. Registered Cisco.com users can decode the output of this command by using the Output Interpreter tool, which is at the following URL:

https://www.cisco.com/cgi-bin/Support/OutputInterpreter/home.pl

Cisco uBR10012 Universal Broadband Router Troubleshooting Guide

 

OL-1237-01

4-5

 

 

 

Image 53
Contents Text Part Number OL-1237-01 Corporate HeadquartersCopyright 2001-2004, Cisco Systems, Inc All rights reserved N T E N T S ARP Traffic Testing with Digital Multimeters and Cable Testers B-1 OL-1237-01 Audience PurposeChapter Description Document OrganizationRelated Documentation Documentation Feedback Obtaining DocumentationCisco.com Ordering DocumentationOpening a TAC Case Obtaining Technical AssistanceCisco TAC Website TAC Case Priority Definitions Obtaining Additional Publications and InformationXii Basic Troubleshooting Tasks and Startup Issues Basic Troubleshooting ChecklistConfirming the Hardware Installation Displaying the Cisco IOS Software Version Last reset from power-onDisplaying System Environment Information Hardware Troubleshooting FlowchartTCC+ Cisco uBR10012 System Startup SequenceStartup Event Event Description AC PEM Faults PEM Faults and Fan Assembly FailuresColor Description Fault Symptom Corrective ActionDC PEM Faults DC PEM Front Panel original model, UBR10-PWR-DC 2400W AC-Input Power Shelf Fault Other Electrical ProblemsAC OK DC OKFan Assembly Module Fan Assembly Module FaultsFan Assembly Air Circulation Pattern MULTI-FAN Failure LED Symptom Steps to TakeSingle FAN Failure OL-1237-01 Troubleshooting PRE-1 Modules Message Description PRE Module Not SupportedPRE-1 Module Status Screen IOS Prot Booting Up with Redundant PRE-1 ModulesIOS Intf IOS RUNPRE-1 Module Faults LED Fault Steps to TakeC10000config#interface fastethernet0/0/0 Ethernet Connection ProblemsConsole Port Serial Connection Problems Troubleshooting System Crashes Troubleshooting Common System ProblemsARP Traffic High CPU Utilization ProblemsRouterconfig-if# ip access-groupnumber Exec and Virtual Exec Processes Cpuhog ErrorsDebug and System Messages IP Input Processing Invalid Scheduler Allocate ConfigurationInterrupts are Consuming a Large Amount of Resources Snmp Traffic Bus ErrorsProblems with Access Lists Region Manager Start End Sizeb Class Media Name 0x0A000000 Alignment Errors Memory ProblemsMemory Parity Errors Low Memory ErrorsParticle Pool Fallbacks Spurious Interrupts Spurious Memory Accesses OL-1237-01 Troubleshooting Line Cards Command Description General Information for Troubleshooting Line Card CrashesSIG Value SIG Name Error Reason Sigreload Cache Parity ErrorsSigerror Bus Errors Software-Forced Crashes Troubleshooting Line Cards TCC+ Front Panel Status Description PowerMaintenance Fault Type Response Show controllers clock-reference command Troubleshooting the OC-12 Packet-Over-SONET Line Card Fault Corrective Action RX CARRIER-A RX CARRIER-BActive PASS-THROUGH EnabledFail EnablePOS SRPPass Thru SyncWrap Gigabit Ethernet Line Card Faceplate and LED Descriptions Troubleshooting the Gigabit Ethernet Line CardGigabit Ethernet Line Card Faults and Recommended Responses OL-1237-01 Password Recovery Procedure Password Recovery Procedure OverviewPress Return. The user Exec prompt appears Change all three passwords using the following commands OL-1237-01 Unsupported Frame Relay Commands Unsupported CommandsMlppp Commands Hccp CommandsUnsupported PPP Commands Unsupported Mpls VPN CommandsSpectrum Management Commands Unsupported Telco-Return CommandsOL-1237-01 Equipment Description Testing with Digital Multimeters and Cable TestersTesting with OTDRs Testing with TDRs and OTDRsTesting with TDRs Testing with Breakout Boxes, Fox Boxes, and BERTs/BLERTs Testing with Network MonitorsTesting with Network Analyzers Active LED Enable LEDBert BlertMAINTENANCE, OC-12 SRP/DPT ENABLE, OC-48 DPT/POSMAINTENANCE, TCC+ POWER, OC-12 DPT/SRP POWER, TCC+Power LED Maintenance LEDSTATUS, OC-12 DPT/SRP STATUS, TCC+ SYNC, OC-48 DPT/POS TX, OC-48 DPT/POS WRAP, OC-48 DPT/POSPresent LED TCC+ OC-12 DPT/SRP TCC+RX Carrier LED RX LED RX Pkts LEDWrap LED TDR B-2TX LED OC-48 DPT/POS IN-6