Main
ii March, 2003 Developers Manual
Intel 80200 Processor based on Intel XScale Microarchitecture
Contents
3 Memory Management.............................................................................. 1
Page
8 System Management............................................................................... 1
12 Performance Monitoring..........................................................................1
Page
Intel 80200 Processor based on Intel XScale Microarchitecture
14 Performance Considerations..................................................................1
A Compatibility: Intel 80200 Processor vs. SA-110................................ 1
Page
Page
Figures
Tables
Page
Page
Page
Introduction
1.1 Intel 80200 Processor based on Intel XScale Microarchitecture High-Level Overview
1.1.1 ARM* Architecture Compliance
1.1.2 Features
1.1.2.1 Multiply/Accumulate (MAC)
Figure 1-1. Intel 80200 Processor based on Intel XScale Microarchitecture Features
1.1.2.2 Memory Management
1.1.2.3 Instruction Cache
1.1.2.4 Branch Target Buffer
1.1.2.5 Data Cache
1.1.2.6 Power Management
1.1.2.7 Interrupt Controller
1.1.2.8 Bus Controller
1.1.2.9 Performance Monitoring
1.1.2.10 Debug
1.2 Terminology and Conventions
1.2.1 Number Representation
1.2.2 Terminology and Acronyms
1.3 Other Relevant Documents
Programming Model
2.1 ARM* Architecture Compliance
2.2 ARM* Architecture Implementation Options
2.2.1 Big Endian versus Little Endian
2.2.2 26-Bit Code
2.2.4 ARM* DSP-Enhanced Instruction Set
2.2.5 Base Register Update
2.3 Extensions to ARM* Architecture
2.3.1 DSP Coprocessor 0 (CP0)
2.3.1.1 Multiply With Internal Accumulate Format
Page
Page
2.3.1.2 Internal Accumulator Access Format
Page
2.3.2 New Page Attributes
Page
2.3.3 Additions to CP15 Functionality
2-12 March, 2003 Developers Manual
2.3.4 Event Architecture
2.3.4.1 Exception Summary
2.3.4.2 Event Priority
Table 2-11. Exception Summary
Table 2-12. Event Priority
2.3.4.3 Prefetch Aborts
2.3.4.4 Data Aborts
Precise Data Aborts
Imprecise data aborts
Multiple Data Aborts
2.3.4.5 Events from Preload Instructions
2.3.4.6 Debug Events
Memory Management
3.1 Overview
3.2 Architecture Model
3.2.1 Version 4 vs. Version 5
3.2.2 Memory Attributes
3.2.2.1 Page (P) Attribute Bit
3.2.2.4 Data Cache and Write Buffer
Table 3-2. Data Cache and Buffer Behavior when X = 1
3.2.2.5 Details on Data Cache and Write Buffer Behavior
3.2.2.6 Memory Operation Ordering
3.2.3 Exceptions
3.3 Interaction of the MMU, Instruction Cache, and Data Cache
3.4 Control
3.4.1 Invalidate (Flush) Operation
3.4.2 Enabling/Disabling
3.4.3 Locking Entries
3-8 March, 2003 Developers Manual
Example 3-3. Locking Entries into the Data TLB
3.4.4 Round-Robin Replacement Algorithm
Page
Instruction Cache
4.1 Overview
4.2 Operation
4.2.1 Operation When Instruction Cache is Enabled
4.2.2 Operation When The Instruction Cache Is Disabled
4.2.3 Fetch Policy
4.2.4 Round-Robin Replacement Algorithm
4.2.5 Parity Protection
4.2.6 Instruction Fetch Latency
4.2.7 Instruction Cache Coherency
4.3 Instruction Cache Control
4.3.1 Instruction Cache State at RESET
4.3.2 Enabling/Disabling
4.3.3 Invalidating the Instruction Cache
4.3.4 Locking Instructions in the Instruction Cache
4.3.5 Unlocking Instructions in the Instruction Cache
Page
Branch Target Buffer
5.1 Branch Target Buffer (BTB) Operation
Branch Target Buffer
5.1.1 Reset
5.1.2 Update Policy
Branch Target Buffer
5.2 BTB Control
5.2.1 Disabling/Enabling
5.2.2 Invalidation
Page
Data Cache
6.1 Overviews
6.1.1 Data Cache Overview
6-2 March, 2003 Developers Manual
Figure 6-1. Data Cache Organization
6.1.2 Mini-Data Cache Overview
6.1.3 Write Buffer and Fill Buffer Overview
6.2 Data Cache and Mini-Data Cache Operation
6.2.1 Operation When Caching is Enabled
6.2.2 Operation When Data Caching is Disabled
6.2.3 Cache Policies
6.2.3.1 Cacheability
6.2.3.2 Read Miss Policy
6.2.3.3 Write Miss Policy
6.2.3.4 Write-Back Versus Write-Through
6.2.4 Round-Robin Replacement Algorithm
6.2.5 Parity Protection
6.2.6 Atomic Accesses
6.3 Data Cache and Mini-Data Cache Control
6.3.1 Data Memory State After Reset
6.3.2 Enabling/Disabling
6.3.3 Invalidate & Clean Operations
6-10 March, 2003 Developers Manual
6.3.3.1 Global Clean and Invalidate Operation
Example 6-2. Global Clean Operation
Page
6.4 Re-configuring the Data Cache as Data RAM
Developers Manual March, 2003 6-13
Example 6-3. Locking Data into the Data Cache
6-14 March, 2003 Developers Manual
Example 6-4. Creating Data RAM
...
... ......
6.5 Write Buffer/Fill Buffer Operation and Control
Configuration
7.1 Overview
Intel 80200 Processor based on Intel XScale Microarchitecture
Developers Manual March, 2003 7-3
ARM Architecture Reference Manual
7-4 March, 2003 Developers Manual
7.2 CP15 Registers
Developers Manual March, 2003 7-5
7.2.1 Register 0: ID and Cache Type Registers
Table 7-4. ID Register
Table 7-5. Cache Type Register (Sheet 1 of 2)
Page
Developers Manual March, 2003 7-7
7.2.2 Register 1: Control and Auxiliary Control Registers
ARM Architecture Reference Manual
7-8 March, 2003 Developers Manual
Developers Manual March, 2003 7-9
7.2.3 Register 2: Translation Table Base Register
7.2.4 Register 3: Domain Access Control Register
Register 4 is reserved. Reading and writing this register yields unpredictable results.
7.2.5 Register 4: Reserved
Table 7-8. Translation Table Base Register
7.2.6 Register 5: Fault Status Register
7.2.7 Register 6: Fault Address Register
7.2.8 Register 7: Cache Functions
Page
7.2.9 Register 8: TLB Operations
7.2.10 Register 9: Cache Lock Down
7.2.11 Register 10: TLB Lock Down
7.2.12 Register 11-12: Reserved
7.2.13 Register 13: Process ID
7.2.13.1 The PID Register Affect On Addresses
7.2.14 Register 14: Breakpoint Registers
7.2.15 Register 15: Coprocessor Access Register
Developers Manual March, 2003 7-19
Table 7-20. Coprocessor Access Register
0 0
7.3 CP14 Registers
7.3.1 Registers 0-3: Performance Monitoring
7.3.2 Register 4-5: Reserved
7.3.3 Registers 6-7: Clock and Power Management
7.3.4 Registers 8-15: Software Debug
System Management
8.1 Clocking
Page
8.2 Processor Reset
8.2.1 Reset Sequence
8.2.2 Reset Effect on Outputs
8.3 Power Management
8.3.1 Invocation
8.3.2 Signals Associated with Power Management
Page
Interrupts
9.1 Introduction
9.2 External Interrupts
9.3 Programmer Model
Developers Manual March, 2003 9-3
9.3.1 INTCTL
Write-as-Zero Reserved
9.3.2 INTSRC
Developers Manual March, 2003 9-5
9.3.3 INTSTR
Page
External Bus
10.1 General Description
Page
Developers Manual March, 2003 10-3
10.2 Signal Description
Table 10-1. Intel 80200 Processor based on Intel XScale Microarchitecture Bus Signals
10.2.1 Request Bus
10.2.1.1 Intel 80200 Processor Use of the Request Bus
Page
10.2.2 Data Bus
10.2.3 Critical Word First
10.2.4 Configuration Pins
10.2.5 Multimaster Support
Page
10.2.6 Abort
10.2.7 ECC
10.2.8 Big Endian System Configuration
10.3 Examples
10.3.1 Simple Read Word
10.3.2 Read Burst, No Critical Word First
10-16 March, 2003 Developers Manual
10.3.3 Read Burst, Critical Word First Data Return
10.3.4 Word Write
10.3.5 Two Word Coalesced Write
Developers Manual March, 2003 10-19
10.3.5.1 Write Burst
10.3.6 Write Burst, Coalesced
Developers Manual March, 2003 10-21
10.3.7 Pipelined Accesses
10.3.8 Locked Access
10.3.9 Aborted Access
10-24 March, 2003 Developers Manual
10.3.10 Hold
Bus Controller
11.1 Introduction
11.2 E CC
11.3 Error Handling
11.3. 1 Bus Aborts
11.3.2 ECC Errors
Page
Developers Manual March, 2003 11-5
11.4 Programmer Model
11.4.1 BCU Control Registers
Page
Page
Page
11.4.2 ECC Error Registers
Page
Performance Monitoring
12.1 Overview
12.2 Clock Counter (CCNT; CP14 - Register 1)
12.3 Performance Count Registers (PMN0 - PMN1; CP14 - Register 2 and 3, Respectively)
12.3.1 Extending Count Duration Beyond 32 Bits
12.4 Performance Monitor Control Register (PMNC)
Developers Manual March, 2003 12-5
12.4.1 Managing PMNC
The following are a few notes about controlling the performance monitoring mechanism:
Table 12-3. Performance Monitor Control Register (CP14, register 0) (Sheet 2 of 2)
12-6 March, 2003 Developers Manual
12.5 Performance Monitoring Events
12.5.1 Instruction Cache Efficiency Mode
12.5.2 Data Cache Efficiency Mode
12.5.3 Instruction Fetch Latency Mode
12.5.4 Data/Bus Request Buffer Full Mode
12.5.5 Stall/Writeback Statistics
12.5.6 Instruction TLB Efficiency Mode
12.5.7 Data TLB Efficiency Mode
12.6 Multiple Performance Monitoring Run Statistics
12.7 Examples
Software Debug
13.1 Definitions
13.2 Debug Registers
13.3 Introduction
13.3.1 Halt Mode
13.3.2 Monitor Mode
Developers Manual March, 2003 13-3
13.4 Debug Control and Status Register (DCSR)
GE H TF TI TDTATS TU TR SA MOE M E
13-4 March, 2003 Developers Manual
13.4.1 Global Enable Bit (GE)
The Halt Mode bit configures the debug unit for either halt mode or monitor mode.
13.4.2 Halt Mode Bit (H)
Table 13-1. Debug Control and Status Register (DCSR) (Sheet 2 of 2)
GE H TF TI TDTATSTU TR SA MOE M E
13.4.3 Vector Trap Bits (TF,TI,TD,TA,TS,TU,TR)
13.4.4 Sticky Abort Bit (SA)
13.4.5 Method of Entry Bits (MOE)
13.4.6 Trace Buffer Mode Bit (M)
13.5 Debug Exceptions
13.5.1 Halt Mode
Page
13.5.2 Monitor Mode
13.6 HW Breakpoint Resources
13.6.1 Instruction Breakpoints
13.6.2 Data Breakpoints
13.7 Software Breakpoints
13.8 Transmit/Receive Control Register (TXRXCTRL)
13.8.1 RX Register Ready Bit (RR)
13.8.2 Overflow Flag (OV)
13.8.3 Download Flag (D)
13.8.4 TX Register Ready Bit (TR)
13.8.5 Conditional Execution Using TXRXCTRL
13.9 Transmit Register (TX)
13.10 Receive Register (RX)
13.11 Debug JTAG Access
13.11.1 SELDCSR JTAG Command
13.11.2 SELDCSR JTAG Register
13.11.2.1 DBG.HLD_RST
TDI
TDO
13.11.2.2 DBG.BRK
13.11.2.3 DBG.DCSR
13.11.3 DBGTX JTAG Command
13.11.4 DBGTX JTAG Register
13.11.5 DBGRX JTAG Command
13.11.6 DBGRX JTAG Register
13.11.6.1 RX Write Logic
13.11.6.2 DBGRX Data Register
TDI
TDO
13.11.6.3 DBG.RR
13.11.6.4 DBG.V
13.11.7 Debug JTAG Data Register Reset Values
13.12 Trace Buffer
13.12.1 Trace Buffer CP Registers
13.12.1.1 Checkpoint Registers
13.12.1.2 Trace Buffer Register (TBREG)
13.13 Trace Buffer Entries
13.13.1 Message Byte
13.13.1.1 Exception Message Byte
13.13.1.2 Non-exception Message Byte
13.13.1.3 Address Bytes
13.13.2 Trace Buffer Usage
Page
13.14 Downloading Code in the ICache
13.14.1 LDIC JTAG Command
13.14.2 LDIC JTAG Data Register
13.14.3 LDIC Cache Functions
Page
13.14.4 Loading IC During Reset
Developers Manual March, 2003 13-39
13.14.4.1 Loading IC During Cold Reset for Debug
Page
13.14.4.2 Loading IC During a Warm Reset for Debug
Page
13.14.5 Dynamically Loading IC After Reset
Debugger Actions
Debug Handler Actions
Page
13.14.5.1 Dynamic Code Download Synchronization
13.14.6 Mini Instruction Cache Overview
13.15 Halt Mode Software Protocol
13.15.1 Starting a Debug Session
13.15.1.1 Setting up Override Vector Tables
13.15.1.2 Placing the Handler in Memory
13.15.2 Implementing a Debug Handler
13.15.2.1 Debug Handler Entry
13.15.2.2 Debug Handler Restrictions
13.15.2.3 Dynamic Debug Handler
Page
13.15.2.4 High-Speed Download
13.15.3 Ending a Debug Session
13.16 Software Debug Notes/Errata
Performance Considerations
14.1 Interrupt Latency
14.2 Branch Prediction
14.3 Addressing Modes
14.4 Instruction Latencies
14.4.1 Performance Terms
14.4.2 Branch Instruction Timings
14.4.3 Data Processing Instruction Timings
Developers Manual March, 2003 14-5
Table 14-5. Branch Instruction Timings (Those not predicted by the BTB)
Table 14-6. Data Processing Instruction Timings
14-6 March, 2003 Developers Manual
14.4.4 Multiply Instruction Timings
Table 14-7. Multiply Instruction Timings (Sheet 1 of 2)
Developers Manual March, 2003 14-7
Table 14-8. Multiply Implicit Accumulate Instruction Timings
Table 14-9. Implicit Accumulator Access Instruction Timings
Table 14-7. Multiply Instruction Timings (Sheet 2 of 2)
14-8 March, 2003 Developers Manual
14.4.5 Saturated Arithmetic Instructions
14.4.6 Status Register Access Instructions 14.4.7 Load/Store Instructions
Table 14-10. Saturated Data Processing Instruction Timings
Table 14-11. Status Register Access Instruction Timings
Table 14-12. Load and Store Instruction Timings
14.4.8 Semaphore Instructions
14.4.9 Coprocessor Instructions
14.4.10 Miscellaneous Instruction Timing
14.4.11 Thumb* Instructions
Page
Compatibility: Intel 80200 Processor vs. SA-110 A
A.1 Introduction
A.2 Summary
A-2 March, 2003 Developers Manual
A.3 Architecture Deviations
A.3.1 Read Buffer
A.3.2 26-bit Mode
A.3.3 Cacheable (C) and Bufferable (B) Encoding
A.3.4 Write Buffer Behavior
A.3.5 External Aborts
A.3.6 Performance Differences
A.3.7 System Control Coprocessor
A.3.8 New Instructions and Instruction Formats
Page
Optimization Guide B
B.1 Introduction
B.1.1 About This Guide
B.2 Intel 80200 Processor Pipeline
B.2.1 General Pipeline Characteristics
B.2.1.1. Number of Pipeline Stages
Developers Manual March, 2003 B-3
B.2.1.2. Intel 80200 Processor Pipeline Organization
Table B -1 gives a brief description of each pipe-stage.
Figure B-1. Intel 80200 Processor RISC Superpipeline
F1 F2 ID RF X1 X2 XWB M1 M2 Mx
D1 D2
B.2.1.3. Out Of Order Completion
B.2.1.4. Register Scoreboarding
B.2.1.5. Use of Bypassing
B.2.2 Instruction Flow Through the Pipeline
B.2.2.1. ARM* V5 Instruction Execution
B.2.2.2. Pipeline Stalls
B.2.3 Main Execution Pipeline
Page
B.2.4 Memory Pipeline
B.2.4.1. D1 and D2 Pipestage
B.2.5 Multiply/Multiply Accumulate (MAC) Pipeline
B.2.5.1. Behavioral Description
B.3 Basic Optimizations
B.3.1 Conditional Instructions
B.3.1.1. Optimizing Condition Checks
B.3.1.2. Optimizing Branches
Page
B.3.1.3. Optimizing Complex Expressions
Developers Manual March, 2003 B-13
B.3.2 Bit Field Manipulation
B.3.3 Optimizing the Use of Immediate Values
B.3.4 Optimizing Integer Multiply and Divide
B-16 March, 2003 Developers Manual
B.3.5 Effective Use of Addressing Modes
B.4 Cache and Prefetch Optimizations
B.4.1 Instruction Cache
B.4.1.1. Cache Miss Cost
B.4.1.2. Round Robin Replacement Cache Policy
B.4.1.3. Code Placement to Reduce Cache Misses
Page
B.4.2 Data and Mini Cache
B.4.2.3. Read Allocate and Read-write Allocate Memory Regions
B.4.2.4. Creating On-chip RAM
B.4.2.5. Mini-data Cache
B.4.2.6. Data Alignment
B.4.2.7. Literal Pools
B.4.3 Cache Considerations
B.4.3.1. Cache Conflicts, Pollution and Pressure
B.4.3.2. Memory Page Thrashing
B.4.4 Prefetch Considerations
B.4.4.1. Prefetch Distances in the Intel 80200 Processor
Page
B.4.4.2. Prefetch Loop Scheduling
B.4.4.3. Prefetch Loop Limitations
B.4.4.4. Compute vs. Data Bus Bound
B.4.4.5. Low Number of Iterations
B.4.4.6. Bandwidth Limitations
B.4.4.7. Cache Memory Considerations
Page
B.4.4.8. Cache Blocking
B.4.4.9. Prefetch Unrolling
B.4.4.10. Pointer Prefetch
B.4.4.11. Loop Interchange
B.4.4.12. Loop Fusion
B.4.4.13. Prefetch to Reduce Register Pressure
B.5 Instruction Scheduling
B.5.1 Scheduling Loads
Page
B.5.1.1. Scheduling Load and Store Double (LDRD/STRD)
B.5.1.2. Scheduling Load and Store Multiple (LDM/STM)
B.5.2 Scheduling Data Processing Instructions
B.5.3 Scheduling Multiply Instructions
B.5.4 Scheduling SWP and SWPB Instructions
B.5.5 Scheduling the MRA and MAR Instructions (MRRC/MCRR)
B.5.6 Scheduling the MIA and MIAPH Instructions
B.5.7 Scheduling MRS and MSR Instructions
B.5.8 Scheduling CP15 Coprocessor Instructions
B.6 Optimizing C Libraries
B.7 Optimizations for Size
B.7.1 Space/Performance Trade Off
B.7.1.1. Multiple Word Load and Store
B.7.1.2. Use of Conditional Instructions
Page
Test Features C
C.1 Introduction
C.2 JTAG - IEEE1149.1
C.2.1 Boundary Scan Architecture
Developers Manual March, 2003 C-3
C.2.2 TAP Pins
C.2.3 Instruction Register (IR)
C.2.3.1. Boundary-Scan Instruction Set
Developers Manual March, 2003 C-5
Table C-3. IEEE Instructions
C.2.4 TAP Test Data Registers
C.2.4.1. Device Identification Register
C.2.4.2. Bypass Register
C.2.4.3. Boundary-Scan Register
Developers Manual March, 2003 C-7
C.2.5 TAP Controller
C.2.5.1. Test Logic Reset State
C.2.5.2. Run-Test/Idle State
C.2.5.3. Select-DR-Scan State
C.2.5.4. Capture-DR State
C.2.5.5. Shift-DR State
C.2.5.6. Exit1-DR State
C.2.5.7. Pause-DR State
C.2.5.8. Exit2-DR State
C.2.5.9. Update-DR State
C.2.5.10. Select-IR Scan State
C.2.5.11. Capture-IR State
C.2.5.12. Shift-IR State
C.2.5.13. Exit1-IR State
C.2.5.14. Pause-IR State
C.2.5.15. Exit2-IR State
C.2.5.16. Update-IR State
C.2.5.17. Boundary-Scan Example
Developers Manual March, 2003 C-13
Figure C-3. JTAG Example
n
nnnnn
n
abcd
Figure C-4. Timing Diagram Illustrating the Loading of Instruction Register
idcode
Developers Manual March, 2003 C-15
Figure C-5. Timing Diagram Illustrating the Loading of Data Register