Introduction to Memory and System Architectures ARM DDI 0100E Copyright © 1996-2000 ARM Limited. All rights reserved. B1-3 Note Because of the wide variety of systems based on ARM processors, all functionality described in Part B might be inappropriate to any given system. Furthermore, some ARM processors have implemented functions in a different manner to the one described here. Because of this, the datasheet or Technical Reference Manual for a particular ARM processor is the definitive source for its memory and system control facilities. Part B therefore does not attempt to specify absolute requirements on the functionality of the System Control coprocessor or other memory system components. Instead, it contains guidelines which, if followed: • mean that the system is more likely to be compatible with existing and future ARM software. • probably make it easier to port incompatible software to the system. In order to provide an adequate description of the range of memory and system facilities on existing ARM implementations, Part B describes a number of options that will not be used on new ARM implementations. For information on the rules that must be followed by new implementations of the memory and system architectures, contact ARM Ltd. The fact that Part B describes a broad range of facilities, many of which are used only on some existing ARM implementations, also means that architecture version numbers for the memory and system architectures would not be helpful or descriptive. They are therefore not used. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Introduction to Memory and System Architectures B1-4 Copyright © 1996-2000 ARM Limited. All rights reserved. ARM DDI 0100E 1.2 System-level issues This section lists a number of general and operating-system issues that the system designer needs to address when using an ARM processor. 1.2.1 Memory systems, write buffers and caches ARM processors and software are designed to be connected to a byte-addressed memory. Word and halfword accesses to the memory ignore the alignment of the address and access the naturally-aligned value that is addressed (so a memory access ignores address bits 0 and 1 for word access, and ignores bit 0 for halfword accesses). The endianness of the ARM processor should normally match that of the memory system, or be configured to match it before any non-word accesses occur (when the endianness is configurable and CP15 is implemented, bit[7] of CP15 register 1 controls the endianness). Memory that is used to hold programs and data should be marked as follows: • Main (RAM) memory is normally set as cachable and bufferable. • ROM memory is normally set as cachable, and should be marked as read only, so the bufferable attribute is not used and should be 1. Write buffers Some ARM implementations incorporate a merging write buffer that subsumes multiple writes to the same location into a single write to main memory. Furthermore, some write buffers re-order writes, so that writes are issued to memory in a different order to the order in which they are issued by the processor. Therefore, I/O locations should not normally be marked as bufferable, to ensure all writes are issued to the I/O device in the correct order. For writes to bufferable areas of memory, memory aborts can only be signaled to the processor as a result of conditions that are detectable at the time the data is placed in the write buffer. Conditions that can only be detected when the data is later written to main memory (such as a parity error from main memory) must be handled by other methods (typically by raising an interrupt). Caches Frame buffers can be cachable, but frame buffers on writeback cache implementations must be copied back to memory after the frame buffer has been updated. Frame buffers can be bufferable, but again the write buffer must be written back to memory after the frame buffer has been updated. ARM processors do not normally support cache coherence between the ARM and other system bus masters. Bus snooping is not supported. If memory data is to be shared between multiple bus masters without taking special software measures to ensure coherency, then the data must be mapped as: • uncachable to ensure that all reads access main memory • unbufferable to ensure that all write access main memory. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Introduction to Memory and System Architectures ARM DDI 0100E Copyright © 1996-2000 ARM Limited. All rights reserved. B1-5 Alternatively, using software, you can manage the coherence of data buffers that are read or written by another bus master by: • cleaning data from writeback caches and write buffers to memory when the processor has written to the data buffer and before the other bus master reads the buffer • flushing relevant data from caches when the buffer is being read after the other bus master has written the buffer. You can use an uncached, unbuffered semaphore to maintain synchronization between multiple bus masters (see Semaphores on page B1-6). For implementations with writeback caches, all dirty cache data must be written back before any alterations are made to the MMU page tables, to ensure that cache line write back can use the page tables to form the correct physical address for the transfer. You can index caches using either virtual or physical addresses. Physical pages must only be mapped into a single virtual page, otherwise the result is UNPREDICTABLE. ARM processors do not normally provide coherence between multiple virtual copies of a single physical page. Some ARM implementations support separate instruction and data caches. Coherence between the data and instruction caches is not necessarily maintained in hardware, so if the instruction stream is written, the instruction cache and data cache must be made coherent. This can entail: • cleaning the data cache (storing dirty data to memory) • draining the write buffer (completing all buffered writes) • flushing the instruction cache. Instruction and data memory incoherence occurs after a program has been loaded (and therefore treated as data) and is about to be executed. It also occurs if self-modifying code is used or generated. 1.2.2 Interrupts ARM processors implement fast and normal levels of interrupt. Both interrupts are signaled externally, and many implementations synchronize interrupts before an exception is raised. Fast interrupt request (FIQ) Disables subsequent normal and fast interrupts by setting the I and F bits in the CPSR. Normal interrupt request (IRQ) Disables subsequent normal interrupts by setting the I bit in the CPSR. For more information, see Exceptions on page A2-13. Canceling interrupts It is the responsibility of software (the interrupt handler) to ensure that the cause of an interrupt is canceled (no longer signaled to the processor) before interrupts are re-enabled (by clearing the I and/or F bit in the CPSR). Interrupts can be canceled with any instruction that might make an external data bus access, meaning any load or store, a swap, or any coprocessor instruction. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. Introduction to Memory and System Architectures B1-6 Copyright © 1996-2000 ARM Limited. All rights reserved. ARM DDI 0100E Canceling an interrupt via an instruction fetch is UNPREDICTABLE. Canceling an interrupt with a load multiple that restores the CPSR and re-enables interrupts is UNPREDICTABLE. Devices that do not instantaneously cancel an interrupt (that is, they do not cancel the interrupt before letting the access complete) must be probed by software to ensure that interrupts have been canceled before interrupts are re-enabled. This allows a device connected to a remote I/O bus to operate correctly. 1.2.3 Semaphores The Swap and Swap Byte instructions have predictable behavior when used in two ways: • Systems with multiple bus masters that use the Swap instructions to implement semaphores to control interaction between different bus masters. In this case, the semaphores must be placed in an uncached and unbufferable region of memory. The Swap instruction then causes a (locked) read-write bus transaction. This type of semaphore can be externally aborted. • Systems with multiple threads running on a uniprocessor that use the Swap instructions to implement semaphores to control interaction of the threads. In this case, the semaphores can be placed in a cached and bufferable region of memory, and a (locked) read-write bus transaction might or might not occur. The Swap and Swap Byte instructions are likely to have better performance on such a system than they do on a system with multiple bus masters (as described above). This type of semaphore has UNPREDICTABLE behavior if it is externally aborted. Semaphores placed in uncachable/bufferable memory regions have UNPREDICTABLE results. Semaphores placed in cachable/unbufferable memory regions have UNPREDICTABLE results. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. ARM DDI 0100E Copyright © 1996-2000 ARM Limited. All rights reserved. B2-1 Chapter B2 The System Control Coprocessor This chapter describes coprocessor 15, the System Control coprocessor. It contains the following sections: • About the System Control coprocessor on page B2-2 • Registers on page B2-3 • Register 0: ID codes on page B2-6 • Register 1: Control register on page B2-13 • Registers 2-15 on page B2-17. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The System Control Coprocessor B2-2 Copyright © 1996-2000 ARM Limited. All rights reserved. ARM DDI 0100E 2.1 About the System Control coprocessor All of the standard memory and system facilities are controlled by coprocessor 15 (CP15), which is therefore called the System Control coprocessor. Some also use other methods of control, which are described in the chapters describing the facilities concerned. For example, the Memory Management Unit described in Chapter B3 Memory Management Unit is also controlled by page tables in memory. If none of the standard memory and system facilities are implemented in a system, the System Control coprocessor might not be present. In this case, no coprocessor accepts CP15 instructions, and so all such instructions are UNDEFINED. However, new implementations of the memory and system architectures must implement the System Control coprocessor, and must follow some additional rules about which facilities are implemented. For details of these rules, contact ARM Ltd. This chapter describes the overall design of the System Control coprocessor and how its registers are accessed. Detailed information is given on some of its registers. Other registers are allocated to facilities described in detail in other chapters and are only summarized in this chapter. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The System Control Coprocessor ARM DDI 0100E Copyright © 1996-2000 ARM Limited. All rights reserved. B2-3 2.2 Registers The System Control coprocessor can contain up to 16 primary registers, each of which is 32 bits long. For some of these, additional bits in the register access instructions are used to identify a specific version of the register and/or specific types of access to the register, so the number of physical 32-bit registers in CP15 can be more than 16. However, the 4-bit primary register number is used to identify registers in descriptions of the System Control coprocessor, because it is the primary factor determining the function of the register. CP15 registers can be read-only, write-only or read/write. The detailed descriptions of the registers specify: • what types of access are allowed • what functionality is invoked by each type of access • whether a primary register identifies more than one physical register, and if so, how they are distinguished • any other details that are relevant to the use of the register. 2.2.1 Register access instructions The only defined System Control coprocessor instructions are: • MCR instructions to write an ARM register to a CP15 register • MRC instructions to read the value of a CP15 register into an ARM register. All CP15 CDP, LDC and STC instructions are UNDEFINED. The MCR and MRC instructions to access the CP15 registers use the generic syntax for those instructions: MCR{<cond>} p15, 0, <Rd>, <CRn>, <CRm>{, <opcode2>} MRC{<cond>} p15, 0, <Rd>, <CRn>, <CRm>{, <opcode2>} where: <cond> This is the condition under which the instruction is executed. The conditions are defined in The condition field on page A3-5. If <cond> is omitted, the AL (always) condition is used. Bits[23:21] These bits of the instruction, which are the <opcode1> field in generic MRC and MCR instructions, are always 0b000 in valid CP15 instructions. If they are not 0b000, the instruction is UNPREDICTABLE. <Rd> This is the ARM register involved in the transfer (the source register for MCR and the destination register for MRC). This register must not be R15, even though MCR instructions normally allow it to be R15. If R15 is specified for <Rd> in a CP15 MRC or MCR instruction, the instruction is UNPREDICTABLE. 31 28 27 26 25 24 23 21 20 19 16 15 12 11 8 7 5 4 3 0 cond 1110 SBZ L CRn Rd 1111opcode21 CRm Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The System Control Coprocessor B2-4 Copyright © 1996-2000 ARM Limited. All rights reserved. ARM DDI 0100E <CRn> This is the primary CP15 register involved in the transfer (the destination register for MCR and the source register for MRC). The standard generic coprocessor register names are c0, c1, , c15. <CRm> This is an additional coprocessor register name which is used for accesses to some primary registers to specify additional information about the version of the register and/or the type of access. When the description of a primary register does not specify <CRm>, c0 must be specified. If another register is specified, the instruction is UNPREDICTABLE. <opcode2> This is an optional 3-bit number which is used for accesses to some primary registers to specify additional information about the version of the register and/or the type of access. If it is omitted, 0 is used. When the description of a primary register does not specify <opcode2>, it must be omitted or 0 must be specified. If another value is specified, the instruction is UNPREDICTABLE. These MCR and MRC instructions can only be used while the processor is in a privileged mode. If they are executed while the processor is in User mode, an Undefined Instruction exception occurs. Note If access to some System Control coprocessor functionality by User mode programs is required, the usual solution is that the operating system defines one or more SWIs to supply it. As the precise set of memory and system facilities available on different processors can vary considerably, it is recommended that all such SWIs are implemented in an easily replaceable module and that the SWI interface of this module is defined to be as independent of processor details as possible. The IMB and IMB_Range SWIs described in Instruction Memory Barriers (IMBs) on page A2-28 are examples of such SWIs. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The System Control Coprocessor ARM DDI 0100E Copyright © 1996-2000 ARM Limited. All rights reserved. B2-5 2.2.2 Primary register allocation Table 2-1 shows the allocation of the primary registers of the System Control coprocessor. Table 2-1 Primary register allocation Reg Generic use Specific uses Details in 0 ID codes (read-only) ID and Cache type Register 0: ID codes on page B2-6 1 Control bits (read/write) Miscellaneous control bits Register 1: Control register on page B2-13 2 Memory protection and control MMU: Translation table base PU: Cachability bits Register 2: Translation table base on page B3-23 Register 2: Cachability bits on page B4-6 3 Memory protection and control MMU: Domain access control PU: Bufferability bits Register 3: Domain access control on page B3-24 Register 3: Bufferability bits on page B4-6 4 Memory protection and control MMU: Reserved PU: Reserved Register 4: Reserved on page B3-24 Registers 4, 8, 10: Reserved on page B4-7 5 Memory protection and control MMU: Fault status PU: Access permission bits Register 5: Fault status on page B3-24 Register 5: Access permission bits on page B4-7 6 Memory protection and control MMU: Fault address PU: Protection area control Register 6: Fault address on page B3-25 Register 6: Protection area control on page B4-8 7 Cache and write buffer Cache/write buffer control Register 7: Cache functions on page B5-15 8 Memory protection and control MMU: TLB control PU: Reserved Register 8: TLB functions on page B3-25 Registers 4, 8, 10: Reserved on page B4-7 9 Cache and write buffer Cache lockdown Register 9: Cache lockdown on page B5-18 10 Memory protection and control MMU: TLB lockdown PU: Reserved Register 10: TLB lockdown on page B3-27 Registers 4, 8, 10: Reserved on page B4-73 11 Reserved - - 12 Reserved - - 13 Process ID Process ID Register 13: Process ID on page B6-6 14 Reserved - - 15 IMPLEMENTATION DEFINED IMPLEMENTATION DEFINED Implementation documents Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. The System Control Coprocessor B2-6 Copyright © 1996-2000 ARM Limited. All rights reserved. ARM DDI 0100E 2.3 Register 0: ID codes CP15 register 0 contains one or more identification codes for the ARM and system implementation. When this register is read, the opcode2 field of the MRC instruction selects which identification code is wanted, as shown in Table 2-2, and the CRm field must be specified as c0 (if it is not, the instruction is UNPREDICTABLE). Writing to CP15 register 0 is UNPREDICTABLE. It is recommended that all the ID registers in Table 2-2 are implemented, but only the main ID register (<opcode2> == 0) is mandatory. Whether or not other ID registers are implemented is IMPLEMENTATION DEFINED. If an <opcode2> value corresponding to an unimplemented or reserved ID register is encountered, the System Control coprocessor returns the value of the main ID register. ID registers other than the main ID register are defined so that when implemented, their value cannot be equal to that of the main ID register. Software can therefore determine whether they exist by reading both the main ID register and the desired register and comparing their values. If the two values are not equal, the desired register exists. 2.3.1 Main ID register When CP15 register 0 is read with <opcode2> == 0, an identification code is returned from which, among other things, the ARM architecture version number can be determined, as well as whether or not the Thumb instruction set has been implemented. Note Only some of the fields in CP15 register 0 are architecturally defined. The rest are IMPLEMENTATION DEFINED and provide more detailed information about the exact processor variant. Consult individual datasheets for the precise identification codes used for each processor. For historical reasons, there are three distinct ways in which the CP15 register 0 ID code might need to be interpreted. To determine which to use, look at bits[15:12] of the ID code: • if they are 0x0, this indicates a pre-ARM7 processor • if they are 0x7, this indicates that the processor is in the ARM7 family • otherwise, a more recent processor family than ARM7 is involved. Table 2-2 System Control coprocessor ID registers opcode2 Register Details in 0b000 Main ID register Main ID register 0b001 Cache Type register Cache Type register on page B2-9 other Reserved (see main text) - Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark. [...]... ID 3 0 Revision The processor ID values are as follows: 0x4156030 0x4156060 ARM6 00 (Architecture 3) 0x4156061 ARM6 10 (Architecture 3) 0x4156062 B2-8 ARM3 (Architecture 2) ARM6 20 (Architecture 3) Copyright © 1996-2000 ARM Limited All rights reserved Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark ARM DDI 0100E The System Control Coprocessor 2.3.2 Cache Type register If present,... processor The top four bits of this number are not allowed to be 0x0 or 0x7 Bits[19:16] Contain an architecture code The following architecture codes are defined (all other values of the architecture code are reserved by ARM Ltd): 0x1 Architecture 4 0x2 Architecture 4T 0x3 Architecture 5 0x4 Architecture 5T 0x5 Architecture 5TE Bits[23:20] Contain an IMPLEMENTATION DEFINED variant number This is typically... Contain an IMPLEMENTATION DEFINED variant number Bit[23] Indicates which of the two possible architectures for an ARM7 -based processor is involved: 0 Architecture 3 1 Architecture 4T Bits[31:24] Contain an implementor code See Post -ARM7 processors for these codes Pre -ARM7 processors Four processors prior to ARM7 use ID codes in which bits[15:12] are 0x0, and no further processors will be allocated such... following codes are defined (all other values of the architecture code are reserved by ARM Ltd): 0x41 = A (ARM Ltd) 0x44 = D (Digital Equipment Corporation) 0x69 = i (Intel Corporation) ARM DDI 0100E Copyright © 1996-2000 ARM Limited All rights reserved Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark B2-7 The System Control Coprocessor ARM7 family processors If bits[15:12] of the... strategy has a reasonably easily predictable worst-case performance L4 (bit[15]) For some ARM processors that support architecture version 5 or above, this bit controls a backwards-compatibility feature with previous versions of the architecture 0 = normal behaviour for the architecture ARM DDI 0100E Copyright © 1996-2000 ARM Limited All rights reserved Please purchase PDF Split-Merge on www.verypdf.com to... enter ARM or Thumb state (that is, to set the new value of the CPSR T bit) instead ignore bit[0] of the T bit and stay in the current execution state The instructions affected by this are: • LDM (1) on page A4-30 • LDR on page A4-37 • POP on page A7-75 For ARM processors that support architecture versions prior to version 5, this bit should be treated as UNP/SBZP For ARM processors that support architecture. .. to configure the ARM processor to the endianness of the memory system: 0 = Configured for little-endian memory system 1 = Configured for big-endian memory system For details, see Endianness on page A2-23 B2-14 Copyright © 1996-2000 ARM Limited All rights reserved Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark ARM DDI 0100E The System Control Coprocessor On ARM processors... IMPLEMENTATION DEFINED purposes See the technical reference manual for the implementation or other implementation-specific documentation for details of the facilities available through this register • CP15 registers 11, 12 and 14 are reserved for future expansion Accessing (reading or writing) any of these registers is UNPREDICTABLE ARM DDI 0100E Copyright © 1996-2000 ARM Limited All rights reserved Please purchase... System Control Coprocessor B2-18 Copyright © 1996-2000 ARM Limited All rights reserved Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark ARM DDI 0100E Chapter B3 Memory Management Unit This chapter describes the memory system architecture based on a Memory Management Unit (MMU) It contains the following sections: • About the MMU architecture on page B3-2 • Memory access sequence... Domains on page B3-17 • Aborts on page B3-18 • CP15 registers on page B3-23 ARM DDI 0100E Copyright © 1996-2000 ARM Limited All rights reserved Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark B3-1 Memory Management Unit 3.1 About the MMU architecture The Memory Management Unit (MMU) memory system architecture allows fine-grained control of a memory system Most of the detailed . as follows: 0x4156030 ARM3 (Architecture 2) 0x4156060 ARM6 00 (Architecture 3) 0x4156061 ARM6 10 (Architecture 3) 0x4156062 ARM6 20 (Architecture 3). 31 24. possible architectures for an ARM7 -based processor is involved: 0 Architecture 3 1 Architecture 4T. Bits[31:24] Contain an implementor code. See Post -ARM7