• No results found

Configuring Performance Monitors registers

PART C: PERFORMANCE MONITORS EXTENSION

M, bit [26], when EL3 is implemented, and accessed as a system register in AArch64 state or by the OPTIONAL external debug interface

1 Allow EL0 to read PMCCNTR_EL0

6.10 Configuring Performance Monitors registers

CheckForPMUOverflow()

// Signal Performance Monitors overflow IRQ and CTI overflow events

pmuirq = (PMCR_EL0.E == ‘1’ && PMINTENSET_EL1<31> == ‘1’ && PMOVSSET_EL0<31> == ‘1’);

for n = 0 to UInt(PMCR_EL0.N) – 1

E = (if !HaveEL(EL2) || n < UInt(MDCR_EL2.HPMN) then PMCR_EL0.E else MDCR_EL2.HPME);

if E == ‘1’ && PMINTENSET_EL1<n> == ‘1’ && PMOVSSET_EL0<n> == ‘1’ then pmuirq = TRUE;

// These functions aren’t defined. For details of the GIC, see the Generic Interrupt Controller // specifications. SetInterruptRequestLevel only makes a request for an interrupt. It is the // job of the GIC to generate an interrupt request to the procesosr (if enabled and prioritized).

SetInterruptRequestLevel(InterruptID_PMUIRQ, if pmuirq then HIGH else LOW);

// For details of the CTI, see Embedded Cross-Trigger.

CTI_SetEventLevel(CrossTriggerIn_PMUOverflow, if pmuirq then HIGH else LOW);

// The request remains set until the condition is cleared. (For example, an interrupt handler // or cross-triggered event handler clears the overflow status flag by writing to PMOVSCLR_EL0.) return;

6.10 Configuring Performance Monitors registers

In this section:

 N is the IMPLEMENTATION DEFINED number of counters.

pis the smallest number of bits required for a value in the range [0 .. N); p= ceil(log2(N)).

h is the smallest number of bits required for a value in the range [0 .. N]; h= ceil(log2(N + 1)).

Note: If N is a power-of-two, then (h= p + 1); otherwise (h= p).

 M is the number of counters available in the current EL and state:

— M = N if any of:

 EL2 is not implemented

 in Secure state

 at EL2 in Non-secure state

— otherwise, M = MDCR_EL2.HPMN, or as otherwise defined below.

 H is the value written to MDCR_EL2.HPMN by software.

 P is the value written to PMSELR_EL0.SEL by software.

6.10.1 MDCR_EL2.HPMN

Rule: Hypervisors must set MDCR_EL2.HPMN to a non-zero value ≤ N.

If H > N or H = 0 then:

 The behavior in Non-secure EL1 and EL0 is CONSTRAINED UNPREDICTABLE, either:

— M has an UNKNOWN non-zero value ≤ N.

— No access to any counters, that is, M = 0.

Note: In these cases the behavior includes the values returned by reads of PMCR_EL0.N within the guest, the value of M as used below and the effect on reads/writes of PMCNTENSET_EL0, et al, as defined in [v7A]. If the behavior is such that M = 0, then “x ≥ M” will be true for all possible values of x.

 Direct reads of MDCR_EL2.HPMN return an UNKNOWN value.

Note: The ARM preferred behaviors are based on either:

i. a (simple) implementation where all of HPMN[4:0] are implemented and:

— In Non-secure EL1 and EL0:

 If H > N then M = N.

 If H = 0 then M = 0.

— For reads of MDCR_EL2.HPMN, to return H.

ii. a (lower-cost) implementation where HPMN[4:h] are RAZ/WI:

— In Non-secure EL1 and EL0:

 If (H modulo 2h) > N then M = N.

 If (H modulo 2h) = 0 then M = 0, the guest has no access to any counters.

 Otherwise M = (H modulo 2h).

— For reads of MDCR_EL2.HPMN, to return (H modulo 2h), which may be zero.

6.10.2 PMSELR_EL0.SEL

Rule: Software must set PMSELR_EL0.SEL to either a value < M or 31 (to select PMCCFILTR_EL0).

If P ≥ M and P ≠ 31 then

 Direct reads of PMSELR_EL0.SEL return an UNKNOWN value.

 Direct reads and writes of PMXEVTYPER_EL0 and PMXEVCNTR_EL0 are CONSTRAINED UNPREDICTABLE, and must behave as one of:

UNDEFINED.

— If P < N, generate a Trap exception taken to EL2 with the appropriate syndrome.

— RAZ/WI.

— No-op.

— As if PMSELR_EL0.SEL has an UNKNOWN value < M.

— As if PMSELR_EL0.SEL is 31.

If P = 31 then reads and writes of PMXEVCNTR_EL0 must also have one of the behaviors listed above (other than the last).

If PMUSERENR_EL0.{ER,EN} or MDCR_EL2.TPM enable a trap on the access then:

 If the CONSTRAINED UNPREDICTABLE behavior means the access is UNDEFINED, then this is a higher priority exception than a trap to EL1 or EL2 (see [v8Exception]).

 Otherwise the trap is taken.

Note: The ARM preferred behaviors are based on implementing either:

i. A (simple) implementation where all of SEL[4:0] are implemented, and if P ≥ M and P ≠ 31 then the register is RAZ/WI.

ii. A (lower-cost) implementation where SEL[4:h] are RAZ/WI, and the “all ones” value for SEL[(h – 1):0] is used to encode SEL=31:

— If (P modulo 2h) = (2h – 1) then the behavior is as if SEL=31.

— Else if (P modulo 2h) ≥ M then PMXEVTYPER_EL0 and PMXEVCNTR_EL0 are RAZ/WI.

— Otherwise the behavior is as if SEL = (P modulo 2h).

For SEL=31, PMXEVCNTR_EL0 is RAZ/WI.

6.10.3 PMEVCNTRn_EL0 and PMEVTYPERn_EL0 (direct access)

Rule: Software must only use the direct access registers PMEVCNTRn_EL0 and PMEVTYPERn_EL0 for n < M.

If n ≥ M then direct reads and direct writes of PMEVTYPERn_EL0 and PMEVCNTRn_EL0 are CONSTRAINED UNPREDICTABLE, and must behave as one of:

UNDEFINED.

If n < N, generate a Trap exception taken to EL2 with the appropriate syndrome.

 RAZ/WI.

 No-op.

If PMUSERENR_EL0.{ER,EN} or MDCR_EL2.TPM enable a trap on the access then:

 If the CONSTRAINED UNPREDICTABLE behavior means the access is UNDEFINED, then this is a higher priority exception than a trap to EL1 or EL2 (see [v8Exception]).

 Otherwise the trap is taken.

Note: The ARM preferred behaviors are based on implementing:

— If n ≥ N then the instruction is UNDEFINED.

— Otherwise, if n ≥ M then the register is RAZ/WI.

This does give a virtualization hole, as software at EL1 can discover the value of N.

6.10.4 PMEVTYPERn_EL0[31:26] and PMCCFILTR_EL0[31:26]

Rule: Software must not program {P, U, NSK, NSU, NSH, M} such that they select one of the following nonsensical use cases:

— Not counting.

— Counting at EL1 in one security state and at EL0 in the other state (but not vice versa).

— counting at EL0 and an EL higher than EL1.

P U NSK NSU NSH M ELs in which to count

1 1 0 0 0 0 None

(Do not count in EL1 in one state and in EL0 in the other state) 0 1 1 1 X X Secure EL1 and Non-secure EL0

1 0 1 1 X X Secure EL0 and Non-secure EL1

(Do not count EL0 plus any EL higher than EL1) 1 0 0 0 not 00 All EL0 plus either or both of EL2 and EL3 1 0 0 1 not 00 Secure EL0 plus either or both of EL2 and EL3 1 1 0 1 not 00 Non-secure EL0 plus either or both of EL2 and EL3 Table 40: Reserved encodings, software must not use

These combinations are not UNPREDICTABLE encodings and processors must implement the five filtering bits as described.

6.10.5 PMEVTYPERn_EL0.evtCount

Rule: Software must program PMEVTYPERn_EL0.evtCount with an event defined by the processor or a common event defined by the architecture.

If evtCount is written with a reserved or not implemented event, the behavior depends on the event type:

Common architectural and microarchitectural events

— No events are counted.

— The value returned by a direct read of evtCount is the value that was written.

This allows some level of compatibility of events across implementations by requiring predictable safe behavior if an implementation does not implement the event, for events defined by this architecture.

IMPLEMENTATION DEFINED events

— It is UNPREDICTABLE what event, if any, is counted.

— The value returned by a direct read of evtCount is an UNKNOWN value with the same effect.

This allows implementations to simplify event decoding logic and include undocumented events.

UNPREDICTABLE means the event must not expose privileged information.

However, ARM recommends that the behavior across a family of implementations is defined such that if a given implementation does not include an event from a set of common IMPLEMENTATION DEFINED events, then no event is counted and the value read back on evtCount is the value written.

6.11 Events

6.11.1 Required events

PMUv3 requires the events listed in Table 41 are present in all implementations.

Event

number Event type Event mnemonic Description

0x000 Architectural SW_INCR Instruction architecturally executed, condition code check pass, software increment

0x003 Microarchitectural L1D_CACHE_REFILL Level 1 data cache refill 0x004 Microarchitectural L1D_CACHE Level 1 data cache access

0x010 Microarchitectural BR_MIS_PRED Mispredicted or not predicted branch speculatively executed 0x011 Microarchitectural CPU_CYCLES Cycle

0x012 Microarchitectural BR_PRED Predictable branch speculatively executed

Event

number Event type Event mnemonic Description And at least one of:

0x008 Architectural INST_RETIRED Instruction architecturally executed 0x01B Microarchitectural INST_SPEC Operation speculatively executed Table 41: Required events in PMUv3

Note: ARM recommends that both these events are implemented.

With the caveats:

 Events 0x003 and 0x004 are only required if the processor implements a L1 data or unified cache.

 Events 0x010 and 0x012 are only required if the processor implements program-flow prediction.

However, ARM recommends the events are implemented as shown in Table 43.

 ARM strongly recommends event 0x008 is implemented.

See also “Speculatively executed” events below.

6.11.2 Common architectural and microarchitectural events

Table 42 lists new common architectural and microarchitectural events defined by ARMv8. Events 0x000 to 0x01D are defined by [v7A].

Event

number Event type Event mnemonic Description

0x01E Architectural CHAIN Chain

For odd-numbered counters, counts once for each overflow of the preceding even-numbered counter on the same processor.

Note: This means the PMEVTYPERn_EL1.MT bit is ignored.

For even-numbered counters, does not count.

0x01F 0x020

Microarchitectural L1D_CACHE_ALLOCATE L2D_CACHE_ALLOCATE

Level 1 data cache allocation without refill