• No results found

Source 3 : Inter-core Interference Due to Write Hits

3.5 Sources of Unpredictability Due to Coherence

3.5.3 Source 3 : Inter-core Interference Due to Write Hits

This source is due to write hits in the private cache to non-modified lines. Since the predictable bus arbiter only controls accesses to the shared bus, a request that results in a hit in the private cache can proceed without waiting for the corresponding core slot. This yields two possible scenarios of interference as follows.

3.5.3.1 Source 3A: inter-core interference due to write-hits to non-modified lines during another core’s slot

Figure3.4aexemplifies this scenario. c0has a version ofAin its private cache that is not modified.

During c2’s slot, c2issues a write request toA, while simultaneously c0has a write operation toA

that results in a hit in its private cache. This creates a race with two possibilities. If c0’s write hit

onAoccurs first, c2has to wait until c0 writes backA. On the other hand, if c2’s request appears

on the bus first, c0 has to invalidate its own local copy ofA. Hence, c0’s request toAwill be a

miss and has to wait for c2 to write backAbefore it gets another access to it. Assume that c0’s

write hit occurs first and c2 waits. After c0 writes backAand during c2’s next slot, c0 again has

another write hit toA. Again, c2has to wait for c0 to write backA. Consequently, this situation is

repeatable and can starve c2.

Proposed solution. We avoid this interference by enforcing Invariant3.4. Invariant3.4stalls a write request by a core, which is a hit to a non-modified line until the arbiter grants an access slot to that core. Thereby, it avoids the aforementioned unpredictable consequences. It is worth noting that Invariant3.4aligns with Invariant3.1 as follows. Invariant3.1 mandates that a core can initiate coherence messages into the bus only when it is granted an access to it by the arbiter. Although a write hit to a non-modified line does not need data from the shared memory, it still

c

0

: cannot modify A

c

2

:req A

c

0

:WHit on A

c

2

TDM slots

order

c

0

:modifies A

c

2

:stalls on A

c

2

:modifies A

c

0

:miss on A and stalls

mem:sends A to c

2

Inv. 4

c

2

:req A

c

0

:WHit on A

c

2

not c

0

'

s slot

c

2

:modifies A

mem:sends A to c

2

c

2

:req A

OR

(a) Source 3A: initially, c0readA. c2is under analysis.

c 2 :req A c 0 :mark A for WB c 0 :WB A to mem

c

1

:stalling

on B

c

0

c

2

c

2

order

c

1 c 2 :WB B to mem c 0 :WHit on A and modify

c

0

c

1 mem:sends A to c 2 c 2 :read A c 0 :WHit on A A requested by c 2 c 0 :cannot modify A

c

0

c

2

c

1

Inv. 5

1

2

3

4

5

6

TDM slots

... c 2 :req A c 0 :mark A for WB

c

2

(b) Source 3B: initially, c0modifiedA, c2modifiedB, and c1requestedB. c2is under analysis. Figure 3.4: Unpredictability source 3: inter-core interference due to write hits.

needs to send coherence messages on the bus. This is necessary to invalidate local copies of the same line that other cores have in their private caches. Accordingly, a write hit to a non-modified line has to wait for a granted access by the arbiter. On maintaining Invariant3.4 in Figure3.4a, the following behavior is guaranteed. Since the current slot belongs to c2, and c0’s request is a

write hit toA, which is not modified, c0must wait for its slot to that request. On the other hand,

c2 issues its write request toA. Since no core has a modified version ofA, c2 obtainsAfrom the

shared memory and performs the write operation. c0invalidates its own local copy ofA.

Invariant 3.4. A write request from ci that is a hit to a non-modified line in ci’s private cache

3.5.3.2 Source 3B: inter-core interference due to write hits to non-modified lines that are requested by another core

Invariant3.4 resolves the race situation between a request generated by a core in its designated slot and write hits from other cores. Note that the write hits to non-modified lines can lead to another unpredictable situation that Invariant3.4 does not manage. We illustrate this situation in Figure3.4b. Initially, c0 has a modified version ofA, c2 has a modified version ofB, and c1

has requestedB. 1 c2 requests Ato read; thus, in c0’s next slot, it updates the shared memory

with the modified value of A 2. Since c2’s request is a read, c0 does not invalidate its local

version ofA. At 4, c2has two pending actions: fetchingAfrom memory, and writing backBto

the memory in response to c1’s request. Assume that c2 chooses to write backB. Therefore, its

request toAwaits for the next slot. At 5, c0 has a write hit toA. Consequently, since this is c0’s

slot, it conforms with Invariant3.4; thereby, it modifiesA. At 6, it has to reissue its request toA

and wait for c0 to write backAto memory again. From c2’s perspective, this situation is similar

to the situation at 1. Similarly, in next periods, after c0 writes back A, it can have a write hit

toAbefore c2 receives it from the memory. Clearly, this situation is repeatable indefinitely, and

creates unbounded memory latency for c2.

Proposed solution. We avoid the unbounded latency by enforcing Invariant3.5. Invariant3.5

stalls a write request, which is a hit to a non-modified line until all waiting requests from pre- vious slots are completed. Thereby, it avoids the aforementioned unpredictable consequences. Maintaining Invariant3.5in the right side of Figure3.4b, the following behavior is guaranteed. During c0’s slot, it has a hit to A. SinceAis non-modified by c0 and is previously requested by

c2, the write-hit cannot be processed. Accordingly, c2 obtainsAfrom the shared memory in its

next slot and performs its operation. c0’s request toAis issued afterwards in the corresponding

slot.

Invariant 3.5. A write request from cithat is a hit to a non-modified line, sayA, in ci’s private

cache has to wait until all waiting cores that previously requestedAget an access toA.