In this section, we review some of the data structures and introduce some notations used in this chapter.
6.2.1
Compressed row storage scheme (CRS)
Storage schemes used for unstructured sparse matrices usually involve some form of indirect indexing of its non-zero elements via auxiliary data structures. For example, the compressed row storage (CRS) scheme [4] uses two auxiliary arrays, colind of lengthτ (the number of non-zero elements) androwptrof lengthm+1 wheremis the number of rows of S. This is the most common storage scheme for sparse matrices. The three arrays required to store the sparse matrix S are described below.
1. value: for storing the non-zeros of S row-by-row,
2. colind: for storing the column index of each non-zero, and
3. rowptr: for storing the index of the first non-zero of each row in the value array.
6.2.2
SpMxV
with CRS scheme
Sample code for computing y = Sx under CRS scheme is given in Algorithm 6. In this algorithm, accesses to vector y and all three arrays of CRS are regular. But the accesses to the vector x might be irregular because the column indices of each row may not be consecutive. A large number of cache misses might occur during the accessing of x which may make the SpMxV very slow in practice.
6.2.3
Compressed column storage scheme (CCS)
This scheme is the same as CRS except that the non-zeros are stored column-by- column. Like CRS, three arrays are used in the compressed column storage scheme (CCS) to store sparse matrix S are described below.
1. value: for storing the non-zeros of S column-by-column, 2. rowind: for storing the row index of each non-zero, and
3. colptr: for storing the index of the first non-zero of each column in the value array.
6.2.4
Notations
We consider a sparse matrix S with arbitrary sparsity structure having m rows (r0, . . . , rm−1), n columns (c0, . . . , cn−1) and τ non-zero elements. Here, si,j refers
to the entry ofS which is at thei-th row and thej-th column. We denote byρr and ρc the average number of non-zeros in a row and a column respectively. Let 1/p be the probability that an element ofS is non-zero. Throughout this chapter, we assume that m and n are positive integers of machine word size, or smaller. We assume that
τ > m+n and min(m, n) > p. Time and space complexity estimates are given for the RAM model with memory holding a finite number of w-bit words, for a fixed
w[64]. Cache complexity is measured by considering the ideal cache model described in Chapter 2.
Algorithm 6: SpMxV(value,colind,rowptr,x)
Input: value,colind,rowptr are three arrays that represents S in CRS and dense vector x
Output: vector y, where y=Sx
for all i= 0,1, . . . , m−1 do 1 y[i] = 0; 2 for i= 0,1, . . . , m−1 do 3
for k=rowptr[i] to rowptr[i+1]−1 do
4 j =colind[k]; 5 y[i]+ =value[k]∗x[j]; 6 return y; 7
6.2.5
Binary reflected Gray code
A q-bit binary reflected Gray code [43] is a Gray code denoted by Gq and defined by G1 = [0,1] and Gq = [0Gq−1 0 , . . . ,0G q−1 2q−1−1,1G q−1 2q−1−1, . . . ,1G q−1 0 ], forq >1,
where Gqi is the i-th binary string of Gq and 0
≤i <2q. We call i the rank of Gq i in Gq. For example,G2 = [00,01,11,10] andG3 = [000,001,011,010,110,111,101,100].
So, the rank of 011 in G3 is 2. For details please see [43].
6.2.6
Sorting of binary reflected Gray codes
In this chapter, we develop a new row and column permuting algorithm based on
binary reflected Gray code for sparse matrices. We call it BRGCordering. For our
proposed reordering algorithm, we consider each non-zero of S as 1. We also consider each column of S as a binary reflected Gray code inGm. Like in Section 5.2, we con- sider the bits from row 0 andm−1 as the most and least significant bits respectively.
In this section, we explain how we can sort binary reflected Gray codes in descending order of their ranks by our proposed sorting algorithm described in Chapter 5. From the mathematical definition of binary reflected Gray code in Section 6.2.5, we can describe Corollary 3 which is the basis for sorting binary reflected Gray codes.
Corollary 3. Let Gqi and Gqj be two different binary reflected Gray codes in Gq. Let
their first disagree bit (see Section 5.2) be h for 0≤h < q. Assume that theh-th bit
of Gqj has 1. If the number of 1s in G
q i or G
q
j before h-th bit is even (odd), we can
conclude j > i (i > j).
Proposition 19 describes how we can modify our proposed sorting algorithm in Chapter 5 to sort binary reflected Gray codes according to their ranks.
Proposition 19. Our proposed sorting algorithm in Chapter 5 can sort binary re- flected Gray codes in descending order according to their ranks with one modification.
While creating Ak+1 from Ak (see Section 5.4.1), we need to form array L and apply
a stable sort algorithm on L to obtain L′ in ascending order only when k is even.
Proof ⊲ It follows from Corollary 3.
It should be noted that, we do not use any well-established sorting algorithm, like quick sort, for this purpose. We can implement quick sort algorithm available in C++ STL. The reasons for not using these technique are given below.
1. Our reordering algorithm, which is described later of this chapter, is not just a sorting of columns considering their ranks in binary reflected Gray codes. 2. We have already seen in Chapter 5, our proposed sorting algorithm is suitable
for sparse objects.