• No results found

5. Research design

5.5. Variables in research questions II, III, and IV and their validity

5.5.4. Control variables

The same set of control variables is used in all regression analyses.

A basic factor, which influences the individual’s behaviour in many respects is age.

Already when the presenting research questions I argued that we can expect a higher protest participation among younger individuals. This was my reason to add hypothesis 4 where the protest participation of the young unemployed and the young employed is compared. For those same reasons it is also reasonable to have age as a control variable. When using age as a control variable, the values of Age -variable equals simply the age of the respondents in years.

This is asked both in WVS6 (v242) and ABIII (q1001). In hypothesis 4, youth is defined as people from 18 to 35 years of age. 15 years is regularly used as lower limit for youth (United Nations 2004: 5; Urdal 2006: 615; Bjorvatn & Høigilt 2016: 54). However, 18 is an

unavoidable lower limit here because there are no respondents younger than that. The upper limit of youth instead is oftentimes set at a lower level than at 35 years, but I follow here the definition of youth used by Bjorvatn & Høigilt, who argue that this is a more appropriate limit in the context of the Arab world, where youth is often associated with not having established oneself with families or secure jobs (Bjorvatn & Høigilt 2016: 54).

Education is another factor which obviously has grounds to be included. As with age, I have already earlier discussed why we can assume that people with higher education levels protest more probably. For binary logistic regression analyses education is coded as a four step ordinal scale variable. Compared to the original data some categories are combined. In the case of ABIII, those respondents who have answered the question about education level (q1003) that they are illiterate or have no formal education I have coded as 1, those having some education but lower than secondary, in other words elementary school, basic or

preparatory school or pre-high school diploma, I have coded as 2, those who have a secondary education I have coded as 3 and those who have a higher education than that, in other words a mid-level diploma, bachelor’s degree, master’s degree or above I have coded as 4. In the case of WVS6 those who have answered (v248) that they have no formal education I have again coded as 1, those whose highest educational level attained is either incompleted or completed primary school I have coded as 2, respondents with secondary school education either

incompleted or completed and either technical/vocational type or university-preparatory type I

have coded as 3, and respondents with some university-level education, either with a degree or without, I have coded as 4. When I use linear regression in hypotheses 6 and 8 no ordinal scale variable can be included and thus I have coded education as three binary variables.

Respondents with no formal education serve as a reference category and for three other educational levels I have coded their own binary variables.

In the Arab world men are active in politics more often than women. This lets us assume that men also protest more often. It is also found by Hoffman & Jamal that at the individual level females certainly take part in protests more seldom in the Arab countries than men (Hoffman & Jamal 2012: 184). The gender of the respondent is also included both in ABIII (q1001) as well as in WVS6 (v240). I have coded gender as 1 for males and 0 for females.

As gender is simply what we would like to measure and it is measured here, there should be no problem with the validity of this indicator, the same applies to the control variables presented above, age and education.

Allansson et al. have noted that large-scale demonstrations and the use of central squares in major cities were common traits for the Arab Spring uprisings (Allansson et al.

2012: 45). The notion that Arab Spring related protests often took place in urban centres is one ground to include the urban residence of the respondent as control variable. In general there are also other reasons which can let us assume that protests take place more often in urban than in rural areas. As a rule, urban areas are more easily accessible, this means that it is easier for people to gather in cities than in the countryside. Protesters often hope to get as much attention as possible for the issue they are protesting for and thus it makes sense to gather in urban areas, where more people and arguably also more media are present. Further we may suppose that people living in urban areas more often take part in protests because it is easier for them to just go out into the streets than it is for their fellow citizens living in rural areas to travel to cities to protest. Thus I have added the respondent’s urban residence as a control variable, where 1 stands for respondents living in urban areas and 0 for respondents living in rural areas. Urban residence is included as a control variable also because in many of these countries, for example in Morocco and Egypt, the unemployment rates are higher in urban areas (Egypt State Information Service 2014; High Commission for Planning of

Morocco 2016). If the unemployed would protest more often than the employed this might be simply because the unemployed live more often in urban areas, where taking part in protests is easier. Then again, if it should turn out that the unemployed are less active protesters than the employed, this might also in some cases be because of the differences in the rural and the urban unemployment rates. At least in Tunisia unemployment is claimed to be higher in rural

areas (African Development Bank 2007: 11). To tackle this potential flaw in results it is important to include urban residence as a control variable.

In ABIII in most of the countries the data set includes information whether the respondent is living in an urban or rural area (q13) and this information is used as it is. For respondents from Kuwait, however, this information is not included. According to WDI data set 98.5% of Kuwaitis live in urban areas (World Bank 2015a). Thus I have coded all Kuwaiti respondents as urban. In Palestine this question is slightly modified by adding “refugee camp”

as a third option. Since I have not been able to define whether the refugee camps of Palestine should be considered urban or rural, I have excluded Palestinian respondents living in refugee camps. This excludes about 14% of the Palestinian respondents. In the WVS6 it is not directly asked whether a respondent is living in a rural or urban area, but there is a question about the size of the town (v253). I have used this question and recoded this so that respondents living in towns with a population greater than 10 000 inhabitants are urban and respondents living in towns with a population lower than 10 000 inhabitants are rural. Another option would have been to have 50 000 inhabitants as a limit for rural and urban residence but choosing 10 000 was justifiable because this brought country specific shares of the urban population closer to figures in the WDI data set. In WVS6 the question about the size of the town is not asked in Egypt and Palestine and thus respondents from these countries are excluded from all analyses done with WVS6.

It is acceptable to use this question to control how easy it is for a respondent to reach the location where protests often occur. However, the question could also be better, for example asking how many hours away the respondent lives from a major city. In a perfect world we would have geographic coordinates for the home of every respondent, and accurate and complete georeferenced data about every protest which have occurred. Then we could

calculate how close to protests each respondent lives and evaluate how easily he or she could have reached the place where protests took place. As this information is absent at the moment we have to accept the use of rural and urban as a substitute. Although surely inaccurate in some occasions, it should still capture in general what we wanted and be fairly valid.

As we saw earlier, there were substantial differences between Arab countries in how the Arab Spring unfolded in them (Allansson et al. 2012). Different countries give citizens

different ways to express their political views and in some countries there are restrictions on the freedom of assembly (Gelvin 2012: 5–6). These points let us assume that the country is a factor affecting in how often respondents have taken part in protests both in connection to the Arab Spring protests but also to protesting in general. Thus country is included in the

regression analyses as a control variable when more than one country is included in the sample. Although country is included in the regression analyses, its results are not reported in the tables. There is a separate section in the results where I shortly introduce the results from the country variable. As country is a nominal level variable it is coded as a set of binary variables, where each country gets its own binary variable. One country is defined as a

reference category but it does not have great significance which country we choose, as we are not interested in the effect of country per se but we just want to rule out its impact on the variables in which we actually are interested. The country of a respondent is available for every data set and every respondent in the data.

5.5.5. Validity

An issue with validity which I have not mentioned above but which comes in question with almost all the variables is something which can be called the time lag. In ABIII most

respondents are interviewed in 2013, in other words two to almost three years after the start of the Arab Spring. Some respondents are also interviewed in late 2012 and some during the first half of 2014. This means that I mostly try to find out if the respondent’s employment status at the time when the survey was conducted has an impact on whether he or she took part in protests one to three years earlier. Of course time can change not only the employment status but also the education level, place of residence, interest in politics, voting behaviour and satisfaction with life. Although the passing of time most surely has an impact on the age of a respondent, I am not worried about that because the age of the respondents varies from 18 to close to 100, so one to three years is a minor change—in general young respondents remain young and the old remain old. I argue that during one to three years, also changes in the education level are scant. People move rather rarely and often not so far from where they first lived, so I think that vast majority of people living in urban areas now also lived in urban areas one to three years ago. However, it makes things complicated that it is most difficult to estimate the impact of the time lag on employment status, interest in politics, voting

behaviour and satisfaction with life, the variables in which we are most interested. Still I assume that, while some surely have, most respondents have not completely changed their position in regard to these questions. However, this is hard to substantiate.

In WVS6 respondents are asked whether they protested during the last year. This strengthens my research design because changes during one year are obviously less than during three years. In Tunisia and Egypt the respondents of ABII were asked whether they

took part in the Arab Spring protests which at the time of the surveys had started six to ten months earlier. This is why these two cases in ABII are important for the thesis.

In general the difference between the time the survey was conducted and the time respondents took part in protests is my biggest concern in relation to the validity of the survey data. Briefly, we would like to know the situation and mindset of the respondent at the time of the protests but to be precise none of our questions measures that. However, as we have also WVS6 where people are asked about protesting during the last year and two countries in ABII where people are asked about protesting less than a year ago, I think that if all these data sets generate similar results, the results can be seen as mutually validating and results can be considered to be reliable. When studying hypotheses 6 and 7 where the explained variable is either satisfaction with life, interest in politics or political activity, the time lag does not cause problems because these are asked at the same moment as other questions, for example about employment status.