• No results found

Word Frequency Analysis of Stormfront

Chapter 4: Results

4.3 Word Frequency Analysis of Stormfront

In the American white supremacist forum “Stormfront” the word “white” is mentioned 25769 times and is actually used more than the word “quote” (23285 times).

The plural “whites” is used 6980 times. It is also remarkable that “black” has been written 8564 times, and “jewish” 7137 times. Amongst the most popular words, we also find “race”

(7016), “obama” (5652), “american” (4600), “hitler” (3776), and “negro” (3400). These words indicate that there should be much talk about races and nationalism in the Stormfront forum. Even the word “racist” is actually used 3267 times. Here is an example from line 47164:

“('Stormfront', '908090', 'Facebook banned me, I posted a racist comment on a news site', 'I posted a comment on a news site, like fox news or something, and facebook banned me for it??', 'Hellrazor777', '08-20-2012, 01:19 AM')”

That the author admits he/she wrote a racist comment in the commentary field of a news article connected to Facebook, should imply that he/she does not regret it. From the example it does not look like the word “racist” is regarded as pejorative by that author.

Some everyday words are also applied in Stormfront, although they may be used for propaganda purposes: “family” (3268), “children” (3263), “money” (3426). “youtube”, the name of a video website, is actually mentioned 7509 times.

Some of the words in Stormfront are unintelligible if English is the only European language one understands. Among the 100 most occurring words, one can find words from French (“que” = “what”), Italian (“sono” = “I am”), Spanish (“por” = “for”), and Dutch (“het” =

“the”, neuter form). Most of these are stop words. Something interesting is that when words are sorted by (normalized) GTF-IDF, then the non-English stop words are ranked higher than with (N)GTF, because they are seldom used. Else there is no noticeable difference in which words that are on the top when sorting by (N)GTF or (N)GTF-IDF, respectively, except for some small differences in the ordering. The appearance of some non-English words in the list of most frequent words suggests that Stormfront is today an international forum with mostly

0 200 400 600 800 1000 1200

allah hajj day files prophet akbar allahu peace ten time times dvd file pilgrim days hacked indian rapidshare sms blessings video phone mobile people evolution plants stones journey http husband

GTF

32

English forum threads, but also a considerable amount of threads in other European languages.

Figure 9: 30 most frequent words in Stormfront

The list of 30 words with highest GTF includes the word “obama”. Barack Obama is the first coloured president in the USA, so it is likely to believe that white supremacists do not like him. The word “world” (7538) as well as the word “war” (6152) are both mentioned often.

Many times it is very probable that they are mentioned together in the expression “world war” like in “World War 2” or “Second World War”. The word “anti” is used 5560 times, and has to be the prefix of different anti-ideologies like anti-Semitism, anti-racist, or another kind of opposition-expressing word like “anti-White climate” or “anti-German”. “anti”

appears in the results because hyphens are removed from the analysis program before the counting of the words started. Hyphens are namely considered as punctuation characters by the regular expression “\\p{Punct}” [27] which we use recognizing the punctuations to be removed by the pre-processing feature of our word counting Java program. Punctuation characters must namely be removed from the text to avoid that they are included as parts of words like at the end of sentences. An example is “(…) race.”, from where would then get a word with the full stop included, not only the word “race”. The program would then consider

“race.” as another word than “race”. This will influence the count of “race”. Hyphens are also removed, so “anti-White” becomes “anti” and “White”. This is a side effect we did not see in advance, but is not a big problem for our results.

As one can see in the tables below, which are extracted from the table of words in

Stormfront, one is more likely to find a variant of the word “nazi” than of “zion”, “zionist” or

“zionism”; actually it is 1.84 times more likely. Because of different languages there are many variants of the words “nazi” and “zionist”. “zionist” starts with S in some languages (like Norwegian). Zion is a mountain close to Jerusalem in Israel, and zionists are people who wants to have an independent Jewish state in the territory of Israel, which today is a reality.

Nazi was a nickname for the national-socialist party that ruled alone in Germany from 1933 to 1945 and all the people that supported it or still supports it. We are mostly not going to consider several variants for the same word in this thesis. This is just a demonstration for how many different ways a word can be written.

0 5000 10000 15000 20000 25000 30000

que che people white les quote del jews youtube una don los posted jewish black originally world time des war whites race con obama israel government country anti della

GTF

33

Table 1: Variants of zion/zionist/zionism (2631 occurrences in total)

Word GTF

zionists 459 zionistische 38 zionisten 16 zionist 1164 zionisme 11 zionism 253

zion 155

sionisti 49 sionistes 37 sioniste 51 sionistas 53 sionista 162 sionismo 111 sionisme 16

sion 45

sinonimo 11

Table 2: Variants of nazi (4880 occurrences in total)

Word G

TF naziwhowantstokillsixmil lionjews

18

nazivaju 16

naziv 10

nazisti 53

nazista 89

nazismo 42

34

nazisme 29

nazism 80

nazis 11

22

nazioni 13

2

nazione 19

1 nazionalsocialisti 15 nazionalsocialista 37 nazionalsocialismo 10

6

nazionalità 30

nazionalisti 29

nazionalista 30

nazionalismo 34

nazionali 23

nazionale 17

1

nazional 29

nazifascismo 10

nazies 38

nazie 48

nazi 24

98

Table 3: Top 30 words from Stormfront, sorted in descending order by GTF or NGTF

Word GTF NGTF

NGTF-IDF

white 25769 1.000 1.099

quote 23285 0.904 0.993

people 20364 0.790 1.272 posted 17659 0.685 0.753 originally 17065 0.662 0.728

35

que 13863 0.538 1.753

che 12559 0.487 1.569

don 9815 0.381 0.792

jews 9044 0.351 0.872

black 8564 0.332 0.730

time 8111 0.315 0.725

world 7538 0.293 0.727

youtube 7509 0.291 0.842

les 7482 0.290 1.032

jewish 7137 0.277 0.750

race 7016 0.272 0.677

whites 6980 0.271 0.695

del 6798 0.264 0.897

una 6426 0.249 0.831

war 6152 0.239 0.715

obama 5652 0.219 0.594

anti 5560 0.216 0.569

country 5325 0.207 0.573

los 5086 0.197 0.760

news 5033 0.195 0.515

government 5020 0.195 0.574

con 5000 0.194 0.666

israel 4949 0.192 0.575

police 4949 0.192 0.555

Table 4: Top 30 words from Stormfront, sorted in descending order by NGTF-IDF

Word GT

F

IDF NGT F

NGTF-IDF

36

que 1386

3

3.258 0.538 1.753

che 1255

9

3.219 0.487 1.569 people 2036

4

1.609 0.790 1.272 white 2576

9

1.099 1.000 1.099

les 7482 3.555 0.290 1.032

quote 2328 5

1.099 0.904 0.993

del 6798 3.401 0.264 0.897

jews 9044 2.485 0.351 0.872 youtube 7509 2.89 0.291 0.842

una 6426 3.332 0.249 0.831

don 9815 2.079 0.381 0.792

los 5086 3.850 0.197 0.760

posted 1765 9

1.099 0.685 0.753 jewish 7137 2.708 0.277 0.750 black 8564 2.197 0.332 0.730 originall

y

1706 5

1.099 0.662 0.728 world 7538 2.485 0.293 0.727 time 8111 2.303 0.315 0.725

des 4914 3.784 0.191 0.722

war 6152 2.996 0.239 0.715

whites 6980 2.565 0.271 0.695 race 7016 2.485 0.272 0.677

con 5000 3.434 0.194 0.666

obama 5652 2.708 0.219 0.594 israel 4949 2.996 0.192 0.575 governm

ent

5020 2.944 0.194 0.574 country 5325 2.773 0.207 0.573 anti 5560 2.639 0.216 0.569

37 della 3583 4.094 0.139 0.569 police 4949 2.890 0.192 0.555