Chapter 4: Results
4.3 Word Frequency Analysis of Stormfront
In the American white supremacist forum “Stormfront” the word “white” is mentioned 25769 times and is actually used more than the word “quote” (23285 times).
The plural “whites” is used 6980 times. It is also remarkable that “black” has been written 8564 times, and “jewish” 7137 times. Amongst the most popular words, we also find “race”
(7016), “obama” (5652), “american” (4600), “hitler” (3776), and “negro” (3400). These words indicate that there should be much talk about races and nationalism in the Stormfront forum. Even the word “racist” is actually used 3267 times. Here is an example from line 47164:
“('Stormfront', '908090', 'Facebook banned me, I posted a racist comment on a news site', 'I posted a comment on a news site, like fox news or something, and facebook banned me for it??', 'Hellrazor777', '08-20-2012, 01:19 AM')”
That the author admits he/she wrote a racist comment in the commentary field of a news article connected to Facebook, should imply that he/she does not regret it. From the example it does not look like the word “racist” is regarded as pejorative by that author.
Some everyday words are also applied in Stormfront, although they may be used for propaganda purposes: “family” (3268), “children” (3263), “money” (3426). “youtube”, the name of a video website, is actually mentioned 7509 times.
Some of the words in Stormfront are unintelligible if English is the only European language one understands. Among the 100 most occurring words, one can find words from French (“que” = “what”), Italian (“sono” = “I am”), Spanish (“por” = “for”), and Dutch (“het” =
“the”, neuter form). Most of these are stop words. Something interesting is that when words are sorted by (normalized) GTF-IDF, then the non-English stop words are ranked higher than with (N)GTF, because they are seldom used. Else there is no noticeable difference in which words that are on the top when sorting by (N)GTF or (N)GTF-IDF, respectively, except for some small differences in the ordering. The appearance of some non-English words in the list of most frequent words suggests that Stormfront is today an international forum with mostly
0 200 400 600 800 1000 1200
allah hajj day files prophet akbar allahu peace ten time times dvd file pilgrim days hacked indian rapidshare sms blessings video phone mobile people evolution plants stones journey http husband
GTF
32
English forum threads, but also a considerable amount of threads in other European languages.
Figure 9: 30 most frequent words in Stormfront
The list of 30 words with highest GTF includes the word “obama”. Barack Obama is the first coloured president in the USA, so it is likely to believe that white supremacists do not like him. The word “world” (7538) as well as the word “war” (6152) are both mentioned often.
Many times it is very probable that they are mentioned together in the expression “world war” like in “World War 2” or “Second World War”. The word “anti” is used 5560 times, and has to be the prefix of different anti-ideologies like anti-Semitism, anti-racist, or another kind of opposition-expressing word like “anti-White climate” or “anti-German”. “anti”
appears in the results because hyphens are removed from the analysis program before the counting of the words started. Hyphens are namely considered as punctuation characters by the regular expression “\\p{Punct}” [27] which we use recognizing the punctuations to be removed by the pre-processing feature of our word counting Java program. Punctuation characters must namely be removed from the text to avoid that they are included as parts of words like at the end of sentences. An example is “(…) race.”, from where would then get a word with the full stop included, not only the word “race”. The program would then consider
“race.” as another word than “race”. This will influence the count of “race”. Hyphens are also removed, so “anti-White” becomes “anti” and “White”. This is a side effect we did not see in advance, but is not a big problem for our results.
As one can see in the tables below, which are extracted from the table of words in
Stormfront, one is more likely to find a variant of the word “nazi” than of “zion”, “zionist” or
“zionism”; actually it is 1.84 times more likely. Because of different languages there are many variants of the words “nazi” and “zionist”. “zionist” starts with S in some languages (like Norwegian). Zion is a mountain close to Jerusalem in Israel, and zionists are people who wants to have an independent Jewish state in the territory of Israel, which today is a reality.
Nazi was a nickname for the national-socialist party that ruled alone in Germany from 1933 to 1945 and all the people that supported it or still supports it. We are mostly not going to consider several variants for the same word in this thesis. This is just a demonstration for how many different ways a word can be written.
0 5000 10000 15000 20000 25000 30000
que che people white les quote del jews youtube una don los posted jewish black originally world time des war whites race con obama israel government country anti della
GTF
33
Table 1: Variants of zion/zionist/zionism (2631 occurrences in total)
Word GTF
zionists 459 zionistische 38 zionisten 16 zionist 1164 zionisme 11 zionism 253
zion 155
sionisti 49 sionistes 37 sioniste 51 sionistas 53 sionista 162 sionismo 111 sionisme 16
sion 45
sinonimo 11
Table 2: Variants of nazi (4880 occurrences in total)
Word G
TF naziwhowantstokillsixmil lionjews
18
nazivaju 16
naziv 10
nazisti 53
nazista 89
nazismo 42
34
nazisme 29
nazism 80
nazis 11
22
nazioni 13
2
nazione 19
1 nazionalsocialisti 15 nazionalsocialista 37 nazionalsocialismo 10
6
nazionalità 30
nazionalisti 29
nazionalista 30
nazionalismo 34
nazionali 23
nazionale 17
1
nazional 29
nazifascismo 10
nazies 38
nazie 48
nazi 24
98
Table 3: Top 30 words from Stormfront, sorted in descending order by GTF or NGTF
Word GTF NGTF
NGTF-IDF
white 25769 1.000 1.099
quote 23285 0.904 0.993
people 20364 0.790 1.272 posted 17659 0.685 0.753 originally 17065 0.662 0.728
35
que 13863 0.538 1.753
che 12559 0.487 1.569
don 9815 0.381 0.792
jews 9044 0.351 0.872
black 8564 0.332 0.730
time 8111 0.315 0.725
world 7538 0.293 0.727
youtube 7509 0.291 0.842
les 7482 0.290 1.032
jewish 7137 0.277 0.750
race 7016 0.272 0.677
whites 6980 0.271 0.695
del 6798 0.264 0.897
una 6426 0.249 0.831
war 6152 0.239 0.715
obama 5652 0.219 0.594
anti 5560 0.216 0.569
country 5325 0.207 0.573
los 5086 0.197 0.760
news 5033 0.195 0.515
government 5020 0.195 0.574
con 5000 0.194 0.666
israel 4949 0.192 0.575
police 4949 0.192 0.555
Table 4: Top 30 words from Stormfront, sorted in descending order by NGTF-IDF
Word GT
F
IDF NGT F
NGTF-IDF
36
que 1386
3
3.258 0.538 1.753
che 1255
9
3.219 0.487 1.569 people 2036
4
1.609 0.790 1.272 white 2576
9
1.099 1.000 1.099
les 7482 3.555 0.290 1.032
quote 2328 5
1.099 0.904 0.993
del 6798 3.401 0.264 0.897
jews 9044 2.485 0.351 0.872 youtube 7509 2.89 0.291 0.842
una 6426 3.332 0.249 0.831
don 9815 2.079 0.381 0.792
los 5086 3.850 0.197 0.760
posted 1765 9
1.099 0.685 0.753 jewish 7137 2.708 0.277 0.750 black 8564 2.197 0.332 0.730 originall
y
1706 5
1.099 0.662 0.728 world 7538 2.485 0.293 0.727 time 8111 2.303 0.315 0.725
des 4914 3.784 0.191 0.722
war 6152 2.996 0.239 0.715
whites 6980 2.565 0.271 0.695 race 7016 2.485 0.272 0.677
con 5000 3.434 0.194 0.666
obama 5652 2.708 0.219 0.594 israel 4949 2.996 0.192 0.575 governm
ent
5020 2.944 0.194 0.574 country 5325 2.773 0.207 0.573 anti 5560 2.639 0.216 0.569
37 della 3583 4.094 0.139 0.569 police 4949 2.890 0.192 0.555