A Review of Chronic Diseases Map and Data Trends Analysis
Mustafa Yousef Mustafa ZGHOUL
Abstract — In this paper, an attempt is made to review and analysis the status of the chronic disease the gulf region. In addition, illustrate the methods of visualization of the chronic diseases information, which are the major disease burden in the gulf region. This paper, is reviewed and discussed the methods of analysis the chronic diseases data with reference to univariate time series model. The Forecasting model based simple linear regression is proposed and implemented. The Visualization of digital data is considering as one of the most important techniques for presenting the distribution of statistical data within a small space. An interactive disease map is proposed and implemented to determine the chronic diseases numbers and locations.
Keywords: Chronic diseases, visualization, statistical data analysis, linear regression, time series, forecasting, mapping
I. INTRODUCTION
Chronic diseases (CD) pose a constant danger and a large force because of their increased risk to human life and the economy of nations. Statistics issued by the World Health Organization (WHO) indicate that the number of people living in chronic disease increasing in all parts of the world as illustrated in Figure 1.
Fig. 1: The total NCD deaths by region 2012
(AFR: Africa; AMR: America; SEAR: South East Asia; EUR: European;
EMR: Eastern Mediterranean; WPR: Western Pacific; Source WHO)
Masters Student in Computer Science, Faculty of Computing and IT, Sohar University, Sohar, Sultanate of Oman, [email protected]
In addition, WHO reported that the chronic diseases are the major disease burden in the gulf region, "the chronic diseases cause more than 60% of all deaths in the Gulf Cooperation Council (GCC) countries [1] as shown in Figure2.
Fig. 2: The Diabetes Growth Rate and Affected Population (in ‘000) during 2000 to 2030, Source WHO.
Moreover, depending on the ministry of health statistics in the Sultanate of Oman during the period (1990 - 2005), approximately 75% of diseases burden is attributable to chronic diseases" [2]. The total number of death because of chronic diseases are a double number of death caused by infectious diseases, like pulmonary tuberculosis, viral hepatitis (A), malaria, and AIDS (HIV). In addition, the chronic diseases affect women and men at younger ages. There are a number of risk factors causes the chronic diseases including physical inactivity, unhealthy diet and use of tobacco. The chronic disease is a factual dangerous and it is growing rapidly over the time, Therefore, many governments spent billions of dollars for controlling and preventing the expansion of CD [3]. WHO reports indicate that the occurrence of CD will be raised dramatically in 2023 as shown in Figure 3.
For this reason, one of the solutions to reduce the rate of chronic diseases is increasing the health awareness in the community. Therefor, design and implement a web application for the chronic diseases surveillance is much needed, which will help in raising the knowledge of culture-related chronic diseases and methods of reducing its effects.
Fig. 3: Expected rate of CD in 2023 (Source WHO)
The chronic disease (or non-communicable disease) is a disease that remains or continues for a long period of time, starts from about three months or more. Generally, these diseases cannot be prevented using a treatment or take a medicine. Ministry of health in Sultanate of Oman is working on educated members of the community against the chronic diseases like diabetes, blood pressure, etc., in order to increase the health awareness. The infectious disease (or communicable disease) is a disease that generated by microorganisms, like bacteria, parasites, fungi and viruses. In addition, can be infected other people quickly through sneezing and coughing, or through physical contact. Examples of infectious diseases are malaria, pulmonary tuberculosis, viral hepatitis (A) and AIDS (HIV) [4]. Therefore, the government’s wok hard to reduce the expansion of the chronic disease and infectious diseases by offering intensive programs and studies of injuries prevention methods [5]. The main objective of this paper is to promote awareness of the expansion of chronic diseases in GCC. Therefore, design and implement a web application for analyzing, forecasting and visualizing chronic diseases is much needed. This system will collect the data of chronic disease and illustrate the analysis statistics rates in all GCC. In addition, it will provide a very clear visualization figures based on interactive disease map.
II. VISUALIZATION METHODS AND TECHNIQUES
Many techniques are used to visualize and analysis the data. The interactive disease map is one of a powerful method for data visualization. It is used to illustrate information clearly and efficiently via plots, statistical and information graphics.
The visualization of statistical data aims to present a large amount of information in the short time. Numerical data can be graphed using lines, bars or dots to visual communicate a quantitative message. Recent studies proved that the interactive disease map helps the users to analyze and understand the meaning of complex data easily. Generally, tables are used where the users look up to a specific measurement, while charts and maps are used to present patterns or relationships of multivariate data. The diseases map is a method for
diseases in real time and understanding the reasons of these diseases [6]. A time series is a group of observations or statistics which being recorded or collected at regular intervals and sorted by the time [7]. Therefore, the proposed system will provide an accurate analysis for predicting unseen data over time. Moreover, the trend analysis is a type of technical analysis that tries to predict the future values of data (information in sequence over time) based on past data [8],[9].
The analyze of the statistical data based on trends analysis method is called linear regression analysis and which will help the decision makers in the ministry of health to determine the expansion and needs of chronic diseases [10],[11].
Cartographic visualization is used symbolism techniques, which is referred to locations inside the map and represent their multiple data values [12]. In addition, it allows the users to select statistical and geographical subsets. Local statistics of univariate distributions can be calculated and visualized in a dynamic figure for exploration like scatterplots, dot-plots, population cartograms, choropleths, parallel coordinates plots, and polygon maps as shown in Figure 4.
Fig. 4: Cartographic visualization source [12]
According to Statistics Netherlands (2012), "data visualization is the art of presenting data in a visual manner.
So the data becomes apparent. Data visualization is a helpful tool for all phases of the statistical process. It has two goals in statistics: data exploration and communication" [13].
Diagrams can detect the pattern in large amounts of data. This method views the major findings in the data to the users. There are many types of diagrams, like radar plot, bar chart, and line chart. It used to represent the time series or to make a comparison between two variables. On the other hand, proportional symbol map is a good way of viewing statistical data. It aims to view the characteristics of a subject on the map.
"In a proportional symbol map, a symbol is plotted on the center of a region, and the surface area of the symbol is scaled with the value of the variable" [13]. For example, consider the proportional symbol map as shown in Figure 5.
Fig. 5: Proportional symbol map source [13]
According to Heinrich Hartmann (2016), "statistical techniques are the art of extracting information from data.
Moreover, one of essential data analysis methods is visualization. The human brain can process geometric information much more rapidly than numbers." [14]. There are many methods for visualization, like rug plots, histograms (for one-dimensional data), scatter plots (for two-dimensional data), line plots. [14]. as an example, consider the proportional symbol map as shown in Figure 6.
Fig. 6: Visualization methods source [14]
The histogram is a most popular visualization method. In addition, it is commonly used to show the distribution of numerical data. [14] According to Ben Fry (2008), "within a small space, the visual can communicate (or announce) more information than the table." [15]. Mapping is one of the modern techniques, which is used to visualize the information.
The process of mapping starts with collect the data values and store it inside data set (database). Then design (or draw) the map to represent the data values on it. After that determine some locations (or points) on the map, usually these locations should be the center of each governorate. Afterward use functions to load the set of data values from the database,
which will show on the map itself. As an example, consider the visualization of varying data by symbol size on the map as shown in Figure 7. [15]
Fig. 7: Visualization of varying data by symbol size on map source [15]
According to Adriana REVEIU, Marian DARDALA (2011), cartographic visualization provides the facilities to represent statistical data. It is one of the most important tools in geographical information systems (GIS). In addition, it aims to show the distribution of statistical data inside the regional map.
Moreover, it uses different symbols to show more information at the same time inside the map. These symbols view quantitative details for the users. [16]
III. ANALYSIS OF CHRONIC DISEASES The analysis of chronic diseases often depends on longitudinal cohort (or group) studies, and there are many methodological issues related to chronic disease studies, the first one is based on changing definitions of risk factors and outcomes over time. Moreover, the second issue is missing data.
To perform analysis in the existence of missing data, there are many procedures to do that: the first one, make analysis to individuals which data is complete. The second one, insert the existing values to individuals which data is incomplete and then analyzing the dataset. Moreover, the second analytical technique is most appropriate. There are many analytic techniques for chronic diseases modeling: [17]
1. Logistic regression analysis: this can examine the effects of risk factors on the development of the disease.
2. Survival analysis (time to event data): this deal with time until the event of interest occurs.
3. Neural networks.
4. Tree-based classification methods.
5. Longitudinal data analysis (mixed models, generalized linear models and generalized estimating equations).
The formula for representing the regression analysis model is:
Yi = a + b * Xi
Where the regression parameters, a: is the intercept (on the y axis) . And b: is the slope of the regression line
The Least Squares method used to estimate the slope and intercept regression parameters, as the following:
The formula for calculating slope regression parameter (b) is:
The formula for calculating intercepts regression parameter (a) is:
IV. LITERATURESURVEY
This section presents the literature survey of the chronic diseases data analysis. The researchers were implemented different computing techniques for forecasting and visualizing the unseen data. It will cover the expansion rate of chronic diseases in the global. In addition, the forecasting models and visualization techniques are included. (WHO) global report [3]
statistics shows that the number of deaths in 2005 is about (58) million, around (35) million approximately of deaths resulted by chronic diseases. It represents about 60% of global deaths, which caused by chronic diseases, "only 20% of chronic disease deaths occur in high-income countries, while 80%
occur in low and middle-income countries, where most of the world’s population lives". The chronic diseases will rise about 70% of the total deaths in the world at 2030. Khatib O. [18]
stated, "chronic diseases are the major disease burden in the Eastern Mediterranean Region. There are many risk factors associated with chronic diseases, most of them are related to the lifestyle and can be controlled. Such as low vegetable and fruit intake, physical inactivity, high fast food consumption and high cholesterol are dominant causes of cardiovascular disease and some types of cancer. Also obesity and overweight can raise the risk of chronic diseases, like heart disease and diabetes". There are many approaches can help to deal with this problem such as developing national strategies, policies, and
plans for prevention and health care. Also, implement and enhance the community participation in prevention and health care. Most of the solutions for preventing chronic diseases are expensive [18]. Gregory Hartl & Menno van Hilten (2012) [1], explains that the non-communicable diseases (NCDs) are caused more than 60% of all deaths in the Gulf Cooperation Council (GCC) countries. The risk factors are an unhealthy diet, the use of tobacco and the physical inactivity. The Gulf Cooperation Council (GCC) countries (Saudi Arabia, Bahrain, Sultanate of Oman, Kuwait, Qatar, and United Arab Emirates) are adopting a regional strategy to address, prevent and control of non-communicable diseases (NCDs), like diabetes, cancer, and chronic respiratory disease. It aims to reduce exposure to people from different risk factors and improving the services of preventing and treating the health problems [1]. Hill A.G., et al. (200), mentioned that the awareness about the chronic diseases has been grown in Oman. Morbidity for diagnosis related to chronic diseases, especially cancer, cardiovascular disease, and the endocrine disease was growing and becoming a significant share of Oman's burden of disease [19]. Jawad A.
Al-Lawati et al. (2008), believes that the chronic diseases are posed the main challenge for Omani population [2]. Depending on the ministry of health statistics in Oman during the period (1990 to 2005), approximately 75% of diseases burden is attributable to chronic diseases. "The distribution of chronic diseases and related risk factors among the general population is similar to that of industrialized nations: 12% of the population has diabetes, 30% is overweight, 20% is obese, 41% has high cholesterol, and 21% has the metabolic syndrome". They conclude that the chronic diseases are the major exhaustion on human and financial resources for Sultanate of Oman, and this will affect the advances in the health care system that has been achieved. Similarly, some related works focus on using visualization techniques like interactive map. Jason Dykes (1998), implemented a cartographic visualization for locating symbols on a plane to show the statistical distributions of one or more variables" [12].
V. RESULTS AND DISCUSSIONS
Depends on statistical data which collected from ministry of health in sultanate of Oman, it shown that the prevalence of diabetes in Oman is increasing over years. as illustrated in Figure 8.
Figure 8: Diabetes in Oman (source MoH Oman)
After applying the equation of simple linear regression, the equation will be (Y = 4319.3 + 127.9 * X). and the value of R- squared variable equals to (0.46), this value will describe how data fitted to the regression line. As shown in Figure 9.
Figure 9: Forecasting results for diabetes in Oman after 5 years The results of this forecasting model shown that the prevalence of diabetes in sultanate of oman will increase. And this will help the decision makers to take attention about spread of chronic diseases.
This research paper will implement the simple linear regression method for statistical data analysis and forecasting because the collected data is one variable changed over time. In addition, this paper will select the cartographic visualization method to show the distribution of the data among different geographic locations. In addition, it uses symbols to represent the data; this symbol is scaled with the value of the variable.
CONCLUSION & FUTURE RESEARCH DIRECTION
This paper provided an overview of the chronic diseases in GCC. First, it presents the risk factors that causes the chronic diseases. Then it explains the different methods for statistical data analysis and forecasting models. Moreover, presents
different methods for data visualization techniques is presented and reviewed. The Cartographic technique is appropriate for visualization of Chronic Diseases data because it illustrates the distribution of the data among different geographic locations.
The literature survey proved that a simple linear regression is an appropriate method for data analysis and forecasting data with one variable only.
REFERENCES
[1] Gregory Hartl, Menno van Hilten (January 2012). “WHO recognizes progress of Gulf States for adopting regional strategy to address non- communicable diseases", [Online], available at:
http://www.who.int/mediacentre/news/statements/2012/ncds_20120106/en/
(accessed 15 September 2016)
[2] Al-Lawati JA, Mabry R, Mohammed AJ (July 2008), “Addressing the Threat of Chronic Diseases in Oman", CDC, VOLUME 5: NO. 3.
[3] World Health Organization (2005), WHO global report: "Preventing chronic diseases: A vital investment". Geneva: ISBN 92 4 156300 1.
[4] “What are infectious diseases?", [Online], available at:
http://www.yourgenome.org/facts/what-are-infectious-diseases (accessed 29 July 2016)
[5] “Directorate General for Disease Surveillance and Control in Ministry of Health - Sultanate of Oman”, [Online], available at:
https://www.moh.gov.om/en/web/directorate-general-of-disease-surveillance- control/ (accessed 02 June 2016)
[6] “Mapping Disease", [Online], available at: http://www.the- scientist.com/?articles.view/articleNo/35349/title/Mapping-Disease/ (accessed 23 July 2016)
[7] "Time Series", [Online], available at:
http://www.statslab.cam.ac.uk/~rrw1/timeseries/t.pdf (accessed 03 October 2016)
[8] "Trend Analysis", [Online], available at:
http://pubs.usgs.gov/twri/twri4a3/pdf/chapter12.pdf (accessed 15 June 2016) [9] “Time trends analysis", [Online], available at:
https://ec.europa.eu/jrc/sites/default/files/epaac-wp9-session2-crocetti.pdf (accessed 02 July 2016)
[10] Jabar H. Yousif, Classification of Mental Disorders Figures based on Soft Computing Methods. International Journal of Computer Applications 117(2):5-11, May 2015.
[11] Jabar H. Yousif and Mabruk A. Fekihal. Neural Approach for Determining Mental Health Problems. Journal of Computing, Vol.4, Issue 1, pp6-11 ISSN 2151-9617 ,NY, USA, January 2012.
[12] Jason Dykes (1998), "Cartographic visualization: exploratory spatial data analysis with local indicators of spatial association using Tcl/Tk and cdv", UK, The Statistician, 47, Part 3, pp. 485-497.
[13] Edwin de Jonge (2012), "Data Visualization", Netherlands, Statistics Netherlands, ISSN: 1876-0333.
[14] Heinrich Hartmann (2016), "Statistics for Engineers", USA, ACM Queue 10.1145/2890780.
[15] Ben Fry (2008), "Visualizing Data", USA, O’Reilly, ISBN-10: 0-596- 51455-7.
[16] Adriana REVEIU, Marian DARDALA (2011), "Techniques for Statistical Data Visualization in GIS", Romania, Informatica Economica, vol.
15, no. 3/2011
[17] Ralph D'Agostino, Lisa M. Sullivan (January 2002), "Chronic Disease Data And Analysis: Current State Of the Field", Boston (US). Digital Commons. Vol. 1: No 2, 228-239, Article 32.
[18] Khatib O. (2004), "Non-communicable diseases: risk factors and regional strategies for prevention and care", East Mediterr Health J 2004; 10(6):778- 88.
[19] Hill AG, Muyeed AZ, Al-Lawati JA. (2000), "The mortality and health transition in Oman: patterns and processes". Muscat (OM): World Health Organization, Regional Office for the Eastern Mediterranean, UNICEF Oman.
[20] “Definition of Chronic disease", [Online], available at:
http://www.medicinenet.com/script/main/art.asp?articlekey=33490 (accessed 07 June 2016)
[21] "Histogram", [Online], available at:
http://searchsoftwarequality.techtarget.com/definition/histogram (accessed 03 October 2016)
[22] Astrid Schneider, Gerhard Hommel, and Maria Blettner (2010), "Linear Regression Analysis", Germany; Medicine. 107(44): 776–82.
[23] Christoph Klose, Marion Pircher, Stephan Sharma (May 2004),
"Univariate Time Series Forecasting", UK.
[24] Choong-Yeun Liong and Sin-Fan Foo (2013), "Comparison of Linear Discriminant Analysis and Logistic Regression for Data Classification", Malaysia, LLC 978-0-7354-1150-0.
[25] Suzilah Ismaila, Rohaiza Zakariaa and Tuan Zalizam Tuan Mudab (2014), "Univariate Time Series Forecasting Algorithm Validation", Malaysia, LLC 978-0-7354-1274-3.
[26] Adela SASU (2013), "A Quantitative Comparison of Models for Univariate Time Series Forecasting", Romania, University of Brasov, 117- 124.
[27] Kelly H. Zou, Kemal Tuncali, Stuart G. Silverman (2003), "Correlation and Simple Linear Regression", USA, Radiology; 227:617–628.
[28] Mohammad S. Alam (2013), "Analytical Review of Data Visualization Methods in Application to Big Data", Russia, Journal of Electrical and Computer Engineering, Article ID 969458, 7 pages.
[29] Peter Filzmoser, Karel Hron, Clemens Reimann (2009), "Univariate statistical analysis of environmental (compositional) data: Problems and possibilities", Austria, ScienceDirect, STOTEN-11466.
S Author year Place Method of
study Major Findings Merits Limitations
1 Hill A.G., Muyeed A.Z., Al-Lawati J.A.
[19]
2000 Oman Numerical and
Analytical
Diabetes has become a major chronic (or non- communicable) disease problem in Oman. The prevalence of diabetes varies among different governorates in Oman; the percentage for diabetes in 2000 is 13%.
The awareness about the chronic disease has been grown in Oman.
The percentage for diabetes in Oman was the highest reported in the arab countries.
2 Khatib O.[18] 2004 Oman Analytical The incidence of chronic diseases is rising in the middle east region. In 2000, 47% of the region’s load of disease is due to Non- communicable diseases and it is expected that this will rise up to 60% by the year 2020.
There are many strategies: developing a national strategies, policies and plans for prevention and care, also implement and enhance community participation in prevention and care.
The risk factors associated with chronic diseases are: physical inactivity, unhealthy diet and smoking.
3 World Health Organization [8]
2005 Geneva Analytical and Numerical
The number of deaths in 2005 is about (58) million, around (35) million approximately of deaths resulted by chronic diseases. It represents about 60% of global deaths which caused by chronic diseases.
If the current trends continue, chronic diseases by 2030 will rise about 70% of the total deaths globally.
Many governments around the world spent billions of dollars for medication to control and prevent the spread of chronic diseases
Chronic diseases are a threat and it is growing over the time and it will affect all countries.
4 Jawad A. Al- Lawati, Ruth Mabry
2008 Oman Analytical The chronic diseases pose the main challenge for
Health planners and decision makers in
Chronic diseases are the major exhaustion on
and Ali Jaffer Mohammed [20]
Omani population. During the period (1990 to 2005), approximately 75% of diseases burden is attributable to chronic diseases.
Oman have greater commitment to the provision of services for people with chronic diseases.
human and financial resources for Oman, and this will affect the advances in the health care system that has been achieved.
5 Gregory Hartl, Menno van Hilten
[1]
2012 Geneva Analytical The chronic diseases cause more than 60% of all deaths in the Gulf Cooperation Council (GCC) countries.
All the GCC countries adopting regional strategy to address prevent and control of chronic diseases.
The regional strategy in all the Gulf Cooperation Council (GCC) countries aims to reduce exposure of people to risk factors and improving the services to prevent and treat these health problems.
There are many risk factors related to chronic diseases, like:
unhealthy diet, the use of tobacco and the physical inactivity.
6 Ralph D'Agostino, Lisa M. Sullivan
[17]
2002 Boston (US)
Logistic Regression Analysis, and Survival Analysis
The model for chronic disease consists of four stages or phases: disease free, pre-clinical (latent period), clinical manifestation, and follow- up. Good statistical approaches involve hypothesizing models for these stages, collecting appropriate data, and then fitting and testing the appropriate models. The risk factors are: age, gender, smoking status, blood pressure and cholesterol.
The logistic regression analysis can examine the effects of risk factors on the development of disease. And the survival analysis (time to event data) deal with time until the event of interest occurs.
There are many methodological issues related to chronic disease analysis, the first one is based on changing definitions of risk factors and outcomes over time.
And the second issue is missing data.
7 Kelly H. Zou, Kemal Tuncali, Stuart G. Silverman
[27]
2003 USA Simple Linear Regression
Simple linear regression measure the linear relationship between a predictor variable and an
-The values of dependent variables can be estimated from the observed..
- missing values is a common problem.
outcome variable.
- It summarise the real observations of model.
8 Christoph Klose, Marion Pircher, Stephan Sharma
[23]
2004 UK Autoregressive
moving average (ARMA) linear
model
Forecasting of time-series can be done with different linear models, including Autoregressive (AR), Moving average (MA), and
ARMA model. For
building ARMA model, there are three phases:
model identification, estimation (or fitting) and validation.
- The ARMA model process is stationary.
- Also it can minimize the number of parameters.
- ARMA model is suitable only for the time series that is stationary (for example its mean and variance should be constant over time).
- It is required that there are at least 40 observations in the input data.
- It is also supposed that the values of the estimated parameters are constant during the series.
9 Peter Filzmoser, Karel Hron, Clemens Reimann
[29]
2009 Austria multivariate data analysis
The process of statistical data analysis starts with looking at the data with appropriate graphical tools.
For instance, histogram is giving us an idea about the data distribution.
It can estimate and view the relationship between different variables.
- This method is complex and involves high level of mathematical
calculations.
- The results are not sometimes easy to interpret.
- It needs a large amount of data.
10 Astrid Schneider, Gerhard Hommel, and Maria Blettner
2010 Germany Linear Regression Analysis
There are three types of regression analysis: Linear regression, Logistic
- Describe the relationships between dependent and
The most common problem in medical data analysis is missing
[22] regression, and Cox regression. And the Linear Regression is the most common approach or tool used for modelling univariate time-series. And this tool is used for statistical and predictive analysis.
independent variables.
-The values of dependent variables can be estimated from the observed.
- The risk factors that affect the outcome can be identified.
- It summarise the real observations of model.
values.
11 Choong-Yeun Liong and Sin-Fan Foo
[24]
2013 Malaysia logistic regression (LR) analysis and Linear discriminant
analysis (LDA)
There are two methods used for classifying data: logistic regression (LR) and Linear discriminant analysis (LDA), these methods of classifying data depends on set of predictor variables.
The Logistic regression (LR) is more robust than the Linear discriminant analysis (LDA).
- Logistic regression (LR) method performs better distribution of the data.
- Linear discriminant analysis (LDA) needs short computing time with large sample size.
- Logistic regression (LR) needs long computing time with large sample size.
- The Linear
discriminant analysis (LDA) will give low performance when the prior probability is equal for all groups.
12 Adela SASU [26] 2013 Romania ARIMA
Linear Regression (LR(
Multilayer perceptrons network
(MLP)
The purpose from the forecasting process is to observe or model the existing data series to predict accurately the future unknown data values, for example: after five years.
The Linear regression (LR) and Multilayer perceptrons network (MLP) give more accurate predictions.
- ARMA model is suitable only for the time series that is stationary.
- ARIMA gives less accurate predictions results.
13 Suzilah Ismaila, Rohaiza Zakariaa
2014 Malaysia Quantitative (Projective)
The process of forecasting is not simple, because it
- This technique gives valuable judgment.
- This technique require large amount of data in
and Tuan Zalizam Tuan Mudab [25]
forecasting technique
requires expert and implicit knowledge in producing precise forecast values. For statistical data, the quantitative forecasting technique is suitable.
- Also it giving more weight to recent data.
order to formulate the mathematical model.
14 Jason Dykes [12] 1998 UK Cartographic visualization
Cartographic visualization is a map based method.
And it is used by locating symbols on a plane to show the statistical distributions of one or more variables.
Also local statistics and indicators can be calculated and visualized in a dynamic fashion for exploration.
- Cartographic visualization use the locations inside map to
show data values.
- Interactive way.
- Very hard for automatically created dynamic maps.
15 Ben Fry [15] 2008 USA Mapping The visual can
communicate or announce more information than the table within a small space.
- make comparison between different areas.
- present the geographic location and the distribution of the data.
- The size of symbol may hide the location on map.
- It consumes more time to execute, depends on size of data.
16 Adriana REVEIU, Marian DARDALA
[16]
2011 Romania Cartographic visualization
Cartographic visualization provides the facilities to represent statistical data inside the map.
- Show the distribution of statistical data inside the regional map.
- use different symbols to show more information at the same time inside the map.
- It consumes more time to execute, when the volume of data increase.
17 Statistics Netherlands [13]
2012 Netherla nds
Proportional symbol map
Proportional symbol map is a good way of viewing statistical data. It aims to view the characteristics of a subject on the map.
- present the geographic location and the distribution of the data.
- make comparison between different areas.
- summarise large amounts of data.
- It is difficult to calculate the actual value (if not shown).
- It consumes more time to execute.
- The size of symbol may hide the location on map.
18 Mohammad S.
Alam [28]
2013 Russia - Tree-map
- Circle Packing
- Sunburst
- Circular Network Diagram
One of the most requirements that the analyst tools should meet is to show more than one view per representation display. Circular Network Diagram is the most appropriate method for visualizing the statistical data.
- Tree-map:
hierarchical grouping clearly shows data relations.
- Circle Packing:
space-efficient.
- Sunburst: easily perceptible by most humans.
- Circular Network Diagram: allows us to make relative data representation. And inside the circle, the resolution varies linearly, increasing with radial position.
- Tree-map: not suitable for examining historical trends and time patterns.
- Circle packing: same as for Tree-map method.
- Sunburst: same as for Tree-map method.
- Circular Network Diagram: objects with the smallest parameter weight can be suppressed by larger ones.
19 Heinrich Hartmann [14]
2016 USA Histograms
Line plots
One of the most essential data analysis methods is visualization. The Histogram is a
- Histogram view number of values within interval.
- Histogram used only for numerical.
- Line plot is difficult to
Rug plots
Scatter plot
method. It is commonly used to show the distribution of numerical data.
- Line plot show changes in data over
time.
- Scatter plot show the relationship between
two variables.
read accurately, if there is a wide range of data.
- Scatter plot cannot show the relation of more than two variables.