• No results found

Exercise 4: Spatial statistics

N/A
N/A
Protected

Academic year: 2022

Share "Exercise 4: Spatial statistics"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

Exercise 4: Spatial statistics RYM-C2004 Fall 2021 Contents

Description: ... 1

Step 1: Import GIS data from CSV file ... 1

Step 2: Reprojection... 3

Step 3: Joining the tables ... 4

Step 4: Heatmap ... 4

What to report? ... 7

Questions and problems? ... 8

Description:

In this exercise we will work with vector data scraped from eat.fi website. The data includes restaurants in Helsinki area represented as points. In this exercise, we will learn a few new vector spatial analytical methods and perform simple spatial statistics. During this exercise, we will learn how to:

• import GIS data from a spreadsheet

• work with CRS and projections

• perform a join

• create a heatmap

Step 1: Import GIS data from CSV file

Our vector data is in CSV format. So, we need to first prepare our GIS data.

Let’s open the data and see what we have there. You can open CSV file in Excel. We have a number of fields with information on each restaurant retrieved from eat.fi . In this file you can see fields called latitude and longitude which contain the coordinates for each restaurant. Using these fields, we can geocode our restaurant points and display them in GIS.

(2)

Let’s import the data in QGIS. You can open the import window by pressing Ctrl+Shift+T or from Layer

> Add Layer > Add Delimited Text Layer…

In this new window let’s locate our Restaurants_points.csv file in the File name. Let’s set Layer name to Restaurants (or anything you like). File format is of course CSV. Point coordinates should be loaded automatically. Make sure it matches the following:

X field: longitude Y field: latitude

(Z field and M field are left blank)

The dataset is “WGS 84: ESPG:4326” coordinate system. Make sure it is set correctly in Geometry CRS.

Make sure everything looks OK in the Sample Data. If all looks good click Add. The spatial layer is now added to QGIS.

While on the same window, let’s also add the restaurants_info.csv file. This time there us no geometry so select “No geometry (attribute table only)”.

To check if we have set the coordinate system correctly, we need to see if points appear in the right location. The data is from HMA (Helsinki metropolitan area). Let’s add a base map and see if it appears in the correct location. If it appears at some random location on the globe, it means the CRS is defined incorrectly.

(3)

Step 2: Reprojection

The data is retrieved from a web map and the global coordinate system WGS84 is very common with these maps. However, our analysis is local, and it is best to proceed with a CRS that best suits our region: TM35FIN (EPSG:3067)

To change our CRS, we need to reproject the data. This used to involve a lot of calculations but now we have ready GIS tools for that. In the Processing Toolbox search for “Reproject layer” or find it under

“Vector general”.

Our input layer is the restaurants data. You can select the right target CRS by clicking on the Globe sign and searching for “3067”. Click OK and then Run. I save the new layer as “restaurants_prj”. Go ahead and delete the old restaurants layer.

(4)

✨ If your base map looks stretched and weird it is because your data CRS and project CRS are not matching. You can change your data frame CRS to TM35FIN (EPSG:3067) from top menu Project > Properties > CRS

Step 3: Joining the tables

The restaurant locations and the additional information about them are now in two separate files. We need to link the two datasets so that we can study them together. For this, we need to perform a

“Join” using a common field in the two datasets. The common field can be any kind of ID that can show which rows in the two datasets correspond. In our case, you can see that there is a field named “id”

which is common between the two datasets (see for yourself in the attribute table).

You can perform join by right clicking on your point layer > Properties > Joins

We will join our points with restaurants_info table, matching cases based on “id” fields in both datasets. Click OK and then OK again.

Now if you go to the attribute table of your points layer, you can see the newly added fields from the info table.

✨ You can always come back to this window to edit/add/remove Joins.

Step 4: Heatmap

With the help of a heatmap we can identify the restaurant hotspots in Helsinki area. This can be purely based on the number of restaurants or we can include other factors such as the quality or rating of the restaurants. Let’s try a few of these possibilities here.

QGIS has made it very nice and easy to make nice heatmaps. In the layers panel right click on your restaurant points layer and go to properties. Go to Symbology and in the dropdown list at the very top of the window choose Heatmap as the visualization method.

Choose a color ramp that is most intuitive. For example, ranging from white to red. Once you select your color ramp, click on it and set Color 1 to transparent. This way, the areas with low density of

(5)

restaurants will be transparent so you can still see your base map (you can try the map without this and see what it looks like). Click OK and now we are back in the symbology window.

Now we need to select the radius for the heatmap measurement. This determines the distance within which the densities are calculated. In front of Radius, there is a list which by default should be on Millimeters. Here we have two sets of options:

For a dynamic heatmap: Choose Millimeters, points, pixels, or inches: If you use this kind of radius definition, the heatmap will update each time you change extent of your map, zoom in, or zoom out.

let’s make a dynamic heatmap with a 10 mm radius. Click Apply (or OK) and explore your map.

For a static heatmap: Choose map units as the type of radius. This way, the actual map units will be used for definition of radius and therefore your visualization will not be impacted by the extents of your view. Our uses the CRS TM35FIN which a metric system, meaning that distances are defined in meters. So, the value we enter as radius will be in meters. Let’s use a 100 m radius distance and click Apply (or OK).

(6)

❓ How does the map look? Where are the hot spots? Describe any patterns you see.

❓ Try to make the same heat map with bigger and smaller radius distances. How does the result change?

✨ Choice of radius in heatmaps and density maps in general is very important as it can significantly affect what you see. Knowing your data, the scale of your study, and trial and error often give you good clues on a good radius value.

✨ You can adjust the rendering quality to find the right balance between map quality and rendering speed. If your map is slow to load, trying to lower the resolution (move the slider toward faster)

Let’s modify our heatmap a little bit. Now we want to also account for the overall rating of the restaurants in determining the hotspots. So, we want to see which places have a higher concentration of Highly rated restaurants. For this we will use the field “Restaurants_info_ratingAvgQuality”.

Go back to the symbology window where we created our heatmap. Find the field name

“Restaurants_info_ratingAvgQuality” in the list Weight points by. Click apply. Do you see any changes in the heat map? Let’s increase the effect of rating in the heatmap calculation to accentuate the patterns. In the field Weight points by add ^ 10 at the end of the field name:

Restaurants_info_ratingAvgOverall ^ 10

This way the number of restaurants is weighted by restaurants ratings to the power of 10 (rating10) and therefore the rating has a much stronger effect on the outcome. Click apply or OK.

(7)

❓ How does the map look now? How did the patterns change? Where do we have a higher concentration of highly quality restaurants?

Let’s repeat the same process but this time for the price average “Restaurants_info_priceAvg”.

So we will write:

Restaurants_info_priceAvg ^ 10

Click OK or Apply and explore your map.

❓ How does the map look now? How did the patterns change? Where do we have a higher concentration of expensive restaurants?

✨ If you want to try other possibilities or generate a raster output of your heat map, try the tool Kernel density estimation. This is not required for this exercise, but it is a useful tool to learn. Now you already know how to use it. The advanced parameters in this tool give you more control over your heat map. (The tool may take a while to run for this size of data)

What to report?

• Explain in your own words what we did in this exercise

• Include screenshots of the process and some of the steps you took in this exercise

• Include at least two final maps (for example: hot spot based on restaurant quality and price)

• Answer these questions: (if needed, include maps to justify your response) o What is the difference between the dynamic and static heat map?

o Where do we have the highest concentration of restaurants in Helsinki?

o How is the distribution of expensive restaurants? Where are the hotspots?

DEADLINE: 15.10

(8)

Questions and problems?

If you have any difficulty doing this exercise or have any questions, ask the instructor during exercise sessions, in the course forum, or write him at [email protected]

References

Outline

Related documents

To change it to the spatial extent of LRT project in Waterloo Region, click the rt_stops layer and them Zoom to Layer... Now you are going to set the map extent to the same as

 Note: Each time you update your RefWorks account and you want to work offline, you will need to repeat steps 1 through 4 above to update the references stored on your

If an employee chooses not to continue the life insurance during an unpaid leave, upon their return to active, eligible employment, they will be required to complete a Life

One half of estate to spouse or adult interdependent partner, One half of estate split equally to children or issue Not married or common law, with a child..

When a network contains multiple loops, there will be pipes common to adjoining loops with a clockwise flow in one loop appearing as anti-clockwise in the other.. Each loop must

The professors and courses encourage students to create unique and successful solutions to visual problems.. Successful design results in effective visual

combining the 10 synthetic DEMs generated from this input. b) and c) Volume and height: Solid black and black dashed lines as a), with black

– Member of the Canadian Investor Protection Fund, TD Waterhouse Private Investment Counsel Inc., TD Waterhouse Private Banking (offered by The Toronto-Dominion Bank) and