Technologies and Developments for Earth Observations Data Analysis and Visualisation

(1)

Technologies and Developments for Earth

Observations Data Analysis and Visualisation

Uttam Kumar

Centre for Ecological Sciences, Indian Institute of Science,

Bangalore- 560012. Email: [email protected] Volumes of Data Mining Information Pattern?

(2)

10

th

January, 2013 Agenda

•

Some new Image Classification Techniques

for

handling coarse resolution data

for LCLU

applications.

–

The mixed pixel problem

–

Hybrid Bayesian Classifier

•

Development of Free and Open Source Software:

GRDSS for Geospatial Applications

•

Web Based Application for Geovisualisation

(3)

•

Some New Image Classification Techniques for

handling coarse resolution data for LCLU

applications.

–

The mixed pixel problem

–

(4)

Time scale

Satellite / Source Sensor Spectral bands Spatial resolution in metres (m) Temporal resolution 1972 – 1999 Landsat -1, 5, and 7 MSS, TM, ETM+ PAN, VIS, NIR, MIR,

TIR 15 m – 120 m (moderate spatial resolution) 16-18 days (free) 1988 – 2010

IRS-1C/1D, P6 PAN, LISS-III PAN, VIS-2, NIR-1 (low spectral resolution) 5.8 m – 23.5 (high to moderate spatial resolution) 24 days (medium cost, moderate temporal resolution) 1999 – Till date

IKONOS OSA PAN, VIS-3, NIR-1 1 m (PAN) 4 m (Others) (high spatial) 1-3 days (costly) : : : : : : : : : : : : 1999 – Till date MODIS (Terra, Aqua) VIS, NIR, MIR, TIR 36 (high spectral resolution) 250 m – 1 km (low spatial resolution) 1-2 days

(free & high temporal resolution) 2002 SRTM (Shuttle Radar Topography Mission) --- DEM-1 90 m 1 time (free) 2002 Radar- Hydro 1K Asia --- Precipitation, Slope, Aspect-1 1 Km 1 time (free)

(5)

(6)

Develop techniques for deriving information from coarse

spatial resolution data (such as MODIS).

(7)

1.) What are the techniques to obtain class proportions from mixed pixels?

2.) What are the ways of identifying/extracting endmembers from the bands?

3.) How to address the mixed pixels when objects reflectance’s are non-linear mixtures in nature?

4.) How can we address the intra-class spectral variation or endmember variability?

5.) How can we predict class abundance’s spatial distribution at sub-pixel resolution within a particular pixel obtained from linear/non-linear mixture models?

(8)

n 1 ( , ) . ( , ) N n n y x y e



x y  



η n

( , )

x y



is a scalar value representing the functional coverage of endmember vector e_n at pixel y(x, y).

Constraints:

Abundance nonnegativity constraint

Abundance sum-to-one constraint

1.) 2.)





y

Eα η

0, : 1

n

N







 

1 1 N n n   



This can be solved in two ways: 1. Ordinary least square

2. Orthogonal subspace projection

Linear unmixing

a

b c

(9)

Ordinary Least Squares

• The conventional approach to extract the abundance values is to minimise

||

y Eα



||

T 1 T

(

)





α

E E E y

The Unconstrained Least Squares (ULS) estimate of the abundance is

||yEα||

Gives the Constrained Least Squares (CLS) estimate of the abundance as,

T 1 T ( ) 1 2      _  _   α E E E y T T 1 T T T 1 2(1 ( ) 1) 1 (E E) 1      E E E y

(10)

Orthogonal Subspace Projection

• The technique involves

(i) finding an operator which eliminates undesired spectral signatures, and then (ii) choosing a vector operator which maximises the SNR of the residual spectral

signature

• General linear unmixing equation:

r

=

M



+

n

r = column vector of digital numbers

M = matrix representing target spectral signature α = abundance fraction

n = model error

=



_p

+





r

d

U

n

 the (d, U) model to annihilate U We apply an operator “P” on this model

to the (d, U) model that results in a new signal detection model

#

P

 

I UU

Where is the pseudo-inverse of # T -1 T U

=( )

(11)

p

P

r



P

d





P

U





P

n

On applying P on

r

=

d



_p

+

U





n

we get

P operating on Uγ reduces the contribution of U to about zero

p

P

r



P

d





P

n

we get

On using a linear filter specified by a weight vector xT_{on the OSP model, the} filter output is given by

T T T

x Pr  x Pd



_p x Pn

Now, we need to maximize signal to noise ratio (SNR) of the filter output

T 2 T T T x x SNR(x) x E{ } x T p T P P P P



 d d nn = 2 _T _T 2 T x x x x T p T P P PP   dd

Maximisation of this is a generalized eigenvalue-eigenvector problem

T

x=λ

x

T T

P

dd

P

PP

2 p

λ=λ(σ /α )

where

The eigenvector which has the maximum λ is the solution of the problem and it turns out to be d.

(12)

One of the eigenvalues is

d

T

P

d

and it turns out that the value of xT_{(filter) which maximizes the SNR is}

T T

x



k

d

Applying

d

T

P

on T T T p PP  PP



 PP d r d d d n T T p

P





d

r

d

α is the abundance estimate of the pth target material.

p

P

r



P

d





P

n

_{applying “P” on}Obtained by

= _p+  

(13)

FCC of the study area from (a) IKONOS (PAN and MS fused), (b) IKONOS MS, (c) Landsat ETM+ and (d) MODIS.

(14)

Remote sensing data sets used for validating CLS and OSP algorithms

Data Spectral bands

Spatial resolution

Dimension 2 classes 3 classes 4 classes

IKONOS PAN and MS fused 4 1 m 8000 x 8000 vegetation, non-vegetation urban, vegetation, water urban, vegetation, water, open area IKONOS 4 4 m 2000 x 2000 vegetation, non-vegetation urban, vegetation, water ---Landsat 6 30 m resampled to 25 m 320 x 320 vegetation, non-vegetation urban, vegetation, water urban, vegetation, water, open area MODIS 7 250 m 32 x 32 vegetation, non-vegetation urban, vegetation, water urban, vegetation, water, open area

(15)

Unmixed outputs from CLS and OSP for 2 classes (vegetation, non-vegetation), and 3 classes (urban, vegetation and water) from IKONOS MS data.

(16)

Results

•

Correlation and RMSE for IKONOS, Landsat ETM+

and MODIS images for 2, 3 and 4 classes.

(17)

Endmember Selection

• The rationale behind the new method is that given the

spectral reflectance of the mixed pixel, if the

proportions of all the endmembers (



n

;

n

= 1 to

N

) in

that pixel are known, then the spectral reflectance of

each endmember that constitute the mixed pixel can be

approximated by inverting the LMM.

~ , m i j y ~ , n i j  T -1 T

[

] (

)



E

α α

α y





y

Eα η

The endmember estimate for each band turns out to be

(18)

For 1 1 2 2 n n N N 1 1 ~1,1 ~1 ~1,1~1 ~1,1~1 ~1,1~1 _~_1,1 _~_1,1 1 1 2 2 n n N N 1 1 ~1,2 ~1 ~1,2 ~1 ~1,2 ~1 ~1,2 ~1 _~_1,2 _~_1,2 1 1 2 2 n n N N 1 1 ~i, j ~1 ~i, j ~1 ~i, j ~1 ~i, j ~1 _~_{i, j} _~_{i, j} ~ r ... ... ... ... : ... ... : e e e e y e e e e y e e e e y                                      1 1 2 2 n n N N 1 1 ~ ~ ~ ~ ~ ~ ~ ,c 1 r,c 1 r,c 1 r,c 1 _~_r,c _~_r,c ... ... e  e   e   e   y ~ , ~ _~ ~ , ₁ ,

(

)

N m n n m i j m _{i j} i j _n

y



e









1 1 2 ₁ 1,1 1,1 1,1 1,1 ₁ 1 1 2 2 1,2 1,2 1,2 1 1,2 1 2 1 1 , , , _, ... ... : : : : : : ...    _{    }  _{    }  _{    }  _{     }  _{    }  _{    }  _{    }     _ _ N N N N r c r c r c _{r c} y e y e e _y          : : : : _: : : : : : : : : : _: T -1 T

[

] (

)



E

α α

α y

This is done for each

band separately.

(19)

To compare the performance of PBEE, three endmember identification methods were used:

1. a fully automatic endmember extraction technique – N-FINDR,

2. a supervised interactive technique – a combination of N-Dimensional Visualisation and Scatter Plot and

(20)

Scatter plots for various band

(21)

Abundance maps for the 6 classes Row1 – original abundances obtained from LISS-III classified map resampled to MODIS image size, row2 – PBEE, row3 – N-FINDR, row4 – N-Dimensional Visualisation, row5 – K-Means clustering.

Original PBEE N-FINDR N-Dimensional Visualisation K-Means Clustering

(22)

PBEE

NFINDR Endmember behaviour for the 6 classes (a) to (f) in 7 bands for various techniques.

(23)

• From CC and RMSE, it is concluded that

inversion of

the LMM can provide a better estimate than other

automatic, supervised interactive and semi-automatic

methods.

• Shortcoming

– abundances should be available per class from some high

resolution classified image of the same time frame as that of the

low spatial resolution image with detailed ground information.

(24)

Non-linear Mixture Model

Kumar, U., S. Kumar Raja, Mukhopadhyay, C., and Ramachandra T. V., (2011), A Multi-layer Perceptron based Non-linear Mixture Model to estimate class abundance from mixed pixels, Proceedings of the 2011 IEEE Students’ Technology Symposium, Indian Institute of Technology, Kharagpur, India, 14-16 January, 2011, Abstract page Number – 31, Track 4 – Image and Multi-dimensional Signal Processing.

(25)

►

NLMM accounts for interactions among the ground cover

materials (multiple reflections among the materials on the

surface).

►

Also accounts for topographic features (slope) of the

ground surface.

(26)

Non-linear Mixture Model

= ( , ) +

f

y

E α

η

where, f is an unknown non-linear function that defines the interaction between E

(27)

Architecture of the MLP model.

Structural diagram of the MLP.

The activation rule used here for the hidden and output layer nodes is defined by the logistic function

1 ( ) 1 x f x e  

(28)

Simulated Data Set

A 200 band hyperspectral images generated from spectral libraries of four different minerals - (a) band 1 (b) band 100 (c) band 200.

(x,y)





_n



_n

(x,y)

n

s

4 =1

y

sig

n sig

where is the signature corresponding to nth_mineral,

n

(x,y)log(1 α (x,y))

n

s

is the contribution of endmember e_nand α (x,y)_n is the fractional abundance of e_n in the pixel at (x,y).

(29)

Abundances details of four minerals obtained from the LMM and NLMM.

BDFs of simulated test data obtained from LMM and NLMM.

LMM

NLMM

(30)

(a) LISS-3 classified map resampled to 100 x 100 pixels.

(b) agriculture, (c) builtup / settlement, (d) forest, (e) plantation / orchard, (f) waste land / orchard, (g) Water bodies

Abundances of six categories from NLMM.

(31)

BDFs of MODIS test data from NLMM. (a) agriculture, (b) builtup / settlement, (c) forest, (d) plantation / orchard, (e) waste land / orchard, (f) water bodies.

Correlation and RMSE between actual and predicted proportions.

Classes Correlation (r) (p < 2.2e-16₎ RMSE LMM NLMM LMM NLMM Agriculture 0.6730 0.9110 0.0518 0.0271 Builtup / Settlement 0.6390 0.9345 1.0519 0.0083 Forest 0.7310 0.9411 0.0257 0.0062 Plantation / Orchard 0.6990 0.9447 0.0280 0.0061 Waste/Barre n land 0.6599 0.9342 0.0431 0.0073 Water bodies 0.7799 0.9855 0.0061 0.0016

(32)

Error distribution of MODIS abundance obtained from NLMM (X and Y axes are the two dimensions in feature space and Z axis is the absolute difference between real and estimated class proportion) for the six classes.

(33)

• Computer simulated data - overall RMSE

0.0089±0.00215 with LMM and 0.0030±0.0001 with the

NLMM when compared to actual class proportions.

• The unmixed MODIS images - overall RMSE of NLMM

was 0.0191±0.022 as compared to LMM 0.2005±0.41

indicating that individual class abundances obtained

from NLMM is very close to what is present on the

ground

and observed in the high resolution classified

image.

(34)

Which side of pixel is the class situated?

Unmixed abundance map of builtup

(35)

Pixel Swapping Algorithm

Kumar, U., Mukhopadhyay, C., Kumar Raja S., and Ramachandra T. V., (2008), Soft classification based Sub-pixel allocation model, International Conference on Operations Research for a growing nation in conjunction with the 41st Annual Convention of Operational Research Society of India, Tirupati, AP, India, 15-17 December, 2008.

(36)

Pixel swapping algorithm

can increase the

resolution of the OSP output from

136 x 140

to

1360 x 1400

The swapping algorithm

1. Requires some spatial correlation between pixels.

2. Maximize the autocorrelation between the pixels of the image

3. It takes the abundance output and transforms it into a map of hard LC class map defined at the sub-pixel scale.

Limitation - it only allows the mapping of hard binary LC (target, non-target) classes.

Atkinson, P. M., 2005, Sub-pixel target mapping from soft-classified, remotely sensed imagery. Photogrammetric Engineering & Remote Sensing, 71(7), pp. 839–846.

(37)

Sub-pixel mapping of a linear feature and a circle:

(A) Test image-line (B) Abundance (C) Random allocation (D) After convergence (E) Test image-circle, (F) Abundance (G) Random allocation, (H) Converged map

 Nearest neighbour - 3 and non-linear parameter of the exponential model α was also set to 3.

(38)

PS on MODIS image

(A) Builtup pixels shown in black and non-built shown in white, (B) Sub-pixel map of builtup,

(C) Converged map of the builtup after applying PS algorithm.

LISS-III (25 m) MODIS abundance (250 m) PS MODIS (25 m) LISS-III Classified (25 m)

(39)

• Sensitivity - (0.6) (proportion of actual positives which are correctly

identified)

• Specificity - 0.69 (proportion of negatives which are correctly

identified)

• PPV - 0.6 (precision of positives that were correctly identified)

• NPV - 0.69 (precision of negatives correctly identified)

• With the ground truth, the accuracy was 76.6%

(40)

Kumar, U., Kumar Raja S., Mukhopadhyay, C., and Ramachandra T. V., (2011), Hybrid Bayesian Classifier for Improved Classification Accuracy. IEEE Geoscience and Remote Sensing Letters, vol. 8, no. 3, pp. 473-476.

(41)

• In HBC, the class prior probabilities are determined by unmixing a

supplement low spatial-high spectral resolution multi-spectral (MS)

data that are assigned to every pixel in a high spatial-low spectral

resolution MS data in Bayesian classification.

(42)

Results

Classifiers → Bayesian classifier HFC Class ↓ PA UA PA UA Concrete roofs 69.99 84.01 76.49 ↑ 93.89 ↑ Asbestos roofs 84.77 87.77 91.89 ↑ 94.46 ↑ Vegetation 94.21 87.55 87.24 ↓ 89.13 ↑ Blue plastic roof 84.33 81.17 97.00 ↑ 85.60 ↑ Open area 51.49 69.49 95.00 ↑ 74.22 ↑ Average 76.96 81.99 89.52 ↑ 87.46 ↑

Accuracy Assessment for IKONOS data Accuracy Assessment for LISS-III data

Classifiers → Bayesian classifier HBC

Class ↓ PA* UA* PA* UA*

Agriculture 87.54 87.47 90.15 ↑ 95.56 ↑ Builtup 85.11 81.68 89.39 ↑ 98.33 ↑ Forest 85.71 88.73 92.61 ↑ 96.36 ↑ Plantation 84.44 91.73 95.95 ↑ 91.03 ↓ Waste land 88.03 90.37 98.67 ↑ 89.66 ↓ Water bodies 90.91 88.89 88.18 ↓ 97.00 ↑ Average 86.96 88.15 92.49 ↑ 94.66 ↑

(43)

• Increase in overall accuracy by

– 6% for with IRS LISS-III MS and MODIS

– 9% with IKONOS MS and Landsat MS

as compared to conventional Bayesian classifier.

(44)

Free and Open Source Tools for Geoinformatics

(45)

(46)

GRASS GIS

• GRASS (Geographic Resources Analysis Support

System) is a free GIS software used for

– geospatial data management and analysis,

– image processing,

– graphics/maps production,

– spatial modelling, and

– visualization.

• One of the world’s biggest open source project,

• Official project of the Open Source Geospatial

(47)

First GRASS Mirror Site (Tier 1) in India at IISc

http://wgbis.ces.iisc.ernet.in/grass

(48)

(49)

GRASS Wiki:

(50)

(51)

(52)

(53)

GRDSS

design and

conceptual

diagram.

(54)

GRDSS

data

flow

(55)

Functionalities of GRDSS…

…A Quick Look

(56)

(57)

(58)

(59)

(60)

(61)

(62)

(63)

(64)

(65)

Applications …

Web services

User Interface Platforms: Linux, Handheld _{Raster map operations}

Vector map operation Image Processing LiDAR

(66)

FOSS Kiosk

(67)

(68)

(69)

(70)

GIS Layers and Visualisation Front end

• Elevation • LULC • Place names • Roads • Energy • Communication facilities • Anganwadi centres • Educational Facilities • Medical Facilities • General Facilities • Watershed boundaries

• Water Flow structures

• Sacred groves

• Canals, rivers, ponds

• Streams

• Admin boundary

• Ka-Map from maptools.org (works on

Apache, UMN Mapserver, PHP

(71)

(72)

(73)

Current on going project:

LCLUC studies of major metropolitan

cities of India: A glimpse

(74)

(75)

Bangalore City

(76)

“We are waiting for the city to come to

us…”

(77)

LCLUC in Bangalore

(78)

2010

(79)

1973

1992

1999

2006

2010

(80)

Types of urban outlying growth highlighted in box –

(A) isolated growth,

(B) linear branching (road/corridor), (C) clustered growth.

Diffusive growth

Urban growth map

(81)

Analysis of Land Surface

Temperature

Decreasing Lakes and Parks _{Urbanising Bangalore}

(82)

(83)

N S E W NW NE SW SE

Dividing Bangalore into directional zones

0 500 1000 1500 2000 2500 3000 3500 4000 1970 1980 1990 2000 2010 Year A re a (h a) 0 200 400 600 800 1000 1200 1400 1600 1800 2000 1970 1980 1990 2000 2010 Year A re a (h a ) 0 500 1000 1500 2000 2500 3000 3500 4000 1970 1980 1990 2000 2010 Year A re a (h a ) 0 500 1000 1500 2000 2500 3000 1970 1980 1990 2000 2010 Year A re a (h a) 0 1000 2000 3000 4000 5000 6000 7000 8000 1970 1980 1990 2000 2010 Year A re a (h a ) 0 1000 2000 3000 4000 5000 6000 7000 8000 1970 1980 1990 2000 2010 Year A re a (h a ) 0 1000 2000 3000 4000 5000 6000 7000 8000 1970 1980 1990 2000 2010 Year A re a (h a) 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 1970 1980 1990 2000 2010 Year A re a (h a) N _NE E _SE S _SW W _NW

(84)

Directional Analysis of Land Surface Temperature

Direction Mean LST±SD N 21.30±2.39 NE 22.15±2.22 E 21.01±2.47 SE 21.34±2.30 S 21.71±2.07 SW 22.19±1.92 W 22.97±1.72 NW 22.07±2.25

(85)

Use of Spatial metrics to

quantify the structure of the landscape

quantify the spatial pattern and composition of

features

(86)

0 2000 4000 6000 8000 N NE E SE S SW W NW 1973 1992 2000 2006 2010 Largest Patch 0 10000 20000 30000 40000 N NE E SE S SW W NW 1973 1992 2000 2006 2010

Built up(total land area)

0 1000 2000 3000 N NE E SE S SW W NW 1973 1992 2000 2006 2010 Number of Patches 0 0.2 0.4 0.6 0.8 1 N NE E SE S SW W NW 1973 1992 2000 2006 2010 0 500 1000 1500 2000 2500 N NE E SE S SW W NW

1973 Ratio of Open space 1992 2000 2006 2010

Results of Spatial Metrics

Clumpiness Aggregation index 0 20 40 60 80 100 N NE E SE S SW W NW 1973 1992 2000 2006 2010 Compactness index of the largest patch

0 0.2 0.4 0.6 N NE E SE S SW W NW 1973 1992 2000 2006 2010

Largest patch in N and E in 2010 and medium urban development in W, SW and S. Urban growth is more prominent

in west, southwest and south direction.

Separate clusters of huge urban patches have come in north (Bengaluru International Airport)

and east (International Tech Park Limited).

More compact and moving towards single big patch in 2010. Open space decreased and urban density increased.

(87)

Urban dynamics through

Cellular Automata (CA) based

growth models

(88)

Growth model: CA

• CA is based on pixels, states, neighbourhood and transition

rules.

Transition External factors

Pixel

(Final State)

Pixel

(Initial State)

(89)

•

584% increase in urban areas during 37 years (1973 to 2010).

•

↑ ~2-4 ºC in local LST.

•

74% ↓ vegetation cover and 66% ↓ in water bodies.

Percent Impervious surface _NDVI _Temperature

(90)

Spatial Thinking …

(91)

Thank you