• No results found

Traffic Driven Analysis of Cellular Data Networks

N/A
N/A
Protected

Academic year: 2021

Share "Traffic Driven Analysis of Cellular Data Networks"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

Traffic

 

Driven

 

Analysis

 

of

 

Cellular

 

Data

 

Networks

Samir R. Das

Computer Science Department Stony Brook University

Joint work with Utpal Paul, Luis Ortiz (Stony Brook U), Milind Buddhikot, 

(2)

Mobile Data Usage

Relatively little research on nature of mobile data

traffic. 2

0.6 EB / month

10.8 EB / month

Forecast of Global Mobile Data Traffic

Source: CISCO VNI Mobile

1 Exabyte = 1 million Terabyte

Higher than the

traffic volume in the entire Global Internet

(3)

3 Traffic     Management Modeling  and  Forecasting Traffic  Analysis

(4)

Flow  Monitoring Tool SQL Database Packet Flows

Internet Flow Records

Mobility and Session  Manager Radio Access Network

Measurement

 

Infrastructure

(5)

Sample

 

Results

 

from

 

Traffic

 

Analysis

• Data collected from a nationwide 2G/3G network  circa 2007

– About 10K BSes, 1M subscribers. 

• Significant traffic imbalance per subscriber and  per BS

– 1% of subscribers create more than 60% of load.  – 10% of BSes experience more than 50% of load.

• Mobility is generally low

– More than 50% subscribers stick to just one BS daily.  – Median radius of gyration is ~1 mile.  

(6)

Sample

 

Results

 

from

 

Traffic

 

Analysis

• Mobility is predictable

– Subscribers are almost always found in their top 2‐3 

most visited locations. 

– They return to the same location at the same time of 

the day with high probability. 

• More mobile subscribers tend to generate more  traffic. 

• Radio resource usage efficiency is very poor

(7)

Functional

 

Influence

 

Among

 

BSes

Model BS load as time series. Explore causal 

relationships between pairs of time series. 

Granger Causality – Determines whether one time 

series is useful in forecasting another when using an 

autoregressive model.

– Has been used in economics and neuroscience. 

• Statistically significant causality exists among 

neighboring BSes (roughly among half of the 

neighbors). 

Causality graph and causal path – Make a graph out of 

causality. Long paths exist in this graph (median = 15 

(8)

Modeling

 

Study

• Model BS traffic loads exploiting any  interactions/dependencies

– Exploit tools from machine learning. 

– Many possible directions – purely static/spatial, 

dynamic/temporal.

• Goals:

Intellectual – broad understanding of any underlying 

structure would help future network architectures. 

Utilitarian – models can help estimation/forecasting.  

(9)

• Assume load on n base stations are multi‐variate Gaussian:

• Learn the parameters given a set of training data, 

specifically the “inverse covariance matrix”       ,  

given a set of training data (p observations).  • is easier to estimate than       and exposes 

interesting properties. 

Spatial

 

Modeling

 

Approach:

 

Probabilistic

 

Graphical

 

Modeling

Mean vector Covariance matrix

(10)

Inverse

 

Covariance

 

Matrix:

 

Properties

• If       then load  variables       and        are 

conditionally independent,  given the rest of the 

variables.

• Most problems produce a  `sparse’ model. 

• Related to probabilistic  graphical models (e.g., 

Gaussian Markov Random  Field).

1

2

3 4

5

Undirected Graphical Model

‐> Edge

‐> no edge

Graph properties translate to probabilistic (in)dependencies

(11)

Inference

 

Problem

Estimate load for  BS i given 

the load of a subset of BSes S as the conditional mean:

• Broad questions: 

– How large should be S? Effort 

vs. accuracy tradeoff.  – How to choose S? 1 2 3 4 5 Measure only a 

subset and estimate  the rest. 

(12)

First

 

Solve

 

the

 

Learning

 

Problem

• Learn the inverse covariance matrix from 

training data.

• How? Exploit relationship with linear 

regression modeling.

– Express load of BS i as a linear function of all other 

BS loads and then regress:

– Regression coefficients       can be shown to be 

directly related to inv. cov. matrix elements.  

Yi

Xjji
(13)

Sparse

 

Models

Sparse model ‐> many regression coeffs are 

zero.

Reduces danger of overfitting (lowering 

variance). Also, computationally efficient. 

• Introduce a regularization term in regression. 

We used “Lasso” .

Empirical error Regularization term modeling penalty

(14)

Regularization

Crossvalidate using additional training 

samples (not used for model creation). 

• Use various values of      to create different 

models. 

• Choose the one with max likelihood. 

(15)

Data

 

Processing

• Hourly load of 400 BSes covering 75 x 84 miles area. Includes a busy downtown and surrounding suburbs. 

No temporal dimension in model. Create different 

models for for different parts of the day (every 4 

hours). 

Account for diurnal variation of load. Use residuals 

(16)

Average

 

Edge

 

Length

 

in

 

the

 

Model

 

Graph

Apparent spatial/regional significance. 

(17)

Choosing

 

the

 

Measured

 

Set

 

S

Greedy strategy – each iteration picks the BS that 

minimizes the error estimate.

(18)

Impact

 

of

 

Estimation

 

Accuracy

 

on

 

Applications

• We understand the measurement complexity 

(size of S) vs. Error tradeoff. 

• But how much accuracy do we need? Need to 

turn to applications 

Studied two applications

– Energy Management 

(19)

Opportunistic

 

Traffic

 

Scheduling

• Similar to Smart Electric Grid – move non‐urgent traffic 

from peak to off‐peak periods. 

– What is non‐urgent? p2p, large downloads, sync, push, etc. 

– Who decides? User agent on mobile. May have multiple  levels of priority or have deadlines to aid scheduling. 

– Carriers can incentivize such scheduling. 

Similar to QoS scheduling – but at a higher layer and at 

a longer time scale. 

• Two components in System Architecture

– Server (Scheduler) in core network.

(20)

Time Line

2PM 2:30PM 3PM 3:30PM

Creates low-priority flow Deadline=2hr

20

Server (scheduler) in the 

(21)

Solving

 

the

 

Scheduling

 

Problem

Several approaches possible based on how 

flows are prioritized.

But for any approach, server needs to be able 

estimate current/future loads at all BSes. 

– Also, needs to model/estimate subscriber mobility 

(separate problem).

• Poor estimation leads to poor scheduler 

(22)

Evaluation

 

Approach

• Trace‐driven simulator based on a capacity model  of BSes. 

• Opportunistic scheduling is meant to admit more  traffic but with the same network capacity. 

• We use the same traffic trace always, but reduce  network capacity to demonstrate impact. 

• Impact?

– Do low priority flows still finish within a reasonable 

time? 

(23)

Results

• Low priority flows = 

random subset of long‐

lived flows (over 25 mins),  about 8% of all flows. 

Randomly chosen 

deadlines 1 ‐ 4 hours.  • Rest high priority.  

• Scheduling epoch hourly.  • Only a subset of 400 BSes

are measured, rest  estimated. 

(24)

Conclusions

• Discovering structures in mobile traffic is a 

rich area of study. 

• Applications in network and resource 

(25)

25 Traffic     Management Modeling  and  Forecasting Traffic  Analysis

Questions?

References

Related documents

METHODOLOGY 3.1 Overview 3.2 Design of a Modified Adaptive Fuzzy Inference Engine MAFIE 3.2.1 Fuzzy Inference System FIS 3.2.2 Hybrid Fuzzy clustering algorithm for Automatic

appropriate, students with disabilities will be placed in the general education setting with the use of supplementary aides and services prior to consideration of placement in a

This paper presents a method of tumor prediction based on extracting features from mammogram using Gabor filter with Discrete cosine transform and classify the features using

sis. In order to provide a privacy-preserving solution in the deep learning based image analysis direction, I tailor and deploy CNN models on edge devices to pri- vately

Organisational cultures have a profound impact on LGBT Equalities Organisational cultures have a profound impact on LGBT Equalities work in local authorities. work in

Our studies and those of other investigators have shown that there are certain HLA antigens that identify those patients with psoriasis more likely to develop psoriatic arthritis..

Dynamics analysis for a discrete dynamic competition model Yang et al Advances in Difference Equations (2019) 2019 324 https //doi org/10 1186/s13662 019 2149 6 R E S E A R C H Open

Where necessary and appropriate, the Company will include standard provisions in agreements, contracts, and renewals thereof with all agents and business partners that are