• No results found

ESS event: Big Data in Official Statistics

N/A
N/A
Protected

Academic year: 2021

Share "ESS event: Big Data in Official Statistics"

Copied!
21
0
0

Loading.... (view fulltext now)

Full text

(1)

1

ESS event:

Big Data in Official Statistics

v

(2)

2

LEARNING AND DEVELOPMENT:

CAPACITY BUILDING AND TRAINING

FOR ESS HUMAN RESOURCES

FACILITATOR: JOSÉ CERVERA- FERRI

(3)

3

Session 2

Related Scheveningen challenges

[SCH5] Short-term Human Resources needs:

recruitment, professional training,

secondment/re-deployment

[SCH5] Long-term needs: academic curricula

for Data Scientists

[SCH6] Collaboration with academia for

(4)

4

Session 2: Topics for discussion

• Skills for Big Data

• Opportunities for building skills

• Proposal for a key input to the roadmap to

be established by the ESS Task Force

(5)

5

Session 2: Organization

Short-term

Long-term

Skills for Big Data

Session 2A

Opportunities for acquiring

skills

Session 2A

Session 2B

Proposal for a roadmap to

acquire skills for Big Data in the

ESS

(6)

6

SKILLS FOR BIG DATA

OPPORTUNITIES FOR ACQUIRING

SKILLS

(7)

7

Session 2A

Preliminary considerations (1):

Can NSIs rely on existing skills?

• “Non-traditional set of skills to develop”

• Trained statisticians and IT staff in statistics are already close to the

“data science” skills required for Big Data (data cleaning, cubes,

analytical software, data mining, etc.). Staff well-trained in methodology and statistical domains (UNECE Sprint paper, SWOT analysis – strength).

• The Official Statistics Community has less knowledge of Big Data than

many important players like Google. The Official Statistics Community has limited skills and limited IT resources when it comes to the new, non-traditional, technologies used to gather, process and analyse Big Data (UNECE Sprint paper, SWOT analysis – weakness).

(8)

8

Session 2A

Preliminary considerations (1):

Can NSIs rely on existing skills? (cont.)

• Young staff coming in from universities may be very innovative and already have a personal relationship with Big Data (Facebook, Google, Twitter trends) and less constrained by traditional IT and analysis (UNECE Sprint paper, SWOT analysis – opportunity).

• Failure to permit innovative methods might render OSC organizations less attractive workplaces for top talent (UNECE Sprint paper, SWOT analysis – threats).

• Cultural change:

– “a culture that values high quality and accurate information and regards the best way to achieve this through use of methods where the design can be controlled. Big Data doesn't allow this luxury” – Innovative thinking, risk-taking (is it the realm of Civil Servants??)

(9)

9

Session 2A

Preliminary considerations (2):

Learning methods

• Learning by doing in OS

• Training individuals, or teams?

• The business analyst and project manager

• The mathematician who builds algorithms

• The data architect

• The statistician (data collection, editing, processing)

• The communicator (visualization)

• Data analyst • Data scientist • Data engineer • Data integrator • System manager

(10)

10

Session 2A

Preliminary considerations (3):

Competition

• Competition with the Industry: better

salaries in the private sector for Data

Scientists?

(11)

11

Session 2A

Skills for Big Data

Data Scientist vs. Statistician

Data Scientist as the “connective tissue”

between processing technologies and

data-driven decision making

Necessary skills: math/statistics, IT,

visualization, subject matter specialization

Math/stat: data mining techniques

IT: Hadoop, MongoDB, NoSQL, …

(12)

12

Session 2A:

IT Skills for Big Data

• R-SAS-SPSS

• Business Intelligence, Visual Analytics, Excel

• MapReduce

• Pig, Java

• SQL

• ETL (Extract, transform, load)

• Linux…

(13)

13

Session 2A

Statistical Skills for Big Data

• Computational statistics

• Analytical methods: correlations & causality,

modelling, network analysis, information reduction

• Dissemination: data visualization

(14)

14

Session 2A

Opportunities in the ESS

ESS Learning and Development Framework

ESTP 2014 course

– Big Data: Effective Processing and Analysis of Very Large and Unstructured Data for Official

Statistics

• Contents: classification of various massive data sets, ETL (extract, transform, load), specific challenges, Privacy and statistical disclosure issues, comuting base, overview of statistical methods. Focus on concrete examples.

• Course requirements:

– Database fundamentals and data manipulation languages – Data collection and integration tools

– Data mining techniques for large data sets – Object-oriented design and programming – Probablity and random variables

• Is there anyone with such a complete background in Official Statistics???

European Masters in Official Statistics (EMOS): ESS certification of

programmes offered by Universities

– EMOS workshop 2014 (Helsinki, June 2014)

(15)

15

OPPORTUNITIES FOR ACQUIRING

SKILLS (CONT.)

KEY INPUT TO THE ROADMAP TO BE

ESTABLISHED BY THE ESSTASK

FORCE

(16)

16

Sessions 2B

Opportunities outside the ESS

Grasping the opportunities outside:

Diversity of academic programmes on Big Data,

Business Analytics, Data Science…

(certification?)

Training offer from private companies

(certification?)

(17)

17

Session 2B

[SH6] Collaboration with Academia

• Academic collaborators: use of existing expertise in

statistical analysis of large sets of data: astronomy, remote

sensing, genetics, image processing….

• Source of training: need for mapping academic

programmes on Big Data

• How can academics be integrated with NSI staff?

• How can training be financed? National or ESS level?

(18)

18

Session 2B

Horizon 2020

• Marie Sklodowska-Curie actions: support for innovative training networks, mobility of researchers, inter-sectoral cooperation • ICT 15 -2014: Big data and Open Data Innovation and take-up:

– Objective: To contribute to capacity-building by designing and coordinating a network of European skills centres for big data analytics technologies and business development. The network is expected to identify knowledge/skills gaps in the European industrial landscape and produce effective learning curricula and documentation to train large numbers of European data analysts and business developers, capable of (co)operating across national borders on the basis of a common vision and methodology

– Expected impact: Availability of deployable educational material for data scientists and data workers and thousands of European data professionals trained in state-of-the-art data analytics technologies and capable of (co)operating in cross-border, cross-lingual and cross-sector European data supply chains.

• Call on “Training and educating Data Scientists” • More detailed linkages in Horizon 2020??

(19)

19

Session 2B

Input to the Roadmap: The actions

• Ideas for actions (which term?):

– Identify existing skills in the ESS

– Recruit Data Scientist with the missing skills

– Establish a network of providers of Big Data skills within the ESS – Map the offer of Data Science training programmes in the private

sector and their applicability to OS

– Establish a repository of assessed training materials

– Establish agreements with private sector and academia as providers of training,…

• Who?

– NSIs, Eurostat, International organizations, private sector, Academia?

– Working Groups? Gexp (EMOS), HLG, ESTP, ???

• Which source of financing?

(20)

20

Session 2B

Input to the Roadmap: The actors

• Ideas for actors :

– NSIs

– Eurostat

– International organizations

– Universities

(21)

21

Session 2B

Input to the Roadmap for Big Data

training

• Brainstorming of ideas for building skills

• Assessment: sort by impact and ease of implementation

• Discussion of term, actors and level (national/EU/global),

• Proposal of responsibilities and time frame for the “Input

References

Related documents

Someone who holds each type of card will, as the first two columns of Table 4 show, have approximately 5.6 percentage points lower checking account balances (measured relative to

This study was primarily designed to assess the impact of the PDAS on both transition of care and patient safety. For study endpoint purposes, transition of care was assessed

Citizens of Thailand, Malaysia, Singapore, Hong Kong will be granted a visit visa exemption for the following purposes: > government, education, socio-cultural,

World Health Organization and the European research Organization on Genital Infection and Neoplasia in the year 2000 mentioned that HPV testing showed

− Analiza prednosti (snaga) projektne nastave: učenici rado sudjeluju u planiranju, realizaciji i vrednovanju projektne nastave i imaju mogućnost izbora sadržaja

Either the suspect contributed the evidence, or an unlikely coincidence happened – the once-in-1.6 × 10 15 (1.6 quadrillion) coincidence that an unrelated person would

Single-cell analysis identified large (>1 Mb) clonal CNVs in lymphoblasts and in single neurons from normal human brain tissue, suggesting that some CNVs occur during

Ê Close the server blade, place it back in the basic unit and switch on the server blade as described in the chapter "Completion" on page 53.. Ê Start the operating