• No results found

Big Data Analytics Center (BDAC) School of Natural Sciences. Course Brochure

N/A
N/A
Protected

Academic year: 2021

Share "Big Data Analytics Center (BDAC) School of Natural Sciences. Course Brochure"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

Big Data Analytics Center (BDAC)

School of Natural Sciences

Postgraduate Diploma in Data & Business Analytics

Master of Science Degree in Data Analytics

Course Brochure

2014

(2)

Big Data Analytics Center (BDAC)

Table of Contents

Overview ... 3 Program Objectives ... 3 Program Coordinators... 4 Course Design ... 4 Course Details ... 4 Career Prospects ... 7

Minimum Eligibility Criteria for Applicants ... 7

Admission Process ... 7

Fees and Scholarship ... 7

Frequently Asked Questions ... 8

(3)

Big Data Analytics Center (BDAC)

Overview

The Big Data Analytics Center (BDAC) is an interdisciplinary center set up under the aegis of the School of Natural Sciences (SoNS), Shiv Nadar University (SNU) in 2014. ‘Big Data’ refers to the collection of data sets whose scale, diversity and complexity require new architecture, techniques, algorithms and analysis to manage it, and extract value and hidden knowledge from it. Use of information has become central for the survival and development of the human race. Today we experience a true deluge of data which record and shape our lives, ranging from large global issues such as climate change to the smallest local problem such as controlling a thermostat. The critical screening and processing of Big Data has become a world-wide effort, requiring academic attention from diverse disciplines. The challenge is to develop theoretical and innovative scientific and technological solutions to cater to the needs of the industry, the society and the environment.

Given the wide gap between demand and supply of scientists, technologists and key experts in the domain of Data Analytics today, BDAC has initiated graduate (post-graduate diploma and Masters degree) programs to prepare the interested young minds for the academic analysis of such Big Data and its applications in the society today, from business concerns to social practices and cultural change.

The integrated model of BDAC provides a unique opportunity to young aspirants from different academic disciplines and executives from the corporate world to explore the possibilities of next generation solutions in the emerging discipline of Big Data Analytics.

Integrated Model of Big Data Analytics Center @ SNU

In the view of ever more rapid technological developments in the digitalized world, with new solutions becoming obsolete every few years, the emphasis in our programs is to hone the ability to recognize, define and find solutions to such fundamental problems in an analytical way, to take a leading role in shaping the future society.

Program Objectives

The theoretical and practical mix of the Big Data Analytic programs has the following objectives:

• Develop in-depth knowledge and understanding of the big data analytic domain.

• Analyze and solve problems conceptually and practically from diverse industries, such as manufacturing, retail, software, banking/ finance and pharmaceutical.

(4)

Big Data Analytics Center (BDAC)

• Undertake consulting projects with significant data analysis component for better understanding of the theoretical concepts from statistics, economics and related disciplines.

• Undertake industrial research projects for the development of future solutions in the domain of data analytics to make an impact in the technological advancement.

• Use advanced analytical tools/ decision-making tools/ operation research techniques to analyze the complex problems and get ready to develop such new techniques for the future.

• Allow a flexible option especially for job holders to complete a P.G. Diploma in Business & Data Analytics (1 year program), and continue to a Master of Science Degree in Data Analytics (2 years total).

Program Coordinators

Mr. Narender Dureja, HCL, Noida, and Dr. Santosh Singh, Department of Mathematics & Head, BDAC, School of Natural Sciences, SNU.

Course Design

Postgraduate Diploma in Data and Business Analytics (One Year)

The program consists of ten courses and a project. These courses are divided in two semesters. There are eight core courses and two elective courses. A wide range of electives will be offered to best suite the choice and interest of a student. The appropriate training and usage of SPSS Statistics and SAS Enterprise Miner will be part of the course contents and course structure. The customer-based industrial project will help in broadening and in-depth understanding of the theoretical concepts, and use a practical approach to solve a real-life complex problem. In certain courses, case-based teaching will be adopted.

Master of Science Degree in Data Analytics (Two Years)

The program consists of ten courses in the first and second semester. Out of ten courses, eight are core courses and two are elective courses. The elective courses will have a wide range in the big data analytics to suite the research interest of the student. The training and the analytical usage of SPSS Statistics and SAS Enterprise Miner will enhance and update the knowledge of student in the relevant domains. The summer internship is for two months, based on available choices and the interest of the student. Third and fourth semesters will involve Master’s thesis. Collaborative industrial research projects will provide a suitable base for the Master’s thesis in accomplishing the objective of quality research work.

Course Details

Core Courses

1. Data Collection and Management – Principles, Tools and Platforms / (Database Management Systems): Database concepts, Basic components of DBMS, sources of data, logging, cleaning data,

data representation, data models – (hierarchical, network, XML), and Stores, NoSQL database, design for performance / quality parameters, documents and information retrieval, related tools – (Postgres, OLTP, OLAP, Hadoop, Mapreduce)

2. Data Visualization / (Visualization and Reporting): Purpose of visualization, Multidimensional

visualization, tree visualization, graph visualization and time series data visualization techniques, visual perception, cognitive issues, evaluation as well as other theory and design principles behind information visualization, understanding analytics output and their usage, basic interaction techniques such as

(5)

Big Data Analytics Center (BDAC)

selection and distortion, evaluation, examples of information visualization applications and systems, user tasks and analysis

3. Mathematics for Data Analytics: Basic probability theory, distributions and their properties, Simple

and multiple regression analysis, hypothesis testing and sampling, estimation theory, least square methods, SVD, transformations, stochastic models compression techniques, Markov Models, Markov decision process and its application in sequential decision making, Poisson, Cumulative Poisson Process and its generalization, applications in different business domain, ARMA and ARIMA, Monte Carlo Simulations, application of data analytics in different domains.

4. Business Statistics: Descriptive statistics – uni-variate and bi-variate, residual analysis, confidence

and prediction intervals regression, associations, sequencing, introduction to forecasting, design of experiments and performing basic statistical analysis of data experiments (both field and laboratory) to investigate business issues, tools for conducting basic statistics (for example SPSS and SAS), conducting the analytics on (laboratory and / or field ) data using the tools (for example, SAS, JMP, KNIME)

5. Systems / Business Analysis: Introduction to information system components, types of information

systems, roles of business analyst, evolution and definition, industry needs and applications, process and methodologies, tools and technologies, roles and responsibilities, impact of digital marketing and unstructured data, Systems planning: Objectives, preliminary investigation, other fact-finding techniques, recording facts, Analyzing, requirements: Data flow diagrams, data dictionary, process description, evaluation alternatives, Data analytics Life Cycle: discovery, data preparation, model planning, model building, communication results and operationalization, Implementation: quality assurance, documentation, management approval, Installation / implementation, Acceptance.

6. Data Mining: Clustering, Association rules, factor analysis, scale development, survival analysis, data

reduction using PCA, scoring new data and model implementation, improving predictive models, association and market basket analysis, advanced regression models: concepts and applications, conjoint and discrete choice analysis, design and analysis of experiment.

7. Operation Research: Introduction to optimization, gradient descent method, convex optimization, linear

programming and its generalization (Goal Programming and multi criteria decision analysis), integer programming, dynamic programming, assignment problem, transportation problem and their applications.

8. Big Data Technologies: Big data definition, enterprise / structured data, social / unstructured data,

unstructured data needs for analytics, Big data programming (Hadoop / HDFS, Map-reduce, event stream processing, complex event processing), evolution, purpose and use, application data stores, (NSQL databases, in-memory databases), data computing appliance (DCA) and OLAP, massive parallel processing, in-memory computing / analytics, data science, enterprise / external search, HDFS – Overview and concepts, data flow (read and write), interface to HDFS (HTTP, CLI and Java API), high availability and Name Node federation, Map Reduce developing and deploying programs, optimization techniques, Map Reduce Anatomy, Data flow framework programming Map Reduce best practices and debugging

Electives

9. Understanding Enterprise Processes and Analytics: Overview of domain, understanding of business

pain points, understanding different types of analytics applications, financial services – claims, renewal, sales force, collections, fraud, compliance, risk, pricing, customer loyalty, pricing and promotion effectiveness etc, healthcare – evidence based medicine, comparative effectiveness research, clinical

(6)

Big Data Analytics Center (BDAC)

analytics, fraud/waste/abuse management etc., telecom – network optimization, subscriber profiling, churn management, collection management etc., manufacturing – demand forecasting and SKU rationalization, plant analytics, route and distribution optimization, vendor performance etc, Overview of analytics view chain – data source, ETL Data integration, data migration, MDM, modeling, reporting and visualization etc., process of scoping analytics project / use case, steps in hypothesis creation, establish critical success factors, identify reports and deliverables, data privacy and security

10. Machine Learning and Knowledge discovery: Supervised learning, decision trees, linear discriminant

functions (SVM), neural networks, deep belief networks, density estimation methods, Bayes’ decision theory, expectation and minimization, ensemble methods, feature engineering, association rule mining, clustering techniques. Practical: evaluation of ML Techniques – cross validations, ROC, precision, recall, F-value, introduction to use of ML and KD tools such as Weka, Octave, SciLab/ equivalent libraries/ tools

11. Time Series and Forecasting: A survey of the theory and application of time series methods in

different domains with special emphasis on econometrics. Univariate stationary and non-stationary models, vector auto-regressions, frequency domain methods, models for estimation and inference in persistent time series, and structural breaks, different methods of estimation and inferences of modern dynamic stochastic general equilibrium models (DSGE): simulated method of moments, maximum likelihood and Bayesian approach. The empirical applications will be drawn primarily from macroeconomics and different domains.

12. Evolutionary Programming: Introduction to evolutionary and heuristic techniques. Principles and Historical Perspectives; Application potential in optimization, dimensionality reduction, data mining and analytics, Genetic Algorithms, Evolutionary Strategies, Evolutionary Programming Introduction to Representations, Binary Strings, Real-Valued Vectors, Various Selection Strategies Introduction to Search Operators, Crossover and Mutation, Ant Colony Optimization, Pheromone mediated search and Exploration and Exploitation strategies, Particle swarm optimization basic PSO strategies and variants, different neighborhood topologies, Biogeography Based Optimization; Immigration and Emigration Strategies, Monte Carlo Methods Simulated annealing and advanced annealing strategies, Differential Evolution, Group Search Optimization, Glow worm Optimization, Firefly and other novel heuristic algorithms, Applications of evolutionary & Heuristic techniques in large scale Optimization ,Combinatorial & Function optimization, Multi-objective Optimization, Pareto Front and Non-dominated Solutions NSGA and related solution strategies, Applications to large scale clustering classification, rule mining and Data driven Modeling, Variable Selection and Informative Data reduction and parameter optimization in predictive data analytics with evolutionary and heuristic techniques, Evolutionary Computing in discovering Structure and modularity of large scale networks

13. Multi-core Programming: Fundamental aspects of shared-memory and accelerator-based parallel

programming, such as shared memory parallel architecture concepts, programming models, performance models, parallel algorithmic paradigms, parallelization techniques and strategies, scheduling algorithms, optimization, composition of parallel programs, and concepts of modern parallel programming languages and systems. Practical exercises help to apply the theoretical concepts of the course to solve concrete problems in a real-world multicore system

14. Game Theory: Rigorous investigation of the evolutionary and epistemic foundations of solution

concepts, such as rationalizability and Nash equilibrium. Classical topics on repeated games, bargaining, and super-modular games, games, heterogeneous priors, psychological games, and games without expected utility maximization. Applications and case studies from different domains

(7)

Big Data Analytics Center (BDAC)

15. Text Analytics: Introduction to text mining, text representation and turning into features, exploratory

analysis: frequency and co-occurrence, clustering, categorization, bag of features, predictive analysis for categorization, predicative analysis for sentiment analysis, analyze data from extracted text from web, such as social media and tweets, Develop prototypes for identifying the entities mentioned in text, the relations between them, and the opinions expressed about these entities.

Career Prospects

The P.G. Diploma and M.S. Degree programs will educate the aspirants who want to make an impact in the corporate and academic world in the domain of data analytics as data scientist and researcher, big data leads/ administrators/ managers, business analysts and data visualization specialist. The course is also suitable for those who are already working in analytics to enhance their theoretical and conceptual knowledge as well as those with analytical aptitude and would like to start career in big data analytics in different business sectors. The collaboration with the different multi-national companies at the level of mutual research interests and customer related projects will ease the path for campus recruitment.

Minimum Eligibility Criteria for Applicants

The applicants must hold either a four-year undergraduate degree in engineering/ mathematics/ statistics/ physics/ economics/ commerce or a Masters degree in mathematics/ physics/ statistics/ economics/ commerce, or a three-year undergraduate degree in mathematics/ statistics/ physics/ economics/ commerce plus two or more years of relevant industrial experience. The applicants for executive education are expected to have at least five years of work experience. The minimum eligibility criteria can be waived for exceptionally qualified profiles.

Admission Process

Each candidate will be evaluated holistically to assess his/her potential for becoming a good data scientist/data analysts/data manager. A written test will be conducted at the announced date in the NCR region. The written test will have subjective as well as multiple choice questions. The shortlisted candidates in written test will be called for technical interview. The candidates will be evaluated on the basis of their scientific in depth knowledge, analytical skills and computational knowledge.

Fees and Scholarship

The total fee (Admission fee + Tuition fee + contribution to the Student Activity Fund) for the two-year M.S. program is INR 772,000, and that for the one-year PG Diploma is INR 396,000. The break-up is shown in the table below. The amount takes care of academic expenses such as basic program material, royalty for copyrighted material, library and network charges. The students will have an option to live on campus. This fee does not cover accommodation, mess charges, laundry charges and other living expenses.

(8)

Big Data Analytics Center (BDAC)

Fee Structure

PG Diploma /

M.S. – First Year (INR) M.S. – Second Year (INR) Two-Year Total for M.S. (INR)

Admission Fee 20,000 -- 20,000

Tuition Fee 375,000 375,000 750,000

Contribution to Student Activity Fund 1,000 1,000 2,000

Total Fees 396,000 376,000 772,000

RefundableSecurity Deposit 25,000 -- 25,000

Total Payable 421,000 376,000 797,000

Note:

1. For those students who opt for campus accommodation, hostel fee will be charged according to the SNU rules.

2. Qualified M.S. and PG Diploma students may be awarded tuition fee waivers (partial or full) based on merit.

3. Additionally, qualified students enrolled in the M.S. program in Data Analytics will be eligible for teaching assistantship of INR 12,000 per month.

Frequently Asked Questions

1. Why should I choose to take a program in data analytics?

The recent years have seen an exponential growth of digital data. In the next decade the total digital data is expected to cross 35 zettabytes (1 zettabyte = 1021 bytes). There will be a huge requirement for

qualified educated data scientists, data analysts and data managers to manage the data and provide real time solutions to deal with the technological, social and business challenges. Many surveys have shown that more than 4.4 million jobs in the area of Big Data Analytics will be created in the coming years. There is a huge opportunity for young graduates and budding professionals to prepare themselves to ‘make a difference’ in the domain of Big Data by handling the technological and social challenges from different perspectives.

SNU, an emerging premier university in India, has deigned two different programs to bridge the gap between demand and supply in the domain of business and data analytics. These programs educate and trained the students on the latest software tools, and in depth technological understanding of critical processing of the data. Hence, it enables them to take up various positions in the domain of business and data analytics.

2. How to decide for admission in M.S. in data analytics or in PG Diploma in data and business analytics?

Both the programs have their unique value and have been designed to cover up the wide range of candidates interested. Those students and working professionals who are interested in understanding the technological concepts, fundamentals and get trained themselves with the latest software tools and techniques behind the Big data, for them one year PG Diploma in Business and Data Analytics is best suitable program. On the other hand those students who would like to explore and develop the new algorithms and techniques, two years Master of Science in Data Analytics will be best suited. M.S. program will also open up a path for pursuing Ph.D.

(9)

Big Data Analytics Center (BDAC)

3. Are both the programs a distance education, online or full time program?

PG Diploma and M.S. are full time programs. The curriculum apart from class-room teaching includes significant portion of practical, group projects and assignments.

4. Will the students get to work on live projects with companies?

Project is an integral part of the curriculum to reinforce the classroom learning. Students will work in groups on industry assigned projects.

5. What kind of companies will come to campus for placement and what will be the profile?

As we are going to start the first session in August 2014, and the Career Development Center at SNU is dedicated engage and negotiate with different companies in the relevant domain for both summer internships and final placement of the students.

6. Are both the programs approved?

The Shiv Nadar University has been set up through an Act of the State of Uttar Pradesh and is also recognized by the UGC. Being a program of the Shiv Nadar University it is not necessary for the PG Diploma and M.S. program offered by the Big Data Analytics Center in the School of Natural Sciences to be approved by any other organization/ body. We follow all UGC guidelines.

7. Who will teach the courses?

The Big Data Analytics Center (BDAC) at SNU is a research center. In this center, we have a good pool of renowned faculty and researchers from within SNU, and also IIT professors and industrial experts as adjunct/guest/visiting faculty who will teach the core and elective courses.

8. Whom should I contact for further queries?

Dr. Santosh Singh, Head, Big Data Analytics Center, School of Natural Sciences, SNU, Shiv Nadar University P.O., Gautam Buddha Nagar, UP 201 314, India.

Email: [email protected]

References

Related documents

The calculation with the traditional formulae does not give you any exact fair price but only a result which is true, assuming a flat yield curve and a re-investment of the

Select an illustration (the first double page spread has all plants and animals mentioned in the book and is a good one to use)and give children a specific length of time (five

In conclusion, for the studied Taiwanese population of diabetic patients undergoing hemodialysis, increased mortality rates are associated with higher average FPG levels at 1 and

Yeast Surface Two-hybrid for Quantitative in Vivo Detection of Protein- Protein Interactions via the Secretory Pathway[J]. Cell Surface Assembly of HIV gp41 Six-Helix Bundles

For both capital services and the capital stock, results are provided based on two different breakdowns of investment data: the 2-asset case drawing upon data for structures

On the single objective problem, the sequential metamodeling method with domain reduction of LS-OPT showed better performance than any other method evaluated. The development of

On the other hand, an example of a cross-side network effect affecting switching decision could be a situation in which an individual is persuaded to switch a

Beyond the issues related to communications and team work skills, Canadian ICT firms also experience gaps in terms of niche and combined technical skills specific to each company’s