Learning Outcomes and Competency Standards
6. Develop Learning Tools: Development of NOS provided in-depth information of all tasks performed by an individual in that occupation and guided the development
5.3 Data Scientist (Junior) – Data Science & Analytics
Occupational Standard
(for use in the development of Business Technology Management related job descriptions, performance evaluations, career development plans, educational learning outcomes etc.) Description of Position Data Scientists are responsible for modeling complex Institute
problems, discovering insights and identifying opportunities through the use of statistical, algorithmic, mining and
visualization techniques. In addition to advanced analytic skills, this role is also proficient at integrating and preparing large, varied datasets, architecting specialized database and computing environments, and communicating results.
In most organizations, Data Scientists work closely with clients, data stewards, project/program managers, and other IT teams to turn data into critical information and knowledge that can be used to make sound organizational decisions. Other responsibilities include providing data that is congruent and reliable. They need to be creative thinkers and propose innovative ways to look at problems by using data mining (the process of discovering new patterns from large datasets) approaches on the set of information available. They will need to validate their findings using an experimental and iterative approach. Also, Data Scientists will need to be able to present back their findings to the business or organization by exposing their assumptions and validation work in a way that can be easily understood by their business
counterparts. These professionals will need a combination of business focus, strong analytical and problem solving skills and programming knowledge to be able to quickly cycle hypothesis through the discovery phase of the project. Excellent written and communications skills to report back the findings in a clear, structured manner are required.
Position Development Advancement to manager level positions is possible through progressively responsible leadership positions and management experience. The career path will be determined by the size, type, geographic scope, culture, and organizational structure of the firm offering employment.
Required Qualifications
Education Post-secondary education is preferred, usually a Bachelor’s degree in a business, computing or engineering field. Follow up technical educational may also be required depending on the technologies in use at the various organizations. Moreover, many organizations require senior Data Scientists to complete post-secondary school in any of the following areas: mathematics, statistics, economics, computer science, commerce, or
115 skills, knowledge, work-related experience, and/or industry courses and programs. Some organizations will send individuals to specific enterprise solutions training courses and programs to learn additional tools and techniques.
Related Work Experience Individuals may have experience in any of the methodologies and techniques used as a junior data scientist. Often this experience may be augmented by specific industry experience using industry or use case specific tools (e.g. R, SAS, python, etc.). Data
Scientists (junior) may also require several years of experience in data analysis, modelling, business requirement specification, qualification and assurance, systems analysis, data
administration, software engineering, as well as project management and supervisory experience. Typically, data scientists require experience manipulating large datasets and using databases, as well experience with a general-purpose programming language (such as Hardtop MapReduce or other big data frameworks, or Java). Data scientists also typically have experience using statistical packages and have familiarity with basic principles of distributed computing and/or distributed databases.
Tasks Designs experiments, test hypotheses, and build models.
Conducts data analysis and designs algorithms
Applies basic statistical and predictive modeling techniques to build, maintain, and improve on multiple real-time decision systems
Leads discovery processes with key stakeholders to identify business requirements and expected outcomes.
Works with and alongside more senior data scientists and statisticians to build robust models
Models and frames business scenarios that are meaningful and which impact on critical business processes and/or decisions.
Identifies what data is available and relevant, including internal and external data sources, leveraging new data collection processes such as smart meters and geo-location information or social media.
Collaborates with subject matter experts to select the relevant sources of information for new, tough problems
Makes strategic recommendations on data collection, integration and retention requirements incorporating business requirements and knowledge of best practices.
Validates analysis using scenario modeling
Defines the validity of the information, how long the information is meaningful, and what other information it is related to.
Works with internal data stewards to ensure that the information used is in compliance with regulatory and security policies.
Qualifies where information can be stored or what information, external to the organization, may be used in support of the use case.
Develops usage and access control policies and systems in collaboration with the data steward.
Partners with the data stewards in continuous
improvement processes impacting data quality in the context of the specific use case.
Recommends on-going improvements to methods and algorithms that lead to findings, including new
information
Presents and depicts the rationale of their findings in easy to understand terms for relevant stakeholders
Educates their organization both from IT and the business perspectives on new approaches, such as testing
hypotheses and statistical validation of results.
Helps the organization understand the principles and the math behind the process to drive organizational buy-in.
Provides business metrics for the overall project to show improvements (contribution to the improvement should be monitored initially and over multiple iterations).
Demonstrates clarity, accuracy, precision, relevance, depth, breadth, logic, significance, and fairness
Leads the design and deployment of enhancements and fixes to systems as needed.
Tools and Technology Statistical analysis software
Data analytics or intelligence programs
Office productivity tools
Software development tools and dev. ops tools including language specific IDE’s, GIT, etc.
Required Competencies Knowledge
Data Scientists should have knowledge of:
Large complex data analytics or intelligence programs
Data, statistics, and big data concepts that relate to data analysis
117
Various architectures including distributed architectures
Software development methodologies relating to analysis
Architectural understanding of the data and big data ecosystem
Best practices in data delivery and measurement for the individual organizations that they work for or with
Policies and principles for the management of information
Relevant information standards and their appropriate use
Basic technologies and workflow for the purposes of analysis, design, development and implementation of information systems and applications.
Organizational or industry specific terminology and commonly used abbreviations and acronyms
Commonly used formats, structures and methods for
recording and communicating data, as well as knowledge for how this data is incorporated for system and application use.
Architectural relationships between key information technology components and best practices in enterprise architecture frameworks/perspectives.
Appropriate informatics standards and enterprise models to enable system interoperability (e.g., terminology, data structure, system to system communication, privacy, security, safety).
Key information technology concepts and components (e.g., networks, storage devices, operating systems, information retrieval, data warehousing, applications, firewalls, etc.).
The ability to identify relevant sources of data needed to assess the quality of information & draw appropriate conclusions
Statistical & analytical tools, techniques and concepts
The ability to present data and information in a way that is effective for users and consumers of the data
Knowledge of the indicators and metrics important for the specific business that they are measuring
Skills Data Scientists should have skills in the following categories:
Technical
Demonstrable knowledge and experience of large, complex data analytics or intelligence programs
Statistical, pattern recognition skills
Understanding of data concepts
Understanding of data technology and tools
Experimental design, set-up, and modelling
Experience with applicable analytics platforms, tools and technologies
Architectural understanding of the data and big data ecosystems
Contextual
Full understanding of the organization and of its
requirements and opportunities in data/big data analytics
Experience in targeting tradecraft as well as experience in cargo screening, person screening, operational targeting
Experience managing a team and working with senior level Government clients on consulting projects
Strategic thinking
Personal Attributes A Data Scientist should have the following personal attributes:
Communication skills
Presentation and public speaking skills
Rapport building and networking
Innovation and creativity
Leadership skills including ability to influence others, to lead business and technology programs, projects, workshops and initiatives, to inspire confidence and garner respect from business and technology stakeholders
Planning, supervision, coaching and delegation skills
Decision making skills
Negotiating skills
Research skills
Abilities A Data Scientist should have the following abilities:
Ability to explain complex concepts to lay person
Ability to collaborate with multiple skills and cross-functional expertise.
Ability to communicate the benefits of analytical approaches simply and clearly
Ability to communicate with top executives, business management, IT management, solution architects, technical architects, subject matter experts, partners and customers.
Ability to adapt vocabulary and style for each situation
Ability to present appropriately to a variety of audiences, including large audiences, top executives, business and
119 business and technology problems
Ability to seek standardized solutions for problems where available
Ability to find solutions across a wide range of
technologies and business domains. Often solutions have budget, time or operational constraints
Work Values Individuals who are effective as Data Scientists are:
Able to communicate at all levels of organization
Able to present complex ideas with simple visuals
Able to find solutions across a wide range of technologies and business domains
Able to facilitate collaboration
Enjoy problem-solving
Highly analytical
Able to work independently
Work Styles Data Scientists would have the following work styles:
Collaborative
Cooperative
Stress tolerant
Initiative
Independent
Integrity
Essential Skills Profile A Data Scientist would have the following essential skills profile:
Reading text
Document use
Writing skills
Numeracy
Oral Communication
Thinking Skills
Problem Solving
Decision Making
Job Task Planning and Organizing
Significant Use of Memory
Finding Information
Working with Others
Continuous Learning
Additional Information
Physical Aspects Data Scientists work extensively in an office environment (sitting for long periods, repetitive computer and telephone use).
However, Data Scientists may also be required to travel to satisfy the position function. Typically there is no heavy lifting,
bending, or stooping required; however, this is determined by the needs of the organization.
Attitudes Data Scientists should have very advanced interpersonal skills – be persuasive, empathetic, able to handle pressure, creative, have a sense of urgency, and attention to detail. Enterprise Data Architects must exhibit leadership, people management skills, advanced negotiation skills, advanced conflict resolution skills, and organizational and planning abilities. Adaptability and flexibility are important, as Data Scientists work with diverse multicultural workforces.
Future Trends Affecting
Essential Skills The ability to speak more than one language, and an awareness of and sensitivity to the diversity of international cultures is
considered a growing need in the face of increasing globalization.
Furthermore, familiarity with opportunities and benefits associated with “green IT” (e.g. server energy efficiency, reducing overall power consumption from IT related activities, etc.) will be of increasing importance as facilities begin to
manage their overall environmental footprint while seeking short and long term cost saving opportunities. A strong understanding of cloud computing will also serve all individuals in this position very well.
121 Occupational Standard
(for use in the development of Business Technology Management related job
descriptions, performance evaluations, career development plans, educational learning outcomes etc.)
Description of Position Enterprise data architects apply architecture principles and practices to IT and business problems in order to guide organizations through the business, information, process, and technology changes necessary to execute their
strategies. Enterprise data architecture involves enterprise analysis, design, planning, and implementation, using a holistic approach at all times, for the successful
development and execution of strategy. These practices utilize the various aspects of an enterprise to identify, motivate, and achieve these changes. An Enterprise Data Architect is a person responsible for performing this complex analysis of business or technology structure and processes with the goal of drawing conclusions from the information collected so that a solution can be developed.
They also create schematic documents used to solve problems and communicate those documents widely throughout their organizations.
Position Development Advancement to management level positions is possible through progressively responsible leadership positions and management experience. The career path will be
determined by the size, type, geographic scope, culture, and organizational structure of the firm offering employment.
Required Qualifications
Education Post-secondary education is preferred, usually a Bachelor’s degree in a business, computing or engineering field. Follow up technical educational may also be required depending on the technologies in use at the various organizations.
Training Enterprise Data Architects require on-the-job training;
however, typically organizations require that the individual will already have the required skills, knowledge, work-related experience, and/or industry courses and programs.
Some organizations will send individuals to specific
enterprise solutions training courses and programs to learn additional tools and techniques.
Related Work Experience Individuals may have experience in any of the
methodologies and techniques used as an Enterprise Data Architect. Often this experience may be augmented by specific industry experience using industry or use case specific tools (e.g. Cloud data tools).
Tasks Communicate the benefits of various architectural approaches or designs to both business and engineering audiences
Present solutions to a variety of audiences, including large audiences, top executives, business and
technical leaders
Seek and find solutions to a wide range of business and technology problems
Seek standardized solutions for problems where available
Find solutions across a wide range of technologies and business domains
Tools and Technology Office productivity tools
Architecture diagram tools
Software development tools and dev. ops tools including language specific IDE’s, GIT, etc.
Required Competencies
Knowledge Enterprise Data Architects should have knowledge of:
The organization, structure, and relationship between the various systems existing within an organization as well as the organization’s overall structure and function
Architectural relationships between key information technology components and best practices in Enterprise Data Architecture frameworks/perspectives for the specific businesses that they are working in
Familiarity with technology frameworks that are relevant for their various industries
Hardware, software, application and systems engineering best practices and goals
Relevant organizational concepts, processes, technologies and workflow for purposes of
analysis, design, development and implementation of a data science & analytics driven information system
Basic organizational terminology as well as commonly used abbreviations and acronyms
Commonly used formats, structures and methods for recording and communicating data within a specific organization, as well as an understanding
123
Appropriate informatics standards and enterprise models which enable system interoperability (e.g., terminology, data structure, system to system communication, privacy, security, safety)
Project and program management planning and organizational skills
Financial modeling as it pertains to IT investment
IT governance and operations
Policies and principles for the management of analytics data and information
Data, information and workflow models that can be used to model information technology solutions
Key information technology concepts and components (e.g., networks, storage devices, operating systems, information retrieval, data warehousing, applications, firewalls, etc.)
The ability to identify relevant sources of data and information to assess quality of information and draw appropriate conclusions
Appropriate analytical and evaluation techniques and concepts
Knowledge on the best practices for visualizing and presentation data and information that is effective for users
Knowledge of indicators and metrics for
organizational delivery & systems management Skills An Enterprise Data Architect should have skills in the
following categories:
Technical
The ability to understand the big picture within an organization and the relationship between domains and components within it
Systems thinking - the ability to see how parts interact with the whole (big picture thinking)
Comprehensive knowledge of hardware, software, application, and systems engineering
Project and program management planning and organizational skills
Knowledge of financial modeling as it pertains to IT investment
Ability to adopt a successful customer service orientation that applies to various stakeholders
Time management and prioritization skills
Systems & engineering thinking
Emotional intelligence
Contextual
Understanding of the business for which the
Enterprise Data Architecture is being developed (see above regarding various health care organizations)
Knowledge of IT governance and operations Personal Attributes An Enterprise Data Architect should have the following
personal attributes:
Communication skills
Presentation and public speaking skills
Rapport building and networking
Innovation and creativity
Leadership skills including ability to influence others, to lead business and technology programs, projects, workshops and initiatives, to inspire confidence and garner respect from business and technology stakeholders
Planning, supervision, coaching and delegation skills
Decision making skills
Negotiating skills
Research skills
Abilities An Enterprise Data Architect should have the following abilities:
Ability to communicate the benefits of architectural approaches simply and clearly
Ability to communicate with top executives, business management, IT management, solution architects, technical architects, subject matter experts, partners and customers.
Ability to adapt vocabulary and style for each situation
Ability to present appropriately to a variety of audiences, including large audiences, top executives, business and technical leaders
125
Ability to seek standardized solutions for problems where available
Ability to find solutions across a wide range of technologies and business domains. Often solutions have budget, time or operational constraints.
Work Values Individuals who are effective as Enterprise Data Architects are:
Able to communicate at all levels of organization
Able to present complex ideas with simple visuals
Able to find solutions across a wide range of technologies and business domains
Able to facilitate collaboration
Enjoy problem-solving
Highly analytical
Able to work independently
Work Styles An Enterprise Data Architect would have the following work styles:
Collaborative
Cooperative
Stress tolerant
Initiative
Independent
Integrity
Essential Skills Profile An Enterprise Data Architect would have the following essential skills profile:
Reading text
Document use
Writing skills
Numeracy
Oral Communication
Thinking Skills
Problem Solving
Decision Making
Job Task Planning and Organizing
Significant Use of Memory
Finding Information
Working with Others
Continuous Learning Additional Information
Physical Aspects Enterprise Data Architects work extensively in an office environment (sitting for long periods, repetitive computer and telephone use). However, Enterprise Data Architects may also be required to travel to satisfy the position function. Typically there is no heavy lifting, bending, or stooping required; however, this is determined by the needs of the organization.
Attitudes Enterprise Data Architects should have very advanced interpersonal skills – be persuasive, empathetic, able to handle pressure, creative, have a sense of urgency, and attention to detail. Enterprise Data Architects must exhibit leadership, people management skills, advanced negotiation skills, advanced conflict resolution skills, and organizational and planning abilities. Adaptability and flexibility are
important, as Enterprise Data Architects work with diverse multicultural workforces.
Future Trends Affecting
Essential Skills The ability to speak more than one language, and an
awareness of and sensitivity to the diversity of international cultures is considered a growing need in the face of
increasing globalization. Furthermore, familiarity with opportunities and benefits associated with “green IT” (e.g.
server energy efficiency, reducing overall power consumption from IT related activities, etc.) will be of increasing importance as facilities begin to manage their overall environmental footprint while seeking short and long term cost saving opportunities. A strong understanding of cloud computing will also serve all individuals in this
server energy efficiency, reducing overall power consumption from IT related activities, etc.) will be of increasing importance as facilities begin to manage their overall environmental footprint while seeking short and long term cost saving opportunities. A strong understanding of cloud computing will also serve all individuals in this