• No results found

Database vs. data warehouse

Database is a collection of related information stored in a structured form in terms of table so that it makes easier insertion, deletion and manipulation of data. Database consists of tables that contain attributes. Whereas a data warehouse is a database system optimized for reporting and analysis. It generally refers to the combination of many different databases across entire enterprise. Once the data entered in the data warehouse, it can be then only loaded, refreshed and accessed for queries.

DATA MINING

It is a process of extracting hidden predictive information from large databases. It is a powerful new technology to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. For a commercial business, the discovery of previously unknown statistical patterns or trends can provide valuable insight into the function and environment of their organization. Data-mining techniques can generally be grouped into two categories: predictive method and descriptive method.

1. Descriptive method: It a method of finding human interpretable patterns that describe the data. Data mining in this case is useful to group together similar documents returned by search engine according to their context.

2. Predictive method: In this method, we can use some variables to predict unknown or future values of other variable. It is used to predict whether a newly arrived customer will spend more than Rs. 1000 at a department store.

In its simplest form, data mining automates the detection of relevant patterns in a database, using defined approaches and algorithms to look into current and historical data that can then be analyzed to predict future trends. Because data mining tools predict future trends and behaviors by reading through databases for hidden patterns, they allow organizations to make proactive, knowledge-driven decisions and answer questions that were previously too time-consuming to resolve.

Data mining is not particularly new — statisticians have used similar manual approaches to review data and provide business projections for many years. Changes in data mining techniques, however, have enabled organizations to collect, analyze, and access data in new ways. The first change occurred in the area of basic

data collection. Before companies made the transition from ledgers and other paper-based records to computer-based systems, managers had to wait for staff to put the pieces together to know how well the business was performing or how current performance periods compared with previous periods. As companies started collecting and saving basic data in computers, they were able to start answering detailed questions quicker and with more ease.

LESSON ROUND-UP

– Database is a collection of related files that are usually integrated, linked or cross-referenced to one another. The advantage of a database is that data and records contained in different files can be easily organized and retrieved using specialized database management software called a database management system (DBMS) or database manager.

– The term database system refers to an organization of components that define and regulate the collection, storage, management and use of the data within a database environment. The database system is composed of: 1. Hardware, 2. Software, 3. People, 4. Procedures, 5. Data

– A database management system is a set of software programs that allows users to create, edit and update data in database files, and store and retrieve data from those database files. Data in a database can be added, deleted, changed, sorted or searched all using a DBMS.

– DBMSs are commonly used to manage, Membership and subscription mailing lists, Accounting and bookkeeping information, The data obtained from scientific research, Customer information, Inventory information, Personal records, Library information

– The Advantages of a DBMS includes improved data availability, Minimized data redundancy, data Accuracy, Program and file consistency, User-friendliness, Improved security. DBMS have certain disadvantages which includes high cost, security issues etc.

– Data Model can be defined as an integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization. The purpose of a data model is to represent data and to make the data understandable.

– Data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently. Data structures provide a means to manage large amounts of data efficiently, such as large databases and internet indexing services. Usually, efficient data structures are a key to designing efficient algorithms.

– Database administrators (DBAs) are primarily responsible for specific databases in the subsystem. In some companies, DBAs are given the special group authorization, SYSADM, which gives them the ability to do almost everything in the DB’s subsystem, and gives them jurisdiction over all the databases in the subsystem. The DBA can be responsible for granting authorizations to the database objects, although sometimes there is a special security administration group that does this.

– Data Definition Languages is used to define the various types of data in the database and their relationships with each other while Data Manipulation Language (DML) enables users to access or manipulate data (retrieve, insert, update, delete).

– A database file is defined as a collection of related records. A database file is sometimes called a table.

A file may be composed of a complete list of individuals on a mailing list, including their addresses and telephone numbers.

– A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data that is required for decision making process whereas Data mining involves the use of various data analysis

tools to discover new facts, valid patterns and relationships in large data sets. Data mining also includes analysis and prediction for the data. Data mining helps in extracting meaningful new patterns that cannot be found just by querying or processing data or metadata in the data warehouse

SELF-TEST QUESTIONS

(These are meant for re-capitulation only. Answers to these questions are not to be submitted for evaluation) 1. What is database or database management systems (DBMS)? What’s the difference between file and

database? Can files qualify as a database?

2. Write the difference between DBMS and RDBMS?

3. Define data models. What are the different data models are available?

4. What do you mean by Data Definition Languages (DDL) and Data Manipulation Languages (DML)?

5. What is Data Warehousing? Explain its characteristics?

6. What is Data mining? Compare Data mining and Data Warehousing?

7. What is database? What are the characteristics of database system?

8. What are the functions of a database administrator?

9. What are the scenarios where we donot require database system? Explain any one.

Example No. 2: Draw a flowchart to find the largest of three numbers A,B, and C.

Answer: The required flowchart is shown as below

PROGRAMMING LANGUAGE

NO

NO

LESSON OUTLINE

– Systems- An Overview