• No results found

Database and Data Mining Security

N/A
N/A
Protected

Academic year: 2022

Share "Database and Data Mining Security"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

1

Database and Data Mining

Security

2

Threats/Protections to the System

1. External procedures

–security clearance of personnel –password protection

–controlling application programs –Audit

2. Physical environment

–secure areas for DB/hardware –radiation shielding

3. Data storage –encryption

–duplication copies

3

4. Processor software

- user authentication - access control - threat monitoring - audit trail 5. Processor hardware

- memory protection - state of privilege - reliability

6. Communication line : -data encryption, implemented with cost consideration

4

• DB holds essential data that reflects the organization’s core competencies

• Protecting data is at the heart of secure system

• Users rely on DBMS to manage protection

• DB organization and contents are valuable corporate assets that must

be carefully protected

• Two major security issues in DB context –Integrity

–Secrecy

• Two major problems –Inference

–Multilevel

6.1 Introduction to DB

• DB = organized collection of data and a set of rules that organize the data by specifying certain relationship among the data

• Data Mining = the process of extracting hidden patterns from DB by using stat, math-inference

• DB administrator = the person who defines the rules to organize and control the usage of DB

• DBMS = software providing front end or interface for users to interact with DB

(2)

7

• DBMS functions:

–create –manage –protect –provide access

8

Advantages of DB

• Shared access

• Minimal redundancy

• Data consistency

• Data integrity

• Controlled access

9

6.2 Security Requirements of DB

• DB system may be attacked at many levels

• Attackers usually be end users rather than programmers

• Violation are reading, modifying or destroying info. by unauthorized people

• Basic problems

–access control, exclusion of spurious data, authentication of user, reliability

10

Requirements for DB Security

1. Physical DB integrity :

–DB must be immune to physical problems such as power failure, hardware

malfunction,..

–DB can be reconstructed if it is destroyed 2. Logical DB integrity : structure of DB must

be preserve, eg., modification of one field does not effect other fields

11

3. Element integrity:data contained in each element must be correct or accurate. Can be provided by

–field check : test for appropriate value, –access control : concurrency access control –change log: list every change made to DB,

we can track for all previous actions or when error occurs we can undo, roll back,

……

12

4. Auditability : Possible to track who or what has accessed the elements in DB

–pass-through problem access which has no transfer of data to user, eg., when using select, thus difficult to audit –log may be overstate or understate

(3)

13

5. Access control : user is allowed to access only authorized data. Different users can be restricted to different modes of access

6. User authentication : every user is positively identified, both for auditing and access permission

7. Availability : data must be available for the right person and at the right time

14

6.3 Reliability & Integrity

• Reliable software = software that run for very long time without failing

• DB reliability and integrity can be viewed from 3 dimensions:

–DB integrity : DB as a whole is protected against damage, as from HW/SW failure –Element integrity : the value of data elements

is changed only by authorized user

–Element accuracy : only correct values are written into the elements of a DB

15

Protection Features for Reliability

DB can be monitored and controlled by many methods as follows:

• Field checks is a check for validity of values in DB fields. Usually applied at data entry

• Change logs whenever changing on DB, there must be a log file to keep both old and new values in order that DBA can examine, verify or make correction if error occurs

16

• Access control procedures to keep eyes on all users who access DB so that we can know DB status and access of every user before system crash or conflict

• User authentication to check and allow only authorized user to get DB access

• Integrity checks info. should be checked for integrity, accuracy and completeness

• Audits performed by internal or external party to make sure that the system perform as designed

• Monitors can check for the structural integrity of DB eg., value being entered is consistence with other parts of DB or not

–Range comparisons verify new value whether it is in the acceptable range

–State constraints to check whether DB values violate the entire DB constraints –Transition constraints to check the

conditions necessary before changes can be applied to a DB eg., before new employee can be added, there must be a vacant position

–Boundary checks to check for sensitive values whether they are fallen in the lower and upper bound without revealing actual values eg., checking salary which is sensitive against it boundary values

(4)

19

• Two-phase update

: secure method for updating

–intent phase: prepare data to be used for updating, eg., gather data, create dummy record, open file, lock rec., compute final answers (if fail, we can repeat)

–committing phase : making the permanent change by writing a commit flag, if fail, must perform recovery, eg., undo (roll back)/redo (roll forward)

20

• Redundancy/internal consistency

–error detection/correction

–shadow fields : create 2ndcopy of field/

record in order that the 1stcopy is failed or error occurs when updating

• Data recovery

–roll back/roll forward –compensating transaction –backup/restore

• Concurrency/consistency control

–Serializability,

–data locking

21

6.4 Sensitive Data

• Sensitive data =data that should not be made public otherwise it causes damage to individual

• Security concerns not only the data element but their context and meaning (Table 6-6)

• We should also take into account different degrees of sensitivity

• Access control problem : how to limit access so that sensitive data are not to be released to unauthorized people

22

23

• Factors that can make data sensitive

–inherently sensitive, eg., location of

defensive missile

–data from a sensitive source, eg., info.

from informer whose identity may be compromised if the information were disclosed

–declared sensitive, eg., classified military data

24

–part of a sensitive attribute/record, eg., salary of personnel DB, record of secret space mission program

–sensitive in relation to previously disclose information, eg., longitude coordinate of secret gold mine when appearing with latitude can pinpoint the location

(5)

25

Access Decision

• access decision must be basedon access policy

• factors effect to the decision

–availability of data: whether the access makes a permanent blocking or very long time data locking resulting denial of service

–acceptability of access: whether the access can release sensitive info. even user does not ask for but it come out with non-sensitive data –assurance of authenticity: whether the

access are made from authorized people, unauthorized people can reveal sensitive data by combining several less sensitive queries

26

Types of Disclosures

• exact data

: the most serious disclosure

• bounds

: useful way to present sensitive data

• negative results

: data that are separated into 2 groups, not appearing in one group determines that they are in another group

• existence

: reveal the existence of data regardless of its actual value is sometimes sensitive

• probable value

: combination of non- sensitive query may result in disclosure of sensitive data (in probability)

27

Security vs Precision

• Security goal : protect data as secure as possible

• Precision goal: reveal data as much as possible

• situation is complicated by a desired to share non-sensitive and protect sensitive data

• ideal combination of security and precision : maintain for perfect security with maximum precision

28

6.5 Inference

- inference is the way to infer, derive, deduct info. from non-sensitive data

- usually deduct to find sensitive info. from most extreme value of available info.

- to protect inference, it can be done by

creation of rule-based semantic layer between logical DB design and physical

implementation which will be criteria to examine query

Methods of Inference

Direct Attack attacker uses query trying to put some conditions so little output or a single data item is come out

Ex student data containing sensitive field of drug with values; 0, 1, 2, 3

(less obvious query) List NAME where List NAME where

SEX=M and DRUG=1 (SEX=M and DRUG=1) or (SEX = M and SEX = F) or (obvious query) (DORM=AYRES)

(6)

31 32

Indirect Attack

• indirect attack usually be done outside DB by using anonymous statistics to infer individual

• normally statistical info. from DB must eliminate anything used to identify individual, eg., name address, tel,…..

• present only neutral statistics, eg., count, sum, mean, ..… without extreme values

• However, indirect attack may take these for inference, eg.,

–sum, count, mean, median, tracker attack –linear system vulnerability

33 34

35 36

By combination of Table 6.8 and Table 6.9 we can infer who they are (for yellow)

(7)

37 38

39

• Tracker Attack

–Fool DB manager into locating the desired data by using additional query

–The tracker adds additional records to be retrieved for two different queries

–The two records cancel each other out, leaving only the data

required (given n and n-1 we can easily compute single element)

40

Linear system vulnerability :

- single c can be solved from 5 queries

Control for statistical inference attacks

• Query controls are effective primarily against direct attack

• Precision checks is set to determine whether a given query discloses sensitive data

–Suppression : query will be rejected and terminated without any response or indication when query sensitive data –Concealing : presents info. that is not

exact but close to actual data by slightly modifying data values with random no.

(8)

43

• Limit response suppression

44

• Combined Results

45

• random sample results are computed from data randomly selected from the whole data

• random data perturbation : slightly modify statistics before presenting to the requester, eg, put some noise (error) into output data which has no effect to statistics

Ex. Average salary may be multiplied by a small number and each record may be

added/subtracted by small random number

46

• query analysis

–Used to analyze query whether it can infer sensitive data

–Technique applied is to consider query history and query context

–More complicate –Difficult to do

47

• Inference conclusion –No perfect solution –Three approaches

1. Suppress obvious sensitive information 2. Track what the user knows

3. Disguise the data

48

Aggregation

–This attack builds sensitive result from less sensitive data

–Several less sensitive data can be tracked to sensitive data

–Rather complicate and difficult to do –Presently, advance in data mining can be

applied to perform this type of attack

References

Related documents

Keywords: Anti-racist policies, French model of integration, Multiculturalism, Ethnicization, Urban violence, Education.. Address

The variable angle valley gutter system provides secure and watertight installations for both glass and polycarbonate glazing and is able to accommodate a wide range of

In these cases the choice of evils is either death of the entire group by starvation (or by drowning) or killing someone in order to cannibalize his flesh (or

Tissue distribution of protein resistant prion protein in variant Creutzfeldt-Jakob disease using a highly sensitive immunoblotting assay... Silveira JR, Raymond GJ, Hughson AG,

Regarding the M-PAC based variables, we made the traditional hypotheses that compared to HCS meeting neither guideline, those meeting the combined or aer- obic guidelines would have

The experiment involved three different treatments: (1) An examination of the role of anonymity for individual contributions, (2) an exploration of the effect of giving a small