edit] Programming issues

Just as the early programming languages were designed to generate spreadsheet printouts, programming techniques themselves have evolved to process tables (also known as spreadsheets or matrices) of data more efficiently in the computer itself.

Spreadsheets have evolved to use powerful programming languages like VBA;

specifically, they are functional, visual, and multiparadigm languages.

Many people find it easier to perform calculations in spreadsheets than by writing the equivalent sequential program. This is due to two traits of spreadsheets.

• They use spatial relationships to define program relationships. Like all animals, humans have highly developed intuitions about spaces, and of dependencies between items. Sequential programming usually requires typing line after line of text, which must be read slowly and carefully to be understood and changed.

• They are forgiving, allowing partial results and functions to work. One or more parts of a program can work correctly, even if other parts are unfinished or broken. This makes writing and debugging programs much easier, and faster[citation

needed]. Sequential programming usually needs every program line and character to

be correct for a program to run. One error usually stops the whole program and prevents any result.

A 'spreadsheet program' is designed to perform general computation tasks using spatial relationships rather than time as the primary organizing principle.[citation needed].

It is often convenient to think of a spreadsheet as a mathematical graph, where the nodes are spreadsheet cells, and the edges are references to other cells specified in formulas.

This is often called the dependency graph of the spreadsheet. References between cells can take advantage of spatial concepts such as relative position and absolute position, as well as named locations, to make the spreadsheet formulas easier to understand and manage.

Spreadsheets usually attempt to automatically update cells when the cells on which they depend have been changed. The earliest spreadsheets used simple tactics like evaluating cells in a particular order, but modern spreadsheets compute a minimal recomputation order from the dependency graph. Later spreadsheets also include a limited ability to propagate values in reverse, altering source values so that a particular answer is reached in a certain cell. Since spreadsheet cells formulas are not generally invertible, though, this technique is of somewhat limited value.

Many of the concepts common to sequential programming models have analogues in the spreadsheet world. For example, the sequential model of the indexed loop is usually represented as a table of cells, with similar formulas (normally differing only in which cells they reference).

[edit] Shortcomings

This article contains weasel words, vague phrasing that often accompanies biased or unverifiable information. Such statements should be clarified or removed. (February 2009)

While spreadsheets are a great step forward in quantitative modeling, they have deficiencies. At the level of overall user benefits, spreadsheets have several main shortcomings, especially concerning the unfriendliness of alpha-numeric cell addresses.

• Spreadsheets have significant reliability problems. Research studies estimate that roughly 94% of spreadsheets deployed in the field contain errors, and 5.2% of cells in unaudited spreadsheets contain errors.^[20]

Despite the high error risks often associated with spreadsheet authorship and use, specific steps can be taken to significantly enhance control and reliability by structurally reducing the likelihood of error occurrence at their source.^[21]

• The practical expressiveness of spreadsheets can be limited unless their modern features are used. Several factors contribute to this limitation. Implementing a complex model on a cell-at-a-time basis requires tedious attention to detail.

Authors have difficulty remembering the meanings of hundreds or thousands of cell addresses that appear in formulas.[citation needed]

These drawbacks are mitigated by the use of named variables for cell

designations, and employing variables in formulas rather than cell locations and cell-by-cell manipulations. Graphs can be used to show instantly how results are changed by changes in parameter values. In fact, the spreadsheet can be made invisible except for a transparent user interface that requests pertinent input from the user, displays results requested by the user, creates reports, and has built-in error traps to prompt correct input.^[22]

• Similarly, formulas expressed in terms of cell addresses are hard to keep straight and hard to audit. Research shows that spreadsheet auditors who check numerical results and cell formulas find no more errors than auditors who only check numerical results ^[20]. That is another reason to use named variables and formulas employing named variables.

• Collaboration in authoring spreadsheet formulas can be difficult when such collaboration occurs at the level of cells and cell addresses.

However, like programming languages, spreadsheets are capable of using aggregate cells with similar meaning and indexed variables with names that indicate meaning. Some spreadsheets have good collaboration features, and it is inadvisable to author at the level of cells and cell formulas to avoid obstacles to

collaboration, where many people cooperate on data entry and many people use the same spreadsheet.

• Productivity of spreadsheet modelers is reduced by the antiquated cell-level focus of spreadsheets that is seldom used today. That old and poor approach means that even conceptually simple changes in spreadsheets (such as changing starting or ending time or time grain, adding new members or a level of hierarchy to a dimension, or changing one conceptual formula that is represented as hundreds of cell formulas) often require large numbers of manual cell-level operations (such as inserting or deleting cells/rows/columns, editing and copying formulas, re-laying out worksheets). Each of these manual corrections increases the risk of introducing further mistakes. For these reasons, the use of named variables and formulas that use variable names is the norm today.

Other problems associated with spreadsheets include:^[23][24]

• Some sources advocate the use of specialized software instead of spreadsheets for some applications (budgeting, statistics)[25][26][27]

• Many spreadsheet software products, such as Microsoft Excel^[28] (versions prior to 2007) and OpenOffice.org Calc^[29] (versions prior to 2008), have a capacity limit of 65,536 rows by 256 columns. This can present a problem for people using very large datasets, and may result in lost data.

• Lack of auditing and revision control. This makes it difficult to determine who changed what and when. This can cause problems with regulatory compliance.

Lack of revision control greatly increases the risk of errors due the inability to track, isolate and test changes made to a document.

• Lack of security. Generally, if one has permission to open a spreadsheet, one has permission to modify any part of it. This, combined with the lack of auditing above, can make it easy for someone to commit fraud.

• Because they are loosely structured, it is easy for someone to introduce an error, either accidentally or intentionally, by entering information in the wrong place or expressing dependencies among cells (such as in a formula) incorrectly.^[30][31]

• The results of a formula (example "=A1*B1") applies only to a single cell (that is, the cell the formula is actually located in - in this case perhaps C1), even though it can "extract" data from many other cells, and even real time dates and actual times. This means that to cause a similar calculation on an array of cells, an almost identical formula (but residing in its own "output" cell) must be repeated for each row of the "input" array. This differs from a "formula" in a conventional computer program which would typically have one calculation which would then apply to all of the input in turn. With current spreadsheets, this forced repetition of near identical formulas can have detrimental consequences from a quality assurance standpoint and is often the cause of many spreadsheet errors. Some spreadsheets have array formulas to address this issue.

• Trying to manage the sheer volume of spreadsheets which sometimes exists within an organization without proper security, audit trails, the unintentional introduction of errors and other items listed above can become overwhelming.

While there are built-in and third-party tools for desktop spreadsheet applications that address some of these shortcomings, awareness and use of these is generally low. A good example of this is that 55% of Capital market professionals "don't know" how their spreadsheets are audited; only 6% invest in a third-party solution^[32]

A database consists of an organized collection of data for one or more uses, typically in digital form. One way of classifying databases involves the type of their contents, for example: bibliographic, document-text, statistical. Digital databases are managed using database management systems, which store database contents, allowing data creation and maintenance, and search and other access.

[hide]

• 1 Architecture

• 2 Database management systems

o 2.1 Components of DBMS

 2.1.1 RDBMS components

 2.1.2 ODBMS components

• 3 Types

o 3.1 Operational database

o 3.2 Data warehouse

o 3.3 Analytical database

o 3.4 Distributed database

o 3.5 End-user database

o 3.6 External database

o 3.7 Hypermedia databases

• 4 Models

o 4.1 Post-relational database models

o 4.2 Object database models

• 5 Storage structures

• 6 Indexing

• 7 Transactions

• 8 Replication

• 9 Security

o 9.1 Confidentiality

• 10 Locking

o 10.1 Granularity

o 10.2 Lock types

o 10.3 Isolation

o 10.4 Deadlocks

• 11 See also

• 12 References

• 13 Further reading

• 14 External links

[edit] Architecture

Database architecture consists of three levels, external, conceptual and internal. Clearly separating the three levels was a major feature of the relational database model that dominates 21st century databases.^[1]

The external level defines how users understand the organization of the data. A single database can have any number of views at the external level. The internal level defines how the data is physically stored and processed by the computing system. Internal architecture is concerned with cost, performance, scalability and other operational

matters. The conceptual is a level of indirection between internal and external. It provides a common view of the database that is uncomplicated by details of how the data is stored or managed, and that can unify the various external views into a coherent whole.^[1]

In document ziel (Page 41-45)

edit] Programming issues

[edit] Shortcomings

Contents

[edit] Architecture