Knowledge capture and creation tools (a) Content Creation Tools

KNOWLEDGE MANAGEMENT – LOOKING AHEAD

5.3. KNOWLEDGE MANAGEMENT TOOLS AND TECHNIQUES

5.3.1. Knowledge capture and creation tools (a) Content Creation Tools

It is predicted that content management systems (CMS) will become a “commodity”

in the future. Many content management system projects fail owing to lack of good implementation standards and a lack of an understanding of usability issues.

Technology-NOTES

only approaches will continue to generate unsuccessful projects. CMS should be handled in a strategic way. These failures provide a valuable source of learning. The move towards open standards would greatly assist the evolution of CMS, which is likely to proceed with the use of XML-based protocols for communicating with and between content management systems. Additional standards are needed for storing, structuring, and managing content.

Eventually, content, documents, records, and knowledge management will converge, which will be of greatest benefit to organizations. As yet, there is no merged platform to accommodate such a convergence.

Authoring tools, the most commonly used content creation tools, and range from the general (e.g. word processing) to the more specialized (e.g. web page design software).

Annotation technologies enable short comments to be attached to specific sections of text document, often by a number of different authors (e.g. by making use of the track changes feature in Word). This allows a “running commentary” to be built up and preserved.

Annotations may be public (visible to all who access and read the document) or private (visible to the author only).

(b) Data Mining and Knowledge Discovery

The data mining and knowledge discovery processes automatically extract predictive information from large databases based on statistical analysis (typically, cluster analysis).

Using a combination of machine learning, statistical analysis, modeling techniques, and database technology, data mining detects hidden patterns and subtle relationships in data and infers rules that allow the prediction of future results. Raw data is analyzed in order to offer a model that attempts to explain the observed patterns. This model can then be used to predict future occurrences and to forecast expected outcomes.

A large number of inputs are required, usually over a significant period of time, and the types of model produced range from “easy” to “almost impossible” to understand. Examples of easy-to-understand models are decision trees. Regression analyses are moderately easy to understand, and neural networks remain black boxes. The major drawback of the black box models is that it becomes very difficult to hypothesize about casual relationships.

Variables may be correlated, but this relationship may not have any meaning or usefulness. For example, a major bank found a correlation between the state an applicant lived in and a higher percentage of defaults on loans given out. This finding should not be the basis for a policy that would automatically reject any applicants from their state. Reality checks are always needed with statistics before any conclusions can be drawn.

Typical applications of data mining and knowledge discovery systems include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, credit risk analysis, and market basket analysis. However, there are usually a few gems to be

NOTES

mined with data mining applications. These are often unexpected correlations that upon further study yield some useful (and often actionable) insights into what is occurring.

Data mining tools that are currently in use include:

• Statistical analysis tools (e.g. SAS).

• Data mining suites (e.g. enterprise Miner).

• Consulting/outsourcing tools such as EDS, IBM, and Epsilon. (Note that these are models, not just software).

• Data visualization software that coherently presents a large amount of information in a small space. They make use of human information processing capabilities-your eyes- to detect patterns, for example, in a virtual reality or simulation environment where you can “walk around the data points”.

It is also possible to apply this technique and use these tools to mine content other than data-namely, text mining and thematic analysis and web mining-to look at what content, how often, for how long (e.g. number of hits), which is very helpful in content management.

Similarly, skill mining or expertise profiling can be used to detect patterns in online curriculum vitae of organizational members. Expertise location systems can be automatically created based on the content that has been mined. Commercial software systems can also be used to mine e-mail data in order to determine who is answering what types of queries or themes. Organizational experts and expertise can be detected by looking at the patterns of questions and answers contained within the e-mails. The same caveat applies to all of these data mining applications: a human being is always needed in the loop in order to carry out “reality checks” (i.e. to verify and validate that the patterns do indeed exist and that they have been interpreted in a useful and valuable manner.)

(c) Blogs

A blog is a slang name for a web log. For the uninitiated, a web log is a popular and fairly personal content form on the Internet. A person’s web log is much like an open diary.

It chronicles what a person wants to share with the world on an almost daily basis. A blog is a frequently updated, publicly accessible journal. Although the “blogosphere” started off as a medium for mostly personal musings, it has evolved into a tool that offers some of the most insightful information on the web. Furthermore, blogs are becoming much more common, as businesses, politicians, policy makers, and even libraries and library associations have begun to blog as a way of communicating with their patrons and constituents.

everal librarians publish blogs that offer a wealth of information about social software and its uses. SNTReport.com focuses on the social software industry and how social software tools are being used to help people collaborate. Blogs not only offers a new way to communicate with customers, but they have internal uses as well. For example, large organizations can use a well formed blog to exchange ideas and information about web

NOTES

development projects, training initiatives, or research issues. These questions and answers can be cross-indexed and archived, which helps build a knowledge network among the participating members. Most importantly, the price of setting up a well formed, secure blog and leverage it into a knowledge and content management tool is a pittance when compared to the cost of other, proprietary solutions.

At present, the majority of blogs are published exclusively in text. The next generation of blogs, however, will implement audio and video elements, bringing a sophisticated multimedia blend to the medium.

Blogs are collections of articles or stories arranged in reverse chronology and are generally updated more frequently than regular web pages. Just like any other information on the net, there is no guarantee of authority, accuracy, or lack of bias. In fact, personal blogs are frequently biased and can be good sources of opinion and information from the

“man on the street”. Because blogs can be updated on the fly, they frequently have access to unfiltered information faster from war zones and sites of natural disasters than the mainstream media outlets. Blogs are also good sources of unfiltered information on either faulty or very useful products.

In the beginning, blogs appeared in search results alongside regular web pages. Since blogs are technologically any different from other web pages (that is, they are HTML, XML, java script, etc. - it is their format, not their coding that is different). Spiders and bots (or web crawlers, knowledge robots) automatically search for information online and collect posts (i.e. messages that are submitted to a computerized messaging system) the same way as they collect other online information. Search engines that place greater value on sites that are recently and frequently updated and are highly linked tend to rank blog posts very highly. Because the barrier to publication is so low in blogs, arguably much lower than that for standard web pages, these high rankings were introducing a lot of noise into online searches. The odds are that if you have searched on a controversial topic in the past year you have run across several archived blog posts. Recently, most major search engines have altered their algorithms to push blogs down in the search results. Engines that only return two results from any one site use this feature to limit the impact of blogs on the search results.

Blog searching breaks down into atleast two categories: (1) information from within blogs/across blogs or (2) addresses of feeds from blogs so that you may subscribe in your aggregator (i.e. a piece of software or a remotely hosted service that periodically reads a set of news sources, such as blogs, identifies what is new, and displays them on single page). Feeds and blogs are two different concepts, but they are closely linked because most blogs have feeds and many feeds are generated by blogs. Just as in other web search tools, there are search engines and directories. At this time, blog search engines are where general search engines were before the Google Age: there are many competing smaller products, but no product dominates the scene.

NOTES

(d) Content management tools

Content management refers to the management of valuable content throughout the useful lifespan of the content. Content lifespan will typically begin with content creation, handle multiple changes and updates, merging, summarization, and other repackaging, and will typically end with archiving. Metadata (information about the content) is used to better manage content throughout its useful lifespan. Metadata includes such information as source/

author, keywords to describe content, date created, date changed, quality, best purposes, annotations by those who have made use of it, and an expiry or “best before” date where applicable. It is also useful to include attributes such as storage medium, location and whether or not it exists in a number of alternative forms (e.g. different languages). XML is increasing being used to tag knowledge content, and taxonomies serve to better organize and classify content for easier future retrieval and use.

XML (eXtensible Markup Language) gives you the ability to structure and add relevance to chunks of information (that is why many CM solutions use XML), and in theory to exchange data more easily between applications (e.g. with your suppliers, customers, and partners). However, you may all use the same words (tags), but if each of you defines and applies them differently, then we remain in the land of Babel. Common agreed schemas are essential. Keep tags with developments on the schemas and metadata standards in your field.

Taxonomies are hierarchical information trees for classifying information, analogous to the library subject catalog. They can help overcome differences of language usage in different part of an organization and even clarify the use of different languages. Traditionally, taxonomy development is manually intensive in that it is created and maintained by people.

The growing problem of information overload means that taxonomies are receiving significant attention. But how do you cope with the evolution of terms whose meanings seem to change from one year to the next? Automatic (or semiautomatic) classification of information objects uses software such as natural language analyzers, text summarizers, and other technology to understand some of the meaning- the concepts- behind blocks of text, and to tag and index it appropriately to aid subsequent retrieval. Automated classifiers find patterns in textual content, produce categories, and classify the content using these categories.

Personal capital is a term coined to explain a divergence from the traditional notion of capital, which is an asset “owned” by an organization. In fact, the future of KM will blur the boundaries between the individual, the group or community, and the organization. KM will become a pervasive part of how we conduct our everyday business lives. Personalized KM (PKM) will gain increasing importance given the ever-increasing momentum of information overload with which we must deal. In other words, some of the key principles, best practices, and business processes of KM that have to date been focused at the organizational level will filter down to be used by individuals managing their own personal capital.

NOTES

PKM and traditional knowledge management differ depending on whether an organizational or personal perspective is adopted. Tools for personal information management are impressive and, if you think about e-mail and portals, are already widely used. Newer tools such as blogs, news aggregators and instant messaging, represent a new toolset for PKM.

Personal portals- which were once known as “enterprise” portals- are now focused n needs of the individual- all a person’s information and application needs harmoniously brought together into a preferred arrangement on the desktop. This is mass customization in front of your eyes. Again, the aims are laudable, but reality and theory are often miles apart. PKM brings many of the key principles of KM to bear on the personal productivity and specific work requirements of a given knowledge worker. Definitions of PKM revolve around a set of core issues: managing and supporting personal knowledge and information so that it is accessible, meaningful, and valuable to the individual; maintaining networks, contacts, and communities; making life easier and more enjoyable; and exploiting personal capital (Higgison, 2004). On an information management level, PKM involves filtering and making sense of information and organization paper and digital archives-mails, and bookmark collections.

In document DBA 1735 Knowledge Management (Page 168-173)