• No results found

Best Practices for Architecting Taxonomy and Metadata in an Open Source Environment

N/A
N/A
Protected

Academic year: 2021

Share "Best Practices for Architecting Taxonomy and Metadata in an Open Source Environment"

Copied!
26
0
0

Loading.... (view fulltext now)

Full text

(1)

Best Practices for

Architecting Taxonomy and Metadata

in an Open Source Environment

Don Miller

Vice President of Sales Concept Searching

[email protected]

Twitter @conceptsearch Zach Wahl

President and Chief Executive Officer Enterprise Knowledge

[email protected]

(2)

Expert Speakers

Zach Wahl - President and Chief Executive Officer at

Enterprise Knowledge has over 15 years’ experience leading

programs in knowledge and information management, working with more than 200 public and private organizations to

successfully design and implement information management systems. He has developed his own taxonomy design

methodology, has authored courses on knowledge management, and is a frequent speaker and trainer.

Don Miller – Vice President of Sales at Concept Searching

has over 20 years’ experience in knowledge management. He is a frequent speaker on records management, and

information architecture challenges and solutions, and has

been a guest speaker at Taxonomy Boot Camp, and numerous SharePoint events about information organization and records management.

(3)

Agenda

• Enterprise Knowledge

• Introduction to Business Taxonomy for Open Source • Open Source Challenges and Considerations

• Design Best Practices • Taxonomy in Action • Concept Searching • Unique Approach • Considerations • Use Case • Demonstration • Next Steps

(4)

• Company founded in 2002

• Product launched in 2003

• Focus on management of structured and unstructured information • Technology Platform

• Delivered as a web service

• Automatic concept identification, content tagging, auto-classification, taxonomy management

• Only statistical vendor that can extract conceptual metadata • 2009, 2010, 2011, 2012, 2013 ‘100 Companies that Matter in KM’

(KMWorld) and Trend Setting product of 2009, 2010, 2011, 2012, 2013

• Authority to Operate enterprise wide US Air Force and enterprise wide NETCON US Army

• Locations: US, UK, and South Africa

• Client base: Fortune 500/1000 organizations

• Managed Partner under Microsoft global ISV Program - ‘go to partner’ for Microsoft for auto-classification and taxonomy management

• Smart Content Framework for Information Governance comprising

• Six Building Blocks for success

The Global Leader in

Managed Metadata Solutions

(5)

Enterprise Knowledge

Dedicated to Making Your Information Work for You

• Principals bring over 15 years of taxonomy design consulting with support for over 200 organizations globally.

• www.enterprise-knowledge.com • Twitter: @EKConsulting

• Blog: http://www.enterprise-knowledge.com/category/blog/

• Core services include:

• Knowledge Management and Taxonomy

• Enterprise Search

• Application Development

(6)

Taxonomy Definitions

tax·on·o·my (tāk-sōn-mē)

n. pl. tax·on·o·mies

1. The classification of organisms in an ordered system that indicates natural relationships.

2. The science, laws, or principles of classification; systematics. 3. Division into ordered groups or categories: "Scholars have been

laboring to develop a taxonomy of young killers" (Aric Press).

Zach’s Definition – Controlled vocabularies used to describe or characterize explicit concepts of information, for purposes of capture, management, and presentation.

(7)

Taxonomy and Metadata

• Provide structure to unstructured information

• Join or relate multiple disparate sources of information • Provide multiple avenues to find and discover information • Enable findability

(8)

Metadata “Card” Title Author Doc Type Topic Department

Brochures & Manuals Memos

News

Policies & Procedures Presentations Reports Employee Services Compensation Retirement Insurance Education & Training Manufacturing Safety Quality Free Text Entry

(9)

Taxonomy and Metadata

Content~Information~Data~Files

Metadata Fields

Metadata Values/Tags

Taxonomies (Flat or Hierarchical)~ Controlled Vocabularies

(10)

Traditional v. Business Taxonomies

Traditional Taxonomy Business Taxonomy

Purpose Categorization Findability

Designed By Scientists/Librarians The Business

Managed By Scientists/Librarians The Business

Used By Scientists/Librarians Everyone

Complexity Deep, Wide, Detailed Flat, Simple, Deconstructed Key Characteristics Mutually Exclusive,

Collectively Exhaustive

Usable, Intuitive, Natural

(11)

The Business Taxonomy

• Usable – Easy to adopt and utilize for any skill level

• Relatively flat (2-3 levels) • “Easy” to navigate

• Intuitive – Does not require training and reflects the way the user thinks

(12)

The Business Taxonomy

• Tend to be less rigid and constrained

• Influenced by “traditional” usability design

• Driven by the content and needs you have today • Leverages multiple categorization approaches

(via multiple metadata fields and multiple taxonomies) • Accepts imperfect categorization

(13)

Open Source Challenges and Considerations

• Open Source is “free” and “easy”

• But taxonomy isn’t…

• There are multiple ways to use taxonomy

• Menus, Search, Tag Clouds, Page Tags

• Taxonomy design is not enough, you need to plan for taxonomy implementation and exposure

• Open Source tools like Drupal favor “flat” taxonomies

• Faceting is easy to enable but requires diligent tagging and oversight

(14)

Taxonomy Design for Open Source – Best Practices

• Define taxonomy purpose, audience, and use cases upfront. Design before you build.

• Practice usability design best practices (limit depth and breadth, use plain language, etc). Flat lists work best in Open Source content

management tools.

• Leverage primary category/topic taxonomy with supporting metadata fields. For instance, in Drupal, use of multiple Lists with Views to

enable faceting.

• Design for your end users and publishers.

• Employ analytics and support iterative design.

• Plan for the long-term – ensure governance plans are in place before content migration and rollout.

(15)
(16)
(17)
(18)
(19)

• Concept Searching’s unique statistical concept identification underpins all technologies

• Multi-word suggestion is explicitly more valuable than single term suggestion algorithms

Concept Searching has a unique approach to ensure success

• conceptClassifier will generate conceptual metadata by extracting multi-word terms that identify ‘triple heart bypass’ as a concept as opposed to single keywords

• Metadata can be used by any search engine index or any application/process that uses metadata.

Concept Searching provides Automatic Concept Term Extraction

Triple Baseball Three Heart Organ Center Bypass Highway Avoid

Unique Approach

(20)

Metadata driven application and enforcement of policies - conceptClassifier has been deployed since 2010 to automatically generate metadata and use that metadata to apply and enforce policies. Many clients are using the platform to support their information governance strategy.

Proven, mature functionality out of the box - The platform has been deployed in numerous sites and applications across the enterprise, including MOSS and SharePoint 2010, 2013, Solr, Stellent,

Documentum, SQL, Oracle, File Shares, Exchange via SharePoint and across the enterprise.

Smart Content Framework™

Sum of parts is greater than whole

(21)

Open Source Considerations

“Given enough eyeballs, all bugs are shallow.”

Linus Torvalds Creator of Linux

• Security • Quality

• Customizability

• Freedom (avoid vendor lock-in) • Interoperability

• Auditability • Support • Cost

• Try Before You Buy

Any difference if you are purchasing ‘proprietary’ software? Not much!

(22)

Open Source or Proprietary – OK By Us

• Concept Searching Technology Platform

• conceptSearch • conceptClassifier

• conceptTaxonomyManager • conceptSQL

• conceptTaxonomyWorkflow

• conceptClassifier Technology Platform

• Compound Term Processing Engine • Licensed for concept extraction only • conceptClassifier

• conceptTaxonomyManager • conceptTaxonomyWorkflow

(23)

Situation

• Company is the premier global provider of fee based market intelligence, advisory services, and events for the information technology, telecommunications and consumer technology markets

• Seeking a solution to enhance site visitors’ search experience • Potential loss of revenues

Challenge

• Complex taxonomy requirements

• Inability for clients to identify the relevant information they were seeking

Solution

• conceptTaxonomyManager and conceptClassifier

• Solr

• Integrated in-house

Benefits

• Improved search results

• Increased accuracy and relevant retrieval of information for

“Automation is great, but still needs a human eye to gain that last bit of ground.

Anyway, it's a great story and I'm still very happy with

Concept Searching and the flexibility it gives us.”

Director, Enterprise Solutions Planning

Use Case

Smart Content FrameworkTM Building Blocks - Metadata, Insight

(24)
(25)

What’s the End Result?

• Technology from Concept Searching complements Enterprise

Knowledge’s strategic and tactical planning experience and expertise in architecting solutions that improve business processes.

• Utilizing Concept Searching’s Smart Content Framework™ and

intelligent metadata enabled solutions, this partnership addresses key challenges in enterprise search, records management, data privacy, migration, and content management in secure and complex

environments.

For a comprehensive demo of the combined solution and discussion of expected ROI, please contact Don Miller at Concept Searching or

(26)

Thank You

Don Miller

Vice President of Sales Concept Searching

[email protected]

Twitter @conceptsearch Zach Wahl

President and Chief Executive Officer Enterprise Knowledge

[email protected] Twitter @EKConsulting

References

Related documents

During the storage period, no significant difference was observed between the coatings used, however, their action on the fruits contributes to reduce the loss of fresh mass and

Using these rival models and the estimated posterior probabilities we then design rules that are robust in two senses: ‘weakly robust’ rules are guaranteed to be stable and

The nonparametric ap- proach which uses mainly the Data Envelopment Analysis (DEA) method or the Free Disposal Hull (FDH) method and the parametric approach which uses particularly

Because ownership rates are much greater for married couples than for single or "other" households, the aggregate ownership rate tended to decline by 2 3/4 percentage

© 2013 Duke CGGC Components Manufacturing Plastics extrusion & molding Precision metal works Electronics development Software Development Weaving/Knitting Textiles

As for fmns which stay for all three periods, they eam strictly positive gross profit in both periods 2 and 3 (even while exit occurs!), prices are strictly greater than

Figure 2 — The correlation between serum triglyceride (TG) levels after CAPD treatment (Post-D TG, mg/dL) and serum insulin levels ( µ U/mL) at fasting (a), 1 hour (b), and 2 hours

It may be recalled that the transport and health agencies clarified that private vehicle motorists may not wear face masks when driving alone.. If driving with passengers, the rule