• No results found

Thesaurus and Classification Scheme : a study of the compatibility of the principles for consturction of thesaurus and classification scheme

N/A
N/A
Protected

Academic year: 2020

Share "Thesaurus and Classification Scheme : a study of the compatibility of the principles for consturction of thesaurus and classification scheme"

Copied!
14
0
0

Loading.... (view fulltext now)

Full text

(1)

SEMINAR ON THESAURUS (1975). Paper AF

THESAURUS AND CLASSIFICATION SCHEME : A STUDY OF THE COMPATIBILITY OF THE PRINCIPLES FOR CONSTRUCTION OF THESAURUS AND CLASSIFICATION SCHEME

M A GOPINATH,

Documentation Research and Training Centre, Bangalore K N PRASAD,

Central Machine Tool Institute, Bangalore

The performance of an Information Retrieval System can be imp-roved by the use of controlled vocabularies, such as, classification sche-mes, subject authority lists and thesauri. Recent trends in the field of design of different types of controlled vocabularies have developed

exclu-sively, principles and rules for each one of these devices. While it is admitted that there are certain essential differences in each of these types of controlled vocabularies, there are also certain features which empha-size basic similarity, interconvertibility and compatibility among them. This paper discusses the compatibility of the guiding principles provided by the ISO for the construction of a thesaurus and the principles available in the General Theory of Library Classification. It is found that there are several initial steps which are common for construction of both a thesau-rus and a classification scheme. The thesaurus usually confines itself to two planes of work, the idea plane and the verbal plane, whereas the clas-sification system spreads into three planes of work, idea plane, verbal plane and the notational plane. The verbal plane of thesaurus actually takes some of the roles of the notational plane of a classification scheme. It is found that the role of thesaurus, classification scheme and that. of a

subject heading. list can coexist and used harmoniously in an integrated library and information system.

Abbreviations used

0 INTRODUCTION

Classification scheme, subject heading lists, and thesauri are three general types of tools designed to control vocabularies of a given information storage and retrieval systems. However, in recent years the trends in informa-tion organisainforma-tion have laid emphasis on the value of a thesaurus as a tool to improve effi-ciency of retrieval in an information system. According to Kent, thesaurus is, "a compila-tion of terms of a given informacompila-tion retrieval system's vocabulary, arranged in some mean-ingful form and which provides information re-lating to each term that will enable the user of the information file to predict the relevance of

responses to questions when this particular vocabulary control mechanism is used" (4). Thus a thesaurus acts as a device, which helps users to formulate their questions precisely. It also acts as an aid to indexer in assigning the preferred descriptor to subjects of documents.

The function of classification schemes is more or less similar to that of a thesaurus. The classification schemes organize concepts in a logical, systematic manner and indicate the

interrelationships between concepts. They re-flect the equivalence, hierarchical and non-hierarchical relationships that incident among the concepts. However, generally the functions of a classification system are confined to a CCC Classified Catalogue Code with additional rules for

Dictionary Catalogue Code. Ed 5.

(2)

AF0

library atmosphere. A classification system acts as a tool for intellectual organisation of documents (on the shelves of a library) and its

surrogates in a catalogue.

The essential unity of the functions of a thesaurus, subject heading list, and

classifica-tion scheme is succinctly indicated by John H Schncider (8) in the following figure.

Uncon- Alpha- Three Word Multi trolled betical level Trees level

vocabu- subject The-

Hier-laries head- sauri archival

ing classy-Autho- fication rity

Lists

As we move from left to right in the figure we find that control devices have in-creasing degree of organisation and inin-creasing delineation of relationships; and this suggests that the response of an information retrieval system can be progressively improved by using systems which are wellstructured.

Thus, it may be seen that there can be several common principles that can be used for the design of a thesaurus and that of a scheme for classification. It may also be -noted that many good thesauri have built into them the features of a faceted classification, such as the Thesaurofacet (2). This paper attempts to study some of the common features of these guiding principles.

1 GUIDELINES FOR THE

CONSTRUC-TION OR A THESAURUS

A fairly good number of documents have been published to aid the construction of a thesaurus. The more important of them include: UNESCO'S Guidelines far the establishment and development of monolingual thesauri (1O), and, IS0:2788 Guidelines for the establishment and development of monolingual thesauri (3). In addition to this several treatises are written on thesaurus. They include Soergel's indexing languages and thesauri: Construction and maintenance (9) and Aitchison arid Gilchrist's Thesaurus construction: A Practical manual

(1). This paper considers the guidelines pro-vided mainly in the LSO document (3), since it covers more comprehensively the rules that are taken care of in the construction of a thesaurus.

A 3 8

GOPINATH and PRASAD

a GUIDELINES FOR THE DESIGN OF A CLASSIFICATION SCHEMES The principles for the design of a scheme for classification have been explicitly stated in the form of postulates, canons, and working principles. This could provide helpful guidelines for design and development in library

classification at all levels, act as a standard to conform to and for evaluation of any existing practice or new development and as a system of

control at each stage of work. Most of these guidelines occur in Ranganathan's Prolegomena to library classification (7). Some rules for

rendering of terms are given in his Classified Catalogue Code (6). The principles for classi-fication and subject headings enumerated in

these documents are taken as principles to be compared with those of the thesaurus.

3 COMPARATIVE STUDY

The comparative study of these two y. e., thesaurus and general theory of library

classification is done in the following manner. First, the rules pertaining to a parti-cular feature of the thesaurus is/are presented, This is followed by the postulates, canons, principles or rules in the general theory of library classification and/or subject headings. These two are later followed by an annotation commenting on the relative likeness of the two prescriptions.

4 GENERAL FEATURES

41 A thesaurus, as a structured subject of natural language, describes the' subject con-tent of the documents, objects of collection of data.

Library classification is, in essence, subject classification. Alphabetical arrange-ment throws coordinate and subordinate sub-jects scattered. But it should be one that throws the subjects in the sequence of their mutual filiation,

The main core of library classification is the classification of subjects. We should call this-core 'subject classification' (Prol-CTI),

(3)

The difference between a thesaurus and a library classification is highlighted in their respective definitions. A classification scheme adds one more dimension in addition to the structuring of the subjects of the document, in providing notation. The notational language is i ntended for the communication of the ranks of the subjects. This ranking is done in the idea plane with the help of several guiding principles. The notational system avoids some of the pit-falls of the natural languages, such as homonyms and synonyms. But, in turn, the notational l anguage has to combat with the faster pace of the growth of knowledge and the inevitable rigi-dity of notational systems. This it has to do retaining as far as possibly brevity, simplicity, familiarity/universality,and mnemonic features. These four qualities are necessary in the i nterest of the overall economy of the use of the notational language. Thus the theory of library classification has to provide for a dynamic fle-xibility in the structure of the notational langua-ge. This is not a problem in thesaurus con-struction. But all other features which are i ncident in the thesaurus construction and up-dating occur in the idea and the verbal planes of the classification system.

42 A particular thesaurus should accura-tely reflect the information content of the body of documents or other items in a collection to which the thesaurus applies.

The main core of library classification is the classification of subjects (Prol-CT 1).

The canon of coextensiveness (Prol-7H 1) implies the representation in a class number of the measure of incidence of each of the relevant characteristics of the subject embodied in the document classified.

Here, both the thesaurus and the classi-fication system should be able to reflect the information content of a document or any other source of information precisely. Usually, the classification systems are used for pre-coor-dinate representations, and the thesaurus for post-coordinate representations, since thesau-rus usually provides a more flexible presenta-tion than a classificapresenta-tion system. This is be-cause the classification systems are generally designed to prefer one sequence among the many possible sequences. Whereas, a thesau-rus does not make such differences in combina-tions. However, subject indexing systems, using a classification scheme as basic struc-ture can provide multiple access and multiple representation of the subject.

Thesaurus and classification scheme AF 44

43 A thesaurus should contain terms and cross references apprppriate to the subject matter taking into consideration both the lan-guage of the information needs of the users.

Classificatory language has its use restricted to arrangement of subjects, isolates and commodities. A classificatory language is a controlled language. It is possible to secure that it does not develop homonyms or synonyms, (Prol- MC 1).

A thesaurus essentially functions as a controlling device for the vocabulary used in a subject field or a field of activity. It generally entertains the users by accepting their own terms into its vocabulary (usually known as entry vocabulary). And, if the user happens to come with a non-preferred term it refers him to the preferred term by providing

cross-references. In such cases, it also indicates to the users, the standard vocabulary of the language.

In the schedule of a classification sche-me, only the standard descriptor, which is in current use is provided. Further, these stan-dard terms are generally arranged according to the principles of helpful sequence. In the application of these principles, we recognise, the contextual role of the concept denoted by the standard term. Thus the sequence of the term is not easily accessible to the user. Therefore, just as a thesaurus has got an entry vocabulary, a classification system will have alphabetical index, which will reflect the functions of an entry vocabulary, such as, referring from a non-preferred term to a preferred term,

44 A thesaurus provides words or terms to express meanings that are implied by the term relationships given in the thesaurus (in contrast to a dictionary).

The denotation of a term in a scheme for classification should be determined and should he left to be determined in the light of or through the sub-classes or ranked isolates (lower links) enumerated in the various chains having the class or ranked isolate, as the case may be, denoted by the term in question as their common link (Prol-GC 1).

(4)

AF44 GOPINATH and PRASAD

In other words, in a thesaurus, each preferred term collects together, all its associated terms. This can be treated as atomised presentation of relationship. The term in a classification schedule lies in a spectrum of relations. The relationship between any two terms is normally ascertained through the sequence in which it is arranged and through the respective notations.

45 Thesaurus may be arranged like an alphabetical index and the terms in a thesaurus may be used to construct an index. However, the thesaurus itself is not an index. An index to a collection must have addresses or locates for items in the collection associated with each term, but a thesaurus contains the terms only, without the addresses or locators of an index. A thesaurus classifies terms by arranging

them in hierarchical classes. As a "term classification system" a thesaurus has some similarities with subject matter classification classification systems, as represented, for

example, by the Universal Decimal Classifi-cation.

the succession of the characteristics should be consistently adhered to so long as there is no change in the purpose of classification (Prol-EH to EC).

The sequence of the classes in an array of classes and o£ the ranked isolates in an array of ranked isolates, should be helpful to the pur-pose of those for whom it is intended (Prol-EP 1

The relationship between a thesaurus and a classification system as delineated in the above paragraph assumes, the terms in thesau-rus are arranged according to a systematic sequence other than alphabetical sequence, then it more or less resembles a regular classifica-tion schedule. For example, the schedule part of the Thesauro-facet can be treated as a sys-tematic thesaurus, whereas its alphabetical i ndex part as the alphabetical thesaurus. Once again it may be taken that a basic difference between a classification schedule and that of a

thesaurus is that in the latter each term is treated as an independent entry with all relation-al terms being listed under it.

But, whereas hierarchical subject clas-sification systems try to show the whole system

of hierarchical relations, the thesaurus shows 5 STRUCTURE relations necessary for indexing and retrieval

according to the body of documents and the in-formation needs of users.

A thesaurus is one kind of authority list, that is, the preferred terms in a particular

late to the boundary conditions within which, the characteristics could be used for deriving iso-lates. The characteristics used, should dif-ferentiate some o£ its entities to give rise to atleast two classes or ranked isolates ; should be relevant to the purpose of classification;

should be definite and ascertainable ; and should continue to be unchanged so long as there is no change in the purpose of classification

( Prol-EC to EF).

The succession of characteristics in the associated scheme of characteristics should see that no two characteristics in the associated

scheme of characteristics should be concom-mitant, succession of characteristics should be relevant to the purpose of classification; and

A 4 0

51 Cross references in a thesaurus make explicit the ways in which entries relate to each other in a net work of concepts.

Canon of sought heading - The principle

should be based on the answer to the question; is reader or library staff likely to look for a book under the particular type or choice or

rendering of heading or in the particular added entry (CCC-BE 0)

The institution of the manority of cross reference index entries had its origin in the canon of Sought 1-leading. A reader might remember an author a collaborator by only one of the names used by him as alternative names or variant forms of one and the same name, in different documents Whatever be the name sought by the reader the catalogue should inform him of all the documents written by him

under other names too The claim of thesaurus are required, indexing and retrieval that the decision whether an entry

terms for a given information and

documenta-tion system. There are other kinds of natural 1 With a particular type of heading, or language-based authority lists, such as subject 2 With a particular choice for that heading lists. In general, however, these do

not have the hierarchical structure of the 3

heading, or

With a particular rendering of that thesaurus.

4

choice, or

(5)

each name to be used as heading may therefore be admitted without undue disregard of the Law

of Parsimony (CCC-BE 6).

The role of cross-references as lead-in terms to preferred terms or descriptors has already been mentioned for a thesaurus. An i mportant factor is not working of concepts in the choice of terms for lead-in entry. Here comes the identification of the user's approach. Criteria for the choice of a lead-in term (sought heading) is explicitly stated in the form of canon of sought heading in the cataloguing rules.

52 In a thesaurus, the descriptor can be characterized as an authorised and formalised term or symbol used to represent unambiguously the concepts of documents and queries.

The denotation of a term in a scheme for classification should be determined in the light of the different classes or ranked isolates of lower order (upper links) belonging to the same primary chain as the classes or the ranked iso-lates denoted by the term in question (Prol-GB 1

The denotation of a term in a scheme for classification should be determined and should be left to be determined in the light of or through the sub-classes or ranked isolates

(lower links) enumerated in the various chains having the class or ranked isolate, as the case may be, denoted by the term in question as

their common link (Prol-GC 1).

The term used to denote a class or a ranked isolate in a scheme for classification should be the one current among those specia-lising in the subject field covered by the sche-me (Prol-GD 1 ).

The principle that the name of any entity-be it of a person, a geographical entity, a cor-porate body, a series, a document, a subject, or a language-used as the heading of a catalogue entry should be made to denote one and only one entity by adding to it the necessary and suffi-cient number of individualising elements ( CCC - BDO).

The terms used as headings or sub headings are to be the standard ones given in the classification scheme in use (CCC-KE 1).

The terms used as headings are to watched and as they become obsolete, fresh entries are to be made with their current equi-valents in the headings, and the old ones may be removed ultimately, though not immediately ( CCC - K31 7).

Thesaurus and classification scheme AF 55

In a natural language one and the same term is used in different subject contexts with

different connotations.

Ex : Copy - Imitation, Prototype, Represent Gravel - Offend, Puzzle, Soil

The classification and cataloguing rules given, provide the methods for resolving such near-synonyms and homonyms, while the thesaurus rule is only a broader one.

53 Descriptors in a thesaurus may be terms denoting concepts or concept combina-tions, terms denoting individual entities (These are also called proper names or identifiers. like project names, nomenclatures, trade-marks, acronyms etc., ).

Terminology is the system of terms used to denote that is to name - the classes or ranked isolates in a scheme for classification ( Prol-GA 0).

The definition of the term 'descriptor' in thesaurus is explicit. It is actually equal to a name of a class or an isolate in a scheme for classification.

54 In a thesaurus, it is advisable to use proper names in the same way as other des-criptors i. e., to interrelate them. The same may apply when internationally agreed nomen-clatures are integrated into the thesaurus.

If it happens that the whole class Num-ber or a part o f it, made up of the Basic sub-ject Number and/or, one or more of its Isolate Numbers, represents a proper name or can be translated into a single word in popular usage, it is to be used as the heading (CCC-KD 23).

Both the thesaurus and classification systems treat proper names as if they are regular descriptors. However, emphasis is laid on internationally agreed nomenclature, wherever it exists.

55 In thesauri not using preferred terms, all terms included in the thesaurus may, in principle he descriptors. The concept-denoting terms not permitted in indexing must be

re-garded as unauthorized terms. They are called non-descriptors.

(6)

AF 5 6

56 A descriptor in a thesaurus may consist of one or more words.

Multiple subject-Heading - Subject head-i ng havhead-ing head-in head-its successhead-ive blocks, the names of successive classes, These are normally of i ncreasing extension. (CCC - FR 56).

The sub-headings necessary to secure i ndividualisation are to be derived, with the aid of canon of context from the last digit of one (or more) of the upper sought links in the chain (CCC-KD 21).

The minimum number of such links, necessary and sufficient for individualisation, are to contribute the subheadings (CCC-CD ZZ).

Postulate-based Permuted Subject Indexing (=POPSI) is an indexing procedure helpful in (al Formulating subject headings, which may be used as feature headings or for other indexing purposes ; (b) Deriving subject i ndex entries for a classified catalogue, an i ndex to a book, etc. ; (c) Determining the subject of reader's query in a consistent and helpful way; (d) Formulating a strategy for searching information about a subject in a cata-logue or other surrogate files ; and (e) Deriving a base for the presentation of ideas in the text of a document (Lib Sc. 12 - H, 01).

The rules for multi-worded descriptors is governed by principles of facet analysis in a classification system.

Ex : Heavy lathe = Lathe - Heavy

57 A descriptor in a thesaurus should re-flect the terminology of the subject, but it should contain as few words as possible, and prefer-ably only one.

There should be an organised attempt to ( a) Delimit the vaguness of words and eliminate ambiguity: (b) Establish an agreed Standard Terminology free from homonyms and syno-nyms for each subject field; and (c) Lay down methodology to coin new terms, when new i deas come into being or an old term has to be replaced (Pros-GA 5).

Helpfulness of glossaries in different subject fields is obvious as a means of resolv-ing the problems mentioned.

58 The words of compound descriptors in a thesaurus should be entered preferably in their natural word order (e. g, electrical engi-neering) i. e. not artificially invested. It may

A 4l

GOPINATH and PRASAD

be helpful to include the inverted forms as non-descriptors preferentially related to the des-criptors.

When the inverted form is chosen for the entry, i. e. The descriptor, the inclusion

of the non-inverted form as a synonym is neces-sary.

If the term used as heading or sub-heading consists of more than one word, and if the term used as heading or sub-heading is not the name of a person, a geographical entity, or a corporate body, or the title of a work, the words are to be written in their natural sequence

{ CCC-KE 3}.

The Two rules of the thesaurus empha-sise the case of access for the user although it is helpful to invert certain multiworded des-criptors for arrangement and cluster-in. The users may not be familiar with inverted approach Hence thesaurus emphasises natural sequence of the words in the term,

5A To keep the number o£ descriptors with-in limits, it may sometimes be useful to re-present the concepts or combination of concepts by a combination of descriptors.

This thesaurus rule is primarily based on the need for keeping the size of thesaurus to a unit. However, this rule has to be followed judiciously, because it may lead to the

con-struction of descriptors which may not be easily accessible to the users.

5B In most cases it is recommended to enter terms which are broader terms to other descriptors as precombined descriptors ; If a concept is represented by the combination of

simple descriptors, this should be expressed by the 'USE' reference. Under the simple des-criptors as specified, 'UF' reference (e, g. UFC} has to be made to the unused precombination term.

The combination of simple descriptors must be included in the systematic sections of the thesaurus and the unused precombined des-criptor in the alphabetical sections as non-descriptor.

(7)

5C Spelling

In a thesaurus the most widely accepted spelling of the word should be adopted. In case where, due to varying usage, more than one

spelling of a word is accepted, both spellings should be included in the thesaurus and reference made from one to the other. Alternatively , a well established dictionary can be chosen to act as arbitrator, whenever this problem arises.

Varient forms may be due to transli-teration from one script or language to another, difference in usage in regard to archaic, mo-dern and other forms of spelling, and prefer-ence of singular or plural forms and

similar alternative morphological forms {CCC-I_C 1).

for an alphabetical index of a classification scheme.

5D Translation

In a thesaurus many current technical terms have arisen by translation from other languages, but sometimes, a modern foreign language. Latin or Greek term is incorporated into the specialised vocabulary for a particular subject when both the foreign language term and its putative translation coexist with the same meaning, both should be included in the

thesau-rus and reference made from one to the other. If the name of subject concept is not in the favoured language and it is not a proper noun but is descriptive, its translation in the favoured language may be added within square brackets as a separate sentence (ACC-EA 5Z2).

The thesaurus and the classification rules take care of the favoured term and fa-voured language by network of cross referenc-i ng.

Thesaurus and classification scheme

5E Transliteration

The problem is further complicated when the foreign language in question is written

i n a different alphabet. This is particularly true in the case of identifiers (Proper nouns l. The transliteration standards recommended by the ISO should be used whenever applicable.

Wherever a choice exists, the transliteration which does not employ diacritical marks should be selected Ex. Satellites, Sputniks.

If the name subject or concept is not in the favoured script of the library the words taken from the title page are to be translitera-ted in that script in accordance with the accep-ted table of transliteration (CCC-EB 3).

AF 5G

It is always helpful to represent a con-cept in noun-form, because it will uniformise the grammatical variances which are likely to add unnecessarily many non-preferred descrip-tors. however, care should be taken for quali-fying terms or adjectival terms which may gene-rally lead to precombined descriptors.

5G Number

The use of the singular or the plural form of descriptors should be decided in accor-dance with the usage in the language of the the-saurus.

Sometimes the singular and plural forms of a word denote different concepts ; in this case both should be entered Ex. wood; woods.

In English, in general, the plural form should be used for descriptors, particularly when generic terms are i nvolved (i , e., whenn the descriptor denotes classes of things). The singular form is used for specific material or property terms (attributes), process terms proper names and disciplinary areas.

The rules for transliteration in both give a sufficient_ number of Cross the cases are considered adequately. Reference Index Entries, using the different

variants as Headings 5F Noun form

Descriptors should be preferably in the Another way is to use a uniformised form of a noun (or a noun phrase) or that formof the verb which is grammatically equivalent.

Ex-Democracy instead of Democratic. form by referring from every variant form to

the preferred uniformised form (CCC-LC 1).

Each heading or sub-heading is to hi- a single noun in nominative case except when a qualifying adjective is necessary as in 'Algeb-The thesaurus and the classification

rules take care of the variant forms of words, particularly the spelling form. The rule in thesaurus is important, because the access to it is mainly through the words. It is also true

(8)

AF 5G GOPINATH

This rule is equally applicable for a term in a classification scheme, although it is not specifically mentioned as a separate rule.

5H Adjectives

There are of course, a certain number of cases where only adjectives or other non-noun forms can used. Ex. Social, International.

A small proportion of single-word terms i n adjectival form may be useful as modifiers (continuous, horizontal). Science adjectives could be pre-coordinated with nouns and entered as compound descriptors, the choice to enter adjectives singly should be dictated by consi-derations of practicability and flexibility. Pre

coordination is recommended whenever a modi-fier appears very frequently in combination with another particular term,

This rule is applicable to a faceted classification schedule. For it may contain a

set of qualifiers or modifiers known as speciatrs which may end in adjectival forms. Ex. Rigid Bearing.

5S Abbreviations and Accronyms

In general, abbreviated forms of terms should be avoided because their use may not be dependent on context, or their recognition may be dependent on capitalization and periods which became constraints if computed printers or other E D P Equipment is used in conjunc-tion with the thesaurus. Therefore, they should be used only when their meanings are well estab-lished within the group of users concerned or their meaning is internationally established and when significant gains in practicality can be demonstrated.

Abbreviated and unabbreviated forms of a given term should be treated as synonyms and cross referenced accordingly.

Abbreviations with several meanings are to be treated as homonyms {Homographs}.

Sometimes the necessity of limiting the length of the descriptor entails the use of less well established abbreviations. In all these cases a scope note should be appended.

Well established acronyms are accepta-ble as descriptors. Ex. RADAR, LASER.

The rules for abbreviations and acro-nyms are very clearly provided for thesaurus, which may be adopted in the alphabetical index

A 4 4

and PRASAD

to the classification schedule and also in the schedule itself.

5K Character set

The eventual use of electronic data processing equipment may entail

the use of only the upper case format for the descriptors ;

- avoidance of diacritical marks ; - li mitation of the number of characters

that a descriptor may have.

The rule on character set which takes care of problems in data processing equipment is applicable for classification schemes also.

5L Punctuation

Punctuation marks in descriptors should be minimised except for specialised nomen-clature, only parentheses and the hyphen are needed as descriptors.

Punctuation marks are to be given as in ordinary prose {CCC-ED 8}.

A comma is to separate two consecutive blocks in a heading (CCC-ED 81),

A comma is to separate a descriptive element in a block in a heading, from what it describes. ( CCA-ED 82).

The conjunction between blocks in a heading may be replaced by a comma. If this is done, semi-colon should precede a descrip-tive element instead of a comma (CCC-ED 84J. The rule for punctuation marks is expli-cit for classification schemes. But the thesau-rus rule has wisely pointed out that the use of punctuation marks in rendering the descriptor

should be kept at the minimum.

5M Special characters and numerals When it is considered necessary to use special characters other than hyphens and parentheses in descriptors, their meanings should be clearly defined. Special characters other than those mentioned above may be used in scope notes, definitions, and other forms of additional information,

(9)

Alt numbers, other than those forming part of the name of a monarch or a pope or any other person or of a corporate body and usually written in Roman' numerals, and other than call numbers and Classic numbers are to be in lndo-Arabic numberals {CCC-ED 55}.

The rule on numerals is identically used in both thesaurus and a classification scheme.

5N Homonym (Homograph)

The different meanings of homonyms (homographs) must be marked and distinguished by specifying symbols or terms (qualifiers) which should be placed between parentheses immediately after the homonym as part of the descriptor,

homonyms are due to the number of idea-units to be denoted being greater than the number of words available in a language to de-note them. A compromise will be to avoid homonyms within a single subject-field, though the same words may be used with other mean-ings in other subject fields. (Prol-GA 34).

A subject has an individuality of its own. The integrity of a subject shouldbe respected by naming it. In other words, its name should be unique Therefore, the pull of the

changes in the words happening in the natural language cannot be totally escaped by the words in the technical terminology. The resulting violation of uniqueness in the name of a subject scatters the documents on it. It also leads eventually to the designation of different sub-jects by one name, and this creates chaos. ( Prol-WA 1).

The subject represented by a class num-ber in a system of class numnum-bers and the iso-late idea represented by an isoiso-late number in a system of isolate numbers, should be uniyus ( Prol-JC 1).

This rule on homograph emphasises the need for individualisation and thereby resolving the homonyms. Lx. Star - Luminary, Actor, Badge, Destiny, Ornament, Noble, Decoration etc.,

512 ' Scope notes and definitions

A scope note is a brief explanation of the intended use of a descriptor. It may accom-pany the descriptor in the main part of the the-saurus, but does not form part of the descriptor.

Thesaurus and classification scheme

Scope notes may be used

to restrict the usage of a descriptor ; to explain abbreviation and acronyms ; to exclude a possible meaning from a term, especially for terms which are in common use in different disciplines to date addition and deletion of terms and to record changes in the handling of terms.

Scope notes should be indicated by spe-cial characters and clearly distinguished from qualifiers.

The conceptual content of a descriptor in a thesaurus is indicated mainly by the repre-sented relations to other thesaurus words. Whenever there is doubt regarding the unique interpretation of a descriptor, a definition should be added, specifying the exact conceptual content which may accompany the descriptor in the main part of the thesaurus but does not form part of the descriptor.

The denotation of a term in a scheme for classification should be determined in the light of the different classes or ranked isolates o£ lower order (upper links) belonging to the same primary chain as the class or the ranked isolate denoted by the term in question (Prol-GB 1).

The denotation of a term in a scheme of classification should be determined and should be left to be determined in the light of or through the sub-classes or ranked isolates as the case may be, denoted by the term in question as their common link (Prol-CA 1).

Scope notes and definitions play an im-portant role in determining the exact scope and meaning of a term. The need for these in a classification schedule may not be as essential as it is in a thesaurus.

50 Translation

In many cases it will be helpful to show the equivalent terms in other languages to

en-sure what the descriptor is correctly used in the analysis of foreign language texts. Where the moaning is not entirely equivalent, atten-tion can be drawn to this in the form of an ex-planatory note. Translations of simple

descrip-tors should be treated as synonyms and quasi-synonyms.

The rule is primarily necessary for a multi-lingual thesaurus which may sometime

(10)

A F 54

may be adopted for a classification schedule which enumerates terms in more than one sche-dule.

5R Source of information

Information on the source of a descrip-tor or a defnition can be very important for further development of the thesaurus. The source information could therefore be collected together with the descriptors but need not he included in the printed main part of the thesau-rus.

This rule is applicable for the design of classification schedules also, for specifying the exact context in which a term is used.

55 Descriptor inter relationships 1 Equivalence Relationship

( Preferential Relation)

When terms are regarded as equivalent (similar or almost the same meaning), they can be combined into equivalence categories, so that equivalent terms are assigned to one and the same concept. In retrieval all documents associated with the equivalence category must be retrieved even if only one of the terms is used as the descriptor.

Synonyms i.e„ terms which have

the same or almost the same mean-ing in a particular discipline ; Quasi-synonym i,e., terms whose meanings may differ in the vocabu-lary used and the field concerned, but which are considered as synonyms for the purpose of the documentation system under consideration.

A natural language abounds in synonyms - that is, different words denote one and the same idea. It will be a help if one and the same word is used to denote that idea in all contexts. (Prol-GA 4).

Alphabetical arrangement of subjects by their names, as a means of mechanizing their arrangement, must be ruled out, as the names of subjects are not unique (Prol-HA 3).

The emphasis on equivalence relation-ship in thesaurus is quite obvious. The equi-valance relationship is generally taken by the alphabetical index to the schedule in a classi-fication scheme. The preferred term in a schedule usually represents a class of terms which may have different shades of meanings.

A 4 6

GOPINATH and PRASAD

' USE' reference

In systems using preferred terms the ' USE' reference is employed to lead from a non-descriptor to one or more descriptors as follows

to indicate a preferred synonym. Ex. FLEXING USE BENDING to refer from a specific term to a more general term which has been

selected to represent the specific concept (quasi-synonym) Ex. PLANT WAXES USE WAXES

to indicate a preference in spelling or to expand or explain abbreviations. Ex. PI MESONS USE PIONS

to prescribe the use of two or more terms to express a concept (seman-tic factoring), Ex. FERROMAGNE-TIC FILMS USE FERROMAGNE-TIC MATERIALS + FILMS. to express concepts that can be con-sidered synonyms for the purposes of indexing and retrieval {O uasisynonyms}. Ex. HERIDITY USE GENETICS.

to bring together different points or degrees of a conceptual continuum. Ex. FLUIDITY USE VISCOSITY. to cross reference inverted entries to the preferred natural word order. Ex. TABLES, MATHEMATICAL USE MATHEMATICAL TABLES.

to reflect current terminology. Ex. ELECTRICAL CONDENSERS USE CAPACITORS.

to use preferred terms from jargon. Ex. WHIRLY BIRD USE HELICOPTORS. to use translations of descriptors. Ex. ROENTGENSTRAHLEN USE X-RAYS

' USED FOR reference

Conversly, the 'USED FOR' reference is employed with the preferred term for reci-procal reference. It accompanies the term to which the USE reference refers.

(11)

It accompanies each of the terms to which the USE reference refers

Ferromagnetic UFC Ferromagnetic

materials films

Films UFC Ferromagnetic

films There is to be a SEE ALSO subject entry corresponding to each of the sought links of the chain, upper or lower to the link contri-buting to the main Heading of specific subject Entry or the subject analytical entry as the case may be (CCC-KZD 499).

Referred-To Heading - The word or the word-group with which a Cross Reference Index Entry as a See Also subject entry in a dictionary catalogue ends. The Referred-to heading in a SEE ALSO subject entry in a Dictionary catalogue is usually the name of a specific subject of a document (CCC-FN 451).

See also Subject entry - General Added Word Entry in a Dictionary catalogue refer-ring from the name of one subject to that of another (CCC-F7,D 43).

These rules for USE reference can aid a construction of an alphabetical index to a classification scheme as much as it does for a thesaurus. This is reflected in the rules given for Cross Reference Index Entry in the Alas-sified Catalogue Code.

2 Hierarchical relation

The hierarchical relation expresses relations of super subordination of concepts. It can be subdivided into:

Generic relation

In this relation, the generic (super-ordinated) term denotes a class of concepts of which the concept denoted by the specific (sub-ordinated) term is always a member. The specific concept differs from the generic one in atleast one characteristic.

Part-Whole-Relation (Partitive relation) In this relation, the superordinated (entity) term denotes objects or concepts of which the objects or concepts denoted by the subordinated (part) term are a part. Part terms are obtained by mental disintegration of the whole represented by the entity term into its component parts.

Thesaurus and classification scheme AF5S

Hierarchical relations can be represen-ted alternatively in any of the following ways

The generic and part-whole-relations are differentiated and shown separately.

The generic and part-whole-rela-tions are not differentiated and are grouped together in the hierarchical reference.

In disciplines in which the part-whole relation are of no significance in hierarchical retrieval, it is recommended that only the generic relation be represented by hierarchical reference. In this case, the Fart-whole-rela-tion is treated as associative relaFart-whole-rela-tion.

The representation of the part-whole-relation by both hierarchical and associative reference in a thesaurus should be avoided.

In most thesauri the hierarchical rela-tion is represented by the references BROADER TERM (BT), representing the relation of a concept being superordinated and NARROWER TERM (NT), indicating the reciprocal relation. In cases where both types of hierarchical rela-tions are to be distinguished, different symbols must be chosen for generic and part-whole-relations. In this case it is recommended to use the following references. BROADER TERM. GENERIA (BTG), NARROWER TERM GENERIA (NTG) for the generic relation and BROADER TERM PARTITIVE (BTP) and NARROWER TERM PARITIVE (NTP) for the Part-Whole-relation.

In a class number or an isolate number, there should be a digit to represent each of the characteristics used in constructing the class number or isolate number; as the case may be ( Prol-SE 1).

The classes in an array of classes, and the ranked isolates in an array of ranked iso-lates should be totally exhaustive of their res-pective common immediate universes (Prol-EM 1).

The classes in an array of classes and the ranked isolates in an array of ranked iso-lates should be mutually exclusive. (Prol-EN 1).

The sequence of the classes in an array of classes, and of the ranked isolates in an array of ranked isolates should be helpful to the purpose of those for whom it is intended (Prol-EP 1).

(12)

AF 5S GOPINATH

isolates occur in different arrays, their se-quence should be parallel in such arrays, where-ever insistence on such a parallelism does not run counter to other more important require-ments (Prol-EO 1).

While moving down a chain from its first link to its last, the extension of the clas-ses or of ranked isolates, as the case may be, should decrease and the intension should in-crease at each step (Prol-ES 1).

A chain of classes or of ranked isolates should comprise one class or one ranked iso-late, as the case may be, of each and every order that lies between the orders of the first link and the last link of the chain (Prol-ET 1).

Classification, as a process results in mutually related groups of various orders with-in the universe concerned. A group may either 'he unitary or multiple. Mutual relations among the groups are established by recognizing their respective 'Logical forms'. The logical form of a group is the manifestation of its structure. The structure of a group refers to its parts and their interrelationships. Interrelationships among the parts of a group manifest in their respective ranks. The purpose of classification determines the orders of the groups to be re-cognised, and the degree of ranking to be in-corporated. At bottom, therefore, the classi-fication, as a process, consists essentially of

recognizing purpose oriented logical forms of the entities of a universe to recognise groups and their mutual relationships. In this context,

classification, as a process, is defined to be consisted of grouping alone, or of grouping cum ranking.

Groups recognised through structural analysis are comparable among themselves. Ranks of the different groups are determined by a process of comparison of their respective structures. Ranking of groups results in hier-archy.

The degree of hierarchy to be incorporated is again purpose dependent. In an inten-sive hierarchy, each group gets distinguished, from all other groups with reference to its

co-ordinates, superco-ordinates, subordinates and collaterals. Viewed from this angle, classifi-cation, as a process consists of distinguishing each entity in the universe concerned from all other entities by recognizing its relationships with its coordinates, superordinates,

subordi-nates and collaterals. For convenience of reference, these relationships can be called " COSSCO-relationships".

A 4 8

and PRASAD

Organizing classification

The arrangement based on a standard pattern of grouping results in an 'organisation' - that is, a hierarchical grouping. For conve-nience of reference, the classification resulting in organization may be denoted by the term

Organizing classification'. In this COSSCO-relationship are explicit; and related groups are juxtaposed- (Lib Sc. 11-D).

Any of the fundamental categories "Personality" and "Matter" may manifest it-self more than once in one and the same round within a subject; and similarly with 'space° and 'Time' in the last round. The mainfestation of a fundamental category within a round will be said to be its level 1 facet in that round Its second manifestation within that round will be said to be, its level 2 facet in that round and so on (Prol-RJ 1).

If, in a subject, facet 'I3' is an organ of facet 'A', then' A' should precede 'B' (whole-Organ principle) (Prol-RN 1).

In studying the attributes of the universe of subjects and of its components, it is found helpful to consider a subject as a system (Lib Sc. 9-WG).

The hierarchical relations are to be clearly demarcated in a classification system as it adopts this as its preferred approach. Consistent recognition of a hierarchical model in all subjects covered by a classification sys-tem is much emphasised in classification theory, Therefore a formal hierarchical structure has to be established. For this purpose, the classi-fication theory is being continuously enriching itself from concepts from Logic, General Sys-tems theory and Semantics.

3 Associative Relation (Affinitive Relation)

The associative relations is usually employed to cover the other relations between concepts that are related but are neither con-sistently hierarchical nor equivalent. (e.g. similarity, antonymity).

It should be noted, however, that a variety of relations exist between concepts. Associative relations should therefore be estab-lished only if it is assumed that these relations will be actually required in retrieval. Asso-ciated concepts can he referred to by the RFLATED TERM (PT] reference.

1

(13)

Associative relations may be used to indicate

antonymity i , e., a concept is the opposite of another concept. Ex. HARDNESS RT SOFTNESS coordination i. e., concepts are derived from a superordinated concept by the same step of division. Ex. GENERIC RELATION PT

PART-WHOLE RELATION. generic relation i. e., something is t he predecessor of another thing. Ex. FATHER FT SON

concurrent use of two concepts. Ex. EDUCATION RT TEACHING cause and effect T

Ex. TEACHING FT LEARNING i nstrument relation

Ex. WRITING PTT PENCILS material relation i. e., something is the material of which another t hing is made

Ex. PAP ER R T BOOKS

similarity of different kinds (physical similarity, similarity of material, similarity of process etc. }

Ex. TEACHING PT TRAINING The arrangement based on the derived patterns of grouping results in 'associative grouping'. For convenience of reference, the classification resulting in associative grouping maybe denoted by the term, 'Associative Classification'. In this COSSCO- relations are not explicit, and the related groups are linked up by cross-references {Lib Sc. l1-D43).

The associative relationship is very i mportant in thesaurus structure, whereas to a faceted classification scheme, it is taken care of by the non-hierarchical relations such as, faceted, speciator relation etc. (5).

5T Proliferation of unrelated specific names would tend to convert the thesaurus into a simple list of identifiers which would be self-defeating. It is therefore recommended that the names of unrelated specific entities be avoided as much as possible.

This is a helpful warning that should be taken care of in thesaurus construction as well as in an alphabetical index to a classification scheme.

5U The authenticity of the descriptors should be verified by consulting dictionaries, other indexing or standardized vocabularies, current usage in the literature and especially the opinion of subject specialists.

Thesaurus and classification scheme AF6

This rule emphasises the relation of the sources for descriptors. The source terms for a classification schedule as well, is to be selected with great care. Dictionaries, en-cyclopedias, and glossaries provide the well crystallised (standardised) terminology of the subject field. The currently used term can be got only through scanning the main parts of the abstracting and indexing periodicals in the sub-ject field.

5V Obsolete terminology should not be included, unless only a; furbidden terms.

The term used to denote a class or a

ranked isolate in a scheme for classification

should be the one current among those speciali-sing in the subject-field covered by the scheme

( Prol-GD 1).

Obsolete terms should be avoided in all information retrieval languages, be it a thesau-rus, or a classification scheme. Even diction-ary avoids an obsolete meaning. Glossaries generally mention the superceeding of the obsolete terms.

5W Thesaurus assimilates the neologisms and special jargon that proliferate in expanding fields of basic and applied research.

This is obviously necessary as thesaurus is a tool in the information retrieval system, which always keeps abreast of current informa-tion/data. Thesaurus should not generally wait for the term to get into circulation for a long time. In other words, the thesaurus should be able to accept new terms as and when it is used in the technical articles or reports published in the subject field.

6 CONCLUSION

It may be evident from the studies made i n this paper that there are many common fea-tures that are useful in the construction of classi-fication schemes and thesauri. Infact the steps for construction for the design of a classifica-tion scheme and that of a thesaurus can go hand in hand upto certain stages. The essential dif-ference lies in that the thesaurus mainly has two planes of work, namely the idea plane and the verbal plane. Where as a classification

scheme spreads itself into three planes of work namely the idea plane, the verbal plane, and the notational plane. In a thesaurus the words are made to function in two roles. 1} denotation of concepts ; and 2) denotation of ranking of

(14)

is used to denote the ranking of concepts. But in the process, the classification system intro-duces a new language of its own called

classifi-catory language which is made of symbols or digits. In faceted classification the notation is taken to further sophistication, such as, the

use of indicator digits to indicate the nature of different facets, and the rules for the ordinal values four these indicator digits. Thus classi-fication, although started as a device to secure consistent arrangement, the artificial nature of the class number itself made the users resist it. In a sense it made ease of access to docu-ments a bit more elaborate. Infact many of the designers of faceted classification, say that the complicated nature of faceted notation should be simplified at the cost of coextensive and

mnemonic features (8). There are more mo-dern protaganists of thesauri, who opine that classification system should only answer the needs of book classification, and the classifica-tion of reports, technical articles, etc., should be left to subject headings, which derive terms from controlled vocabularies like thesaurus. It may be seen that the thesaurus, the classifica-tion scheme and the subject authority lists, represent different levels and depths of organi-sations and all of them can coexist in an infor-mation storage and retrieval system, comple-menting each others efficiencies or deficiencies.

Thesaurofacet ; a faceted classification for engineering and related subjects. 1969,

A 5 0

GOPINATH and PRASAD

3 Sec I ISO; 2788. Guidelines for the establishment and develop-ment of monolingual thesauri. 1974 ,

4 Sec 0 CENT (Allen). Information analysis and retrieval. 1971. p 230

5 Sec SS NEELAMEGHAN (A). Non-hierarchical associative rela-tionships ; their types and computer generation of RT links. ( Annual Seminar ( DRTC). 1975. Paper AA) 6 Sec 2 RANGANATHAN(SR).

Classi-fied catalogue code with addi-tional rules for dictionary

catalogue code. Ed 5. Assist by A Neelameghan. 1964. 7 Sec 2 --. Prolegomena to library

classification. Ed 3. Assist by M A Gopinath. 1967.

8 Sec 0 SCHNEIDER (John H). Modern Sec 6 classification characteristics,

uses and profiles. (Drexel Libr O. 10,4; 1974; 37-55)

9 Sec 1 SOERGEL (D), Indexing langua-ges and thesauri : Construc-tion and maintenance. 1974.

1 0 Sec 1 UNESCO. Guidelines for the establishment and development of monolingual thesauri. 1973. 7 BIBLIOGRAPHICAL REFERENCES

1 Sec 1 AITCHI SON (J) and GILCHRIST

2 Sec 0

References

Related documents

Results of the survey are categorized into the following four areas: primary method used to conduct student evaluations, Internet collection of student evaluation data,

The paper is discussed for various techniques for sensor localization and various interpolation methods for variety of prediction methods used by various applications

ATPase activities were more affected in acute exposure than chronic duration and this could be also related with higher concentrations of mercury and nickel..

19% serve a county. Fourteen per cent of the centers provide service for adjoining states in addition to the states in which they are located; usually these adjoining states have

The objective of this study was to develop Fourier transform infrared (FTIR) spectroscopy in combination with multivariate calibration of partial least square (PLS) and

Also, both diabetic groups there were a positive immunoreactivity of the photoreceptor inner segment, and this was also seen among control ani- mals treated with a

Figure 9 Proportions of different tree species in the post - cutting stands according to stochastic anticipatory optima for risk avoider (A) and risk seeker (S)... had a greater