156
All Rights Reserved © 2012 IJARCSEE
Improved methodology for accessing industry based
microbial database using Fuzzy Relational Algebra
and proposing a microbial fuzzy database model.
Corresponding Author – Mr. Sayantan Sinha1 2
1. School of Biotechnology, KIIT University, Bhubaneswar-751024, India.
2. Institute of Cybernetics Systems and Information Technology, Kolkata-700108, India. Ph, - 0-8895892080 / 0091-033-2564-2574.
Abstract
Numerous biological databases such as
Microbial Genome Database for
Comparative Analysis (MBGD), Microbial Identification and Typing Database (HPA) and National Microbiological Database (NMD) have been developed in recent times which triggered impacts in the different research fields of biological sciences. It has been observed that due to adaptability, evolution, and emergence, the biological systems are very much uncertain and hence require complexity analysis. Reports suggest that little work has been done in modeling uncertainty and developing methodologies for such databases at the industrial level. Although, there have been several proposals for extending relational database system to represent and retrieve data, the present concept emphasizes the use of fuzzy relational algebra to access the database as applicable in the Industries. Aim - To fill
this gap an applicable Industry based
157
All Rights Reserved © 2012 IJARCSEE
Keywords:
Biological Database, Complexity Analysis, Fuzzy Relational Algebra, Uncertainty Modeling.
Introduction:
Many real world applications e.g. genome, geophysical, and biological systems, must deal with imprecise or vague data. For such systems, we need information management system that provides support for managing this imprecise data [17]. Fuzzy sets constitute the oldest and most reported soft computing paradigm. They are well suited to modeling different form of uncertainties and ambiguities, often encountered in real life [2], [3], [4], [5]. Significant work has been done in incorporating uncertainty management in relational databases (RDB‘s) using fuzzy set theory [18], but little has been done in modeling fuzziness in microbe based industrial database model.
A database is computer based system, the purpose of which is to store information regarding a set of entities and to provide users with the capability of organizing, manipulating, and retrieving the stored
information as requested. Several models for representing information in database have been proposed. One of these models has become predominant [14]. Virtually all fuzzy databases described in the literature are conceived in terms of the relational model. Buckles and Petry [1982-83] developed a model for fuzzy relational databases [1] that contain, as a special case, the classical crisp model of a relational database. The model of a classical relational database consists of a set of multidimensional relations conceptualized as tables. The columns of this table corresponds two fields or attributes and are usually called domains. Each domain is defined on an appropriate domain base (or universal) set. The rows are elements of the relations; they correspond to records or entries, and are called tuples [14].
Significance of using fuzzy logic to design the database.
158
All Rights Reserved © 2012 IJARCSEE
in linguistic terms. This type of information can be quite useful when the database is to be used as a decision aid in areas such as investment; entrepreneurships etc. where ―soft‖ subjective and imprecise data are not only common but quite valuable. In addition, it is also desirable to relieve the user of the constraint of having to formulate queries to the database in precise terms.
Vague questions like ―which
microorganisms can be used to produce Biological detergents?‖, ―Which industries are forecasted to experience significant growth by a substantial number of applications‖, often capture the relevant concerns of database users more accurately and easily than precise queries.
Proposing the microbial based Fuzzy relational algebra database (MBFRDB) model:
Suppose a database contains the grade of importance of microorganism in different
industries. Two relations are contained within the database.
1 ‗APPLICATION‘, which has domain ‗MICROORGANISM‘ and ‗INDUSTRY.‘
2 ‗FEATURES‘ , which has domains
‗MICROORGANISM‘, ‗QUALITY‘,
‗DETAILS‘, ‗OPTION‘.
The design is represented in the schematic table (I and II).
Relation: Applications
MICROORGANISMS INDUSTRY
Bacillus Biological Detergents
Lactobacillus Probiotics
Lactococcus Dough Preparation
Rhizomucor Biological Detergents
Bacillus sp. Dough preparation
Aspergillus Biological Detergents
Sacharomyces Probiotics
Rhizomucor Dough Preparation
Rhizomucor Leather Baiting
Bacillus sp. Leather Baiting
Lactococcus Biological Detergents
Aspergillus Leather Baiting
Bifidobacterium Probiotics
Cerevisae Probiotics
Table [I] – Relational database comprising microorganisms and applied industries
159
All Rights Reserved © 2012 IJARCSEE Relation
:Features
OPTION MICROGANISM QUALITY Details
BD Bacillus sp. Highly Favorable
**
LB Bacillus Slightly Favorable
**
DP Bacillus Favorable ** DP Lactococcus Highly
Favorable **
BD Lactococcus Slightly Favorable
**
LB Lactococcus Favorable ** DP Rhizomucor Favorable ** LB Rhizomucor Slightly
Negative **
BD Aspergillus Slightly Negative
**
LB Aspergillus Slightly Favorable
**
P.bio Lactobacillus Slightly Negative
**
P.bio Saccharomyces Slightly Favorable
**
P.bio Cerevisae Highly Favorable
**
P.bio Bifidobacterium Favorable ** BD Rhizomucor Favorable ** ** to be accessed from the database.
Table [II] Relational database comprising microorganisms and applied industries
Accessing, Data Retrieval & operating the MBFRDB database using fuzzy relational algebra:
Access to the database is accomplished through a relational algebra. The algebra consist of the procedural application of operations containing four basic elements: an operation name, the names of relations and the names of domains to be operated on, and an optional conditional expression. For instance, if our database contains a ternary relation ‗STUDENT‘ , which domains
‗NAME‘ , ‗ ADDRESS‘ and ‗MAJOR‘, we can obtain the name and addresses of all students whose major is ‗ biotechnology‘ by constructing a new relation with domains ‗NAME‘ & ‗ ADDRESS‘ as a projection of the original relation. The algebraic operation performing this task would be
Project (STUDENT: NAME, ADDRESS)
where (1)
MAJOR = “Biotechnology”
The algebra also contains other relational operations such as compliment, union, intersection, and joins which performs the corresponding tasks on the relations and the domains specified in order to produce the desired information. The fuzzy relational algebra used to access this fuzzy database consists of the same four components as the conventional relational algebra.
To understand the operation of the MBFRDB database, let us build a user-database model.
160
All Rights Reserved © 2012 IJARCSEE
about the efficiency of microorganisms from the database.
Algorithm:-
Step 1: The user wants to access the entire microorganism that is used for the production of biological detergents. This is accomplished by the operation.
(Project (Select APPLICAIONS where
INDUSTRY = Biological detergents) over
MICROORGANISMS) giving R1 (2)
Here R1 is a temporary relation on domain ‗MICROORGANISMS‘. Listing only those microorganisms that are used in biological detergent production. It is equal to
STEP 2: User wants to know the accuracy and quality of the microorganism that he/she retrieved from R1.
Here the temporary relation R2 must be construct on domain MICROORGANISMS‘ and ‗ QUALITY‘ , which lists the quality of the microorganism in R1 about option ‗BD‘.
The algebraic expression accomplishing this can be represented as-
(Project (Select (Join R1 and FEATURES over „MICROORGANISMS‟ where
OPTION= „BD‟) over
MICROORGANISMS, QUALITY)
giving R2. (3)
The relation R2 that is produced is given by
Relation: R2
NAME QUALITY
Bacillus sp. Highly Favorable
Rhizomucor Favorable
Aspergillus Favorable
Lactococcus Slightly Favorable
STEP 3: On getting the overview of the quality and optimization of the different microorganism producing biological detergents. These are now wants to access the details about the microorganisms. The algebraic expression accomplishing this is
(Project (Select FEATURES: where
MICROORGANISMS=Bacillus and
OPTION = „BD‟) over DETAILS) (4)
This will provide the user with the details about the chosen microorganisms.
Eg: enzyme the chosen microorganism is producing, the protocol for culturing the chosen microorganism, fermenting cost, area of sampling etc.
Conclusion and Further Research:
The illustrated MBFRD model introduces fuzziness only by means of fuzzy equivalence relations or, more generally,
RELATION : R1 MICROORGANISMS Bacillus sp.
161
All Rights Reserved © 2012 IJARCSEE
fuzzy compatibility relations on individual domain universal set [15]. The methodology of the model thus has created a link between mathematics, biology and Soft computing which is able to analyze the complexity and manage the uncertainties of the industrial schema. The database can be used by entrepreneurs and researchers to access the different microorganisms that are able to produce the desired product they are searching for. Unlike the other databases this database will provide the entire set of microorganism related to the specific query criterion and will also grade the quality and application level of the microorganism on the basis of the product specified criteria .The user can also retrieve the details about the microorganism needed. It is beneficial for the user as he/she can choose the desired microorganism from a large set of data presented and also according to his/her need, for ex: if an user wants to access the particular microbe which can be used as probiotics , he therefore accessed the database and retrieved the whole set of microorganisms that can be applied in pobiotics but now he needs to know that which microbe is best suited and which is less suited and at the same time he needs to know the entire details about the microbes . To answer all his/her queries he/she now can
easily retrieve the comparison data amongst the entire set of microbes that can be applied in probiotics neither the less he/she also accesses the details of each and every microbe. After considering all the factors he/she chooses the microbe that is economical and suits his/her purpose. For enhancing and developing the database and its operations we need to standardize and convert the design of the MBFRDB into a running application in extent of imprecise data or imprecise rules.
Acknowledgment:
I am grateful to Prof. Dr. Dwijesh Dutta Majumder (Emeritus Prof. Indian Statistical Institute & director ICSIT) for his constant guide and for teaching me and inspiring me to work on Fuzzy logic).
I am thankful to Mr. Satyabrata Sinha (Veteran Mathematician & Teacher) for teaching me Mathematics and to Mrs. Sharmistha Sinha (Historian) for encouraging me in my work.
I am thankful to Dr. Ritesh Pattnaik (faculty KIIT School of Biotechnology) for guiding me and sharing ideas.
162
All Rights Reserved © 2012 IJARCSEE References:
[1] Buckles B and Petry F (1982) ―A fuzzy representation for relational databases,‖ Fuzzy Sets and Systems 7, pp.213-226.
[2] Majumder DD (1979) ―Cybernetics and general systems—A unitary science,‖ Kybernetes, vol. 8, pp. 7–15.
[3] Majumder DD and Roy PK (2000) ―Cancer self-remission and tumor instability— A cybernetic analysis: Toward a fresh paradigm for cancer treatment,‖ Kybernetes, vol. 29, pp. 896–927.
[4] Roy P, Majumder DD and Biswas J (1999) ―Spontaneous cancer regression: Implications for fluctuation,‖ Ind. J. Phys., vol. 73-B, pp. 777–883.
[5] Chang SSL and Zadeh L (1972) ―On fuzzy mappings and control,‖ IEEE Trans. Syst. Man, Cybern., vol. SMC-2, pp. 30–34.
[6] Benčič A, Hudec M (2002) MOŠ/MIS– Urban and municipal statistics project and information system of the Slovak Republic .In Proceedings of the SYM-OP-IS. Vuletić Print, Tara, Serbia, XXI-32--XXI-35.
[7] Branco A, Evsukoff A, Ebecken N. (2005) Generating Fuzzy Queries from Weighted Fuzzy Classifier Rules. In Proceedings of the ICDM workshop on Computational Intelligence in Data Mining. IOS Press, Huston, USA, pp. 21-28.
[8] Cox E (2005). Fuzzy modeling and genetic algorithms for data mining and
exploration. Morgan Kaufman, San Francisco, USA.
[9] Galindo J, Urrutia A, Piattini M (2006) Fuzzy Databases: Modeling, Design and Implementation, Idea Group Publishing, Hershey, USA.
[10] Hudec M (2007) Fuzzy improvement of the SQL. In Proceedings of the Balkan Conference on Operational Research (Balcor). Beograd, Serbia, pp. 255-265.
[11] Hudec M (2008) Fuzzy SQL for statistical databases. In MSIS, Meeting on the Management of Statistical Information Systems. Luxembourg.
[12] Chamberlin D, Boyce R (1974) SEQUEL: A Structured English Query Language. In Proceedings of the ACM SIGMOD Workshop on Data Description, Access and Control. ACM Press, Ann Arbor, USA, pp. 249-264.
[13] Kacprzyk J, Pasi G, Vojtáš P, Zadrozny S (2000) Fuzzy querying: Issues and perspectives. Kybernetika, Vol. 36, No. 6, pp. 605-616.
[14] Klir G, Yuan B (1995) Fuzzy sets and fuzzy logic, theory and applications. Prentice Hall, New Jersey, USA.
163
All Rights Reserved © 2012 IJARCSEE
[16] Zadeh L (1965): Fuzzy Sets.
Information and Control, No. 8, pp.
338-353.
[17] Nauman A. Chaudhry, James R. Moyne, Elke A. Rundensteiner ‗A Design Methodology for Databases withUncertain Data‘ Techcon ‗93 Conference Proceedings, pp. 31-33.
[18] Mitra S, Pal S K (2005) ‗Fuzzy sets in pattern recognition and machine intelligence‘ Elsevier Fuzzy Sets and Systems 156. pp. 381–386