6.2 Answering research questions
6.2.5 RQ5: The role of knowledge in the architecture of a semantic
The aim of this research question is to identify what role knowledge plays within the architecture of a semantic search engine. As it was seen in RQ1, RQ3 and RQ4, knowledge has a relevant role in the proposed architecture, mostly by means of a knowledge base. As it is shown later in table 6.8, most of the studies have already answered RQ4 before.
For this question, one field was proposed in order to retrieve the data from the studies:
a Knowledge role: this field aims to get any description of the role that knowl-edge performs within the proposed architecture.
There are 5 studies found to answer this question. A very brief summary of each study is shown in table 6.8, with the knowledge role description identified per each study.
Study Knowledge role
[26]
The knowledge base of this proposed system is domain specific.
This knowledge base interacts with the domain ontology in order to get the relevant and correct answers for the user query.
[31]
The knowledge base is essential to provide a set of semantic anno-tators for the search platform and constitutes an easy and faster way to annotate documents. This knowledge is the support of the interface in the search platform allowing access to verify the relevance of the input and the connection between information im-portant to explore complex queries, and help users in their queries.
[37]
This semantic search engine implements a pseudo-P2P architec-ture where heterogeneous documents are shared by peers respect-ing access policies. Intelligent documents (through a document ontology) are the preferred way in this engine to share knowledge.
[39]
The system coordinates the search for knowledge in heteroge-neous sources, such as the Web, semi-structured data, relational databases and others. The Knowledge Management Layer com-prises a ranking agent, a query formulation agent, and ontology agent and an imaginary domain model.
Chapter 6. Synthesis and analysis of data 52
Study Knowledge role
[41]
Knowledge sharing is one of the most important feature of this system which should be implemented as a platform for collabora-tive work so it would be possible to develop the document base collectively by its stakeholders such as companies, organizations, research institutes. The innovation process ontology plays a key role allowing knowledge sharing.
Table 6.8: Studies selected for RQ5
After checking the summaries of the selected studies for this research question, it can be seen that:
a Knowledge sharing should be a key feature, so that the semantic search engines proposed in these studies should allow knowledge sharing by means of the domain ontology they have implemented, such as a document ontology [37] or an innovation process ontology [41].
b Knowledge bases help getting better search results, and in doing so ontologies play an important role. As it is already noted in section 2.10, knowledge and ontologies usually work together in order to retrieve better and relevant results [26]. On the other hand, in [31], Mendonça et al. use a knowledge base to help on the document annotation process, so that those documents can be queried by users, and also it can be used to help users creating their queries which leads to get better search results.
c Knowledge is gathered from heterogeneous sources, which enriches the results a user can get. In order to accomplish this, a set of agents were proposed by Kerschberg et al. so that those diverse sources can be queried [39]. This takes into account the set of general and domain specific ontologies the system uses, as already mentioned by that study.
Chapter 7
Conclusions and future work
In this research the most important topics of an architecture of a semantic search engine were presented and discussed.
Most of the studies, as depicted previously, propose a concrete implementation of their architectures, so those systems were and are working now. On the other hand, those studies have a well-defined main subject and well-detailed architectures, and they mention what the context is for the search engine to be developed. However they have a pending work regarding their conclusions and future work recommen-dations, which constitutes a weak aspect.
It’s also worth mentioning the diversity on countries of origin which shows that the interest in semantic search engine is not restricted to a handful of countries -even developing countries are investing time and resources in these engines. Besides, some of the selected studies, and also in rejected studies, show a collaborative work among people of different institutions and countries which can be understood that a broader interest exists in proposing semantic search engines to resolve problems.
Regarding the research questions, they were proposed to get the most relevant points of interest of the architecture of a semantic search engine. For RQ1, it was sought what modules are used the most and what architecture patterns are present.
Out of the most used modules listed in subsection 6.2.1, it can be highlighted, for instance, the extractor components that are responsible for extracting data and discover new data; reasoning components, such as ontologies or inference engines;
and user interfaces, showing that usability is a must in a semantic search engine.
Additionally, these engines follow the N tier architecture pattern where the layers are loosely coupled which fosters portability and reuse.
In RQ2 it was sought to know what kind of validation and verification methods 53
Chapter 7. Conclusions and future work 54
the architectures of semantic search engines were using. As already mentioned in subsection 6.2.2, there were no studies found while executing the primary search phase. However this situation constitutes an opportunity for future work by propos-ing validation and verification mechanisms already in use in other software engi-neering application fields. As mentioned by Abowd et al. in [57], there are many benefits of using architectural evaluation methods, such as a better understanding and documentation of the system, clarification and prioritization of requirements, and early detection of problems in the architecture, which boosts architecture qual-ity. There are also several evaluation methods for architecture quality attributes, such as the ones identified by Mattsson et al. [58].
Regarding RQ3, the main reason behind it was to identify what kind of require-ments the architectures of semantic search engines were trying to fulfill. As already stated in subsection 6.2.3, there are several common set of requirements present across search engines. For instance, having a high precision on results is one of most recurring requirements among the selected studies, which shows a very high expec-tation from final users. In the same sense usability is mentioned as a requirement, confirming once again the very high impact that it has on a semantic search engine, specially when there’s a need such as with the work of Mouromtsev et al. [51], where Russian art collections are involved.
For RQ4, it can be said that most of the studies propose the use of domain ontologies as a fundamental piece in a semantic search engine. Even when the search engine relies on a general purpose ontology such as WordNet, domain ontologies are needed to address, in a specialized way, the needs of a specific domain and user requirements [39]. In the same line, ontologies are mostly used to classify and represent relationships among concepts that users need to find and retrieve. And also this fosters reusability, as new concepts are identified and added to the ontology, making its maintenance crucial as it evolves over time [56].
The last research question is RQ5, related to the role that knowledge has on the architecture of a semantic search engine. One of the key roles is knowledge sharing that is achievable through the implementation of the ontologies that search engines rely on. Knowledge sharing is mentioned by Gruber [8], which seems to be a recurring aspect of ontologies and knowledge. And in doing so search results benefit, improving their precision and recall. Finally, gathering heterogeneous sources leads to implement and maintain a set of components that retrieve the knowledge stored on those sources, improving search results as well.
Chapter 7. Conclusions and future work 55
Lastly, it can be seen that most of the studies try to propose a solution for a previous unresolved problem, to propose a new alternative for users, or to improve search results. In order to determine whether this new solution is better than the existing one, sometimes the studies presented, besides the architecture, comparison results or performance benchmarks, mostly to identify the degree of precision and recall that the semantic search offers. Some examples of these results are shown in the work of Amanqui et al. [27], Thangaraj and Sujatha [46] or Dong et al. [28].
However, as the purpose of this systematic review is not related to that kind of experiments, this can be considered as a good starting point for a future work, in order to understand how implementers are analyzing their engines’ results against previous existing solutions.
Appendix A
List of queries used by search resource
Digital library
Research
question Query
ACM
RQ1
+(module layer component) +(architecture
"system architecture") +("semantic search") +(engine system platform tool) +(implementa-tion applica+(implementa-tion approach)
RQ2
+(evaluation assessment appraisal) +(method procedure technique approach) +(validation verification) +(architecture "system architec-ture") +("semantic search") +(engine system platform tool)
RQ3
+(requirement need requisite) +(architecture
"system architecture") +("semantic search") +(engine system platform tool)
RQ4
+(role) +(ontology) +(architecture "system architecture") +("semantic search") +(engine system platform tool)
RQ5
+(role) +(knowledge) +(architecture "system architecture") +("semantic search") +(engine system platform tool)
56
Appendix A. List of queries used by search resource 57
(module OR layer OR component) AND (archi-tecture OR "system archi(archi-tecture") AND ("se-mantic search" AND (engine OR system OR platform OR tool)) AND (implementation OR application OR approach)
RQ2
(evaluation OR assessment OR appraisal) AND (method OR procedure OR technique OR ap-proach) AND (validation OR verification) AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool))
RQ3
(requirement OR need OR requisite) AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool))
RQ4
role AND ontology AND (architecture OR "sys-tem architecture") AND ("semantic search"
AND (engine OR system OR platform OR tool))
RQ5
role AND knowledge AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool))
Scopus
RQ1
( TITLE-ABS-KEY ( module OR layer OR component ) AND TITLE-ABS-KEY ( ar-chitecture OR "system arar-chitecture" ) AND TITLE-ABS-KEY ( "semantic search" AND ( engine OR system OR platform OR tool ) ) AND TITLE-ABS-KEY ( implementation OR application OR approach ) )
Appendix A. List of queries used by search resource 58 OR procedure OR technique OR approach) AND TITLE-ABS-KEY(validation OR veri-fication) AND TITLE-ABS-KEY(architecture OR "system architecture") AND TITLE-ABS-KEY("semantic search" AND (engine OR sys-tem OR platform OR tool))
RQ3
TITLE-ABS-KEY(requirement OR need OR requisite) AND TITLE-ABS-KEY(architecture OR "system architecture") AND TITLE-ABS-KEY("semantic search" AND (engine OR sys-tem OR platform OR tool))
RQ4
TITLE-ABS-KEY(role) AND TITLE-ABS-KEY(ontology) AND TITLE-ABS-KEY(architecture OR "system architecture") AND TITLE-ABS-KEY("semantic search"
AND (engine OR system OR platform OR tool))
AND (engine OR system OR platform OR tool))
ScienceDirect
RQ1
tak((module OR layer OR component) AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool)) AND (implementation OR application OR approach))
Appendix A. List of queries used by search resource 59
Digital library
Research
question Query
RQ2
tak((evaluation OR assessment OR appraisal) AND (method OR procedure OR technique OR approach) AND (validation OR verification) AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR sys-tem OR platform OR tool)))
RQ3
tak((requirement OR need OR requisite) AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool)))
RQ4
tak(role AND ontology AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool)))
RQ5
tak(role AND knowledge AND (architecture OR "system architecture") AND ("semantic search" AND (engine OR system OR platform OR tool)))
Table A.1: List of queries used for each digital library
Appendix B
List of articles requested
Table B.1 shows the list of articles whose authors were contacted because their work were not available in the digital sources. Besides, other articles were purchased because they look promising when starting the secondary phase. Only articles that were accepted are listed here.
Study Research Question
Requested
Purchased Date Search
resource [24] RQ1, RQ3, RQ4 Purchased 26th Dec 2016 ACM
[47] RQ3 Requested 12th Dec 2016 Scopus
[48] RQ3 Requested 10th Dec 2016 Scopus
[37] RQ5 Requested 13th Dec 2016 Scopus
Table B.1: List of articles requested or purchased for the research
60
Bibliography
[1] Tim Berners-Lee, James Hendler, Ora Lassila, et al. The semantic web. Scien-tific American, 284(5):28–37, 2001.
[2] Shosh Leshem and Vernon Trafford. Overlooking the conceptual framework.
Innovations in education and Teaching International, 44(1):93–105, 2007.
[3] Joseph A Maxwell. Qualitative research design: An interactive approach: An interactive approach. Sage, 2012.
[4] Philippe Kruchten. The rational unified process: an introduction. Addison-Wesley Professional, 2004.
[5] Len Bass, Paul Clements, and Rick Kazman. Software Architecture in Practice.
Addison-Wesley Professional, 2003.
[6] R. Guha, Rob McCool, and Eric Miller. Semantic search. In Proceedings of the 12th International Conference on World Wide Web, WWW ’03, pages 700–709, New York, NY, USA, 2003. ACM. ISBN 1-58113-680-3.
[7] Amit Sheth, Cartic Ramakrishnan, and Christopher Thomas. Semantics for the semantic web: The implicit, the formal and the powerful. International Journal on Semantic Web and Information Systems (IJSWIS), 1(1):1–18, 2005.
[8] Thomas R. Gruber. A translation approach to portable ontology specifications.
Knowledge acquisition, 5(2):199–220, 1993.
[9] Nicola Guarino. Formal ontology in information systems: Proceedings of the first international conference (FOIS’98), June 6-8, Trento, Italy, volume 46.
IOS press, 1998.
61
Bibliography 62
[10] W3C. Resource description framework (rdf) model and syntax specification.
https://www.w3.org/TR/PR-rdf-syntax/, January 1999. [Online; accessed 26-March-2016].
[11] W3C. Inference - w3c. https://www.w3.org/standards/semanticweb/
inference, 2015. [Online; accessed 27-March-2016].
[12] Nancy Ide and Jean Véronis. Introduction to the special issue on word sense disambiguation: the state of the art. Computational linguistics, 24(1):2–40, 1998.
[13] Eneko Agirre, Oier Lopez de Lacalle, and Aitor Soroa. Random walks for knowledge-based word sense disambiguation. Computational Linguistics, 40 (1):57–84, 2014.
[14] Tim Berners-Lee. Linked data - design issues. https://www.w3.org/
DesignIssues/LinkedData.html, 2006. [Online; accessed 02-April-2016].
[15] Ronald J. Brachman, Hector J. Levesque, and Raymond Reiter. Knowledge Representation. MIT press, 1992.
[16] John F. Sowa. Knowledge Representation: Logical, Philosophical, and Compu-tational Foundations. Brooks/Cole, 1999.
[17] Randall Davis, Howard Shrobe, and Peter Szolovits. What is a knowledge representation? AI magazine, 14(1):17, 1993.
[18] Ricardo Baeza-Yates and Berthier Ribeiro-Neto. Modern information retrieval, volume 463. ACM Press New York, 1999.
[19] Amit Singhal. Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4):35–43, 2001.
[20] Barbara Kitchenham. Procedures for performing systematic reviews. Keele, UK, Keele University, 33(2004):1–26, 2004.
[21] Mark Petticrew and Helen Roberts. Systematic reviews in the social sciences:
A practical guide. John Wiley & Sons, 2008.
[22] Mohammad Zarour, Alain Abran, Jean-Marc Desharnais, and Abdulrahman Alarifi. An investigation into the best practices for the successful design and
Bibliography 63
implementation of lightweight software process assessment methods: A system-atic literature review. Journal of Systems and Software, 101:180 – 192, 2015.
ISSN 0164-1212.
[23] Toan Luu, Fabius Klemm, Ivana Podnar, Martin Rajman, and Karl Aberer.
Alvis peers: A scalable full-text peer-to-peer retrieval engine. In Proceedings of the International Workshop on Information Retrieval in Peer-to-peer Networks, P2PIR ’06, pages 41–48, New York, NY, USA, 2006. ACM. ISBN 1-59593-527-4.
[24] Frederico A. Durão, Taciana A. Vanderlei, Eduardo S. Almeida, and Silvio R.
de L. Meira. Applying a semantic layer in a source code search tool. In Pro-ceedings of the 2008 ACM Symposium on Applied Computing, SAC ’08, pages 1151–1157, New York, NY, USA, 2008. ACM. ISBN 978-1-59593-753-7.
[25] Asad Masood Khattak, Jibran Mustafa, Nabeel Ahmed, Khalid Latif, and Shar-ifullah Khan. Intelligent search in digital documents. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, WI-IAT ’08, pages 558–561, Washington, DC, USA, 2008. IEEE Computer Society. ISBN 978-0-7695-3496-1.
[26] A. N. Jamgade and S. J. Karale. Ontology based information retrieval system for academic library. In Innovations in Information, Embedded and Communi-cation Systems (ICIIECS), 2015 International Conference on, pages 1–6, March 2015.
[27] F. K. Amanqui, K. J. Serique, S. D. Cardoso, J. L. D. Santos, A. Albuquerque, and D. A. Moreira. Improving biodiversity data retrieval through semantic search and ontologies. In Web Intelligence (WI) and Intelligent Agent Tech-nologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on, volume 1, pages 274–281, Aug 2014.
[28] H. Dong, F. K. Hussain, and E. Chang. A service search engine for the industrial digital ecosystems. IEEE Transactions on Industrial Electronics, 58(6):2183–
2196, June 2011. ISSN 0278-0046.
[29] A. M. Khattak, N. Ahmad, J. Mustafa, Z. Pervez, K. Latif, and S. Y. Lee.
Context-aware search in dynamic repositories of digital documents. In 2013
Bibliography 64
IEEE 16th International Conference on Computational Science and Engineer-ing, pages 338–345, Dec 2013.
[30] D. Çelik, E. Elverici, A. Elçi, and N. Inan. Educational activity finder for chil-dren with pervasive developmental disorder through a semantic search system.
In 2012 IEEE 36th Annual Computer Software and Applications Conference, pages 482–487, July 2012.
[31] R. Mendonça, A. F. Rosa, J. L. Oliveira, and A. Teixeira. Ontology-based health information search: Application to the neurological disease domain. In 2013 8th Iberian Conference on Information Systems and Technologies (CISTI), pages 1–6, June 2013.
[32] N. Guelfi, C. Pruski, and C. Reynaud. Experimental assessment of the target adaptive ontology-based web search framework. In 2010 10th Annual Inter-national Conference on New Technologies of Distributed Systems (NOTERE), pages 297–302, May 2010.
[33] S. Movva, R. Ramachandran, S. Graves, and H. Conover. Customizable search engine with semantic and resource aggregation capability. In 2008 10th IEEE Conference on E-Commerce Technology and the Fifth IEEE Conference on En-terprise Computing, E-Commerce and E-Services, pages 376–381, July 2008.
[34] A. Vasilateanu, N. Goga, and A. Moldoveanu. Semantic report search engine - questor. In System Theory, Control and Computing (ICSTCC), 2014 18th International Conference, pages 134–139, Oct 2014.
[35] M. El Kholy and A. Elfatatry. Intelligent broker a knowledge based approach for semantic web services discovery. In Evaluation of Novel Approaches to Software Engineering (ENASE), 2015 International Conference on, pages 39–44, April 2015.
[36] A.a Hogan, A.b Harth, J.a Umbrich, S.a Kinsella, A.a Polleres, and S.a Decker.
Searching and browsing linked data with swse: The semantic web search engine.
Journal of Web Semantics, 9(4):365–401, 2011. ISSN 15708268.
[37] F. Corradini, F. De Angelis, F. Paoloni, A. Polzonetti, and B. Re. A case study of a semantic search engine for g2g collaboration based on intelligent docu-ments. Proceedings of 4th International Conference on e-Government, ICEG 2008, pages 499–506, 2008.
Bibliography 65
[38] K. Schutte, F. Bomhof, G. Burghouts, J. Van Diggelen, P. Hiemstra, J. Van
’T Hof, W. Kraaij, H. Pasman, A. Smith, C. Versloot, and J. De Wit. Goose:
Semantic search on internet connected sensors. Proceedings of SPIE - The International Society for Optical Engineering, 8758, 2013. ISSN 0277786X.
[39] L.a Kerschberg, H.a Jeong, Y.U.b Song, and W.c Kim. A case-based framework for collaborative semantic search in knowledge sifter. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4626 LNAI:16–30, 2007.
[40] D. Çelik, A. ElçI, and E. Elverici. Finding suitable course material through a semantic search agent for learning management systems of distance education.
Proceedings - International Computer Software and Applications Conference, pages 386–391, 2011.
[41] M. Jurczyk-Bunkowska and I. Pawełoszek. The concept of semantic system for supporting planning of innovation processes [koncepcja semantycznego system wspomagania planowania procesów innowacji]. Polish Journal of Management Studies, 11(1):79–89, 2015.
[42] M. Ponnada and N. Sharda. Model of a semantic web search engine for multime-dia content retrieval. Proceedings - 6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007; 1st IEEE/ACIS Interna-tional Workshop on e-Activity, IWEA 2007, pages 818–823, 2007.
[43] T.T. Dao, T.N. Hoang, X.H. Ta, and M.C. Ho Ba Tho. Knowledge-based personalized search engine for the web-based human musculoskeletal system resources (hmsr) in biomechanics. Journal of Biomedical Informatics, 46(1):
160–173, 2013.
[44] T. Bürger and G. Güntner. Smart content factory - semantic knowledge based indexing of audiovisual archives. IET Seminar Digest, 2005(11099):367–371, 2005.
[45] E. Borini, R. Damiano, V. Lombardo, and A. Pizzo. Dramasearch. character-mediated search in cultural heritage. Proceedings - 2009 2nd Conference on Human System Interactions, HSI ’09, pages 554–561, 2009.
Bibliography 66
[46] M. Thangaraj and G. Sujatha. An architectural design for effective information retrieval in semantic web. Expert Systems with Applications, 41(18):8225–8233, 2014.
[47] S. Paiva, M.R. Cabrer, and A.G. Solla. Precision: A semantic search guided-based system. Systems Theory: Perspectives, Applications and Developments, pages 209–228, 2014.
[48] G. Besbes, H. Baazaoui-Zghal, and H.B. Ghezela. Fuzzy ontology-based system for personalized information retrieval. KDIR 2014 - Proceedings of the Inter-national Conference on Knowledge Discovery and Information Retrieval, pages 286–293, 2014.
[49] U. Bügel, M. Schmieder, B. Schnebel, T. Schlachter, and R. Ebel. Leveraging
[49] U. Bügel, M. Schmieder, B. Schnebel, T. Schlachter, and R. Ebel. Leveraging