The following paragraph is excerpted from a tutorial by Appelt and Isreal [3], which demonstrates IE systems and its debate.
There are two basic approaches to the design of IE systems, which are labeled as the Knowledge Engineering Approach and the Automatic Training Approach. In knowledge engineering ap- proach, information parsing and extracting rules are defined by knowledge engineer or domain
expert. The expert will formulate his rules based on pre-defined and usually domain specific text and his own knowledge in the domain. Despite its disadvantage that it is highly dependent on the knowledge and expertise of the engineer, most of the well performing IE systems are based on this approach. On the other hand, the automatic training approach does not require human knowledge but labor. The system will be fed by training corpora, a set of domain relevant texts, to acquire grammar annotations and then a learning algorithm is run resulting in information needed for further analysis.
OpenNLP18 is an online list of open source projects in NLP including IE and IR, available for both scientists and developers. For example, MinorThird [6], a collection of open source Java classes, has a number of learning methods that extracts text and label text documents. The most popular of IE projects is General Architecture for Text Engineering19(GATE), which offers not only classes and components but also a software development architecture and component development tools. The Natural Language Processing Research Group at the University of Sheffield, UK20 and Information Sciences Institute21 have also bunch of links to IE projects.
6
Conclusion
In this study we searched for a system to share our reflections and ideas on a shared experience or event with other people. Chat rooms? Instant messaging? or Discussion boards? We have reviewed several types of mass interaction systems and assessed their similarity to our desired system. With the rapid development of the Internet, computer mediated communication is evolving more rapidly. From evaluated applications, web discussion forum turns out to be the most versatile method of sharing reflection during an event. Potential conference organizers and attendees were interviewed and gave their view on the conference support tool, shared reflection and shared their own experiences. Based on evaluation results, system functional and non-functional requirements were structured and design prototype was outlined. Finally, the matchmaking module was discussed and Natural Language Processing and its subbranches were suggested for the implementation.
References
[1] K. C. Adams. The web as database: New extraction technologies and content management.
Online, March 2001.
[2] Jeana Frost Andrew Fiore and Judith S. Donath. Scientists, designers seek same for good conversation: Workshop on online dating.
18http://opennlp.sourceforge.net 19http://gate.ac.uk
20http://nlp.shef.ac.uk/research/areas/ie.html 21http://www.isi.edu/info-agents/RISE/projects.html
[3] D. E. Appelt and D. J. Israel. Introduction to information extraction technology. In 16th
International Joint Conference on Artificial Intelligence, 1999.
[4] Susan B. Barnes. Computer-Mediated Communication: Human-to-Human Communication
across the Internet. Allyn and Bacon, 2003.
[5] Hee-Kyung Cho, Murray Turoff, and Starr Roxanne Hiltz. The impacts of delphi commu- nication structure on small and medium sized asynchronous groups: Preliminary results. [6] W.W. Cohen. Minorthird: Methods for identifying names and ontological relations in text
using heuristics for inducing regularities from data, 2004.
[7] Pavel Curtis and David A. Nichols. MUDs grow up: Social virtual reality in the real world. In COMPCON, pages 193–200, 1994.
[8] Line Eikvil. Information extraction from world wide web - a survey. Technical Report 945, Norweigan Computing Center, 1999.
[9] Andrew Fiore. Online personals: An overview.
[10] Leonard N. Foner. A multi-agent referral system for matchmaking, 1996. [11] Thomas F. Gordon and et al. Zeno: Groupware for discourses on the internet.
[12] Thomas F. Gordon and Nikos I. Karacapilidis. The zeno argumentation framework. In
International Conference on Artificial Intelligence and Law, pages 10–18, 1997.
[13] Object Management Group. Uml specification.
[14] Cunningham H. Information extraction, automatic. In Encyclopedia of Language and
Linguistics, 2nd Edition. Elsevier, 2005.
[15] The Palace Inc. Virtual world chat software, 1997.
[16] Giulio Jacucci, Antti Oulasvirta, Antti Salovaara, and Risto Sarvas. Supporting the shared experience of spectators through mobile group media. In GROUP ’05: Proceedings of the
2005 international ACM SIGGROUP conference on Supporting group work, pages 207–216,
New York, NY, USA, 2005. ACM Press.
[17] Daniel Jurafsky and James H. Martin. Speech and Language Processing: An Introduction to
Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice
Hall PTR, Upper Saddle River, NJ, USA, 2000.
[18] David Kurlander, Tim Skelly, and David Salesin. Comic chat. In SIGGRAPH ’96: Pro-
ceedings of the 23rd annual conference on Computer graphics and interactive techniques,
pages 225–236, New York, NY, USA, 1996. ACM Press.
[19] E. Marsh and D. Perzanowski. Muc-7 evaluation of ie technology: Overview of results. In
[20] Joseph F. McCarthy, Danah Boyd, Elizabeth F. Churchill, William G. Griswold, Elizabeth Lawley, and Melora Zaner. Digital backchannels in shared physical spaces: attention, intention and contention. In CSCW ’04: Proceedings of the 2004 ACM conference on
Computer supported cooperative work, pages 550–553, New York, NY, USA, 2004. ACM
Press.
[21] T. Nasukawa and T. Nagano. Text analysis and knowledge mining system. IBM Syst. J., 40(4):967–984, 2001.
[22] Stephen D. Richardson. Determining similarity and inferring relations in a lexical knowl-
edge base. PhD thesis, New York, NY, USA, 1997.
[23] Mark Roseman and Saul Greenberg. Teamrooms: Groupware for shared electronic spaces, 1996.
[24] M. Schwartz and D. Wood. Discovering shared interests among people using graph analysis of global electronic mail traffic, 1992.
[25] Upendra Shardanand and Patti Maes. Social information filtering: Algorithms for automat- ing “word of mouth”. In Proceedings of ACM CHI’95 Conference on Human Factors in
Computing Systems, volume 1, pages 210–217, 1995.
[26] Tomek Strzalkowski, Gees C. Stein, G. Bowden Wise, Jose Perez Carballo, Pasi Tapanainen, Timo Jarvinen, Atro Voutilainen, and Jussi Karlgren. Natural language information re- trieval: TREC-7 report. In Text REtrieval Conference, pages 164–173, 1998.
[27] Yasuyuki Sumi and Kenji Mase. Collecting, visualizing, and exchanging personal interests and experiences in communities. In WI’01: First Asia-Pacific Conference on Web Intelli-
gence: Research and Development, pages 163–174, London, UK, 2001. Springer-Verlag.
[28] William J. Tolone, Simon M. Kaplan, and Geraldine Fitzpatrick. Specifying dynamic sup- port for collaborative work within worlds. In COCS ’95: Proceedings of conference on
Organizational computing systems, pages 55–65, New York, NY, USA, 1995. ACM Press.
[29] Takashi Tomokiyo and Matthew Hurst. A language model approach to keyphrase extrac- tion. In Proceedings of ACL Workshop on Multiword Expressions, 2003.
[30] Peter D. Turney. Learning algorithms for keyphrase extraction. Information Retrieval, 2(4):303–336, 2000.
[31] Murray Turoff and Harold A. Linstone. The Delphi Method: Techniques and Applications. Addison-Wesley, 1975.
[32] Fernanda B. Vi´egas and Judith S. Donath. Chat circles. In CHI ’99: Proceedings of the
SIGCHI conference on Human factors in computing systems, pages 9–16, New York, NY,
USA, 1999. ACM Press.
[33] Steve Whittaker and Candace Sidner. Email overload: exploring personal information management of email. In CHI ’96: Proceedings of the SIGCHI conference on Human
[34] Steve Whittaker, Loren G. Terveen, William C. Hill, and Lynn Cherny. The dynamics of mass interaction. In Computer Supported Cooperative Work, pages 257–264, 1998.
[35] Wikipedia. Avatar (virtual reality) — wikipedia, the free encyclopedia, 2006. [36] Wikipedia. Blog — wikipedia, the free encyclopedia, 2006.