I200 Information Representation

School of Informatics
Indiana University
Spring 2001

Instructor: Uta Priss
Email: upriss@indiana.edu
Office: LI 029
Office phone: 812-855-2793
Office hours: Tuesday 4.00 - 5.00 and by appointment

This syllabus is electronically available at http://php.indiana.edu/~upriss/i200/200-Sp01-syllabus.html

Course Syllabus

Team Project
new optional reading: The Semantic Web
In: Scientific American, May 2001.
new optional reading: Agent-Oriented Technology in Support of E-Business
In: Communications of the ACM, Vol 44, 4, 2001.

Introduction

The basic structure of information representation in social and scientific applications is the topic of this course. Organization, storage and retrieval of information are important challenges for the modern information society. This course introduces representational structures and approaches from many disciplines: philosophical theories of classification and categorization; psycho-linguistic models of mental language processing; information access and representation on the World Wide Web; object-oriented design and relational databases; AI knowledge representation and discovery. The multi-disciplinary approach of this course demonstrates how concepts of information representation are shared amongst different disciplines and how they can be combined. The goal of the course is to provide a broad but basic introduction to current information representation techniques and paradigms. Although some software tools will be used in the exercises, the students are not expected to write computer programs for this course. Even concepts such as object-oriented design will be explained independently of an actual programming language.

Course Objective

To introduce the student to a broad range of information representation models drawn from the fields of information science, computer science, semiotics, philosophy, cognitive psychology, and artificial intelligence.

Prerequisites

Knowledge of a programming language as can be obtained from INFO I110, INFO I210, or similar courses. Recommended prerequisite or concurrent: INFO I201. Basic knowledge of how to use the WWW.

Class Organization

Each class session consists of lectures, class discussions, and in-class exercises, which the students will work on in small teams. Besides the in-class exercises, assignments are given for each week that the students are expected to complete (with the help of the assigned readings) before the weekly meetings. The assignments will be discussed in class. The students will work on a semester-long team project that they will present during the last class session.

Readings

A readings package will be available from the IU bookstore. Some of the readings are not included in the readings package but instead are on-line linked to this syllabus. To read the on-line ACM readings, students must use a computer that belongs to the indiana.edu domain because the ACM digital library may not be available otherwise.

Grading

The final course grade will be computed for each student on the basis of grades assigned for the following:

Class contribution and listserv discussion 1/3
Team project 1/3
Midterm exam and multiple choice tests 1/3

Each student is expected to complete all course work by the end of the term. A grade of incomplete (I) will be assigned only if exceptional circumstances warrant. In all other cases there will be a grade penalty for items that are handed in late.

Class contribution

Class contribution includes the quality and quantity of contributions to the work of the class. The students are expected to complete the assignments and readings of each week before the class meeting (except for the first week). If a student misses a class, he/she will hand in that week's assignment, which will not be graded but will ensure that the student does not fall behind in the class. Participation in the discussion of assignments and readings will be a large proportion of the class participation grade. It is required that every student demonstrates respect for the ideas, opinions, and feelings of all other members of the class.

A majordomo mailing list will be used to communicate about course matters. The students should send comments and questions concerning the reading materials and the assignments to this list. Students are expected to post a minimum of on average one message per week. The mailing list is upriss_i200@indiana.edu.

Team Project

The students will work on a semester-long team project. Each team will design an information system for the materials covered in this class. The system can be web-based or use a more traditional paper-based format. Each system must design three different means of information representation or access, such as classification system, graphical representation, database, metadata, ontology or others, as discussed in this class. The information to be represented consists of the topics and readings discussed in this class. Each information system must be well documented and will be handed in during the last class meeting. Each team will present their system during the last class meeting.

Midterm exam and multiple choice tests

The exam will be a take-home exam. Three multiple choice tests will be handed out during the semester. They will be announced in advance on the class mailing list.

A note on plagiarism

The students must clearly indicate if they use materials from other sources, such as textbooks or Internet webpages. Full citation information must be given for such sources. Academic and personal misconduct by students in this class are defined and dealt with according to the procedures in the Code of Student Ethics.

Class Schedule


Part I: Foundations


Week 1. (Jan 9, 11) Introduction: Data, Information, and Knowledge

Assignment:
Analyze the methods of information representation and information access in a phone-book. Pay special attention to the yellow pages. How are they organized? Which possibilities of information retrieval offers a phone-book on CD-ROM compared to a printed phone-book? What are equivalent services on the WWW? What are the user expectations of these types of information systems?

Readings:

Wurman, Richard Saul (1989).
The Understanding Business. In: Information Anxiety - What to do when information doesn't tell you what you need to know. New York: Doubleday. p. 51-82.
Buckland, Michael (1991).
Information as thing. Journal of the American Society for Information Science, 42, p. 351-360.

Week 2. (Jan 16, 18) Classification, categorization and concepts

Assignment:
In a grocery store analyze
* how the merchandise is organized/categorized;
* why this particular organizational structure was adopted; and
* whether this organizational/categorization scheme actually helps or hinders the customer in finding specific items.

Readings:

Jacob, Elin K. (1991).
Classification and categorization: drawing the line. In: Barbara H. Kwasnik and Raya Fidel (Eds.). Advances in classification research. Vol. 2, Washington D.C.: American Society for Information Science, p. 67-83.


Week 3. (Jan 23, 25) Cognitive Organization of Information: Scripts, Schemas, and Mental Models

Assignment:
Design a script for "shopping in a grocery store". How universal can your script be (considering shoppers with different cultural backgrounds or grocery stores in different countries)?

Readings:

Robillard, Pierre (1999).
The role of knowledge in software development. Communications of the ACM, January 1999, p. 87-92.
Optional Reading: Rumelhart, David E. (1984).
Schemata and the cognitive system. In: Wyer and Srull (Eds.). Handbook of social cognition. Vol. 1, Hillsdale NJ: Lawrence Erlbaum, p. 161-188.
Optional Reading: Schank, Roger, and Kass, Alex. (1988).
Knowledge representation in people and machines. In: Umberto Eco, Marco Santambrogio and Patrizia Violi (Eds.), Meaning and mental representation. Bloomington: Indiana University Press, p. 181-200.


Week 4. (Jan 30, Feb 1) Faceted Knowledge Organization

Assignment:
Three facets for grocery items are "storage temperature", "packaging", "type of meal". Find some values (classes) for each of these facets. Choose five grocery items and assign them to classes in the facets.

Readings:

Hunter, Eric J. (1988)
Faceted Classification. Classification made simple, Gower, p. 7-33.
Web Developer's Virtual Library
The Web Librarian. (On-line at http://www.wdvl.com/Location/WLn/)

Week 5. (Feb 6, 8) Spatial and Temporal Information

Assignment:
Describe the use of tools such as "knife", "fork", and "cup" during the process of "eating a meal" in a semi-formal way. It will be composed of actions, such as "picking up fork", "holding fork", "laying down fork", etc. What are the temporal relationships, i.e. which actions precede or follow which other actions, which actions are simultaneously? You can use a graphical representation similar to Figure 8. in Allen's paper. How can you represent repeating actions?

Readings:

Allen, James F. (1983).
Maintaining knowledge about temporal intervals. Communications of the ACM, 26 (11), 832-843.
GIS Introduction
(Online document)
Optional Reading: Cyber Geography Research
(Online document)

Part II: Information on the WWW and in Relational Databases; Object-Oriented Design


Week 6. (Feb 13, 15) Information on the WWW

Assignment:
For the text at the following page find
* three subject terms under which it should be listed in a directory such as Yahoo
* three index terms that would be assigned to the text by an automatic indexing system (i.e. the most frequent terms except of stopwords)
* three terms that would be useful for retrieving the document through a full text search but that would not be listed as subject terms or index terms.

Readings:

Gudivada, V.; Raghavan, V.; Grosky, W.; Kasanagottu, R. (1997)
Information Retrieval on the World Wide Web. IEEE Internet Computing, October 1997.
Schwartz, Candy (1998)
Web Search Engines. Journal of the American Society for Information Science, 49 (11), p. 973-982.
Feb 13, Midterm will be handed out

Week 7. (Feb 20, 22) The Object-Oriented Design Paradigm

Assignment:
Design a simple system for traffic simulation that contains cars, bikes, pedestrians, streets, pedestrian crossings, traffic lights and crossings with four-way stop signs. For each class give several attributes and methods. Pay special attention to attributes that are needed so that the vehicles and pedestrians can react to crossings and to other vehicles and pedestrians.

Readings:

Parsons, Jeffrey; Wand, Yair (1997).
Choosing classes in conceptual modeling. Communications of the ACM, June 1997, p. 63-69.
Object orientation
(Online document).

Week 8. (Feb 27, Mar 1) Relational Databases

Assignment:
Draw entity-relationship diagrams for the traffic assignment.

Readings:

Sanders, G. Lawrence (1995).
Data Modeling. Danvers, Boston: Boyd & Frasier, p.17-38.
Feb 27, Midterm is due

Week 9. (Mar 6, 8) Information Architecture

Assignment:
For each of the following questions, determine broad categories instead of specific details:
What software is used on the WWW to facilitate information access, gather statistics, ensure security, etc?
What are the main challenges of web site development (considering, for example, temporal aspects, commerce, accessibility, navigation, implementation, ...)
What people skills are required of a webmaster?

Readings:

Web Developer's Library
What is a Webmaster? (On-line at http://stars.com/Internet/Web/Jobs/webmaster.html)
Web Developer's Library
Faceted Hyper-Trees. (On-line at http://www.wdvl.com/Location/Navigation/Classify.html)
Richmond, Alan
Conceptual Foundations. (On-line at http://stars.com/Authoring/Design/Conceptual.html)

Part III: Information Representation in AI and Cognitive Science


Week 10. (Mar 20, 22) Knowledge Representation and Reasoning

Assignment:
Analyze the following semantic network. What implications are made by the network: what statements can be made about the relationship between a) "Marge" and "dog", b) "Lisa" and "power plant" and c) "Marge" and "Springfield"? Which information is missing?

Readings:

Firebaugh, M. W. (1988)
Knowledge Representation in AI. Artificial Intelligence. Chapter 9. Boston: Boyd & Frasier. p. 274-299.


Week 11 (Mar 27, 29) Lexical Databases and Thesauri

Assignment:
Compare the term (concept) "clothes" in WordNet and Roget's Thesaurus.

Readings:

Miller, George A. (1995).
WordNet: a lexical database for English. In: Communications of the ACM 38 (11), November 1995, p. 39-41.
WordNet on-line.
(Click on "Use WordNet Online").

Week 12. (Apr 3, 5) Ontologies

Assignment:
Analyze the concept "clothes" in CYC.

Readings:

Lenat, Douglas B. (1995).
CYC: a large-scale investment in knowledge infrastructure. In: Communications of the ACM 38 (11), November 1995, p. 33-38.
The Upper CYC Ontology.
(Online document).
new optional reading: Ontology FAQ
(Online document).
new optional reading: Agent-Oriented Technology in Support of E-Business
In: Communications of the ACM, Vol 44, 4, 2001.


Week 13. (Apr 10, 12) Conceptual Graphs and Formal Concept Analysis

Assignment:
Draw conceptual graphs for the following three sentences:
Marry buys an apple at the large grocery store for $1.00.
The plane flies from Chicago to Indianapolis.
John believes that the plane that arrived from Chicago will leave on time.

Readings:

Sowa, John (1999).
Conceptual Graphs and Conceptual Graphs Examples (on-line documents).
Wolff, Karl Erich (1994).
A first Course in Formal Concept Analysis. Proceedings SoftStat'93. Gustav Fischer Verlag. p. 1-5.


Week 14. (Apr 17, 19) Knowledge Discovery and Data Mining

Assignment:
To be determined.

Readings:

Fayyad, U.; Uthurusamy, R. (1996)
Data Mining and Knowledge Discovery in Databases. Communications of the ACM, Vol. 39, No. 11, p. 24-26.
Munakata, Toshinori (1999).
Knowledge Discovery. Communications of the ACM, Vol. 42, No. 11, p. 27-29.
Optional Readings:
Communications of the ACM, Vol. 39, No. 11, p. 27-64.
Communications of the ACM, Vol. 42, No. 11, p. 30-67.
new optional reading: The Semantic Web
In: Scientific American, May 2001.

Week 15. (Apr 24, 26) Presentation of Team Projects