General questions about coursework

If you have general questions about your coursework, you can go to a practical and ask your questions. The practicals on Monday (15:00 - 17:00, JKCC 2) and Tuesday (10:00 - 12:00, JKCC12) are fairly empty. (The first hour on Tuesday is supervised by Dr. Liu, second hour by Dr. Priss). There are also practicals on Wednesday and Thursday.

Printout for the coursework

When you print your source code and documentation, please, avoid using too much paper. Use small fonts and print two pages per page (and/or use both sides). As long as the text is readable without a magnifying glass, it will be ok. Remember to highlight the security part(s) with a pen.

Each team should only hand-in one copy of the code and documentation.

Marking Scheme for the coursework

You can reach a total of 45 points for the coursework. This is 50% of your final mark. The points will be distributed as follows:

Basic Search12 points
Advanced Search12 points
HTML: search screens, results pages8 points
parsing,cookie, security6 points
design and adherence to specification3 points
documentation, credits page4 points

Problems with Titan:

  • Internal Server Error
    If you edit your CGI files using a Windows editor, the first line of your file must end in "-w" ("#!/usr/local/bin/perl -w"), otherwise you will get a server error. This is because Unix does not like the Windows newline character after "/perl"; Unfortunately, some CGI scripts may not run with the "-w" option on Titan (see below).
  • No html pages or CGI scripts are viewable through browser
    For some reason C&IT seems to occasionally reset permissions. Go to your home directory (the one above public_html) and type chmod 711.
  • Slow response or no response on Titan
    Instead of the URL,, you can also view your files through This is processed through a different webserver which currently seems to work better. (Thanks to B. Laird for alerting me to this problem.) If you are editing your files on a Unix editor, you should leave the "-w" off the first line. (I.e. the first line should be "#!/usr/local/bin/perl"). Then most likely your scripts will work on Titan.

    Problems with Cookies

  • Cookies are turned off by default in IE in the JKCC. To turn them on go to Tools/Internt Options/Privacy/Advanced Settings

    Specific questions about the coursework

  • Parsing
    You can either parse the three text files once and store them in new files or you can parse them each time a user searches. But make sure that the parsing is done by Perl and does not involve any manual editing.
  • cat versus catalog, but versus butter
    A search for "cat" should not return "catalogue", a search for "but" should not return "butter". Use word boundaries (\b) to achieve this.
  • Underscores in the text
    You should take the underscores out of the original text file.
  • Dealing with "don't"
    There are some odd occurrences in the text file, for example, "don't" is printed as "do n't". Ignore this. I won't use such words to test your search.
  • Boolean Operators in basic search
    In the basic search, the operators from the form can directly be translated into Perl AND/OR/NOT.
    Operator Precedence: AND is stronger than OR. That means that "a AND b OR c" is the same as "(a AND b) OR c". For NOT: "a NOT b" is the same as "a AND NOT b".
  • Documentation/Help page
    The text of your documentation page (or help page) should be between 1 and 2 pages long when printed on A4 paper. In addition to that you should have 1 or 2 diagrams as specified in the coursework description.

    Expert Search

  • Boolean Operators in expert search

    Short answer: I apologize for the confusion about the expert search. If you think that your search does not correspond to any of the solutions I describe below (for example, if you don't allow a + for the first term), please, don't panic. These are minor details which do not have a major impact on the points for the coursework. I would recommend that you don't start making changes but instead mention in your documentation how your expert search works.

    Long answer: Translating the Boolean operators directly into Perl AND/OR/NOT causes problems. If
    +cat +dog bird fish
    was translated into
    (cat AND dog) OR bird OR fish
    then documents with only bird or fish but not dog or cat would be retrieved. This would be a contradiction to the fact that cat and dog are supposed to be required.

    But it turns out that other possible solutions also have problems. Another solution is to group all + terms and all terms without + and connect them with AND:
    (cat AND dog) AND (bird OR fish)
    But this produces odd behavior in the case of only one optional term because
    +cat +dog bird
    would be translated as
    cat AND dog AND bird
    where the optional term is then also required.

    The solution which seems most appropriate from a logical viewpoint would be to leave the optional terms off, if required terms exist:
    cat AND dog
    But this logically correct solution has severe usability issues because optional terms seem to have no purpose at all.

    All of these issues are caused by the fact that there is no simple way to translate natural language terms (such as "required") into Boolean operators.

    In conclusion: Your expert search should produce precise results in cases where all terms are optional and in cases where all terms either have + or - in front of them, because these cases are unambiguous. In all other cases, there exist several different possible solutions.

  • How the expert search will be evaluated
    You need not consider more than three terms in one query. You can expect the plus or minus sign to be directly attached to the terms. For example, a query might be "cat +dog", but not cat + dog or cat+dog.