General questions about coursework
If you have general questions about your coursework, you can go to
a practical and ask your questions. The practicals on Monday
(15:00 - 17:00, JKCC 2) and Tuesday (10:00 - 12:00, JKCC12) are
fairly empty. (The first hour on Tuesday is supervised by Dr. Liu,
second hour by Dr. Priss). There are also practicals on Wednesday
Printout for the coursework
When you print your source code and documentation, please, avoid
using too much paper. Use small fonts and print two pages per page
(and/or use both sides). As long as the text is readable without a
magnifying glass, it will be ok. Remember to highlight the security
part(s) with a pen.
Each team should only hand-in one copy of the code and documentation.
Marking Scheme for the coursework
You can reach a total of 45 points for the coursework. This is
50% of your final mark. The points will be distributed as follows:
|Basic Search||12 points
|Advanced Search||12 points
|HTML: search screens, results pages||8 points
|parsing,cookie, security||6 points
|design and adherence to specification||3 points
|documentation, credits page||4 points
Problems with Titan:
Internal Server Error
If you edit your CGI files
using a Windows editor, the first line of your file must end in "-w"
("#!/usr/local/bin/perl -w"), otherwise you will get a server error.
This is because Unix does not like the Windows newline character after "/perl";
Unfortunately, some CGI scripts may not run with the "-w" option
on Titan (see below).
No html pages or CGI scripts are viewable through browser
For some reason C&IT seems to occasionally reset permissions. Go
to your home directory (the one above public_html) and type chmod 711.
Slow response or no response on Titan
Instead of the
URL, www.titan.napier.ac.uk/~username, you can also view your files
through www.dcs.napier.ac.uk/~username. This is processed through a
different webserver which currently seems to work better. (Thanks to
B. Laird for alerting me to this problem.) If you are editing
your files on a Unix editor, you should leave the "-w" off the first line.
(I.e. the first line should be "#!/usr/local/bin/perl"). Then most
likely your scripts will work on Titan.
Problems with Cookies
Cookies are turned off by default in IE in the JKCC. To turn them
on go to Tools/Internt Options/Privacy/Advanced Settings
Specific questions about the coursework
You can either parse the three text files once and store them in new files
or you can parse them each time a user searches. But make sure that
the parsing is done by Perl and does not involve any manual editing.
cat versus catalog, but versus butter
A search for "cat" should not return "catalogue", a search for "but"
should not return "butter". Use word boundaries (\b) to achieve this.
Underscores in the text
You should take the underscores out of the original text file.
Dealing with "don't"
There are some odd occurrences in the text file, for example,
"don't" is printed as "do n't". Ignore this. I won't use such
words to test your search.
Boolean Operators in basic search
In the basic search, the operators from the form can directly
be translated into Perl AND/OR/NOT.
Operator Precedence: AND is stronger than OR. That means that
"a AND b OR c" is the same as "(a AND b) OR c". For NOT:
"a NOT b" is the same as "a AND NOT b".
The text of your documentation page (or help page) should be between
1 and 2 pages long when printed on A4 paper.
In addition to that you should have 1 or 2 diagrams as specified
in the coursework description.
Boolean Operators in expert search
I apologize for the confusion about the expert search.
If you think that your search does not correspond to any of the solutions
I describe below
(for example, if you don't allow a + for the first term), please,
don't panic. These are minor details which do not have a major impact on the
points for the coursework.
I would recommend that you don't start making changes but instead
mention in your documentation how your expert search works.
Translating the Boolean operators directly
into Perl AND/OR/NOT causes problems. If
+cat +dog bird fish
was translated into
(cat AND dog) OR bird OR fish
then documents with only bird or fish
but not dog or cat would be retrieved.
This would be a contradiction to the fact that cat and dog are
supposed to be required.
But it turns out that other possible solutions also have problems.
Another solution is to group all + terms and all terms
without + and connect them with AND:
(cat AND dog) AND (bird OR fish)
But this produces odd behavior in the case of only one optional term because
+cat +dog bird
would be translated as
cat AND dog AND bird
where the optional term is then also required.
The solution which seems most appropriate from a logical viewpoint
would be to leave the optional terms
off, if required terms exist:
cat AND dog
But this logically correct solution has severe usability issues because
optional terms seem to have no purpose at all.
All of these issues are caused by the fact that there is no simple
way to translate natural language terms (such as "required")
into Boolean operators.
Your expert search should produce precise results in cases where all
terms are optional and in cases where all terms either have + or -
in front of them, because these cases are unambiguous. In all other
cases, there exist several different possible solutions.
How the expert search will be evaluated
You need not consider more than three terms in one query.
You can expect the plus
or minus sign to be directly attached to the terms. For example,
a query might be "cat +dog", but not
cat + dog or cat+dog.