Questions about the coursework

The Search engine part

  • How is the security of the application evaluated? I will test the application to see whether it is secure by typing some malicious code into your forms. The tests relate mostly to what was discussed in the week 4 lecture. In order for you to receive 10 points for security, your script must pass all tests and there must be some discussion of security in your documentation.
  • What do you mean by "5 lines"? Lines could either mean "lines in the html source text file" or "lines as displayed by the browser". Either solution is fine if it results in displaying a bit of context for each match. If you intend to use the line breaks in the original file, you need to ensure that the Unix line breaks don't get lost if you save the file on a PC.
  • What if the display is messed up because of open HTML tags? Dealing with open HTML tags can be difficult. Getting this to work 100% may be impossible. Aim for a solution that works in most cases.
  • How to highlight the results: for example, you could use bold font. Have a look how Google or similar search engine do this.
  • Downloading the page that is to be searched: You can download the page and store it on your server.
  • Changing the page that is to be searched: If you want to change the page in any way (i.e. break it down, insert something, reformat, upload into MySQL), you can do that. As long as the search retrieves the same pieces of text as the original page, I don't care what you do to your copy of the page.
  • Funny characters on the page that is to be searched: There are some funny characters on the page (eg. &#60; and &#62; which display as < and >). You don't need to worry about them with respect to the search. I do not intend to use any difficult characters when I test your search for marking purposes. (But you may need to worry about strange characters with respect to security - see the lecture notes).
  • What is the difference between AND and a phrase search? In a phrase search the words must occur adjacent to each other, exactly in the same order as entered in the text field by the user. For Boolean AND, the words need not be adjacent and can be in any order.
  • Boolean AND on the search form: if two words are entered then the search should default to evaluating the terms using Boolean AND. Thus users should not type "AND" or any other character to denote AND. The difference between this search and a phrase search is that users need to enter double quotes ("") around phrases.
  • Boolean AND and the 5 lines before and after: you can restrict the Boolean AND to the 10 lines you are printing (i.e. all words must occur in the 10 lines). Or you can evaluate AND with respect to the whole page (i.e. print 10 lines around any occurrence of any of the words). This is still different from OR, because in that case not all words need to occur on the page.

    Problems with Cookies

  • Cookies may be turned off by default in IE in the JKCC. To turn them on go to Tools/Internet Options/Privacy/Advanced Settings
  • From home: some students report that cookies may not work properly when they are using the scripts from home. This could be due to the firewall of the university.
  • It is a good idea to go to the directory where cookies are stored and check whether cookies are actually written. Note: this only works for non-session cookies. Session cookies are held in memory by the browser and are never written onto the hard-drive at all.

    The printed documentation

  • Schematic diagram/flowchart: you only need one diagram. This diagram can be about your whole application (either including the HTML pages or only the scripts) or it can focus on a particular aspect of the tool.

    Hints for general problems with the webserver

  • Perl: Internal Server Error
    If you edit your CGI files using a Windows editor, the first line of your file must end in "-w" ("#!/usr/local/bin/perl -w"), otherwise you will get a server error. This is because Unix does not like the Windows newline character after "/perl"; Unfortunately, more complex CGI scripts with undefined variables can run slowly with the "-w" option. On Unix, it may be better not to use the "-w" once any errors have been removed from the script.
  • No html pages, Perl or PHP scripts are viewable through the browser
    For some reason C&IT seems to occasionally reset permissions. Go to your home directory (the one above public_html) and type "chmod 711 .".

    Questions about the exam

    The format of the exam

  • See the past paper webpage
  • You will not be asked to recite facts, but instead you will be tested with respect to your understanding of the topics.
  • If you are asked to write code in the exam, you can write it either in Perl or PHP.


  • What books to use for revision? There should be a few books about Perl, PHP in the library (eg the Learning Perl book). But I am not sure how you would use them for revision because they are more reference books than textbooks. I would probably recommend to go over some of the exercises again instead of reading lots of books.
  • Will we have to know the content of the various extra readings? You should know the concepts which were discussed in the extra readings. But you will not be asked to recall exactly what author X said in paper Y.

    What you can take with you to the exam

  • The exam is an open-book exam. You may use any book and any handwritten notes.
  • You should take the following with you:
  • You may NOT use electronic devices (e.g. laptops, calculators, mobile phones etc). If you require electronic devices because of a disability or other reasons, you need to obtain advance permission.