1) Logfiles and Statistics

1.1) Information about users that is available on the WWW

1.2) Collecting information:

logfiles,
cookies,
hit counters etc which connect to some site tracker that collects web traffic data

1.3) Generating statistics:

site statistics (locally generated from server logs)
site trackers (usually remotely generated by tracking web traffic)

1.4) Global web statistics:

search engine statistics eg. Lycos,
Web Server Surveys
WWW statistics

2) Search Engines

2.1) Examples:

  • IU Search
  • Excite advanced search
  • Yahoo advanced search
  • Google advanced search

    2.2) Types of searches (cf. Rosenfeld & Morville, Chapter 6)

  • known-item search
  • existence search
  • exploratory search
  • comprehensive search (research)

    2.3) Integrating searching and browsing (cf. Rosenfeld & Morville, Chapter 6)

  • eg. NorthernLight, Google
  • multiple search modes
  • indexing: manually derived categories versus automatically derived categories

    2.4) Search results (cf. Rosenfeld & Morville, Chapter 6)

  • context, summary, links, ranking
  • provide help in case no documents or too many documents retrieved

    2.5) Search engine techniques

  • Natural language processing: parsing, stemming, indexing
  • Information retrieval: Salton's vector space model
  • Artificial intelligence: automated summarization, categorization
  • latent semantic indexing, singular value decomposition
  • Google: using link structures, anchor texts

    3) Development Resources

    -> Amanda's presentation; W3C resources; Netscape developer's resources; Web Developer's Virtual Library; ....

    4) Promoting your site

    The art of being highly ranked by search engines.

    5) Encryption

    SSL (Secure Socket Layer; https)
  • server and browser must both be configured to use SSL
  • servers are authenticated via a certificate authority, such as Thawte or VeriSign
  • transmitted data is encrypted