1) Logfiles and Statistics
1.2) Collecting information:
logfiles,
cookies,
hit counters etc
which connect to some site tracker that collects web traffic data
1.3) Generating statistics:
site statistics (locally generated from server logs)
site trackers (usually remotely generated by tracking web traffic)
1.4) Global web statistics:
search engine statistics
eg. Lycos,
Web Server Surveys
WWW statistics
2) Search Engines
2.1) Examples:
IU Search
Excite advanced search
Yahoo advanced search
Google advanced search
2.2) Types of searches (cf. Rosenfeld & Morville, Chapter 6)
known-item search
existence search
exploratory search
comprehensive search (research)
2.3) Integrating searching and browsing
(cf. Rosenfeld & Morville, Chapter 6)
eg. NorthernLight,
Google
multiple search modes
indexing: manually derived categories versus automatically
derived categories
2.4) Search results (cf. Rosenfeld & Morville, Chapter 6)
context, summary, links, ranking
provide help in case no documents or too many documents retrieved
2.5) Search engine techniques
Natural language processing: parsing, stemming, indexing
Information retrieval: Salton's vector space model
Artificial intelligence: automated summarization, categorization
latent semantic indexing, singular value decomposition
Google: using link structures, anchor texts
3) Development Resources
-> Amanda's presentation;
W3C resources;
Netscape developer's resources;
Web Developer's Virtual Library; ....
4) Promoting your site
The art of being highly ranked by search engines.
5) Encryption
SSL (Secure Socket Layer; https)
server and browser must both be configured to use SSL
servers are authenticated via a certificate authority, such as
Thawte or
VeriSign
transmitted data is encrypted