Search engines and Web directories

Finding Specifc Information

Find the information agents text, which we used last week, on the web. Search for it using the following search engines and write down for each engine how many copies of the text you can find.

Google Northern Light Hotbot AltaVista

Using categories

1) Find categories in Yahoo and Open Directory Project where "information agents" belong. Note that the categories are not called "information agents" and that Yahoo uses Google for its "Web Page Matches". For this exercise only look at the categories not at the "Web Page Matches".

2) Find an introduction to object-oriented programming that explains it without using a specific programming language.
Try it first on a Search engine (Google) and then in a Web directory (Yahoo). Which is the appropriate Yahoo category? Try the same search in the Open Directory Project.

3) Find a tutorial for "DOM" (meaning the "Document Object Model") in Open Directory Project and Yahoo. Does Yahoo have a category in which the DOM tutorial can be found? Compare Yahoo's "Web Page matches" (i.e. its use of Google) with searching on Google directly.

Document clustering

Type "DOM" into Vivisimo. Can you get an idea of what DOM is about by looking at the categories (which are generated by "document clustering")? Compare with Northern Light.

Further search options

Try searches for your name (if you have some web pages other than on or the name of someone you know (if you don't have pages yourself. Note that pages on are often excluded from search engines because of a setting on that does not permit spiders or robots.) Try all the different search engines and directories mentioned before. Which ones work best? Try different combinations of the name, eg "firstname lastname" or "lastname firstname" or putting the name in double quotes. Do these make a difference? Try the different advanced search options on Yahoo and Google. Do they make a difference?

More Search Tips

Here is a site that has more search tips and examples.