SEARCHING
The Internet consists of millions of private/public/academic/business and government
networks having local to global scope. It has billions of web pages on varied topics
that may be accessed by you. Thus, if you are asked to surf the net for specific. information relating to a topic, it may take years before you could find all the pages
related topic you are looking for. How can you make this process faster? Well perhaps
you require the service of a search engine on the Internet. Some of the popular search
engines on the Internet are: Yahoo!, Google, Bing, Ask.com and many more. These
are the tools to help you locate the content that you are looking for
A search engine can be defined as a tool to search diverse and disorganized sources of
information available on the Internet. You can clearly visualize from this definition
that a search engine has to use some automated programs that needs to continuously
keep visiting the web pages about the content they have and organize the information
about web pages in some format. Programs that continuously keep searching for
information from web pages are called -spiders, robots, crawlers, wanderers and
worms. Search engines finds, classifies and stores information about the contents of
various websites on the Internet.Types of Search Engine
Some of the basic categories of Search engines are :
a. Primary Search Engines: Such search engines use web crawlers or spiders to
traverse the web and scan websites for key words, phrases, to generate database
of web pages having some indexing or classification. Google and Alta Vista are
examples of primary search engines.
b. Web directory: Web directories organize information into categories
and subcategories or directories. You can search a web directory for
all those entries that contain a particular set of keywords. Directories
differ from search engines in the way they organize information. Yahoo
is an example of web directory.
c. Meta search engines: Such search engines pass your queries to many search
engines and web directories and present summarized results to the users. Some
of the examples of meta search engines are — Dogpile, Infind, Metacrawler,
Metafind and Metasearch.
You can refer to further readings for more details on various types of search engines.
But how does search engine get able to do searching?
A search engine performs, the following three actions:
1. Spidering or Web crawling
2. Indexing
3. Searching
Spidering or Web crawling: Spider or Web crawler is a computer program that
browses the web pages of WWW in a systematic, automated manner. Search Engines
use spider for getting up-to-date data on web sites. They are used to create a copy of
the pages visited by them for later processing to create Index. These programs are also
useful in validating HTML code to a particular standard like XHTML or checking or
validating the hyperlinks.
Indexing: Once, the spiders have completed the task of finding information about
Web pages, the search engine must store the information in such way that you are able
to use it. The search engine may provide some information relating to relevance of
information may be in the form of Ranking. Thus, a search engine may store the
keywords of a web page, number of times that word appeared on the page, the URL of
the page. A weighting factor that gives more weightage in case a word is found at the
top of the document. Each commercial search engine uses a different formula for
assigning weight to the keywords in its index. This is one of the reasons that a search
for the same word on different search engines will produce different results.
Since, the data that is to be stored for indexing is large, therefore, search engine may
encode it. The Index is created with the sole purpose, that is, it allows you to find the
information on the Internet quickly. In general, Index uses hashing (you will study this
concept in the Data Structure course).
Searching: When a user enters a query into a search engine, the engine examines its
index and provides a listing of best-matching web pages according to its ranking
criteria. This short list, usually, have a short summary containing the title of the
document and small part of the text. Most search engines support Boolean search.
Some simple example of a search is given below:
To find website which contains ―java tutorial‖, you may type Java tutorial in the
search box of the browser. The search will look for keywords ―Java‖ AND ―Tutorial‖;
the search expression will retrieve all those records where both the terms occur
(Figure 2.5)
You may use other Boolean operators like OR and NOT in the searches. For example, if you are
interested either in web pages having keyword ―Java‖ or they are tutorials then you may use search
expression as: ―Tutorials” OR “Java” (please refer to Figure 2.6).
If you are looking for Tutorials that are not related to Java then you may use the search expression as
―Tutorials‖ AND (NOT ―Java‖) (refer to Figure 2.7).
Some other Boolean expressions are:
FOLLOWED BY: One of the terms must be directly followed by the other.
NEAR : One of the terms must be within a specified number of words of the other.
Quotation Marks : The words between the quotation marks are treated as a phrase, and that phrase must be found within the document or file.
Field search: Using ‗title‘, ‗in title:‘ or ‗all in title‘ etc.
Limiting search: Limiting by time, date or file type language or occurrences.
Please note that different search engine provides these search forms using different
syntax. You should refer to further studies for more details on this topic.
Let us now show you how you may be able to use search engines. For the purpose of
this unit, we have selected Google and Yahoo, however, you must do the similar
exercise using Bing, Ask.com, and many other search engines. ðŸ’Google was launched in
January 1996, as a
research project by Larry
Page and Sergey Brin as
PhD students at Stanford
University Yahoo was launched
in January 1996, by
Jerry Yang and David
Filo, Electrical
Engineering graduate
students from
Stanford University.
The Table 1.1 shows in detail some of the basic features of Google and Yahoo.
There are many other search features provided by search engines. You must use
various search engines and explore the facilities they provide.
Activity 5: Compare the features of at least four search engines.
Some Searching Tips
Any Search engine allows you to perform either a broad search of everything on the
Web or a narrow search limited specifically to images, video, news, or other specific
search type. If you are interested in further narrowing your search results, try adding
more words to your query Or search again using different terms. Following are some
of the tips that may help you to formulate better search strings:
Choose words carefully: Use specific words to describe exactly what you are looking
for.
Make sure to watch for words with more than one meaning. In case word has more
than one meaning then also include the context in which you are searching the
information.
Try to use phrases instead of single word for pointed search.
You should perform multiple searches simultaneously .
Now let us discuss about some of the products provided by the search engine. Figure
2.8 lists the products provided by Google and Yahoo:
Following is a list of some of these products.
Please note that many search engines provides the similar products. You must try
several search engines from time to time as they keep updating their features. Activity 6: Find all the products available with any four different search engines.
Check Your Progress 2
1. What is a Search Engine? How does it work?
……………………………………………………………………………………
……………………………………………………………………………………
2. What is searching? How efficiently you can search:
……………………………………………………………………………………
……………………………………………………………………………………
a. Tutorials of XML
……………………………………………………………………………
……………………………………………………………………………
b. Universities in India and USA
……………………………………………………………………………
……………………………………………………………………………
c. Universities is not in USA or Britain
……………………………………………………………………………
……………………………………………………………………………
d. Cleaning process of gold, it should not give process related to cleaning of
jewelry
……………………………………………………………………………
……………………………………………………………………………
Activity 7: Find out 10 search engines from the web. Also list the importance in the
context of type of searches they are good at. You may select the some of the search
engines from the following list:
Bing, Yahoo, Google, MSNSearch, Lycos, Webcrawler, Hotbot, Alta Vista,
AOLSearch, Alltheweb.com, Allacademic.
0 Comments