Informatin you need on high tech

 

 

Social network Login

Login either using your Facebook, Google or Twitter account or using your site UserName and Password.

     

Today's Markets

Today's Oil

Crude Oil


$ and Chinese Markets

Visitor counter, Heat Map, Conversion tracking, Search Rank

Advertisement

Latest Comments

  • How to create an article in Joomla from the front-end

    Addie 2014.Jul.11
    This article will help the internet users for building up new webpage or even a blog from start to ...

    Read more...

     
  • What Hosting Companies Don't Tell You, Could Hurt You? Part 1 - Unlimited Space and Bandwidth

    Chris Bao 2013.Oct.07
    Web Hosting Companies could really trap you with marketing advertisements. So you have to be careful ...

    Read more...

     
  • How to create an article in Joomla from the front-end

    James Franklyn 2013.May.07
    Nice tutorial regards Tampa SEO

    Read more...

Our sponsors

Visitors Online

Today 72

Currently are 2 guests and no members online

Powered by Spearhead Software Labs Joomla Facebook Like Button

What is a web crawler?

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 
Custom Search
affiliate_link

A web crawler is a program or automated script that visits Web sites in a methodical, automated manner. This process is called Web crawling or spidering.

The major search engines like Google and Yahoo! all have such a program, which is also known as a web spider or web robot. Crawlers apparently gained the name because they crawl through a site a page at a time, following the links to other pages on the site until all pages have been read.

 

How it Works

A search engine's web crawler constantly search the web site, it "reads" the visible text, the hyperlinks, and the content of the various tags used in the site, such as keyword, rich meta tags, body text, reciprocal links etc. Based on the information gathered from the crawler and predefined algorithms, a search engine will then determine what the site is about and index the information. The site is then included accordingly within the respective search engine's database and its page ranking process.

Pros:

Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches.

Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code.

Linguists may use a web crawler to perform a textual analysis; that is, they may comb the Internet to determine what words are commonly used today. Market researchers may use a web crawler to determine and assess trends in a given market.

Cons:

Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).

There are numerous illegal uses of web crawlers as well such as hacking a server for more information than is freely given.

 

Identification

Web crawlers typically identify themselves to a Web server by using the User-agent field of an HTTP request. Web site administrators typically examine their Web servers' log and use the user agent field to determine which crawlers have visited the web server and how often.