Articles

Muhammed Mahbub Hossain
Muhammed Mahbub Hossain
Creating professional solutions for tough professionals

Introduction to Web Scraping

(page 1 of 5)

What is Web Scraping?

It is a programming technique for collecting data from any web page. It works like a hidden browser where all the input and output of this browser is controlled by a program. As a result programs have the html return by a webpage, and then that program can collect required data from the html return by a webpage. Usually, web scraping is used for collecting data from a website which does not provide RSS or open API. Web scraping made it possible to let a software use any data available on the web in html.

 

Web scrapping technique also works with any password protected web page. For that all that it requires is to have the required password to get access to password protected web page. Web scraping can do almost all that a human can do on a website through a browser like Internet Explorer (IE), Mozilla etc.

 


Why it is Important?


Web scraping is essential when someone need to go through a huge number of websites and collect required data from those websites. Web scraping can be used to automatically spider through thousands of pages and collect required data in a fraction of the time it would take someone to grab the data manually.

 

Web scraping helps not only some specific persons who needs a lot of data to run their business or do well in profession but also all internet users including me and you. For example any internet user may have an address book in any web mail and later that user may need to reuse the contacts of that address book. Then user will have gone through the address book and copy and paste all contacts one by one to reuse those contacts. Web scraping product Contacts Importer made it very easy; just ask user the username and password of the web mail and returning all the contacts of that web mail address book in 2 or 3 seconds.  Thus web scraping helping everyone to reuse their contacts and content on the web very easily. We know some social community website like facebook, myspace is now a days very popular by providing some very useful social services. Their contributions to our modern life are undeniable. These websites spreads very quickly by using an excellent invite tool, which import contacts and let user invite their friends very easily. This tool mainly uses web scraping for importing contacts. More web scraping product is coming regularly to make us more efficient on our regular and useful activities.

Is it Legal?


Web scraping technology is actually fairly questionable. In a way, they can be seemed as stealing the information owned by a web site. The whole issue is complicated because it is unclear where copy/paste ends and scraping begins. Moreover, web scraping cannot access any web content that is not allowed to access. It is okay for people to copy and save the information from web pages, but it might not be legal to have software do this automatically. But scraping of the page and then offering a service that leverages the information hiding and not crediting the original source, is unlikely to be legal.

 

But it does not seem that scraping is going to stop because the main purpose of this technology is converting manual time consuming hard works into automated quick way. Moreover, its been more then five year from when web scrapping is commonly present on the web.

Comments On : Ins and Outs of Web Scraping
Ajay  Sharma
Thanks For Sharing Such a useful information Thanks
By: Ajay Sharma at : Nov 28, 2009
Muhammed...
Thanks
By: Muhammed Mahbub Hossain at : Mar 25, 2009
MD.Elme...
This is the greatest article I ever found on web about web scraping. It covers vast amount of web scraping technique that are using now a days with almost every modern language.
By: MD.Elme Focruzzaman Razi at : Mar 25, 2009

Would you like to comment?

Join Paracalls for a free account, or sign in if you are already a member.
Sign up as Company
Loading ...