Content scraping is a valuable asset, but unfortunately some do not use this tool in the manner in which it was intended. Because of this, it is necessary to discuss the ethics of scraping.
Essentially “ethics” means moral principles or rules. It would be moral, for example, to give credit if you redistribute the content that you scrape. This doesn’t have to be anything fancy, and a link to the original article will suffice. You will also want to ensure that you only use a brief excerpt of the work.
It is also ethical to use the content that you scrape in a moral manner. Suppose you use the information you gather from scraping in a report. Whether this report is internal or available to the world, you will want to credit your source, whether in the footnotes or in a bibliography type list of credits at the end of the report. Whether you’re legally obligated to do so is up for debate depending on which country you’re in, but ethically, crediting your source is the right thing to do.
A tricky area of ethics is information gathered. Suppose you scrape your competitor’s site for information that they are already making available to the world. Then suppose information is unintentionally posted, something along the lines of sales information or other data that they didn’t intend to make available. They remove it, but not before you’ve scraped their site. Now you’re in possession of data that they clearly didn’t intend to make available, but did, and then removed it.
What would you do in this situation? What would you want your competitor to do if this happened to you? Sometimes, the choices are not always easy and often the answers are on a case-by-case basis.