SOPA and PIPA threaten web freedom
January 18 became a milestone in the history of the free Internet. For yesterday, the US Congress scheduled the debates on SOPA bill adoption. However, the bill caused fierce debates not only in the White House. We all remember, SOPA, or Stop Online Piracy Act, was introduced in the House of Representatives on October 26, 2011. The bill proposes dramatic expansion of the US authorities' competency in regards of intellectual property protection. In the most general terms, the adopted bill could allow the government to ban any websites, and fine or even imprison their owners if they are caught spreading illegal content. This bill should also have effect on all the domains located in any other country rather than the USA.
There is another governmental way to block websites non grata, and that's PIPA, or Protect Intellectual Property Act. This bill seems to be less dogmatic, but in any case it is regarded as a significant barrier on the way of improving the global net.
Undoubtedly, these two proposals, SOPA and PIPA, evoked a wide response among web militants, especially those who fight for the Internet freedom and the ability to share knowledge all over. Among the bill adversaries we can find such prominent websites as reddit.com, and even Wikipedia. As a way to protest SOPA, they all decided to go down for twelve hours on the day of the bill debates, on January 18. For example, this is how English Wiki's main page looked the whole day yesterday:
Google will help out
In the forefront of endeavors towards the free web, or, in other words, among the opposers of the foregoing bills, was Google, Inc. It didn't only start collecting votes against SOPA and PIPA adoption, but also gave certain recommendations to the webmasters who wanted to join the blackout, but didn't want to lose their positions in Google Search. What is remarkable, all these tips can easily be used by anyone who wants to temporarily switch off the servers for maintenance, or is just forced to close a part of his website. Following this advice will allow anyone to avoid hurting his website's rankings in the Google search results.
How to retain your rating
So, if you decided to temporarily replace some of your pages by an error message, or by some other content you don't want the search engine to index, read the following instructions carefully.
First of all, Google advises to make sure you let Googlebot know that the content you place on the pages is not real. To do that, return a 503 header for all the pages that are affected. It will also help to avoid showing off duplicated content, which can otherwise significantly affect your Google rating. We should warn you, however, that if there is a sudden increase in 503 pages, Googlebot will slow down its crawling until the problem is eliminated, but Google promises this will not affect your long-term results.
Another issue Google pays attention to is your robots.txt file. You should remember that the first file for the bot to look for on your website is this very robots.txt, in which you specify all restrictions for the bot to follow. For example, in the case of a WordPress website, you would like to hide such pages as /wp-login.php or /wp-register.php from Googlebot. How can you do it? Only by specifying the corresponding restrictions in the robots.txt file.
Well, back to our maintenance (or the repeated blackout). Note that making robots.txt return a 503 on request blocks crawling all over the website until the bot sees an acceptable status for this file, in particular, 200 or 404. So if you're switching off only a certain part of your website, make sure the robots.txt file's status code is not 503. By no means try blocking crawling around the website by disallowing it in robots.txt file: this will prolong the time of recovery by several orders of magnitude./p>
Google also warns that all the differences in website performance will be noticed by Googlebot, so you shouldn't worry when you notice some crawling errors in the logs. However, double-check that no errors appear after you've settled everything down. As general advice, we ask you not to change too many things unless it's absolutely necessary. Don't change the DNS settings, or robot.txt file contents, and do not even think about changing the crawl rate in the admin center of Google's webmaster tools.
And again, keep as many things untouched as possible.
Hopefully, these tips will help you keep your current rating in Google search results if you suddenly decide to carry out server maintenance, or to join the PIPA protesters on January 24.