Better alternatives to “Discourage search engines” in WordPress

search engine discouraged

This post picks up where a recent post left off: Arguing that WordPress’s “Discourage search engines” feature is best left unused because of the dire SEO consequences if you forget to disable it once a site is live. Here, we explore specific alternatives to “Discourage search engines” that don’t carry its risks.

We explore four approaches, in order of preference:

  1. A full alternative that permanently replaces “Discourage search engines”;
  2. A pragmatic alternative that also hides your development site from unwanted direct traffic;
  3. A “lazy” alternative that is good enough for most projects; and
  4. A few ways to make “Discourage search engines” safer if you absolutely must use it.

Enjoy!

1. Manually set robots.txt rules in your development environment

Create a development environment with search engines discouraged: anything inside it won’t get indexed, and anything outside it will.

What makes “Discourage search engines” scary is that it follows you from your development environment to your live site. You can fix that by manually creating a development environment with search engines discouraged. Anything inside the development environment won’t get indexed—and anything outside it will, like your live site once you make the transfer.

This is very easy to set up. However, it does vary a bit by whether you’re working in a subdomain, which looks like subdomain.domain.com, or a subdirectory, which looks like domain.com/subdirectory. If you don’t know the difference and just know how to make subfolders in FTP, you’ll want the “subdirectory” instructions below. (If FTP itself is unfamiliar, it’s software you’ll need to browse your hosting account and upload files; Google “FTP help.”)

1.A. How to disallow search engines in subdirectories

Just paste the following code into a text editor (like Notepad or Sublime Text):

User-agent: *
Disallow: /nameofsubdirectory/
web rootGive the subdirectory the correct name, save the file as robots.txt, and upload it to the web root folder of your hosting account, as shown at right. The web root is the folder where you’d upload a file called image.png to be able to view it at mysite.com/image.png. (Be aware that some hosts’ web roots will be called “public_html” or something similar.)

subdirectory disallowedThe picture at right shows a site whose a development environment is in a subdirectory called dev, and whose robots.txt file is properly disallowing search engines in /dev/ only (while allowing them on the rest of the site). Anything inside the dev folder will always be hidden from search, but when it’s transferred to another directory, search engines will index it normally.

Simply put all your development installs inside the hidden subdirectory. Do this once, and you’ll never have to worry about setting search engine visibility again.

Once you’ve got this set up, simply put your WordPress site in /dev/ until it’s ready to go live. If you’re a web developer, simply put all your development installs inside /dev/, and transfer them out as they’re ready to go live. Do this once, and you’ll never have to worry about setting search engine visibility again.

1.B. How to disallow search engines in subdomains

Paste the following code into a text editor:

User-agent: *
Disallow: /
subdomain rootSave this as robots.txt, and upload to the root folder of your subdomain. So if you’re building your site at “testsite.mysite.com” you’ll want to upload this directly to the “/testsite/” folder. (As a note, some hosts handle subdomains differently, so you might find the root for “testsite.mysite.com” somewhere other than a /testsite/ folder inside your main hosting account.)

subdomain disallowedThe picture at right shows a site whose a development environment is in a subdomain called dev, and whose robots.txt file is properly disallowing search engines. Anything inside the dev subdomain will always be hidden from search, but when it’s transferred to the live directory, search engines will index it normally.

If you work on multiple sites, do this once with a dev subdomain, develop exclusively inside that subdomain, and you’ll never have to worry about search engine visibility again.

2. Put your dev site behind a login screen

nope login screenIf you’re trying to keep search engines off your site, you almost certainly don’t want strangers to happen onto it either. This approach solves both problems.All you have to do is to install a WordPress plugin to put your site behind a login authentication. There are a ton of plugins that do this; Login Security Solution is one I’ve tried. It should be as simple as toggling a couple of plugin options, and sharing the password with whomever needs to be able to view the test site.

Robots can’t crawl sites that are behind logins, so the site contents will be totally hidden from search engines. And there’s no chance you’ll forget to change this setting when you go live, since the login screen will be staring every user in the face until you disable it.

The only caveat about this method is that it might not be the best for a project being developed directly at the live URL (e.g., directly at mysite.com). Search engines may index the contents of the login page itself, causing the login text to show up in search results associated with your domain even after the login has been disabled.

3. Allow your dev site to build page rank, then redirect it to the live site when ready

Let the test site pick up page rank while you’re working on it, and then direct that page rank to the live site when you launch.

With this approach, you basically let the test site pick up page rank while you’re working on it, and then direct that page rank to the live site when you launch.

The theory here is that it’s better to pick up a tiny amount of search traffic at the “wrong” site rather than risk kneecapping the live site. This method makes the most sense for:

  • Small and medium-sized projects, where a test site isn’t likely to draw much attention
  • New sites, which don’t have a live site to compete with in search
  • Short projects, which don’t leave the development site a long time to pick up page rank

In fact, for new sites being developed at the web root (directly at mysite.com), this is probably best practice. The traffic you’ll get during development is negligible, and there’s no reason not to get a head start on the “long game” of SEO, which, among other things, rewards you for how long a domain’s been live and crawlable.

If you’re not developing at the web root, you’ll need to direct the development site to the live site after launch, so you don’t run into duplicate content penalties and other problems arising from Google being able to see two versions of the same site.

But there’s no need to worry about that! Because you’re about to learn…

3.A. How to redirect a test site to the live site

web rootWhen it’s time to do your redirect, you need to find the .htaccess file at the root of the hosting account that contains the live site. The root is where files like “index.php” and “home.php” are located; depending on your host, it might be called “/” or “public_html” in your directory structure.

When you find your .htaccess, you’ll want to replace whatever’s in there with the following code (from here):

Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /

RewriteCond %{HTTP_HOST} ^(www\.)?mydomain\.com$ [NC]
RewriteRule ^subdirectory/?$ http://newurl.com [L,R=301,NC]

Replace mydomain with the root domain of your test environment, subdirectory with the name of the subdirectory you’ve been doing development in, and http://newurl.com with the address (including http://) of the live site. (Also replace \.com$ with, say, \.org$ if your test site actually has a .org extension.) Upload the file, and you should notice your development site forwarding directly to the live site.

3.B. If you forget to redirect…

If you do forget to do your redirects, you’re prone to notice the test site competing with the live site in search. This is bad, but compared to burning down the live site’s SEO, it’s definitely the lesser of two evils.

If you do forget to do your redirects for weeks at a time, you’re prone to notice the test site competing with the live site in search. This is bad, but once you realize the problem, you can redirect all page rank to the live site almost immediately; the test site should disappear from search results within a day or two. Compared to burning down a live site’s SEO, it’s definitely the lesser of two evils.

4. More responsible use of “Discourage search engines”

4.A. Don’t dismiss the Yoast nag message.

yoast huge seo issue
The Yoast SEO plugin shows a warning when you discourage search engines. Unfortunately, clicking “I know, don’t bug me” removes this warning forever.

So you can make it a practice never to get rid of this nag message—presumably, you or your client will notice it on the live site before it’s too late.

4.B. Use “Check Search Engine Visibility on Migration”

A developer named Rhys Wynne has recently developed a plugin called Check Search Engine Visibility on Migration that alerts you if “Disable search engines” is still on after you’ve migrated a WordPress install. This is a great idea if you make it a religion to install it before checking “Discourage search engines.” Note, though, that it won’t work on sites being developed directly at the live URL.

In conclusion…

Well! That was a lot about “Discourage search engines.” It really is a dangerous piece of WordPress, though, so if you’re a developer or site owner and happen to be fallible, I hope you’ve found that one of these alternatives will work for you.

Thanks, and as always, please let us know your thoughts and suggestions in the comments below.


8 Responses

Comments

  • Ben says:

    Thanks Fred!

    I created a development site with a subdomain using a separate WP install. But after reading your article — just out of concern that checking “discourage search engines” on the dev site was affecting the main site — I disabled that option and made a `robots.txt`. Do you know if this option can affect other WP installs on the same server?

    • Fred Meyer Fred Meyer says:

      Thanks for writing, Ben! Discouraging search engines at subdomain.site.com shouldn’t affect search engine behavior at site.com itself–there’s a new robots.txt file for each subdomain, and they’re listened to individually. (As a note, “Disable search engines” also works by changing robots.txt, so you’re changing the same file either way.)

      You con confirm that everything’s the way you want by visiting both subdomain.site.com/robots.txt and site.com/robots.txt, and making sure they’re both set the way you want. This would presumably look like “Disallow: *” for the subdomain, and like no special rule for the domain itself.

      Does that help?

  • Mario M says:

    Awesome follow. A great way to minimize risk and pick up some good advice on developing in tandem with live sites.

    Keep ’em coming.

  • Eruption Joojo says:

    Hello,

    I’m currently having a WordPress.com Blog (Free Plan) &
    desire to sell my Digital Stuff online without setting-up a website and
    relying wholly on the Blog, at first (have been short of funds). Thus,
    going through with this thought I have made changes to my Blog, under
    this after making the payment, the customer is redirected to my blog
    page haivng a link to download the digital content. So, I just want the
    payment gateway to be able to redirect the traffic to the Download page
    on my blog and not have it searchable via search engines, etc. &
    neither be it listed under the WordPress.com Posts list/Pages, etc.
    because if the download page is searchable by the Search Engines &
    listed under my Blog’s post, I wouldn’t earn anything because then the
    customer would directly download the content without making the payment.

    Regards,

    Joojo.

Pingbacks