Thoughts about stub pages - 200 & noindex ok, or 404?

slingshot

With large database/template driven websites it is often possible to get a lot of pages with no content on them.

What are the current thoughts regarding these pages with no content, options;

Return a 200 header code with noindex meta tag
Return a 404 page & header code
Something else?

Thanks

CleverPhD

I would agree with all the comments on how to technically deal with the random pages, but it is a losing battle until you get your website database/templates under control. I once had a similar issue and had to work months to get a solution in place as the website would create all kinds of issues like this.

We had to implement a system so that the creation of these pages would be minimized. I think the issue is that you need to make sure that any random page requests, make sure they get a 404 to start with so that the URL does not get indexed to start with.

That said, all the random URLs that are already indexed, I like the 200 options with the noindex meta tag. My reasons: This is because otherwise with the 404s you get all these error messages that are meaningless in GWT. The noindex also gets the page out of the index. I have seen Google retry 404s on one of our sites, crazy. Ever since Google started showing soft 404s for 301s that redirect many pages to a single URL, I only try to use 301s on more of a one to one basis.

Good luck.

attachmedia

Ok, a understand better. I have the same problem with a Site un Drupal, I think is better use a robot.txt to block the empty pages.

These because the link juice that the page transfere is minimum and use extra resources from the server.

If you can't block with robots.txt the noindex,follow meta es ok. But if you see in Analytics that some Landing Pages are www.example.com/product/ {} random_text_here es better use a 404 with redirect 301 to Site Map for user experience.

slingshot

Thanks for the info.

For more information, let me try and explain the scenario a little better.

When using a template to generate all product page on a site, often these are designed in a way so that any URLs of the form "www.example.com/product/{something}" will map to a script called "GenerateProductPage.java" likely based on the rule that anything in the /product/ directory will map there (or .asp etc depending on the language being used).

On the site, there are only going to be links to the actual products that are stored in the DB, so for a user there are no issues there.

But Google manages to find all manor of strange URLs and since they are of the form "www.example.com/product/{random_text_here}" then this also will 'try' and generate a product page. Since there is no actual product in the database called 'random_text_here' then this will result in an empty product page with nothing there except the template navigation, footer links and menus etc.

We currently are doing as you mentioned, by "noindex, follow" the pages for the same reasons you listed.

So the question was; is this ok to do? is this bad to do? (if so why). Is there any harm in doing things the current way? Should we be 404'ig the pages (and what value does this have over the other methods?) etc.

Thanks for your input Carlo as it shows your thoughts are along the same lines as ours.

Has anyone else got anything to add to the information provided?

Thanks

attachmedia

Hi, mmm, I not really sure that understand why you have invalid pages, options:

Products without stock
Is build based in other database

If you have a product name without content is better a meta noindex, follow because transferred link juice.

But like I say I dont know why these products exist. If you have more info I could help more

slingshot

Thanks for the response.

I guess what I was getting at with the question is when websites are built on flexible platforms and can easily create these pages automatically.

For example, if there was flexible URLs in place whereby URLs such as www.example.com/product/{product_name} all mapped to one script which generated a product page.

So www.example.com/product/{invalid_product_name} would also work and essentially show a blank product page.

The question being, how is the best way to handle these for Google and is there any benefit/harm from either of the methods outlined in the original question.

Has anyone else any thoughts on best ways to handle these scenarios?

Thanks

attachmedia

If you know that a Page doesn't have content I recomend:

A page without content have to response 404.
If the Page return a 404 make a 301 to Site map.
In the Site Map use meta noindex, follow to transfer the link juice.
Eventually you need clean these pages because is bad for users and SEO.

Regards

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Thoughts about stub pages - 200 & noindex ok, or 404?

Got a burning SEO question?

Explore more categories

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved