Thoughts about stub pages - 200 & noindex ok, or 404?
-
With large database/template driven websites it is often possible to get a lot of pages with no content on them.
What are the current thoughts regarding these pages with no content, options;
-
Return a 200 header code with noindex meta tag
-
Return a 404 page & header code
-
Something else?
Thanks
-
-
I would agree with all the comments on how to technically deal with the random pages, but it is a losing battle until you get your website database/templates under control. I once had a similar issue and had to work months to get a solution in place as the website would create all kinds of issues like this.
We had to implement a system so that the creation of these pages would be minimized. I think the issue is that you need to make sure that any random page requests, make sure they get a 404 to start with so that the URL does not get indexed to start with.
That said, all the random URLs that are already indexed, I like the 200 options with the noindex meta tag. My reasons: This is because otherwise with the 404s you get all these error messages that are meaningless in GWT. The noindex also gets the page out of the index. I have seen Google retry 404s on one of our sites, crazy. Ever since Google started showing soft 404s for 301s that redirect many pages to a single URL, I only try to use 301s on more of a one to one basis.
Good luck.
-
Ok, a understand better. I have the same problem with a Site un Drupal, I think is better use a robot.txt to block the empty pages.
These because the link juice that the page transfere is minimum and use extra resources from the server.
If you can't block with robots.txt the noindex,follow meta es ok. But if you see in Analytics that some Landing Pages are www.example.com/product/ {} random_text_here es better use a 404 with redirect 301 to Site Map for user experience.
-
Thanks for the info.
For more information, let me try and explain the scenario a little better.
When using a template to generate all product page on a site, often these are designed in a way so that any URLs of the form "www.example.com/product/{something}" will map to a script called "GenerateProductPage.java" likely based on the rule that anything in the /product/ directory will map there (or .asp etc depending on the language being used).
On the site, there are only going to be links to the actual products that are stored in the DB, so for a user there are no issues there.
But Google manages to find all manor of strange URLs and since they are of the form "www.example.com/product/{random_text_here}" then this also will 'try' and generate a product page. Since there is no actual product in the database called 'random_text_here' then this will result in an empty product page with nothing there except the template navigation, footer links and menus etc.
We currently are doing as you mentioned, by "noindex, follow" the pages for the same reasons you listed.
So the question was; is this ok to do? is this bad to do? (if so why). Is there any harm in doing things the current way? Should we be 404'ig the pages (and what value does this have over the other methods?) etc.
Thanks for your input Carlo as it shows your thoughts are along the same lines as ours.
Has anyone else got anything to add to the information provided?
Thanks
-
Hi, mmm, I not really sure that understand why you have invalid pages, options:
- Products without stock
- Is build based in other database
If you have a product name without content is better a meta noindex, follow because transferred link juice.
But like I say I dont know why these products exist. If you have more info I could help more
-
Thanks for the response.
I guess what I was getting at with the question is when websites are built on flexible platforms and can easily create these pages automatically.
For example, if there was flexible URLs in place whereby URLs such as www.example.com/product/{product_name} all mapped to one script which generated a product page.
So www.example.com/product/{invalid_product_name} would also work and essentially show a blank product page.
The question being, how is the best way to handle these for Google and is there any benefit/harm from either of the methods outlined in the original question.
Has anyone else any thoughts on best ways to handle these scenarios?
Thanks
-
If you know that a Page doesn't have content I recomend:
- A page without content have to response 404.
- If the Page return a 404 make a 301 to Site map.
- In the Site Map use meta noindex, follow to transfer the link juice.
- Eventually you need clean these pages because is bad for users and SEO.
Regards
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blog archive pages are meta noindexed but still flagged as duplicate
Hi all. I know there several threads related to noindexing blog archives and category pages, so if this has already been answered, please direct me to that post. My blog archive pages have preview text from the posts. Each time I post a blog, the last post on any given archive page shifts to the first spot on the next archive page. Moz seems to report these as new duplicate content issues each week. I have my archive pages set to meta noindex, so can I feel good about continuing to ignore these duplicate content issues, or is there something else I should be doing to prevent penalties? TIA!
Technical SEO | | mkupfer1 -
Duplicate Page Errors
Hey guys, I'm wondering if anyone can help... Here is my issue... Our website:
Technical SEO | | TCPReliable
http://www.cryopak.com
It's built on Concrete 5 CMS I'm noticing a ton of duplicate page errors (9530 to be exact). I'm looking at the issues and it looks like it is being caused by the CMS. For instance the home page seems to be duplicating.. http://www.cryopak.com/en/
http://www.cryopak.com/en/?DepartmentId=67
http://www.cryopak.com/en/?DepartmentId=25
http://www.cryopak.com/en/?DepartmentId=4
http://www.cryopak.com/en/?DepartmentId=66 Do you think this is an issue? Is their anyway to fix this issue? It seems to be happening on every page. Thanks Jim0 -
What is the best way to handle links that lead to a 404 page
Hi Team Moz, I am working through a site cutover with an entirely new URL structure and have a bunch of pages that could not, would not or just plain don't redirect to new pages. Steps I have taken: Multiple new sitemaps submitted with new URLs and the indexing looks solid used webmasters to remove urls with natural result listings that did not redirect and produce urls Completely built out new ppc campaigns with new URL structures contacted few major link partners Now here is my question: I have a pages that produce 404s that are linked to in forums, slick deals and stuff like that which will not be redirected. Is disavowing these links the correct thing to do?
Technical SEO | | mm9161570 -
WP image pages
I used Dreamweaver for years but have recently been switching to Wordpress. On the whole the results have been very positive. However, I don't like the way that WP generates a page for images when the image is inserted into a blog post. I was just reading this http://www.eyeflow.com/content-strength-audit/ excellent article on Content Strength Audit and it referred to this problem as well. Often, when I insert an image into a blog, I delete the reference to the image page and link directly to the image. Is this an effective way to deal with the is problem? Is there a better approach? Best,
Technical SEO | | ChristopherGlaeser
Christopher0 -
Pageing page and seo meta tag questions
Hi if i am using paging in my website there is lots of product in my website now in paging total paging is 1000 pages now what title tag i need to add for every paging page or is there any good way we can tell search engine all page or same ?
Technical SEO | | constructionhelpline0 -
Noindex vs. page removal - Panda recovery
I'm wondering whether there is a consensus within the SEO community as to whether noindexing pages vs. actually removing pages is different from Google Pandas perspective?Does noindexing pages have less value when removing poor quality content than physically removing ie. either 301ing or 404ing the page being removed and removing the links to it from the site? I presume that removing pages has a positive impact on the amount of link juice that gets to some of the remaining pages deeper into the site, but I also presume this doesn't have any direct impact on the Panda algorithm? Thanks very much in advance for your thoughts, and corrections on my assumptions 🙂
Technical SEO | | agencycentral0 -
301 lots of old pages to home page
Will it hurt me if i redirect a few hundred old pages to my home page? I currently have a mess on my hands with many 404's showing up after moving my site to a new ecommerce server. We have been at the new server for 2 years but still have 337 404s showing up in google webmaster tools. I don't think it would affect users as very few people woudl find those old links but I don't want to mess with google. Also, how much are those 404s hurting my rank?
Technical SEO | | bhsiao1 -
Cache my page
So I need to get this page cached: http://www.flowerpetal.com/index.jsp?info=13 It's been 4-5 months since uploaded. Now it's linked to from the homepage of a PR5 site. I've tweeted that link 10 times, facebooked, stumbled, linked to it from other articles and still nothing. And I submitted the url to google twice. Any thoughts? Thanks Tyler
Technical SEO | | tylerfraser0