Thoughts about stub pages - 200 & noindex ok, or 404?
-
With large database/template driven websites it is often possible to get a lot of pages with no content on them.
What are the current thoughts regarding these pages with no content, options;
-
Return a 200 header code with noindex meta tag
-
Return a 404 page & header code
-
Something else?
Thanks
-
-
I would agree with all the comments on how to technically deal with the random pages, but it is a losing battle until you get your website database/templates under control. I once had a similar issue and had to work months to get a solution in place as the website would create all kinds of issues like this.
We had to implement a system so that the creation of these pages would be minimized. I think the issue is that you need to make sure that any random page requests, make sure they get a 404 to start with so that the URL does not get indexed to start with.
That said, all the random URLs that are already indexed, I like the 200 options with the noindex meta tag. My reasons: This is because otherwise with the 404s you get all these error messages that are meaningless in GWT. The noindex also gets the page out of the index. I have seen Google retry 404s on one of our sites, crazy. Ever since Google started showing soft 404s for 301s that redirect many pages to a single URL, I only try to use 301s on more of a one to one basis.
Good luck.
-
Ok, a understand better. I have the same problem with a Site un Drupal, I think is better use a robot.txt to block the empty pages.
These because the link juice that the page transfere is minimum and use extra resources from the server.
If you can't block with robots.txt the noindex,follow meta es ok. But if you see in Analytics that some Landing Pages are www.example.com/product/ {} random_text_here es better use a 404 with redirect 301 to Site Map for user experience.
-
Thanks for the info.
For more information, let me try and explain the scenario a little better.
When using a template to generate all product page on a site, often these are designed in a way so that any URLs of the form "www.example.com/product/{something}" will map to a script called "GenerateProductPage.java" likely based on the rule that anything in the /product/ directory will map there (or .asp etc depending on the language being used).
On the site, there are only going to be links to the actual products that are stored in the DB, so for a user there are no issues there.
But Google manages to find all manor of strange URLs and since they are of the form "www.example.com/product/{random_text_here}" then this also will 'try' and generate a product page. Since there is no actual product in the database called 'random_text_here' then this will result in an empty product page with nothing there except the template navigation, footer links and menus etc.
We currently are doing as you mentioned, by "noindex, follow" the pages for the same reasons you listed.
So the question was; is this ok to do? is this bad to do? (if so why). Is there any harm in doing things the current way? Should we be 404'ig the pages (and what value does this have over the other methods?) etc.
Thanks for your input Carlo as it shows your thoughts are along the same lines as ours.
Has anyone else got anything to add to the information provided?
Thanks
-
Hi, mmm, I not really sure that understand why you have invalid pages, options:
- Products without stock
- Is build based in other database
If you have a product name without content is better a meta noindex, follow because transferred link juice.
But like I say I dont know why these products exist. If you have more info I could help more
-
Thanks for the response.
I guess what I was getting at with the question is when websites are built on flexible platforms and can easily create these pages automatically.
For example, if there was flexible URLs in place whereby URLs such as www.example.com/product/{product_name} all mapped to one script which generated a product page.
So www.example.com/product/{invalid_product_name} would also work and essentially show a blank product page.
The question being, how is the best way to handle these for Google and is there any benefit/harm from either of the methods outlined in the original question.
Has anyone else any thoughts on best ways to handle these scenarios?
Thanks
-
If you know that a Page doesn't have content I recomend:
- A page without content have to response 404.
- If the Page return a 404 make a 301 to Site map.
- In the Site Map use meta noindex, follow to transfer the link juice.
- Eventually you need clean these pages because is bad for users and SEO.
Regards
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"Noindex, follow" for thin pages?
Hey there Mozzers, I have a question regarding Thin pages. Unfortunately, we have Thin pages, almost empty to be honest. I have the idea to ask the dev team to do "noindex, follow" on these pages. What do you think? Has someone faced this situation before? Will appreciate your input!
Technical SEO | | Europarl_SEO_Team0 -
Are image pages considered 'thin' content pages?
I am currently doing a site audit. The total number of pages on the website are around 400... 187 of them are image pages and coming up as 'zero' word count in Screaming Frog report. I needed to know if they will be considered 'thin' content by search engines? Should I include them as an issue? An answer would be most appreciated.
Technical SEO | | MTalhaImtiaz0 -
Mobile & desktop pages
I have a mobile site (m.example.com) and a desktop site (example.com). I want search engines to know that for every desktop page there is a mobile equivalent. To do this I insert a rel=alternate on the desktop pages to the mobile equivalent. On the mobile pages I insert a rel=canonical to it's equivalent desktop page. So far so good BUT: Almost every desktop page has 4 or 5 copies (duplicate content). I get rid of this issue by using the rel=canonical to the source page. Still no problem here. But what happens if I insert a rel=alternate to the mobile equivalent on every copy of the source page? I know it sounds stupid but the system doesn't allow me to insert a rel=alternate on just one page. It's all or nothing! My question: Does Google ignore the rel=alternate on the duplicate pages but keeps understanding the link between the desktop source page & mobile page ? Or should I avoid this scenario? Many Thanks Pieter
Technical SEO | | Humix0 -
Moz is returning some of my pages as 404 but why when they are live?
hi guys, i would appreciate some advice on this. here are some example pages where i am getting a 404 status;
Technical SEO | | gezzagregz
http://www.colourbanners.co.uk/printed-boards/correx-boards.html
http://www.colourbanners.co.uk/printed-boards/foamex-boards.html There are quite a few, but thes a live pages so why is this happening? Also our site has dropped in the SERPS, i was wondering if this has something to do with it? many thanks Gerry0 -
Indexed pages and current pages - Big difference?
Our website shows ~22k pages in the sitemap but ~56k are showing indexed on Google through the "site:" command. Firstly, how much attention should we paying to the discrepancy? If we should be worried what's the best way to find the cause of the difference? The domain canonical is set so can't really figure out if we've got a problem or not?
Technical SEO | | Nathan.Smith0 -
Mass 404 pages
Hi Guys, If I were to have to take down the majority of my site, taking all content and links pointing to that content down. How would the search engines react? Would I get a penalty for the majority of the site all of the sudden missing? My only concern is the loss of traffic on the remanding pages. Thanks!
Technical SEO | | DPASeo0 -
Renaming of pages
About 2 months ago one of our clients renamed a section of his website. The worst part is that the URLs of the page also changed. New page: http://www.meresverige.dk/rejser/malmo Old page: http://www.meresverige.dk/rejser/malmoe The problem now is that the new page get absolutely no page-rank transfered from the old page. It also get no mozrank at all. Also if I try to find it in the Open Site Explorer it can not be found.The old page can, but not the new one. We have updated the sitemap.xml and also done proper 301 redirect for the pages since about 2 months. Any ideas here? This page was a very important page in terms of traffic so very much thankful for any input. Have a great day Fredrik
Technical SEO | | Resultify0 -
What should be noindexed on a Wordpress blog?
I know this can be a "it depends" answer so I'll try to explain. Qualifications on your answers would be great. I use the Wordpress architecture for myself and clients on sites and blogs. Almost every business site we create has a blog and I'm always working to improve results on them. My strategy has been the following: Categories: General, main content types, general keywords. Index, follow Tags: Very specific, post specific, may only be used once for one post. My categories have descriptions that are displayed on the category pages with excerpts. Tags rarely have a description but are displayed with excerpts on the page. My idea has been to index the categories to crawl the content and they have unique content by showing the category description. Tags shouldn't be archived because they may be all over the place and may have only 1 post with no tag description. I'm trying to reduce duplicate content but I don't want to limit results for my clients and myself. Should I set tags to noindex, follow or should I have them indexed? The only thing I'm thinking with having the tags indexed is that I may be able to get additional traffic through the more specific tags (i.e. tag = meta tags, category = SEO).
Technical SEO | | JaredDetroit0