False Negative Warnings with Crawl Diagnostic Test
-
Ok... I will try to explain as clear as possible. This issue is regarding close to 5000 'Warnings' from our most recent seomoz pro crawl diagnostic test. The top three warnings have about 6000 instances among them: :
1. Duplicate Page Title
2. Duplicate Page Content
3. 302 (Temporary Redirect)
We understand that duplicate titles and content are "no-no's" and have made it top priority to avoid duplication on any level. Here is the issue lies... we are using the Volusion eCommerce solution and they have a variety of value add shopping features such as "Email A Friend" and "Email Me When Back In-Stock" on each product page.
If one of these options is clicked, you are then directed to the appropriate page. Now each page has a different url with the sole variable of each individual product code. But with it being a part of Volusion's ingrained functionality... the META title is the same for each page. It takes from the title of our store homepage. Example below:
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=BI8BIOSI34
The same goes for the duplicate content warnings. If you click on one of these features, it directs you to a page with pretty much the same content except for different product. Basically each page has both duplicate content and duplicate title.
SEOMOZ description is
Duplicate Title:
Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings.
Duplicate Page Content:
You should use unique titles for your different pages to ensure that they describe each page uniquely and don't compete with each other for keyword relevance.
Because I know SEO is not an exact science, the question here is does Google recognize that although they are duplicates, it actually is generated from a feature that makes us even more of a legitimate eCommerce site? Or, from seomoz description, if duplication is bad only because you do not want your pages to be competing with each other... should I not worry because i could care less if these pages don't get traffic. Or does it effect my domain authority as whole?
Then as for a solution. I am still trying to work out with Volusion how we can change the META title of the pages. It's highly unlikely but we'll see. As for the duplicate content, there is no way to change one of these pages. It's hard coded.
Solution... so if it is bad (even though it shouldn't be) would it be worth it to disable these features. I hope not. Wouldn't that defeat the purpose of Google trying to provide the most legitimate, value add sites to searchers?
As for the 302 (Temporary Redirect) warning... this is only appearing on all of our shopping cart pages. Such as the "Email A Friend" feature, there is a page for every product. For example:
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8040
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8050
The description semoz provides is:
302 (Temporary Redirect):
Using a 302 redirect will cause search engine crawlers to treat the redirect as temporary and not pass any link juice (ranking power). We highly recommend that you replace 302 redirects with 301 redirects.
So the probably solution... I do have the ability to change to a 301 redirect but do I want to do this for my shopping cart? Does Google realize the dead end is legitimate? Or... does it matter if link juice is passed through my shopping cart? And again, does it impact my site as a whole?
It is greatly appreciated if anyone could help me out with this stuff
Thank you
-
To the OP,
We are also on Volusion and have found that adding the Meta Robots tag for noindex, follow in the meta override area for categories has worked for us. We haven't found a way to add it however to the SearchResults page at this time.
-
If you trust the target site, follow the link. If you don't trust the target site, nofollow all the links.
If you feel the footer links will actually be seen and used, keep them. If they are not likely to be seen or use, I would suggest removing them.
-
I just set up footers that are on every page to nofollow sites that I care about because otherwise they get thousands of pages linking all from the same domain - this can't be good for the actual site.
I then made a single link to the sites I care about which is followable. I am hoping this is a good strategy. Sorry to digress from the original interesting topic. -
Thanks mate, I have been searching for a couple days on how to fix that warning.
-
As a rule, don't use "nofollow" on internal links.
-
Hey Ryan,
So I just confirmed with Volusion that certain pages such as these can have the "no-index, follow" tag and certain pages can not. It's just the way their system is setup. So with the pages that can, I will for sure apply the "no-index, follow" and for the pages that can not, II will go ahead and apply a disallow robots.txt. Also, if you wouldn't mind confirming... it's the "no-index, follow" meta tag that I should apply? Not the "no index, no follow" tag?
Thank for all of your assistance and guidance through all of this trouble-shooting!
-
thanks a lot man. I'm going to check out that site map site. Also, I'm going to look into applying those "no-index,follow" tags on the pages instead. Thanks again
-
Anthony,
You can begin a crawl of your site anytime. Click on Research Tools from the menu bar and scroll down to On-Page Optimization Tools > Crawl Test. This will allow you to confirm your robots.txt settings are set correctly.
For sitemaps, http://xml-sitemaps.com/ seems to be quite popular. I would suggest checking them out first. They offer a free test for up to 500 pages, and it is $20 USD to buy their product if you like it.
For Google WMT, the "restricted by robots.txt" errors can be disregarded if you are confident the pages should be blocked. I would recommend allowing Google to crawl your site whenever possible and using the noindex meta tag to prevent the pages from being indexed. This approach would eliminate those errors.
-
Hello Ryan,
Thanks again for the reply... your time is appreciated. We are currently working on creating a site map to 'categorize' the links in both our product and category indexes. This should take care of the two highest numbers of on-page links across our site. The majority of these warnings are under 250 links so we should be good. Or, let's hope cause there really isn't anything we can do about it at this point. Also, by chance do you know of or can you refer a company / independent who designs site maps? We have the xml site map file generated from Google, we just need someone to make it look nice.
Oh yeah, regarding all those duplicate title and duplicate content errors... they should be taken care of with a disallow robots.txt file. With that said, on our last SEO moz crawl the errors still came up on those same "email a friend" and "email when back in-stock" pages. Now... I did submit the robots.txt file during the past scan so this may be the reason. So before I start to wonder any further, I am going to wait until the next crawl is complete. Maybe you might know... into the future, will SEO moz still pick up those duplicate page and title errors in the crawl with the disallow robots.txt file
Also,our webmaster tools is showing 180 "restricted by robots.txt" crawl errors... all from "email a friend" and "email when back in-stock" pages in which the disallow robots.txt was just placed. I understand that even with the disallow robots.txt file, Google can still crawl whatever it chooses. Is this anything that we should be concerned about? Also please note that we have 1000's of these pages and webmaster tools is only showing 180 of them.
Thanks again for your help
-
Hello again.
Thanks for sharing the information on #1 and 2. I have heard of Volusion before but have no experience with them. Based on what you have shared it seems they may not be a great solution from a SEO perspective.
For #4, you are correct. The "META titles over 70 characters" is a warning that long titles will be truncated. The other main consequence is a title's weight is divided amongst the words in the title. The longer the title, the less weight that is applied to each term in the title. If you know and understand these factors, you can choose to ignore the warning.
For #3, you definitely do not want "thousands of links on a page". You need to figure out a way to significantly lower the number of links. Search engines will follow a percentage then stop. Yes, I would say this is bad for SEO.
Somehow you need to categorize the links. Many blog sites will show group links by month for the current year, and by year for past years. You could group by categories. Do something to get your number of links under control. You don't have to be under 100, but for now I would say you should be under ?250 links.
-
Hey Ryan,
So from the answer you provided... we've been on a long journey trying to resolve the aforementioned issues. We took a little break over the 4th but then got back to it a few days ago. For the most part, I believe we have at least concluded what needs to be done, or if anything can actually be done regarding a solution. It is a bit tricky because we are working with Volusion (a 3rd party "shopping cart" service) which definitely limits flexibility. The are a lot of 'pros' with a service such as Volusion (especially with limited resources and knowledge) but the 'cons' are beginning to appear as our knowledge of SEO, web design, etc begins to grow. Anyway, I wanted to just provide a thorough response to your answers and also throw in a few other questions that arose since.
1. In regard to the 302 Temporary redirects on all product specific shopping cart pages... we can not apply a 301 redirect because it will then not allow any customers to actually access the shopping cart page after adding a product to their cart. We were told, "if you redirect from the shopping cart page, the customer will not be allowed to checkout. Each page is needed so taking them out will cause error to the site." I was then told to speak with their marketing services department on the issue as they will be able help solve my SEO needs. i have emailed them and hopefully will hear back. Most likely there is no way to resolve this issue.
2.) You said our duplicate page and duplicate content issues can be resolved with canonical links. As you noticed, there are canonical links on the product, category and homepage of our site. I wanted to mention that there is an "SEO friendly" way to apply these canonical links with Volusion. You just select a button that says, "enable canonical links" in the back end of the store.
After speaking with Volusion support on this matter, we basically concluded that they forgot to apply these links to the 'Email a Friend" and "Email When Back In Stock" pages. I have sent the SEO department an email on this as well and expect to get one of the following three responses.
1. "We will look into this as a future feature request"
2. "There is nothing that can be done"
3. "We know about this but don't worry, it will not impact your search rankings"
Either way, if they tell me there is not a short term solution... I will look into applying a "no index, follow" tag.
3. I did not mention this issue in my initial question but we are also receiving a 'warning' of "too many links on page". In regard to keeping our on-page links to under 100...other than the homepage and product/category index pages, we have done a pretty good job with limiting the amount of links per page. With that said, we have run into somewhat of an issue with category pages that have 70+ products assigned. We have set the default to show 60 products per page but it appears the crawlers are picking up all products (even the ones on the 'next' pages) for that page which is making the links per page very high. For example... the below link is showing 244 on-page links.
http://www.beautystoponline.com/Ardell-False-Eyelashes-s/71537.htm
There is no way there is that many links on this single page. But there are probably almost 200 products assigned to this category. Which explains the high number of links. We were told this is occurring, "due to the fact that all Category pages are generated as "search results" pages (based on the category filter), and because of this, there is very little you would be able to do, as the code that generates search/category pages is system code that cannot be modified."
We were also told that we could submit it as a feature request on their forum and if it is an idea that's popular amongst other merchants, their developers may take it into consideration and change how the links are coded in the future. Opposite of all this... by chance to you have an opinion/suggestion of a solution? (if any)
A quick side note on this topic... back to me mentioning our category and product index pages are showing thousands of on-page links. It is self explanatory to why this is happening.. but would you say it is a bad thing for SEO purposes? I know its good for site structure and passing link juice, meaning that all pages on our site are only 1 click away from the root domain. Right?!?!?!?
4. Another issue I did not previously mention was 'META titles over 70 Characters'. I just wanted to confirm that if a title is more than 70 characters, the only negative is the truncated title and the full name won't appear in the search results. Past that, there shouldn't be any negative effective from Google search rankings from this, right? We have a few of these issues but for the most part... the time it would take to correct a few characters over 70 is not worth it if there is no impact on search rankings.
Anyway man... if you do reply to this 2nd post, your time is greatly appreciated and i thank you
-
Dang... thanks Ryan for such an in-depth response! Gimme a few to take it all in and I may follow up with a few more questions. And Donnie, I appreciate the attention as well!
-
Hey Ryan, thanks. Thumbs up! I like your answer more
I should have looked further into this question. For some reason, I read the beginning and assumed we were talking about shopping cart style, check out pages.
And, you're right. Adding a nofollow to those links is a weak way of addressing Anthony's issue. Thanks for keeping me in check Ryan
-
Hello Anthony.
Presently you have legitimate duplicate page title and content issues. These are not false warnings.
The challenge for the crawl tool, and for Google, is to determine which of these many pages is the "real" page you want to be indexed, and which pages are copies. Search engines do a pretty good job of sorting through your pages but sometimes they will get it wrong. You may do a search for "flat irons" and instead of the main product page appearing, the "email me when it is back in stock" page will display in the search results. This clearly isn't best for your site, your users nor Google.
Your duplicate page and duplicate title issues can be resolved with canonicalization. What you need to do is add the canonical meta tag to the original page and all copies. This will solve your current issue, along with other issues which you have not encountered yet.
It seems you are using it on the actual product pages, but not on the "duplicate" pages. Using your example, I see a good canonical tag on http://www.beautystoponline.com/Andis-Profoil-Shaver-Replacement-Foil-Inner-Cutter-p/an1pro7130.htm. I do not see any canonical tag however on http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130.
When I take a look at the "email me" page, it really does not have any content. I do not believe there is any value for your site nor users in having this page indexed. I would recommend using the "noindex, follow" tag for this page. I normally agree with Donnie but I believe he might have mistyped in this case.
In summary, if you decide to duplicate a page in order to provide the best user experience, make sure all pages have the canonical tag pointing to the primary page you wish to be included in Google's index. If a page has no content and you do not wish it to be indexed, add the "noindex, follow" tag. There simply has to be a way to do this with your current software. If it takes a custom code or plug-in, then you should do it.
About the 302 redirects, all of the links you shared in your Q&A are to your SEOmoz pro account so I cannot access them. In short, the only time I would recommend using a 302 is when you have a temporary redirect in place which will be removed very shortly (i.e. less then 30 days).
There is simply no benefit whatsoever to using a 302 to your shopping cart instead of a 301. Understand this is high level, generic advice which is the best I can do with this level of detail.
-
delete me...
wrong answer...
Do not pass GO, do not collect any PageRank
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Shopify robots blocking stylesheets causing inconsistent mobile-friendly test results?
One of our shopify sites suffered an extreme rankings drop. Recent Google algorithm updates include mobile first so I tested the site and our team got different mobile-friendly test results. However, search console is also flagging pages as not mobile friendly. So, while us end-users see the site as OK on mobile, this may not be the case for Google? I researched more about inconsistent mobile test results and found answers that say it may be due to robots.txt blocking stylesheets. Do you recognise any directory blocked that might be affecting Google's rendering? We can't edit shopify robots.txt unfortunately. Our dev said the only thing that stands out to him is Disallow: /design_theme_id and the rest shouldn't be hindering Google bots. Here are some of the files blocked: Disallow: /admin
Technical SEO | | nhhernandez
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /9103034/checkouts
Disallow: /9103034/orders
Disallow: /carts
Disallow: /account
Disallow: /collections/+
Disallow: /collections/%2B
Disallow: /collections/%2b
Disallow: /blogs/+
Disallow: /blogs/%2B
Disallow: /blogs/%2b
Disallow: /design_theme_id
Disallow: /preview_theme_id
Disallow: /preview_script_id
Disallow: /discount/*
Disallow: /gift_cards/*
Disallow: /apple-app-site-association0 -
Crawl completed but still says meta description missing
Before the last crawl I went through and made sure all meta descriptions were not missing. However, from the last crawl on the 26th July, it still has the same pages on the list and showed they were crawled as well? Any ideas on why they may still be showing as missing?
Technical SEO | | HamiltonIsland0 -
Crawl errors: 301 (permanent redirect)
Hi, here are some questions about SEO Crawl Diagnostics. We've recently found out this 301 (permanent redirect) errors in our website and we concluded that the two factors below are the causes. 1. Some of our URLs that has no / at the end is automatically redirected to the same URL but with / at the end. 2. For SEO reasons we have designed our website in a way that when we type in a URL it will automatically redirect to a more SEO friendly URL. For example, if one of the URLs is www.example.com/b1002/, it will automatically redirect to www.example.com/banana juice/. The question is, are these so significant for our SEO and needs to be modified? One of the errors in our blog was having too many on-page links. Is this also a significant error and if so, how many on-page links are recommended from the SEO perspective? Thanks in advance.
Technical SEO | | Glassworks0 -
Numerous 404 errors on crawl diagnostics (non existent pages)..
As new as them come to SEO so please be gentle.... I have a wordpress site setup for my photography business. Looking at my crawl diagnostics I see several 4xx (client error) alerts. These all show up to non existent pages on my site IE: | http://www.robertswanigan.com/happy-birthday-sara/109,97,105,108,116,111,58,104,116,116,112,58,47,47,109,97,105,108,116,111,58,105,110,102,111,64,114,111,98,101,114,116,115,119,97,110,105,103,97,110,46,99,111,109 | Totally lost on what could be causing this. Thanks in advance for any help!
Technical SEO | | Swanny8110 -
Testing for duplicate content and title tags
Hi there, I have been getting both Duplicate Page content and Duplicate Title content warnings on my crawl diagnostics report for one of my campaigns. I did my research, and implemented the preferred domain setting in Webmaster Tools. This did not resolve the crawl diagnostics warnings, and upon further research I discovered the preferred domain would only be noted by Google and not other bots like Roger. My only issue was that when I ran an SEOmoz crawl test on the same domain, I saw none of the duplicate content or title warnings yet they still appear on my crawl diagnostics report. I have now implemented a fix in my .htaccess file to 301 redirect to the www. domain. I want to check if it's worked, but since the crawl test did not show the issue last time I don't think I can rely on that. Can you help please? Thanks, Claire
Technical SEO | | SEOvet0 -
Crawling image folders / crawl allowance
We recently removed /img and /imgp from our robots.txt file thus allowing googlebot to crawl our image folders. Not sure why we had these blocked in the first place, but we opened them up in response to an email from Google Product Search about not being able to crawl images - which can/has hurt our traffic from Google Shopping. My question is: will allowing Google to crawl our image files eat up our 'crawl allowance'? We wouldn't want Google to not crawl/index certain pages, and ding our organic traffic, because more of our allotted crawl bandwidth is getting chewed up crawling image files. Outside of the non-detailed crawl stat graphs from Webmaster Tools, what's the best way to check how frequently/ deeply our site is getting crawled? Thanks all!
Technical SEO | | evoNick0 -
Reading Crawl Diagnostics and Taking Action on results
My site crawl diagnostics are showing a high number of duplicate page titles and content. When i look at the flagged pages, many errors are simply listed from multiple pages of product category search results. This looks pretty normal to me and I am at a loss for understanding how to fix this situation. Can I talk with someone? thanks, Gary
Technical SEO | | GaryQ0