False Negative Warnings with Crawl Diagnostic Test
-
Ok... I will try to explain as clear as possible. This issue is regarding close to 5000 'Warnings' from our most recent seomoz pro crawl diagnostic test. The top three warnings have about 6000 instances among them: :
1. Duplicate Page Title
2. Duplicate Page Content
3. 302 (Temporary Redirect)
We understand that duplicate titles and content are "no-no's" and have made it top priority to avoid duplication on any level. Here is the issue lies... we are using the Volusion eCommerce solution and they have a variety of value add shopping features such as "Email A Friend" and "Email Me When Back In-Stock" on each product page.
If one of these options is clicked, you are then directed to the appropriate page. Now each page has a different url with the sole variable of each individual product code. But with it being a part of Volusion's ingrained functionality... the META title is the same for each page. It takes from the title of our store homepage. Example below:
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=BI8BIOSI34
The same goes for the duplicate content warnings. If you click on one of these features, it directs you to a page with pretty much the same content except for different product. Basically each page has both duplicate content and duplicate title.
SEOMOZ description is
Duplicate Title:
Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings.
Duplicate Page Content:
You should use unique titles for your different pages to ensure that they describe each page uniquely and don't compete with each other for keyword relevance.
Because I know SEO is not an exact science, the question here is does Google recognize that although they are duplicates, it actually is generated from a feature that makes us even more of a legitimate eCommerce site? Or, from seomoz description, if duplication is bad only because you do not want your pages to be competing with each other... should I not worry because i could care less if these pages don't get traffic. Or does it effect my domain authority as whole?
Then as for a solution. I am still trying to work out with Volusion how we can change the META title of the pages. It's highly unlikely but we'll see. As for the duplicate content, there is no way to change one of these pages. It's hard coded.
Solution... so if it is bad (even though it shouldn't be) would it be worth it to disable these features. I hope not. Wouldn't that defeat the purpose of Google trying to provide the most legitimate, value add sites to searchers?
As for the 302 (Temporary Redirect) warning... this is only appearing on all of our shopping cart pages. Such as the "Email A Friend" feature, there is a page for every product. For example:
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8040
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8050
The description semoz provides is:
302 (Temporary Redirect):
Using a 302 redirect will cause search engine crawlers to treat the redirect as temporary and not pass any link juice (ranking power). We highly recommend that you replace 302 redirects with 301 redirects.
So the probably solution... I do have the ability to change to a 301 redirect but do I want to do this for my shopping cart? Does Google realize the dead end is legitimate? Or... does it matter if link juice is passed through my shopping cart? And again, does it impact my site as a whole?
It is greatly appreciated if anyone could help me out with this stuff
Thank you
-
To the OP,
We are also on Volusion and have found that adding the Meta Robots tag for noindex, follow in the meta override area for categories has worked for us. We haven't found a way to add it however to the SearchResults page at this time.
-
If you trust the target site, follow the link. If you don't trust the target site, nofollow all the links.
If you feel the footer links will actually be seen and used, keep them. If they are not likely to be seen or use, I would suggest removing them.
-
I just set up footers that are on every page to nofollow sites that I care about because otherwise they get thousands of pages linking all from the same domain - this can't be good for the actual site.
I then made a single link to the sites I care about which is followable. I am hoping this is a good strategy. Sorry to digress from the original interesting topic. -
Thanks mate, I have been searching for a couple days on how to fix that warning.
-
As a rule, don't use "nofollow" on internal links.
-
Hey Ryan,
So I just confirmed with Volusion that certain pages such as these can have the "no-index, follow" tag and certain pages can not. It's just the way their system is setup. So with the pages that can, I will for sure apply the "no-index, follow" and for the pages that can not, II will go ahead and apply a disallow robots.txt. Also, if you wouldn't mind confirming... it's the "no-index, follow" meta tag that I should apply? Not the "no index, no follow" tag?
Thank for all of your assistance and guidance through all of this trouble-shooting!
-
thanks a lot man. I'm going to check out that site map site. Also, I'm going to look into applying those "no-index,follow" tags on the pages instead. Thanks again
-
Anthony,
You can begin a crawl of your site anytime. Click on Research Tools from the menu bar and scroll down to On-Page Optimization Tools > Crawl Test. This will allow you to confirm your robots.txt settings are set correctly.
For sitemaps, http://xml-sitemaps.com/ seems to be quite popular. I would suggest checking them out first. They offer a free test for up to 500 pages, and it is $20 USD to buy their product if you like it.
For Google WMT, the "restricted by robots.txt" errors can be disregarded if you are confident the pages should be blocked. I would recommend allowing Google to crawl your site whenever possible and using the noindex meta tag to prevent the pages from being indexed. This approach would eliminate those errors.
-
Hello Ryan,
Thanks again for the reply... your time is appreciated. We are currently working on creating a site map to 'categorize' the links in both our product and category indexes. This should take care of the two highest numbers of on-page links across our site. The majority of these warnings are under 250 links so we should be good. Or, let's hope cause there really isn't anything we can do about it at this point. Also, by chance do you know of or can you refer a company / independent who designs site maps? We have the xml site map file generated from Google, we just need someone to make it look nice.
Oh yeah, regarding all those duplicate title and duplicate content errors... they should be taken care of with a disallow robots.txt file. With that said, on our last SEO moz crawl the errors still came up on those same "email a friend" and "email when back in-stock" pages. Now... I did submit the robots.txt file during the past scan so this may be the reason. So before I start to wonder any further, I am going to wait until the next crawl is complete. Maybe you might know... into the future, will SEO moz still pick up those duplicate page and title errors in the crawl with the disallow robots.txt file
Also,our webmaster tools is showing 180 "restricted by robots.txt" crawl errors... all from "email a friend" and "email when back in-stock" pages in which the disallow robots.txt was just placed. I understand that even with the disallow robots.txt file, Google can still crawl whatever it chooses. Is this anything that we should be concerned about? Also please note that we have 1000's of these pages and webmaster tools is only showing 180 of them.
Thanks again for your help
-
Hello again.
Thanks for sharing the information on #1 and 2. I have heard of Volusion before but have no experience with them. Based on what you have shared it seems they may not be a great solution from a SEO perspective.
For #4, you are correct. The "META titles over 70 characters" is a warning that long titles will be truncated. The other main consequence is a title's weight is divided amongst the words in the title. The longer the title, the less weight that is applied to each term in the title. If you know and understand these factors, you can choose to ignore the warning.
For #3, you definitely do not want "thousands of links on a page". You need to figure out a way to significantly lower the number of links. Search engines will follow a percentage then stop. Yes, I would say this is bad for SEO.
Somehow you need to categorize the links. Many blog sites will show group links by month for the current year, and by year for past years. You could group by categories. Do something to get your number of links under control. You don't have to be under 100, but for now I would say you should be under ?250 links.
-
Hey Ryan,
So from the answer you provided... we've been on a long journey trying to resolve the aforementioned issues. We took a little break over the 4th but then got back to it a few days ago. For the most part, I believe we have at least concluded what needs to be done, or if anything can actually be done regarding a solution. It is a bit tricky because we are working with Volusion (a 3rd party "shopping cart" service) which definitely limits flexibility. The are a lot of 'pros' with a service such as Volusion (especially with limited resources and knowledge) but the 'cons' are beginning to appear as our knowledge of SEO, web design, etc begins to grow. Anyway, I wanted to just provide a thorough response to your answers and also throw in a few other questions that arose since.
1. In regard to the 302 Temporary redirects on all product specific shopping cart pages... we can not apply a 301 redirect because it will then not allow any customers to actually access the shopping cart page after adding a product to their cart. We were told, "if you redirect from the shopping cart page, the customer will not be allowed to checkout. Each page is needed so taking them out will cause error to the site." I was then told to speak with their marketing services department on the issue as they will be able help solve my SEO needs. i have emailed them and hopefully will hear back. Most likely there is no way to resolve this issue.
2.) You said our duplicate page and duplicate content issues can be resolved with canonical links. As you noticed, there are canonical links on the product, category and homepage of our site. I wanted to mention that there is an "SEO friendly" way to apply these canonical links with Volusion. You just select a button that says, "enable canonical links" in the back end of the store.
After speaking with Volusion support on this matter, we basically concluded that they forgot to apply these links to the 'Email a Friend" and "Email When Back In Stock" pages. I have sent the SEO department an email on this as well and expect to get one of the following three responses.
1. "We will look into this as a future feature request"
2. "There is nothing that can be done"
3. "We know about this but don't worry, it will not impact your search rankings"
Either way, if they tell me there is not a short term solution... I will look into applying a "no index, follow" tag.
3. I did not mention this issue in my initial question but we are also receiving a 'warning' of "too many links on page". In regard to keeping our on-page links to under 100...other than the homepage and product/category index pages, we have done a pretty good job with limiting the amount of links per page. With that said, we have run into somewhat of an issue with category pages that have 70+ products assigned. We have set the default to show 60 products per page but it appears the crawlers are picking up all products (even the ones on the 'next' pages) for that page which is making the links per page very high. For example... the below link is showing 244 on-page links.
http://www.beautystoponline.com/Ardell-False-Eyelashes-s/71537.htm
There is no way there is that many links on this single page. But there are probably almost 200 products assigned to this category. Which explains the high number of links. We were told this is occurring, "due to the fact that all Category pages are generated as "search results" pages (based on the category filter), and because of this, there is very little you would be able to do, as the code that generates search/category pages is system code that cannot be modified."
We were also told that we could submit it as a feature request on their forum and if it is an idea that's popular amongst other merchants, their developers may take it into consideration and change how the links are coded in the future. Opposite of all this... by chance to you have an opinion/suggestion of a solution? (if any)
A quick side note on this topic... back to me mentioning our category and product index pages are showing thousands of on-page links. It is self explanatory to why this is happening.. but would you say it is a bad thing for SEO purposes? I know its good for site structure and passing link juice, meaning that all pages on our site are only 1 click away from the root domain. Right?!?!?!?
4. Another issue I did not previously mention was 'META titles over 70 Characters'. I just wanted to confirm that if a title is more than 70 characters, the only negative is the truncated title and the full name won't appear in the search results. Past that, there shouldn't be any negative effective from Google search rankings from this, right? We have a few of these issues but for the most part... the time it would take to correct a few characters over 70 is not worth it if there is no impact on search rankings.
Anyway man... if you do reply to this 2nd post, your time is greatly appreciated and i thank you
-
Dang... thanks Ryan for such an in-depth response! Gimme a few to take it all in and I may follow up with a few more questions. And Donnie, I appreciate the attention as well!
-
Hey Ryan, thanks. Thumbs up! I like your answer more
I should have looked further into this question. For some reason, I read the beginning and assumed we were talking about shopping cart style, check out pages.
And, you're right. Adding a nofollow to those links is a weak way of addressing Anthony's issue. Thanks for keeping me in check Ryan
-
Hello Anthony.
Presently you have legitimate duplicate page title and content issues. These are not false warnings.
The challenge for the crawl tool, and for Google, is to determine which of these many pages is the "real" page you want to be indexed, and which pages are copies. Search engines do a pretty good job of sorting through your pages but sometimes they will get it wrong. You may do a search for "flat irons" and instead of the main product page appearing, the "email me when it is back in stock" page will display in the search results. This clearly isn't best for your site, your users nor Google.
Your duplicate page and duplicate title issues can be resolved with canonicalization. What you need to do is add the canonical meta tag to the original page and all copies. This will solve your current issue, along with other issues which you have not encountered yet.
It seems you are using it on the actual product pages, but not on the "duplicate" pages. Using your example, I see a good canonical tag on http://www.beautystoponline.com/Andis-Profoil-Shaver-Replacement-Foil-Inner-Cutter-p/an1pro7130.htm. I do not see any canonical tag however on http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130.
When I take a look at the "email me" page, it really does not have any content. I do not believe there is any value for your site nor users in having this page indexed. I would recommend using the "noindex, follow" tag for this page. I normally agree with Donnie but I believe he might have mistyped in this case.
In summary, if you decide to duplicate a page in order to provide the best user experience, make sure all pages have the canonical tag pointing to the primary page you wish to be included in Google's index. If a page has no content and you do not wish it to be indexed, add the "noindex, follow" tag. There simply has to be a way to do this with your current software. If it takes a custom code or plug-in, then you should do it.
About the 302 redirects, all of the links you shared in your Q&A are to your SEOmoz pro account so I cannot access them. In short, the only time I would recommend using a 302 is when you have a temporary redirect in place which will be removed very shortly (i.e. less then 30 days).
There is simply no benefit whatsoever to using a 302 to your shopping cart instead of a 301. Understand this is high level, generic advice which is the best I can do with this level of detail.
-
delete me...
wrong answer...
Do not pass GO, do not collect any PageRank
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Internal link is creating duplicate content issues and generating 404s from website crawl.
Not sure what the best way to describe it but the site is built with Elementor page builder. We are finding out that a feature that is included with a pop modal window renders an HTML code as so: Click So when crawled I think the crawling is linking itself for some reason so the crawl returns something like this: xyz.com/builder/listing/ - what we want what we don't want xyz.com/builder/listing/ xyz.com/builder/listing/%23elementor-action%3Aaction%3Dpopup%3Aopen%26settings%3DeyJpZCI6Ijc2MCIsInRvZ2dsZSI6ZmFsc2V9/ xyz.com/builder/listing/%23elementor-action%3Aaction%3Dpopup%3Aopen%26settings%3DeyJpZCI6Ijc2MCIsInRvZ2dsZSI6ZmFsc2V9//%23elementor-action%3Aaction%3Dpopup%3Aopen%26settings%3DeyJpZCI6Ijc2MCIsInRvZ2dsZSI6ZmFsc2V9/ so you'll notice how that string in the HREF is appended each time and it loops a couple times. Could I 301 this issue, what's the best way to go about handling something like this? It's causing duplicate meta descriptions/content errors for some listing pages we have. I did add a rel='nofollow' to the anchor tag with JavaScript but not sure if that'll help.
Technical SEO | | JoseG-LP0 -
How to resolve warning of pages with redirect chain when its your http:// to https://www.
how do I write a 301 redirect in the htaccess file so that http:// goes straight to https://www. Moz replyEli profileHey there!Thanks for reaching out to us!
Technical SEO | | VelocityWebsites0 -
Google only crawling a small percentage of the sitemap
Hi, The company which I work for have developed a new website for a customer, there URL is https://www.wideformatsolutions.co.uk I've created a sitemap which has 25,555 URL's. I submitted this to Google around 4 weeks ago and the most crawls that have ever occurred has been 2,379. I've checked everything I can think of, including; Speed of website Canonical Links 404 errors Setting a preferred domain Duplicate content Robots Txt .htaccess Meta Tags I did read that Matt Cutts revealed in an interview with Eric Enge that the number of pages Google crawls is roughly proportional to your pagerank. But I'm sure it should crawl more than 2000 pages. The website is based on Opencart, if anyone has experienced anything like this I would love hear from you.
Technical SEO | | chrissmithps0 -
Massive drop off in Google crawl stats
Hi Could i get a second opinion on the following please. ON a client site we seem to have had a massive drop off in google crawling in the past few weeks, this is linked with a drop in search impressions and a slight reduction in penalty. There are no warning messages in WMT to say the site is in trouble, and it shouldn't be, however cannot get to the bottom of what is going on. In Feb the Kilobytes downloaded per day was between 2200 and about 3800, all good there. However in the past couple of weeks it has peaked at 62 and most days are not even over 3! Something odd has taken place. For the same period, the Pages crawled per day has gone from 50 - 100 down to under 3. At the same time the site speed hasn't changed - it is slow and has always been slow (have advised the client to change this but you know how it is....) Unfortunately I am unable to give the site url out so i understand that may impact on any advice people could offer. Ive attached some screen shots from WMT below. Many thanks for any assistance. stats.png
Technical SEO | | daedriccarl0 -
Duplicate content due to credit card testing
I recently launched a site - http://www.footballtriviaquestions.co.uk and the site uses Paypal. In order to test the PayPal functionality I set up a zapto.org domain via a permanent IP service that points directly to the computer I've written the website on. It appears that Google has now indexed the zapto.org website. Will this cause problems to my main website, as the zapto.org website will pretty much contain content that is an exact duplicate of what is held on the main website. I've looked in Google webmaster tools for the main website and it doesn't mention any duplicate content, but I'm currently not in the top 50 ranking for "football trivia questions' on Google despite SEOMoz ranking my home page with an A rating. The page does rank at position 16 in Yahoo and Bing. This seems odd to me, although I do have very few back links pointing to my site. If the duplicate content is likely to be causing me problems what would be the best way to knock the zapto.org results out of Google
Technical SEO | | ipr1010 -
Crawl Diagnostic: Notices about 301 redirects
There are detected five 301 redirects on my site and I want to understand why this is happening? And is this important to fix? http://domain.cl/subfolder ---- redirects to ----> http://domain.cl/subfolder/ What does this tell me "/" I am very curious 🙂 Thanks for every answer
Technical SEO | | inlinear
Holger0 -
When is the last time Google crawled my site
How do I tell the last time Google crawled my site. I found out it is not the "Cache" which I had thought it was.
Technical SEO | | digitalops0 -
Crawl Errors and Duplicate Content
SEOmoz's crawl tool is telling me that I have duplicate content at "www.mydomain.com/pricing" and at "www.mydomain.com/pricing.aspx". Do you think this is just a glitch in the crawl tool (because obviously these two URL's are the same page rather than two separate ones) or do you think this is actually an error I need to worry about? Is so, how do I fix it?
Technical SEO | | MyNet0