False Negative Warnings with Crawl Diagnostic Test
-
Ok... I will try to explain as clear as possible. This issue is regarding close to 5000 'Warnings' from our most recent seomoz pro crawl diagnostic test. The top three warnings have about 6000 instances among them: :
1. Duplicate Page Title
2. Duplicate Page Content
3. 302 (Temporary Redirect)
We understand that duplicate titles and content are "no-no's" and have made it top priority to avoid duplication on any level. Here is the issue lies... we are using the Volusion eCommerce solution and they have a variety of value add shopping features such as "Email A Friend" and "Email Me When Back In-Stock" on each product page.
If one of these options is clicked, you are then directed to the appropriate page. Now each page has a different url with the sole variable of each individual product code. But with it being a part of Volusion's ingrained functionality... the META title is the same for each page. It takes from the title of our store homepage. Example below:
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=BI8BIOSI34
The same goes for the duplicate content warnings. If you click on one of these features, it directs you to a page with pretty much the same content except for different product. Basically each page has both duplicate content and duplicate title.
SEOMOZ description is
Duplicate Title:
Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings.
Duplicate Page Content:
You should use unique titles for your different pages to ensure that they describe each page uniquely and don't compete with each other for keyword relevance.
Because I know SEO is not an exact science, the question here is does Google recognize that although they are duplicates, it actually is generated from a feature that makes us even more of a legitimate eCommerce site? Or, from seomoz description, if duplication is bad only because you do not want your pages to be competing with each other... should I not worry because i could care less if these pages don't get traffic. Or does it effect my domain authority as whole?
Then as for a solution. I am still trying to work out with Volusion how we can change the META title of the pages. It's highly unlikely but we'll see. As for the duplicate content, there is no way to change one of these pages. It's hard coded.
Solution... so if it is bad (even though it shouldn't be) would it be worth it to disable these features. I hope not. Wouldn't that defeat the purpose of Google trying to provide the most legitimate, value add sites to searchers?
As for the 302 (Temporary Redirect) warning... this is only appearing on all of our shopping cart pages. Such as the "Email A Friend" feature, there is a page for every product. For example:
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8040
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8050
The description semoz provides is:
302 (Temporary Redirect):
Using a 302 redirect will cause search engine crawlers to treat the redirect as temporary and not pass any link juice (ranking power). We highly recommend that you replace 302 redirects with 301 redirects.
So the probably solution... I do have the ability to change to a 301 redirect but do I want to do this for my shopping cart? Does Google realize the dead end is legitimate? Or... does it matter if link juice is passed through my shopping cart? And again, does it impact my site as a whole?
It is greatly appreciated if anyone could help me out with this stuff
Thank you
-
To the OP,
We are also on Volusion and have found that adding the Meta Robots tag for noindex, follow in the meta override area for categories has worked for us. We haven't found a way to add it however to the SearchResults page at this time.
-
If you trust the target site, follow the link. If you don't trust the target site, nofollow all the links.
If you feel the footer links will actually be seen and used, keep them. If they are not likely to be seen or use, I would suggest removing them.
-
I just set up footers that are on every page to nofollow sites that I care about because otherwise they get thousands of pages linking all from the same domain - this can't be good for the actual site.
I then made a single link to the sites I care about which is followable. I am hoping this is a good strategy. Sorry to digress from the original interesting topic. -
Thanks mate, I have been searching for a couple days on how to fix that warning.
-
As a rule, don't use "nofollow" on internal links.
-
Hey Ryan,
So I just confirmed with Volusion that certain pages such as these can have the "no-index, follow" tag and certain pages can not. It's just the way their system is setup. So with the pages that can, I will for sure apply the "no-index, follow" and for the pages that can not, II will go ahead and apply a disallow robots.txt. Also, if you wouldn't mind confirming... it's the "no-index, follow" meta tag that I should apply? Not the "no index, no follow" tag?
Thank for all of your assistance and guidance through all of this trouble-shooting!
-
thanks a lot man. I'm going to check out that site map site. Also, I'm going to look into applying those "no-index,follow" tags on the pages instead. Thanks again
-
Anthony,
You can begin a crawl of your site anytime. Click on Research Tools from the menu bar and scroll down to On-Page Optimization Tools > Crawl Test. This will allow you to confirm your robots.txt settings are set correctly.
For sitemaps, http://xml-sitemaps.com/ seems to be quite popular. I would suggest checking them out first. They offer a free test for up to 500 pages, and it is $20 USD to buy their product if you like it.
For Google WMT, the "restricted by robots.txt" errors can be disregarded if you are confident the pages should be blocked. I would recommend allowing Google to crawl your site whenever possible and using the noindex meta tag to prevent the pages from being indexed. This approach would eliminate those errors.
-
Hello Ryan,
Thanks again for the reply... your time is appreciated. We are currently working on creating a site map to 'categorize' the links in both our product and category indexes. This should take care of the two highest numbers of on-page links across our site. The majority of these warnings are under 250 links so we should be good. Or, let's hope cause there really isn't anything we can do about it at this point. Also, by chance do you know of or can you refer a company / independent who designs site maps? We have the xml site map file generated from Google, we just need someone to make it look nice.
Oh yeah, regarding all those duplicate title and duplicate content errors... they should be taken care of with a disallow robots.txt file. With that said, on our last SEO moz crawl the errors still came up on those same "email a friend" and "email when back in-stock" pages. Now... I did submit the robots.txt file during the past scan so this may be the reason. So before I start to wonder any further, I am going to wait until the next crawl is complete. Maybe you might know... into the future, will SEO moz still pick up those duplicate page and title errors in the crawl with the disallow robots.txt file
Also,our webmaster tools is showing 180 "restricted by robots.txt" crawl errors... all from "email a friend" and "email when back in-stock" pages in which the disallow robots.txt was just placed. I understand that even with the disallow robots.txt file, Google can still crawl whatever it chooses. Is this anything that we should be concerned about? Also please note that we have 1000's of these pages and webmaster tools is only showing 180 of them.
Thanks again for your help
-
Hello again.
Thanks for sharing the information on #1 and 2. I have heard of Volusion before but have no experience with them. Based on what you have shared it seems they may not be a great solution from a SEO perspective.
For #4, you are correct. The "META titles over 70 characters" is a warning that long titles will be truncated. The other main consequence is a title's weight is divided amongst the words in the title. The longer the title, the less weight that is applied to each term in the title. If you know and understand these factors, you can choose to ignore the warning.
For #3, you definitely do not want "thousands of links on a page". You need to figure out a way to significantly lower the number of links. Search engines will follow a percentage then stop. Yes, I would say this is bad for SEO.
Somehow you need to categorize the links. Many blog sites will show group links by month for the current year, and by year for past years. You could group by categories. Do something to get your number of links under control. You don't have to be under 100, but for now I would say you should be under ?250 links.
-
Hey Ryan,
So from the answer you provided... we've been on a long journey trying to resolve the aforementioned issues. We took a little break over the 4th but then got back to it a few days ago. For the most part, I believe we have at least concluded what needs to be done, or if anything can actually be done regarding a solution. It is a bit tricky because we are working with Volusion (a 3rd party "shopping cart" service) which definitely limits flexibility. The are a lot of 'pros' with a service such as Volusion (especially with limited resources and knowledge) but the 'cons' are beginning to appear as our knowledge of SEO, web design, etc begins to grow. Anyway, I wanted to just provide a thorough response to your answers and also throw in a few other questions that arose since.
1. In regard to the 302 Temporary redirects on all product specific shopping cart pages... we can not apply a 301 redirect because it will then not allow any customers to actually access the shopping cart page after adding a product to their cart. We were told, "if you redirect from the shopping cart page, the customer will not be allowed to checkout. Each page is needed so taking them out will cause error to the site." I was then told to speak with their marketing services department on the issue as they will be able help solve my SEO needs. i have emailed them and hopefully will hear back. Most likely there is no way to resolve this issue.
2.) You said our duplicate page and duplicate content issues can be resolved with canonical links. As you noticed, there are canonical links on the product, category and homepage of our site. I wanted to mention that there is an "SEO friendly" way to apply these canonical links with Volusion. You just select a button that says, "enable canonical links" in the back end of the store.
After speaking with Volusion support on this matter, we basically concluded that they forgot to apply these links to the 'Email a Friend" and "Email When Back In Stock" pages. I have sent the SEO department an email on this as well and expect to get one of the following three responses.
1. "We will look into this as a future feature request"
2. "There is nothing that can be done"
3. "We know about this but don't worry, it will not impact your search rankings"
Either way, if they tell me there is not a short term solution... I will look into applying a "no index, follow" tag.
3. I did not mention this issue in my initial question but we are also receiving a 'warning' of "too many links on page". In regard to keeping our on-page links to under 100...other than the homepage and product/category index pages, we have done a pretty good job with limiting the amount of links per page. With that said, we have run into somewhat of an issue with category pages that have 70+ products assigned. We have set the default to show 60 products per page but it appears the crawlers are picking up all products (even the ones on the 'next' pages) for that page which is making the links per page very high. For example... the below link is showing 244 on-page links.
http://www.beautystoponline.com/Ardell-False-Eyelashes-s/71537.htm
There is no way there is that many links on this single page. But there are probably almost 200 products assigned to this category. Which explains the high number of links. We were told this is occurring, "due to the fact that all Category pages are generated as "search results" pages (based on the category filter), and because of this, there is very little you would be able to do, as the code that generates search/category pages is system code that cannot be modified."
We were also told that we could submit it as a feature request on their forum and if it is an idea that's popular amongst other merchants, their developers may take it into consideration and change how the links are coded in the future. Opposite of all this... by chance to you have an opinion/suggestion of a solution? (if any)
A quick side note on this topic... back to me mentioning our category and product index pages are showing thousands of on-page links. It is self explanatory to why this is happening.. but would you say it is a bad thing for SEO purposes? I know its good for site structure and passing link juice, meaning that all pages on our site are only 1 click away from the root domain. Right?!?!?!?
4. Another issue I did not previously mention was 'META titles over 70 Characters'. I just wanted to confirm that if a title is more than 70 characters, the only negative is the truncated title and the full name won't appear in the search results. Past that, there shouldn't be any negative effective from Google search rankings from this, right? We have a few of these issues but for the most part... the time it would take to correct a few characters over 70 is not worth it if there is no impact on search rankings.
Anyway man... if you do reply to this 2nd post, your time is greatly appreciated and i thank you
-
Dang... thanks Ryan for such an in-depth response! Gimme a few to take it all in and I may follow up with a few more questions. And Donnie, I appreciate the attention as well!
-
Hey Ryan, thanks. Thumbs up! I like your answer more
I should have looked further into this question. For some reason, I read the beginning and assumed we were talking about shopping cart style, check out pages.
And, you're right. Adding a nofollow to those links is a weak way of addressing Anthony's issue. Thanks for keeping me in check Ryan
-
Hello Anthony.
Presently you have legitimate duplicate page title and content issues. These are not false warnings.
The challenge for the crawl tool, and for Google, is to determine which of these many pages is the "real" page you want to be indexed, and which pages are copies. Search engines do a pretty good job of sorting through your pages but sometimes they will get it wrong. You may do a search for "flat irons" and instead of the main product page appearing, the "email me when it is back in stock" page will display in the search results. This clearly isn't best for your site, your users nor Google.
Your duplicate page and duplicate title issues can be resolved with canonicalization. What you need to do is add the canonical meta tag to the original page and all copies. This will solve your current issue, along with other issues which you have not encountered yet.
It seems you are using it on the actual product pages, but not on the "duplicate" pages. Using your example, I see a good canonical tag on http://www.beautystoponline.com/Andis-Profoil-Shaver-Replacement-Foil-Inner-Cutter-p/an1pro7130.htm. I do not see any canonical tag however on http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130.
When I take a look at the "email me" page, it really does not have any content. I do not believe there is any value for your site nor users in having this page indexed. I would recommend using the "noindex, follow" tag for this page. I normally agree with Donnie but I believe he might have mistyped in this case.
In summary, if you decide to duplicate a page in order to provide the best user experience, make sure all pages have the canonical tag pointing to the primary page you wish to be included in Google's index. If a page has no content and you do not wish it to be indexed, add the "noindex, follow" tag. There simply has to be a way to do this with your current software. If it takes a custom code or plug-in, then you should do it.
About the 302 redirects, all of the links you shared in your Q&A are to your SEOmoz pro account so I cannot access them. In short, the only time I would recommend using a 302 is when you have a temporary redirect in place which will be removed very shortly (i.e. less then 30 days).
There is simply no benefit whatsoever to using a 302 to your shopping cart instead of a 301. Understand this is high level, generic advice which is the best I can do with this level of detail.
-
delete me...
wrong answer...
Do not pass GO, do not collect any PageRank
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Am I doing SEO test properly?
Hello, I just created a page for researching the impact of social signals on Google ranking (in Italy). Page was not optimized (one internal backlink, no other external/internal links, keyword repeated 4 or 5 + h1 h2, no alt tags), and only social signals are being stimulated (through votes). The domain is 2 months old and is already positioned for few relevant keywords, but from 2 page down. My question is: am I doing right? Is this a good way to proceed? And if not, what I should do instead? Thank you for an advice. Eugenio
Technical SEO | | socialengaged0 -
Crawling issues in google
Hi everyone, I think i have crawling issues with one of my sites. It has vanished form Google rankings it used to rank for all services i offered now it doesn't anymore ever since September 29th. I have resubmitted to Google 2 times and they came back with the same answer: " We reviewed your site and found no manual actions by the web spam team that might affect your site's ranking in Google. There's no need to file a reconsideration request for your site, because any ranking issues you may be experiencing are not related to a manual action taken by the webspam team. Of course, there may be other issues with your site that affect your site's ranking. Google's computers determine the order of our search results using a series of formulas known as algorithms. We make hundreds of changes to our search algorithms each year, and we employ more than 200 different signals when ranking pages. As our algorithms change and as the web (including your site) changes, some fluctuation in ranking can happen as we make updates to present the best results to our users. If you've experienced a change in ranking which you suspect may be more than a simple algorithm change, there are other things you may want to investigate as possible causes, such as a major change to your site's content, content management system, or server architecture. For example, a site may not rank well if your server stops serving pages to Googlebot, or if you've changed the URLs for a large portion of your site's pages. This article has a list of other potential reasons your site may not be doing well in search. " How i detected that it may be a crawling issue is that 2 weeks ago i changed metas - metas are very slow in getting updated and for some of my pages never did update Do you know any good tools to check for bad code that could slow down the crawling. I really don't know where to look other than issues for crawling. I validated the website with w3c validator and ran xenu and cleaned these up but my website is still down. Any ideas are appreciated.
Technical SEO | | CMTM0 -
Moz Crawl Reporting Duplicate content on "template" styled pages
We have a lot of detail pages on our site that reference specific scholarships. Each page has a different Title and Description. They also have unique information all regarding the same data points. The pages are displayed in a similar structure to the user so the data is easy to read. My problem is a lot of these pages are being reported as duplicate content when they certainly are not. Most of them are reported as duplicates when they have the same sponsor. They may have the same contact information listed. These two are being reported as duplicate of each other. They share some data but they are definitely different scholarships. http://www.collegexpress.com/scholarships/adelaide-mcclelland-garden-club-scholarship/9254/ http://www.collegexpress.com/scholarships/mary-wannamaker-witt-and-lee-hampton-witt-memorial-scholarship/10785/ Would it help to add a Canonical for each page to themselves? Any other suggestions would be great. Thanks
Technical SEO | | GeorgeLaRochelle0 -
Why the number of crawled pages is so low¿?
Hi, my website is www.theprinterdepo.com and I have been in seomoz pro for 2 months. When it started it crawled 10000 pages, then I modified robots.txt to disallow some specific parameters in the pages to be crawled. We have about 3500 products, so thhe number of crawled pages should be close to that number In the last crawl, it shows only 1700, What should I do?
Technical SEO | | levalencia10 -
Linklicious and Crawl rates
Can somebody please explain me what is 'crawl rate' and how does 'linklicious' help us with it? I mean I can always visit the website and know more about it, but I want to understand the concept. Please help.
Technical SEO | | KS__0 -
SEOMoz Crawl Diagnostic indicates duplicate page content for home page?
My first SEOMoz Crawl Diagnostic report for my website indicates duplicate page content for my home page. It lists the home page URL Page Title and URL twice. How do I go about diagnosing this? Is the problem related to the following code that is in my .htaccess file? (The purpose of the code was to redirect any non "www" backlink referrals to the "www" version of the domain.) RewriteCond %{HTTP_HOST} ^whatever.com [NC]
Technical SEO | | Linesides
RewriteRule ^(.*)$ http://www.whatever.com/$1 [L,R=301] Should I get rid of the "http" reference in the second line? Related to this is a notice in the "Crawl Notices Found" -- "301 Permanent redirect" which shows my home page title as "http://whatever.com" and shows the redirect address as http://http://www.whatever.com/ I'm guessing this problem is again related to the redirect code I'm using. Also... The report indicates duplicate content for those links that have different parameters added to the URL i.e. http://www.whatever.com?marker=Blah Blah&markerzoom=13 If I set up a canonical reference for the page, will this fix this? Thank you.0 -
Database Driven Websites: Crawling and Indexing Issues
Hi all - I'm working on an SEO project, dealing with my first database-driven website that is built on a custom CMS. Almost all of the pages are created by the admin user in the CMS, pulling info from a database. What are the best practices here regarding SEO? I know that overall static is good, and as much static as possible is best, but how does Google treat a site like this? For instance, lets say the user creates a new page in the CMS, and then posts it live. The page is rendered and navigable, after putting together the user-inputed info (the content on the page) and the info pulled from the database (like info pulled out to create the Title tag and H1 tags, etc). Is this page now going to be crawled successfully and indexed as a static page in Google's eyes, and thus ok to start working on rank for, etc? Any help is appreciated - thanks!
Technical SEO | | Bandicoot0 -
Two basic questions re. Crawl Diagnostic results
I'm a novice...I've just run my crawl diagnostics and I wonder how important is it to a) Have meta-descriptions on every page, b) Have all titles less than 70 characters? Thanks in advance. Dan.
Technical SEO | | danfk0