False Negative Warnings with Crawl Diagnostic Test
-
Ok... I will try to explain as clear as possible. This issue is regarding close to 5000 'Warnings' from our most recent seomoz pro crawl diagnostic test. The top three warnings have about 6000 instances among them: :
1. Duplicate Page Title
2. Duplicate Page Content
3. 302 (Temporary Redirect)
We understand that duplicate titles and content are "no-no's" and have made it top priority to avoid duplication on any level. Here is the issue lies... we are using the Volusion eCommerce solution and they have a variety of value add shopping features such as "Email A Friend" and "Email Me When Back In-Stock" on each product page.
If one of these options is clicked, you are then directed to the appropriate page. Now each page has a different url with the sole variable of each individual product code. But with it being a part of Volusion's ingrained functionality... the META title is the same for each page. It takes from the title of our store homepage. Example below:
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130
Online Beauty Supply Store | Hair Care Products | Nail Care | Flat Irons
http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=BI8BIOSI34
The same goes for the duplicate content warnings. If you click on one of these features, it directs you to a page with pretty much the same content except for different product. Basically each page has both duplicate content and duplicate title.
SEOMOZ description is
Duplicate Title:
Content that is identical (or nearly identical) to content on other pages of your site forces your pages to unnecessarily compete with each other for rankings.
Duplicate Page Content:
You should use unique titles for your different pages to ensure that they describe each page uniquely and don't compete with each other for keyword relevance.
Because I know SEO is not an exact science, the question here is does Google recognize that although they are duplicates, it actually is generated from a feature that makes us even more of a legitimate eCommerce site? Or, from seomoz description, if duplication is bad only because you do not want your pages to be competing with each other... should I not worry because i could care less if these pages don't get traffic. Or does it effect my domain authority as whole?
Then as for a solution. I am still trying to work out with Volusion how we can change the META title of the pages. It's highly unlikely but we'll see. As for the duplicate content, there is no way to change one of these pages. It's hard coded.
Solution... so if it is bad (even though it shouldn't be) would it be worth it to disable these features. I hope not. Wouldn't that defeat the purpose of Google trying to provide the most legitimate, value add sites to searchers?
As for the 302 (Temporary Redirect) warning... this is only appearing on all of our shopping cart pages. Such as the "Email A Friend" feature, there is a page for every product. For example:
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8040
http://www.beautystoponline.com/ShoppingCart.asp?ProductCode=AN1HOM8050
The description semoz provides is:
302 (Temporary Redirect):
Using a 302 redirect will cause search engine crawlers to treat the redirect as temporary and not pass any link juice (ranking power). We highly recommend that you replace 302 redirects with 301 redirects.
So the probably solution... I do have the ability to change to a 301 redirect but do I want to do this for my shopping cart? Does Google realize the dead end is legitimate? Or... does it matter if link juice is passed through my shopping cart? And again, does it impact my site as a whole?
It is greatly appreciated if anyone could help me out with this stuff
Thank you
-
To the OP,
We are also on Volusion and have found that adding the Meta Robots tag for noindex, follow in the meta override area for categories has worked for us. We haven't found a way to add it however to the SearchResults page at this time.
-
If you trust the target site, follow the link. If you don't trust the target site, nofollow all the links.
If you feel the footer links will actually be seen and used, keep them. If they are not likely to be seen or use, I would suggest removing them.
-
I just set up footers that are on every page to nofollow sites that I care about because otherwise they get thousands of pages linking all from the same domain - this can't be good for the actual site.
I then made a single link to the sites I care about which is followable. I am hoping this is a good strategy. Sorry to digress from the original interesting topic. -
Thanks mate, I have been searching for a couple days on how to fix that warning.
-
As a rule, don't use "nofollow" on internal links.
-
Hey Ryan,
So I just confirmed with Volusion that certain pages such as these can have the "no-index, follow" tag and certain pages can not. It's just the way their system is setup. So with the pages that can, I will for sure apply the "no-index, follow" and for the pages that can not, II will go ahead and apply a disallow robots.txt. Also, if you wouldn't mind confirming... it's the "no-index, follow" meta tag that I should apply? Not the "no index, no follow" tag?
Thank for all of your assistance and guidance through all of this trouble-shooting!
-
thanks a lot man. I'm going to check out that site map site. Also, I'm going to look into applying those "no-index,follow" tags on the pages instead. Thanks again
-
Anthony,
You can begin a crawl of your site anytime. Click on Research Tools from the menu bar and scroll down to On-Page Optimization Tools > Crawl Test. This will allow you to confirm your robots.txt settings are set correctly.
For sitemaps, http://xml-sitemaps.com/ seems to be quite popular. I would suggest checking them out first. They offer a free test for up to 500 pages, and it is $20 USD to buy their product if you like it.
For Google WMT, the "restricted by robots.txt" errors can be disregarded if you are confident the pages should be blocked. I would recommend allowing Google to crawl your site whenever possible and using the noindex meta tag to prevent the pages from being indexed. This approach would eliminate those errors.
-
Hello Ryan,
Thanks again for the reply... your time is appreciated. We are currently working on creating a site map to 'categorize' the links in both our product and category indexes. This should take care of the two highest numbers of on-page links across our site. The majority of these warnings are under 250 links so we should be good. Or, let's hope cause there really isn't anything we can do about it at this point. Also, by chance do you know of or can you refer a company / independent who designs site maps? We have the xml site map file generated from Google, we just need someone to make it look nice.
Oh yeah, regarding all those duplicate title and duplicate content errors... they should be taken care of with a disallow robots.txt file. With that said, on our last SEO moz crawl the errors still came up on those same "email a friend" and "email when back in-stock" pages. Now... I did submit the robots.txt file during the past scan so this may be the reason. So before I start to wonder any further, I am going to wait until the next crawl is complete. Maybe you might know... into the future, will SEO moz still pick up those duplicate page and title errors in the crawl with the disallow robots.txt file
Also,our webmaster tools is showing 180 "restricted by robots.txt" crawl errors... all from "email a friend" and "email when back in-stock" pages in which the disallow robots.txt was just placed. I understand that even with the disallow robots.txt file, Google can still crawl whatever it chooses. Is this anything that we should be concerned about? Also please note that we have 1000's of these pages and webmaster tools is only showing 180 of them.
Thanks again for your help
-
Hello again.
Thanks for sharing the information on #1 and 2. I have heard of Volusion before but have no experience with them. Based on what you have shared it seems they may not be a great solution from a SEO perspective.
For #4, you are correct. The "META titles over 70 characters" is a warning that long titles will be truncated. The other main consequence is a title's weight is divided amongst the words in the title. The longer the title, the less weight that is applied to each term in the title. If you know and understand these factors, you can choose to ignore the warning.
For #3, you definitely do not want "thousands of links on a page". You need to figure out a way to significantly lower the number of links. Search engines will follow a percentage then stop. Yes, I would say this is bad for SEO.
Somehow you need to categorize the links. Many blog sites will show group links by month for the current year, and by year for past years. You could group by categories. Do something to get your number of links under control. You don't have to be under 100, but for now I would say you should be under ?250 links.
-
Hey Ryan,
So from the answer you provided... we've been on a long journey trying to resolve the aforementioned issues. We took a little break over the 4th but then got back to it a few days ago. For the most part, I believe we have at least concluded what needs to be done, or if anything can actually be done regarding a solution. It is a bit tricky because we are working with Volusion (a 3rd party "shopping cart" service) which definitely limits flexibility. The are a lot of 'pros' with a service such as Volusion (especially with limited resources and knowledge) but the 'cons' are beginning to appear as our knowledge of SEO, web design, etc begins to grow. Anyway, I wanted to just provide a thorough response to your answers and also throw in a few other questions that arose since.
1. In regard to the 302 Temporary redirects on all product specific shopping cart pages... we can not apply a 301 redirect because it will then not allow any customers to actually access the shopping cart page after adding a product to their cart. We were told, "if you redirect from the shopping cart page, the customer will not be allowed to checkout. Each page is needed so taking them out will cause error to the site." I was then told to speak with their marketing services department on the issue as they will be able help solve my SEO needs. i have emailed them and hopefully will hear back. Most likely there is no way to resolve this issue.
2.) You said our duplicate page and duplicate content issues can be resolved with canonical links. As you noticed, there are canonical links on the product, category and homepage of our site. I wanted to mention that there is an "SEO friendly" way to apply these canonical links with Volusion. You just select a button that says, "enable canonical links" in the back end of the store.
After speaking with Volusion support on this matter, we basically concluded that they forgot to apply these links to the 'Email a Friend" and "Email When Back In Stock" pages. I have sent the SEO department an email on this as well and expect to get one of the following three responses.
1. "We will look into this as a future feature request"
2. "There is nothing that can be done"
3. "We know about this but don't worry, it will not impact your search rankings"
Either way, if they tell me there is not a short term solution... I will look into applying a "no index, follow" tag.
3. I did not mention this issue in my initial question but we are also receiving a 'warning' of "too many links on page". In regard to keeping our on-page links to under 100...other than the homepage and product/category index pages, we have done a pretty good job with limiting the amount of links per page. With that said, we have run into somewhat of an issue with category pages that have 70+ products assigned. We have set the default to show 60 products per page but it appears the crawlers are picking up all products (even the ones on the 'next' pages) for that page which is making the links per page very high. For example... the below link is showing 244 on-page links.
http://www.beautystoponline.com/Ardell-False-Eyelashes-s/71537.htm
There is no way there is that many links on this single page. But there are probably almost 200 products assigned to this category. Which explains the high number of links. We were told this is occurring, "due to the fact that all Category pages are generated as "search results" pages (based on the category filter), and because of this, there is very little you would be able to do, as the code that generates search/category pages is system code that cannot be modified."
We were also told that we could submit it as a feature request on their forum and if it is an idea that's popular amongst other merchants, their developers may take it into consideration and change how the links are coded in the future. Opposite of all this... by chance to you have an opinion/suggestion of a solution? (if any)
A quick side note on this topic... back to me mentioning our category and product index pages are showing thousands of on-page links. It is self explanatory to why this is happening.. but would you say it is a bad thing for SEO purposes? I know its good for site structure and passing link juice, meaning that all pages on our site are only 1 click away from the root domain. Right?!?!?!?
4. Another issue I did not previously mention was 'META titles over 70 Characters'. I just wanted to confirm that if a title is more than 70 characters, the only negative is the truncated title and the full name won't appear in the search results. Past that, there shouldn't be any negative effective from Google search rankings from this, right? We have a few of these issues but for the most part... the time it would take to correct a few characters over 70 is not worth it if there is no impact on search rankings.
Anyway man... if you do reply to this 2nd post, your time is greatly appreciated and i thank you
-
Dang... thanks Ryan for such an in-depth response! Gimme a few to take it all in and I may follow up with a few more questions. And Donnie, I appreciate the attention as well!
-
Hey Ryan, thanks. Thumbs up! I like your answer more
I should have looked further into this question. For some reason, I read the beginning and assumed we were talking about shopping cart style, check out pages.
And, you're right. Adding a nofollow to those links is a weak way of addressing Anthony's issue. Thanks for keeping me in check Ryan
-
Hello Anthony.
Presently you have legitimate duplicate page title and content issues. These are not false warnings.
The challenge for the crawl tool, and for Google, is to determine which of these many pages is the "real" page you want to be indexed, and which pages are copies. Search engines do a pretty good job of sorting through your pages but sometimes they will get it wrong. You may do a search for "flat irons" and instead of the main product page appearing, the "email me when it is back in stock" page will display in the search results. This clearly isn't best for your site, your users nor Google.
Your duplicate page and duplicate title issues can be resolved with canonicalization. What you need to do is add the canonical meta tag to the original page and all copies. This will solve your current issue, along with other issues which you have not encountered yet.
It seems you are using it on the actual product pages, but not on the "duplicate" pages. Using your example, I see a good canonical tag on http://www.beautystoponline.com/Andis-Profoil-Shaver-Replacement-Foil-Inner-Cutter-p/an1pro7130.htm. I do not see any canonical tag however on http://www.beautystoponline.com/Email_Me_When_Back_In_Stock.asp?ProductCode=AN1PRO7130.
When I take a look at the "email me" page, it really does not have any content. I do not believe there is any value for your site nor users in having this page indexed. I would recommend using the "noindex, follow" tag for this page. I normally agree with Donnie but I believe he might have mistyped in this case.
In summary, if you decide to duplicate a page in order to provide the best user experience, make sure all pages have the canonical tag pointing to the primary page you wish to be included in Google's index. If a page has no content and you do not wish it to be indexed, add the "noindex, follow" tag. There simply has to be a way to do this with your current software. If it takes a custom code or plug-in, then you should do it.
About the 302 redirects, all of the links you shared in your Q&A are to your SEOmoz pro account so I cannot access them. In short, the only time I would recommend using a 302 is when you have a temporary redirect in place which will be removed very shortly (i.e. less then 30 days).
There is simply no benefit whatsoever to using a 302 to your shopping cart instead of a 301. Understand this is high level, generic advice which is the best I can do with this level of detail.
-
delete me...
wrong answer...
Do not pass GO, do not collect any PageRank
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are there ways to avoid false positive "soft 404s" by Google
Sometimes I get alerts from Google Search Console that it has detected soft 404s on different websites, and since I take great care to never have true soft 404s, they are always false positives. Today I got one on a website that has pages promoting some events. The language on the page for one event that has sold out says that "tickets are no longer available" which seems to have tripped up Google into thinking the page is a soft 404. It's kind of incredible to me that in the current era we're in, with things like chatGPT that Google doesn't seem to understand natural language. But that has me thinking, are there some strategies or best practices we can use in how we write copy on the page so Google doesn't flag it as soft 404? It seems like anything that could tell a user that an item isn't available could trip it up into thinking it is a 404. In the case of my page, it's actually important information we need to tell the public that an event has sold out, but to use their interest in that event to promote other events. so I don't want the page deindexed or not to rank well!
Technical SEO | | IrvCo_Interactive0 -
Moz Crawled My Site. Now What?
Hey everyone! So Moz crawled my site and I passed it over to my dev team who's curious about what they should prioritize. Curious what everyone's thoughts are. Here are the issue types: Duplicate Content - Missing Title - Duplicate Title Tag - Redirect Chain - Title too long - Description too short - Missing Description - Missing h1 - Thin Content - URL Too Long - Has meta noindex Would love any assistance! Thank you!
Technical SEO | | inksoft_mm0 -
Huge number of crawl anomalies and 404s - non- existent urls
Hi there, Our site was redesigned at the end of January 2020. Since the new site was launched we have seen a big drop in impressions (50-60%) and also a big drop in total and organic traffic (again 50-60%) when compared to the old site. I know in the current climate some businesses will see a drop in traffic, however we are a tech business and some of our core search terms have increased in search volume as a result of remote-working. According to search console there are 82k urls excluded from coverage - the majority of these are classed as 'crawl anomaly' and there are 250+ 404's - almost all of the urls are non-existent, they have our root domain with a string of random characters on the end. Here are a couple of examples: root.domain.com/96jumblestorebb42a1c2320800306682 root.domain.com/01sportsplazac9a3c52miz-63jth601 root.domain.com/39autoparts-agency26be7ff420582220 root.domain.com/05open-kitchenaf69a7a29510363 Is this a cause for concern? I'm thinking that all of these random fake urls could be preventing genuine pages from being indexed / or they could be having an impact on our search visibility. Can somebody advise please? Thanks!
Technical SEO | | nicola-10 -
Why is my crawl taking so long?
Hi There, My crawl for albertcuyp.nl is taking very long, it started on the 10th of april. I don't know whats going on but i think 2 weeks for a crawl is extremely long. Can you help me?
Technical SEO | | KnowHowww0 -
What is the verdict on using negative text indent on a slider
Hi, I am trying to work out the best way of developing a slider on a page which may include text that I'd like indexed by search engines. One method I've read about is to use negative text indent, but people seem undecided on whether this is a good / bad / fine technique with regards to SEO. I'd be interested in hearing the communities views and experience on this. Thanks in advance.
Technical SEO | | JagexSEO0 -
How to remove the 4XX Client error,Too many links in a single page Warning and Cannonical Notices.
Firstly,I am getting around 12 Errors in the category 4xx Client error. The description says that this is either bad or a broken link.How can I repair this ? Secondly, I am getting lots of warnings related to too many page links of a single page.I want to know how to tackle this ? Finally, I don't understand the basics of Cannonical notices.I have around 12 notices of this kind which I want to remove too. Please help me out in this regard. Thank you beforehand. Amit Ganguly http://aamthoughts.blogspot.com - Sustainable Sphere
Technical SEO | | amit.ganguly0 -
Why just 1 Page has been crawled till date?
We have started SEO for our nestle-family.com/english/ site. However, till date only just 1 page has been crawled. What are the reason for the pages not being crawled?
Technical SEO | | Francis_GlobalMediaInsight0 -
Crawl Errors and Duplicate Content
SEOmoz's crawl tool is telling me that I have duplicate content at "www.mydomain.com/pricing" and at "www.mydomain.com/pricing.aspx". Do you think this is just a glitch in the crawl tool (because obviously these two URL's are the same page rather than two separate ones) or do you think this is actually an error I need to worry about? Is so, how do I fix it?
Technical SEO | | MyNet0