SeoMoz crawler giving false positives?
-
SeoMoz crawler indicated a few times that my site has a duplicate home page error (http://mysite.com and www.mysite.com)
I eliminated the the couple remaining internal links that pointed to http://mysite on a couple pages (all other internal links point to http://www.mysite.com)
I ran the crawl again and it said no errors this time. I naturally thought the duplicate page error problem was fixed.
However this morning I got the regularly scheduled crawl report from SeoMoz that said again I have those duplicate error pages. No changes were made to any of my site's pages between the crawls.
That makes me wonder if the crawler is providing false positives at times or was wrong when it said on the crawl a couple days ago that I don't have any errors (no duplicate page error).
Now, I don't know what to think.
-
Hey,
Our crawler actually requests the page http://mysite.com first but then finds all your links to www.mysite.com
You will want to contact the person responsible for hosting or developing your site in order to make these changes.
Have a great day!
Kenny
-
Thanks for the explanation. Could you answer a couple questions?
1 - If all internal site links go to www.mysite.com (none link to http://mysite.com), how does a duplicate page even happen? I don't understand how this happened to begin with if I don't have any such internal link to http://mysite.com.
2 - Can you recommend a service who can fix the htaccess page for me to create the 301 redirect? I'm not sure I want the hosting service doing it and making a mistake.
Thanks!
-
Hey,
That third campaign is actually a subdomain setup to crawl non-www. No duplicate content errors were presented because there are not any links to follow since all the links contain the subdomain www in them.
Root domain campaigns are distinguished with an astrick before the domain name.
-
Thanks - I initially thought that was it.
But if you see my 3rd campaign of the crawl, it runs it for the root domain and it shows no duplicates.
-
Hey,
I just looked into the issue that you are experiencing with our crawler. The reason the the discrepancy is because you actually have two separate campaigns running for the same site. One is set to crawl the root domain and one the subdomain.
The root domain campaign actually still presents these errors and has week over week but the sub-domain campaign is setup for the www version of your site and that's why these errors are not present, because the crawler won't even attempt to crawl off of www.
It is advisable to perform a 301 redirect as the other commenters mention.
Hope that helps!
Kenny
-
My point is the inconsistency in the SeoMoz crawler reports.
I got two SeoMoz crawl reports today - one was the regularly scheduled one which said I have duplicate home pages (as noted) and the crawl I started a couple hours ago said there are no errors.
So...how do you tell which one is right? Both cannot be since there were no changes to my website pages between the crawls.
thx
-
Hi,
If needed - this is the .htaccess code to help fix this issue; (Make sure and back up .htaccess before making any chages)
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yourdomainhere.com [NC]
RewriteRule ^(.*)$ http://www.yourdomainhere.com/$1 [L,R=301]
The above code would redirect all traffic from non www to www version of your site fixing dup content issues in that regard
Source ;http://www.webconfs.com/how-to-redirect-a-webpage.php
PS Spaces between lines not needed (funky formatting here)
Hope this helps
-
You need to redirect one of your home pages to the other. www.mysite.com is different to the crawl robot as my site.com. In addition to having the issue with seomoz, you are losing serp value for your home page because you are dividing up the SEO value. Do a 301 redirect from one to the other and voila....problem solved.
Please make sure you give me the thumbs up for the help!! Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are there ways to avoid false positive "soft 404s" by Google
Sometimes I get alerts from Google Search Console that it has detected soft 404s on different websites, and since I take great care to never have true soft 404s, they are always false positives. Today I got one on a website that has pages promoting some events. The language on the page for one event that has sold out says that "tickets are no longer available" which seems to have tripped up Google into thinking the page is a soft 404. It's kind of incredible to me that in the current era we're in, with things like chatGPT that Google doesn't seem to understand natural language. But that has me thinking, are there some strategies or best practices we can use in how we write copy on the page so Google doesn't flag it as soft 404? It seems like anything that could tell a user that an item isn't available could trip it up into thinking it is a 404. In the case of my page, it's actually important information we need to tell the public that an event has sold out, but to use their interest in that event to promote other events. so I don't want the page deindexed or not to rank well!
Technical SEO | | IrvCo_Interactive0 -
Blogspot domains - giving me a manual action
So some agency did horrendous article submissions on mass in 08/09. Since I have been tidying this up by manually getting the domains removed in our back-link profile. Some however i just cannot get rid of. Recent penguin update obviously penalised me for this, so i disavowed the rest i could not remove and did a reconsideration request. The reply from Google was still that it violates guidelines and it used 3 blogspot domains (which no crawler i used had previously found) as examples. Now there is NOONE at Google to contact about this and the sites are abandoned, so they just sit there doing damage. I will ofcourse add these to the disavow but can i disavow the whole of blogspot.com ? What if all are in the disavow but they still use it against us in the reconsideration request and i cannot remove them as noone to contact at Google? Really appreciate the help, thanks, 2 years of hell tidying up bad agency work!
Technical SEO | | pauledwards0 -
Is the seomoz on-page factor :Appropriate Use of Rel Canonical working properly?
I have a word press site with a rel canonical plug in. The rel="canonical" href= is there and the url in there works and goes to the actual page.So why does the seomoz keep giving the warning: Appropriate Use of Rel Canonical
Technical SEO | | CurtCarroll0 -
Why aren't certain links showing in SEOMOZ?
Hi, I have been trying to understand our page rank and domains that are linking to us. When I look at the list of linking domains, I see some bigger ones are missing and I don't know why. For example, we are in the Yahoo Directory with a link to trophycentral.com, but SEOMOZ is not showing the link. If SEOMOZ is not seeing it, my guess is Google is not either, which concerns me. There are several onther high page rank domains also not showing. Anyone have any idea why? Thanks! BTW, our domain is trophycentral.com
Technical SEO | | trophycentraltrophiesandawards0 -
Google Crawler Error / restricting crawling
Hi On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant. The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error. Currently Google is attempting to crawl these pages twice a day. I have implemented the following to stop this:- 1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri: http://www.domian.com/catalogsearch/advanced/result/ 2. Added Disallow to robots.txt Disallow: /catalogsearch/advanced/result/* Disallow: /catalogsearch/advanced/result/ 3. Add rel="nofollow" to the links in the site linking to the advanced search. Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status. Can anything else be done? http://www.domain.com/catalogsearch/advanced/result/?bust_line=94&category=55&color_layered=128&csize[0]=0&fabric=92&inventry_status=97&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=115&category=55&color_layered=130&csize[0]=0&fabric=0&inventry_status=97&length=116&price=3%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=94&category=55&color_layered=126&csize[0]=0&fabric=92&inventry_status=97&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=137&csize[0]=0&fabric=93&inventry_status=96&length=0&price=8%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=142&csize[0]=0&fabric=93&inventry_status=96&length=0&price=4%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=137&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=142&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=135&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=128&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=127&csize[0]=0&fabric=93&inventry_status=96&length=0&price=4%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=127&csize[0]=0&fabric=93&inventry_status=96&length=0&price=3%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=128&csize[0]=0&fabric=93&inventry_status=96&length=0&price=10%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=122&csize[0]=0&fabric=93&inventry_status=96&length=0&price=8%2C10
Technical SEO | | Flipmedia1120 -
How can affect the website redesign to my ranking position in Search Engines?
Hi, I have a few questions for you: I’ll will update my booking system and my website design. Now, I'm ranked in number one position with the keyword HOTELES EN CHIAPAS. In fact, several urls of my webiste appear in the search engines. Internal URLs like this: www.hotelesenchiapas.com.mx/obmp30/hotel/villa_mercedes_palenque/1/es/ My question is: I need to conserve this link structure or may i change it for something more friendly like this: www.hotelesenchiapas.com.mx/Palenque/Hoteles/Villa-mercedes-palenque/ And how affect this change to my rank position ?
Technical SEO | | hotelesenchiapas0 -
Seomoz is showing duplicate page content for my wordpress blog
Hi Everyone, My seomoz crawl diagnostics is indicating that I have duplicate content issues in the wordpress blog section of my site located at: http://www.cleversplash.com/blog/ What is the best strategy to deal with this? Is there a plugin that can resolve this? I really appreciate your help guys. Martin
Technical SEO | | RogersSEO0