My crawl diagnostic is showing 2 duplicate content and titles.
-
First of all Hi - My name is Jason and I've just joined - How you all doing?
My 1st question then:
When I view where these errors are occurring it says www mydomain co uk and www mydomain co uk/index.html
Isn't this the same page? I have looked into my root folder and only index.html exists.
-
Thanks Daniel!!!!
Looks like I'll be spending some time in the ol Q&A section
-
Hi Jason
That's perfect(and working) for redirecting all non www pages. You still need to decide on your original question regarding the index page!
Add the following to your htaccess file after the code you've already added :
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/ RewriteRule ^index\.html$ http://www.keystonemortgages.co.uk/ [R=301,L]
-
This is what I have added does it look OK?
Options +FollowSymLinks
RewriteEngine On
RewriteCond %{HTTP_HOST} ^keystonemortgages.co.uk
RewriteRule (.*) http://www.keystonemortgages.co.uk/$1 [R=301,L]
-
Thanks Alex I'll sit back and wait to see! fingers crossed it has a positive effect
-
It's not essential but still useful to have the canonical tag, it doesn't redirect but tells the crawlers the preferred address if you have more than one page/address containing the same content. It's particularly useful for duplicate content like print versions, or results pages based on query strings.
What you've actioned could improve the rankings to your homepage as before, the search engines will have seen the two addresses as separate pages, therefore competing against each other. When the redirect comes into effect the page authority and other metrics of both will be combined into one.
-
Hi quick update:
Just this mean I won't need a canonical tag in my header? Do they do similar things?
I've created the htaccess file and dropped it into my root folder at 1and1 so I guess now I sit back for a re crawl to see if it works?
One more thing. Is this likely to effect rankings?
-
Hey thanks Daniel cheers for the welcome
That sounds simple but I ain't got a clue how to do this so I'll start searching and let you know my results
Thanks for the lead,
Jason
-
Hi Jason and welcome to seoMoz!
If you do not have a rewrite in place(in your htaccess file) then both the www.mydomain.co.uk AND www.mydomain.co.uk/index.html will resolve in your browser. I suggest doing a 301 rewrite and while you're at it make sure you also rewrite all non www varients -> eg http://mydomain.co.uk 301 redirect that to http://www.mydomain.co.uk
You can then go set your preferred version in your Webmaster Tools.
This would be the code for apache(check with your host/developer if you're unsure) :
RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.co.uk
RewriteRule (.*) http://www.example.co.uk/$1 [R=301,L]
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Missing Title for Sitemap
Our site is built on Wordpress and we use a very popular SEO plugin called Yoast to generate our sitemap (as well as handle multiple other SEO functions). When MOZ's spider crawls our site, this sitemap triggers an error saying "Missing Title or Empty." My question is how can I avoid having this error hurt me in terms of my rankings. It seems strange to me that such a ubiquitous plugin would be generating something as important as a sitemap in an incorrect format.
Moz Pro | | ShatterBuggy0 -
Crawl Diagnostics
My site was crawled last night and found 10,000 errors due to a Robot.txt change implemented last week in between Moz crawls. This is obviously very bad so we have corrected it this morning. We do not want to wait until next Monday (6 days) to see if the fix has worked. How do we force a Moz crawl now? Thanks
Moz Pro | | Studio330 -
Crawl diagnostics incorrectly reporting duplicate page titles
Hi guys, I have a question in regards to the duplicate page titles being reported in my crawl diagnostics. It appears that the URL parameter "?ctm" is causing the crawler to think that duplicate pages exist. In GWT, we've specified to use the representative URL when that parameter is used. It appears to be working, since when I search site:http://www.causes.com/about?ctm=home, I am served a single search result for www.causes.com/about. That begs the question, why is the SEOMoz crawler saying there is duplicate page titles when Google isn't (doesn't appear under the HTML improvements for duplicate page titles)? A canonical URL is not used for this page so I'm assuming that may be one reason why. The only other thing I can think of is that Google's crawler is simply "smarter" than the Moz crawler (no offense, you guys put out an awesome product!). Any help is greatly appreciated and I'm looking forward to being an active participant in the Q&A community! Cheers, Brad
Moz Pro | | brad_dubs0 -
Duplicate page report
We ran a CSV spreadsheet of our crawl diagnostics related to duplicate URLS' after waiting 5 days with no response to how Rogerbot can be made to filter. My IT lead tells me he thinks the label on the spreadsheet is showing “duplicate URLs”, and that is – literally – what the spreadsheet is showing. It thinks that a database ID number is the only valid part of a URL. To replicate: Just filter the spreadsheet for any number that you see on the page. For example, filtering for 1793 gives us the following result: | URL http://truthbook.com/faq/dsp_viewFAQ.cfm?faqID=1793 http://truthbook.com/index.cfm?linkID=1793 http://truthbook.com/index.cfm?linkID=1793&pf=true http://www.truthbook.com/blogs/dsp_viewBlogEntry.cfm?blogentryID=1793 http://www.truthbook.com/index.cfm?linkID=1793 | There are a couple of problems with the above: 1. It gives the www result, as well as the non-www result. 2. It is seeing the print version as a duplicate (&pf=true) but these are blocked from Google via the noindex header tag. 3. It thinks that different sections of the website with the same ID number the same thing (faq / blogs / pages) In short: this particular report tell us nothing at all. I am trying to get a perspective from someone at SEOMoz to determine if he is reading the result correctly or there is something he is missing? Please help. Jim
Moz Pro | | jimmyzig0 -
When I did my first crawl, I was given some errors.
Do I then need to re-crawl to make sure the errors were fixed accordingly?
Moz Pro | | immortalgamer0 -
Still Cant Crawl My Site
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us. I did a fetch as google in our WM tools on our robots txt with success. SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there. What is going on here?
Moz Pro | | martJ0 -
SEOMOZ Crawling Our Site
Hi there, We get a report from SEOMOZ every week which shows our performance within search. I noticed for our website www.unifor.com.au that it looks through over 10,000 pages, however our website sells less than 500 products so not sure why or how so many pages are trawled? If someone could let me know that would be great. It uses up a lot of bandwidth doing each of these searches so if the amount of pages being trawled reduced it would definitely assist. Thanks, Geoff
Moz Pro | | BeerCartel750 -
"Issue: Duplicate Page Content " in Crawl Diagnostics - but these pages are noindex
Hello guys, our site is nearly perfect - according to SEOmoz campaign overview. But, it shows me 5200 Errors, more then 2500 Pages with Duplicate Content plus more then 2500 Duplicated Page Titles. All these pages are sites to edit profiles. So I set them "noindex, follow" with meta robots. It works pretty good, these pages aren't indexed in the search engines. But why the SEOmoz tools list them as errors? Is there a good reason for it? Or is this just a little bug with the toolset? The URLs which are listet as duplicated are http://www.rimondo.com/horse-edit/?id=1007 (edit the IDs to see more...) http://www.rimondo.com/movie-edit/?id=10653 (edit the IDs to see more...) The crawling picture is still running, so maybe the errors will be gone away in some time...? Kind regards
Moz Pro | | mdoegel0