GWT and html improvements
-
Hi all
I am dealing with duplicate content issues on webmaster tool but I still don't understand what's happening as the number of issues keeps changing. Last week the duplicate meta description were 232, then went down to 170 now they are back to 218.
Same story for duplicate meta title, 110, then 70 now 114. These ups and downs have been going on for a while and in the past two weeks I stopped changing things to see what would have happened.
Also the issues reported on GWT are different from the ones shown in the Crawl Diagnostic on Moz.
Furthermore, most URL's have been changed (more than a year ago) and 301 redirects have been implemented but Google doesn't seem to recognize them.
Could anyone help me with this?
Also can you suggest a tool to check redirects?
Cheers
Oscar
-
Thank you guys for your answers, I will look into it, and try to solve the problems.
I think many pages are self canonicalized, but I see that many URL's haven't been redirect to the new ones so I will start fixing the redirects.
In the top pages report though shows just the new URL's.
Anyway, I will keep you update on this as I am not too sure how to tackle this.
Thanks a lot.
Cheers
-
Had a few minutes and wanted to help out...
Google doesn't always index/crawl the same # of pages week over week, so this could be the cause of your indexing/report problem with regards to the differences you are seeing. As well, if you are working on the site and making changes, you should be seeing these numbers improve (depending on site size of course Enterprise sites might take more time to go through and fix up, so these numbers might look like they are staying at the same rate - if your site is huge
To help with your 301 issue - I would definitely look up and download SEO Screaming Frog. It's a great tool to use to identify potential problems on the site. Very easy to download and use. Might take some getting used too, but the learning curve isn't very hard. Once you use it a few times to help diagnose problems, or see things you are working on improve through multiple crawling. It will allow you to see some other things that might not be working and get to planning fixes there too
As well, make sure to review your .htaccess file and how you have written up your 301's. If you are using Apache, this is a great resource to help you along. Read that 301 related article here
Make sure to manually check all 301 redirects using the data/URL's from the SEO Screaming Frog tool. Type them in and visually see if you get redirected to the new page/URL. If you do, it's working correctly, and I'm sure it will only be a matter of time before Google fixes their index and displays the right URL or 301. You can also check this tool for verifying your 301 redirects using the old URL and see how it performs (here)
Hope some of this helps to get you off to working/testing and fixing! Keep me posted if you are having trouble or need someone to run a few tests from another location.
Cheers!
-
We had the same issue on one of our sites. Here is how I understand it after looking into it and talking to some other SEOs.
The duplicate content Title and Meta description seem to lag any 301 redirects or canonicals that you might implement. We went through a massive site update and had 301s in place for over a year with still "duplicates" showing up in GWT for old and new URLs. Just to be clear, we had the old URLs 301ing to the new ones for over a year.
What we found too, was that if you look into GWT under the top landing pages, we would have old URLs listed there too.
The solution was to put self canonicalizing links on all pages that were not canonicaled to another one. This cleaned thing up over the next month or so. I had checked my 301 redirects. I removed all links to old content on my site, etc.
What is still find are a few more "duplicates" in GWT. This happens on two types of URLs
-
We have to change a URL for some reason - we put in the 301. It takes a while for Google to pick that up and apply it to the duplicate content report. This is even when we see it update in the index pretty quick. As, I said, the duplicate report seems to lag other reports.
-
We still have some very old URLs that it has taken Google a while to "circle back" and check them, see the 301 and the self canonical and fix.
I am honestly flabbergasted at how Google is so slow about this and surprised. I have talked with a bunch of people just to make sure we are not doing anything wrong with our 301s etc. So, while I understand what is happening, and see it improving, I still dont have a good "why" this happens when technically, I have everything straight (as far as I know). The self canonical was the solution, but it seems that a 301 should be enough. I know there are still old links to old content out there, that is the one thing I cannot update, but not sure why.
It is almost like Google has an old sitemap it keeps crawling, but again, I have that cleared out in Google as well
If you double check all your stuff and if you find anything new, I would love to know!
Cheers!
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I add my html sitemap to Robots?
I have already added the .xml to Robots. But should I also add the html version?
Technical SEO | | Trazo0 -
Duplicate Pages on GWT when redesigning website
Hi, we recently redesigned our online shop. We have done the 301 redirects for all product pages to the new URL (and went live about 1.5 week ago), but GWT indicated that the old product URL and the new product URL are 2 different pages with the same meta title tags (duplication) - when in fact, the old URL is 301 redirecting to the new URL when visited. I found this article on google forum: https://productforums.google.com/forum/#!topic/webmasters/CvCjeNOxOUw
Technical SEO | | Essentia
It says we either just wait for Google to re-crawl, of use the fetch URL function for the OLD URLs. Question is, after i fetch the OLD URL to tell Google that it's being redirected, should i click the button 'submit to index' or not? (See screengrab - please note that it was the OLD URL that was being fetched, not the NEW URL). I mean, if i click this button, is it telling Google that: a. 'This old URL has been redirected, therefore please index the new URL'? or
b. 'Please keep this old URL in your index'? What's your view on this? Thanks1 -
Advice on improve this content page for seo and google
Hi, i use joomla and i am looking for some help to find out what i should be doing to make my content pages better for seo and google. I would be grateful if people would look at the following page as an example http://www.in2town.co.uk/trip-advisor/top-american-ski-resorts-for-over-50s and let me know what i should be doing to make it better for seo and for google so people can find the page. I am using the above page as an example so i can learn from it. I would be grateful if people could look at the source code for the page to see if there is anything that should be in their that is not and if i should be looking at any joomla plugins for the content pages to improve the seo of the page. Any help to improve my seo for my content pages would be great. many thanks
Technical SEO | | ClaireH-1848860 -
How best to deal with www.home.com and www.home.com/index.html
Firstly, this is for an .asp site - and all my usual ways of fixing this (e.g. via htaccess) don't seem to work. I'm working on a site which has www.home.com and www.home.com/index.html - both URL's resolve to the same page/content. If I simply drop a rel canonical into the page, will this solve my dupe content woes? The canonical tag would then appear in both www.home.com and www.home.com/index.html cases. If the above is Ok, which version should I be going with? - or - Thanks in advance folks,
Technical SEO | | Creatomatic
James @ Creatomatic0 -
How can I find my Webmaster Tools HTML file?
So, totally amateur hour here, but I can't for the life of me find our HTML verification file for webmaster tools. I see nowhere to look at it in Google Webmaster Tools console, I tried a site:, I googled it, all the info out there is about how to verify a site. Ours is verified, but I need the verification file code to sync up with the Google API and no one seems to have it. Any thoughts?
Technical SEO | | healthgrades0 -
Google (GWT) says my homepage and posts are blocked by Robots.txt
I guys.. I have a very annoying issue.. My Wordpress-blog over at www.Trovatten.com has some indexation-problems.. Google Webmaster Tools data:
Technical SEO | | FrederikTrovatten22
GWT says the following: "Sitemap contains urls which are blocked by robots.txt." and shows me my homepage and my blogposts.. This is my Robots.txt: http://www.trovatten.com/robots.txt
"User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/ Do you have any idea why it says that the URL's are being blocked by robots.txt when that looks how it should?
I've read a couple of places that it can be because of a Wordpress Plugin that is creating a virtuel robots.txt, but I can't validate it.. 1. I have set WP-Privacy to crawl my site
2. I have deactivated all WP-plugins and I still get same GWT-Warnings. Looking forward to hear if you have an idea that might work!0 -
Different HTML based on resolution
Is it acceptable in terms of SEO to display different HTML based on a users resolution size? I feel I'm wasting space on my site catering for all the 1024 x 768ers
Technical SEO | | niallfred0 -
Google News not indexing .index.html pages
Hi all, we've been asked by a blog to help them better indexing and ranking on Google News (with the site being already included in Google News with poor results) The blog had a chronicle URL duplication problem with each post existing with 3 different URLs: #1) www.domain.com/post.html (currently in noindex for editorial choices as showing all the comments) #2) www.domain.com/post/index.html (currently indexed showing only top comments) #3) www.domain.com/post/ (very same as #2) We've chosen URL #2 (/index.html) as canonical URL, and included a rel=canonical tag on URL #3 (/) linking to URL #2.
Technical SEO | | H-FARM
Also we've submitted yesterday a Google News sitemap including consistently the list of URLs #2 from the last 48h . The sitemap has been properly "digested" by Google and shows that all URLs have been sent and indexed. However if we use the site:domain.com command on Google News we see something completely different: Google News has indexed actually only some news and more specifically only the URLs #3 type (ending with the trailing slash instead of /index.html). Why ? What's wrong ? a) Does Google News bot have problems indexing URLs ending with .index.html ? While figuring out what's wrong we've found out that http://news.google.it/news/search?aq=f&pz=1&cf=all&ned=us&hl=en&q=inurl%3Aindex.html gives no results...it seems that Google News index overall does not include any URLs ending with /index.html b) Does Google News bot recognise rel=canonical tag ? c) Is it just a matter of time and then Google News will pick up the right URLs (/index.html) and/or shall we communicate Google News team any changes ? d) Any suggestions ? OR Shall we do the other way around. meaning make URL #3 the canonical one ? While Google News is showing these problems, Google Web search has actually well received the changes, so we don't know what to do. Thanks for your help, Matteo0