URLs appear in Google Webmaster Tools that I can't find on my own site?!?
-
Hi,
I have a Magento e-commerce site (clothing) and when I had a look through some of the sections in Google Webmaster Tools I found URLs that I can't find on my site.
For example, a product url maybe http://www.example.co.uk/product-url/ which is fine. In that product there maybe three sizes of the product (Small, Medium, Large) and for some reason Googlebot is sometimes finding a url like:
http://www.example.co.uk/product-url/1202/ has been found and when clicked on is a live url (Status code: 200) with is one of the sizes (medium). However I have ran a site crawl in Screaming Frog and other crawl tests and can't seem to find where Googlebot is finding these URLs.
I think I need to:
1. Find how Googlebot is finding these urls?
2. Find out how to keep out of index (e.g. robots.txt, canonical etc....
Any help would be much appreciated and I'm happy to share the URL with members if they think they can have a look and help with this problem. I can share specific URLs which might make the issue seem clearer, let me know?
Thanks,
Darrell
-
No problem, glad it resolved the problem.
There are a number of possibilities, probably through one of the following;
- XML sitemap
- Faceted navigation
- Magento pinged Google when the page was created
-
Cheers John, sorted the issue! Appreciate your expertise.
-
Thanks John, your reply was really helpful and I've now done that for the 4000 simple product and now those URLs are returning 404 pages, which is great. Well, just going to see if I can find a mass import 301 redirect extension for Magento to 301 redirect these urls to the homepage so I can redirect them rather than leave as 404 pages.
How do you think Googlebot found those pages as there is no links to them? Maybe through a link when the simple products were loaded to the cart?
-
What is the visibility set to on the simple products for different sizes? If it's set to "Catalog" it will still be crawlable but not appear in your website's internal search results.
Setting the visibility to "Not Visible Individually" should resolve this issue.
-
I had a similar issue (not Magento), turns out it was in the sitemap that was submitted to WMTs, did you check there?
check the url in the open site explore too, it might tell you if any urls are linking to it
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best URL when adding an SSL certificate . . .
Our (small) company is a little late to the party on this, and we've only just realised that we're better off with an SSL certificate for our website. (Yes I know, I know, but we dropped SEO some time ago after getting severely bitten by a certain Penguin, and are only just making tentative step back to it after those intervening years, so we're running to get back up to date with these things.) This has now been implemented, but our web guy has dropped the 'www' element during the process. Our http://domain.com address has always historically been redicrected to our main http://www.domain.com address. Now our web guy has implemented the SSL cert, our website URL is appearing as https://domain.com, and he has redirected the http://www.domain.com to that new URL. Obviously all our historic (and more recent) link building has been to the http://www.domain.com address. Is this an issue, should the new Https URL keep the 'www', or does it make no difference what so ever? Conversely could it actually be of benefit dropping the 'www.' because our keyword specific product URL's are now 4 characters closer to the http and 4 digits shorter? Finally, on the links we have control of (professional trade associations etc) do we need to ask them to change the links to the new Https address, or does the transition from Http to Https make no difference?
Web Design | | Wookii0 -
How would a redesign, content update and URL change affect ranking?
Hi guys, I have a question that I suspect there is no simple true or false answer to, but perhaps someone has done the same thing as we're pondering wether or not to do? We're taking over an existing site that ranks very well on all the important keywords and is obviously very well liked by Google. The site is today hosted on a sub-domain (xxx.domain.com). When taking over, we'll have to redesign the site and recreate most of the content on the site (unique). The site structure, URLs, incoming links etc. will remain exactly the same. Since we are recreating the site, we also have the opportunity to move the site off the sub-domain and on to the main domain (domain.com/xxx - 85/100 Moz rank) and do a 301 Permanent Redirect on all old URLs. Our long-time experience is that content on the main domain, ranks way better than the sub-domain. The big question is wether or not Google will punish us for both changing the content and the location of the site at the same time? Cheers!
Web Design | | mattbs
Matt0 -
How Does Google differentiate a keyword you are optimizing for and a non-keyword?
So, let's say that my company is called John's Business Consulting and I offer outsourced HR work (recruiting, evaluating, personality assessments, background checks). So for my home page I want "Business Consulting" to be my keyword that I want to rank for. But "recruiting services", "talent development" are all words that describe a service that I offer and could potential be keywords, how do I get Google to not dilute my authority for "business consulting"?
Web Design | | wlw20090 -
Google Tag Manager
I recently discovered the Google Tag Manager and I am in the process of updating many of my websites with this feature. I am using Tag Manager to mange Google Analytics, Google Remarketing, Alive Chat, Woopra, etc. I have one question about how Tag Manager actually works. As best I can tell, the Tag Manager code snippet that I insert into my web pages is the same for all my websites and does not include a unique ID. If that is the case, then Tag Manager must search all the URLs in the TM database to find a match. What is to stop someone else from adding some rules for my URLs to their containers? I expect Google has a method to ensure proper matching, but I'm not clear on how that is enforced. Best,
Web Design | | ChristopherGlaeser
Christopher0 -
Whats happening with Google UK?
Within the last week we have had a handful of our rankings drop dramatically down the SERPS. About 15% but this an estimate and has not been fully investigated yet. Whilst looking into possible scenarios that could be causing this i wanted to check what the SERPS looked like for the terms that we are still holding position on. Typing "extending dining tables" into Google UK today i was amazed at what i found... Ranking in position 1 and 2 is a massive UK furniture store.
Web Design | | Silkstream
But isnt that the same landing page being returned for both positions?? It appears to be a navigation problem within the site category tags causing duplicate content. However they have been rewarded with the top two positons subsequently pushing our website onto page two. I find it so frustrating that we listen to Googles best practices when it comes to pagination issues yet this is how our hard work is rewarded! Anyone else have any thoughts about this? SERPS.jpg0 -
Why can't I ask this question - It is not too short
I tried to post a question which was at least 15 words long and received an error saying the question was less than 5 characters QrXcp
Web Design | | FFTCOUK0 -
What reason would scrapers, and syndication sites outrank all of our content?
Typing in any of our titles for content, scrapers and content syndication sites all outrank us by quite a bit. What is the main reason for this usually? I started noticing this happening quite a bit this year, and think maybe it has to do with panda. Has anyone figured out the reasoning?
Web Design | | upbuiltgames0 -
Where to find high quality (affordable) web designers?
Hi everyone, I am looking for find high quality web designers that are affordable. I am open to many options. There are several things I have looked into. 1. I have looked for designers via CSS galleries, but I don't really know how to get in touch with designers or find them. Rand recently talked about this in a webinar, but if anyone has specific insights on how to find people this way, please let me know. 2. I have also looked into website design contests from sites such as: DesignCrowd.com 99designs.com CrowdSpring.com DesignContest.com I haven't used these services and I was wondering if anyone has experience with design contests. 3. I have looked into the option of hiring a freelancer on oDesk or a similar freelancer site. I don't really know the cost, how to find a good designer, how to avoid inexperienced but cheap designers and all the other such roadblocks that come along with freelancers. If anyone could provide insight into this, it would be greatly appreciated.
Web Design | | alexhoug0