URLs appear in Google Webmaster Tools that I can't find on my own site?!?
-
Hi,
I have a Magento e-commerce site (clothing) and when I had a look through some of the sections in Google Webmaster Tools I found URLs that I can't find on my site.
For example, a product url maybe http://www.example.co.uk/product-url/ which is fine. In that product there maybe three sizes of the product (Small, Medium, Large) and for some reason Googlebot is sometimes finding a url like:
http://www.example.co.uk/product-url/1202/ has been found and when clicked on is a live url (Status code: 200) with is one of the sizes (medium). However I have ran a site crawl in Screaming Frog and other crawl tests and can't seem to find where Googlebot is finding these URLs.
I think I need to:
1. Find how Googlebot is finding these urls?
2. Find out how to keep out of index (e.g. robots.txt, canonical etc....
Any help would be much appreciated and I'm happy to share the URL with members if they think they can have a look and help with this problem. I can share specific URLs which might make the issue seem clearer, let me know?
Thanks,
Darrell
-
No problem, glad it resolved the problem.
There are a number of possibilities, probably through one of the following;
- XML sitemap
- Faceted navigation
- Magento pinged Google when the page was created
-
Cheers John, sorted the issue! Appreciate your expertise.
-
Thanks John, your reply was really helpful and I've now done that for the 4000 simple product and now those URLs are returning 404 pages, which is great. Well, just going to see if I can find a mass import 301 redirect extension for Magento to 301 redirect these urls to the homepage so I can redirect them rather than leave as 404 pages.
How do you think Googlebot found those pages as there is no links to them? Maybe through a link when the simple products were loaded to the cart?
-
What is the visibility set to on the simple products for different sizes? If it's set to "Catalog" it will still be crawlable but not appear in your website's internal search results.
Setting the visibility to "Not Visible Individually" should resolve this issue.
-
I had a similar issue (not Magento), turns out it was in the sitemap that was submitted to WMTs, did you check there?
check the url in the open site explore too, it might tell you if any urls are linking to it
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Large Global Site Structure
Hi, I have a question about the advised structure for a website that I'm currently building. It's a large international brand with it's main office in the UK. The main website is the .com but there is a growing international franchisee network. I've built the .com site on Wordpress but I'm not sure if I the best way forward would be to create each international website on a separate hosted site or just include it in the .com Wordpress structure using the The WordPress Multilingual Plugin. So to sum up... should I build the entire global network on one domain and then use WPML plugin or should I build separate websites for each International franchisee? Hope some one can educate me on the best route to take. Thanks Moz Community
Web Design | | SeoSheikh0 -
Can Image File Names be Masked?
If we "mask" file names for our website but they are left their original name on the server, will Google notice this? Client wants to mask them in order to name them with keywords but not change on the actual server.
Web Design | | Atlanta-SMO0 -
Can external links in a menu attract a penalty?
We have some instances of external links (i.e. pointing to another domain) in site menus. Although there are legitimate reasons (e.g. linking to a news archive kept on a separate domain) I understand this can be considered bad from a usability perspective. This begs the question - is this bad for SEO? With the recent panda changes we've seen certain issues which were previously "only" about usability attract SEO penalties, but I can't find any references to this example. Anyone have thoughts / experience?
Web Design | | SOS_Children0 -
Did i got hit from some google updates.
Hello everybody, i got a problem and i hope someone can clear it up for me. my root domain authority is 42 and home page is 52 (jumped there only yesterday) ,while my google page rank is still PR2 (same for 3 month already). 1 month ago i changed my home page design (not the text) and since then my home page just disappeared from the search engines. can somebody look on my website www.kspiercing.com , and tell me if i got hit by some panda ,koala,penguin or some other sweet Google animal . thank you very much.
Web Design | | kspiercing0 -
Title tag on Google starts with company name then :
Can someone help me and tell me why Google picks up and shows the title tag as for example: SEOmoz**: SEO Software. Simplified.** Then if you click through and look at the cache version of the page it shows the title tags as just SEO Software. Simplified. So without the SEOmoz: at the start. http://webcache.googleusercontent.com/search?q=cache%3Awww.seomoz.org%2F&aq=f&oq=cache%3Awww.seomoz.org%2F&aqs=chrome.0.57j58.3052&sourceid=chrome&ie=UTF-8 Its probably something really easy and I'm going to kick myself when someone tells me but I can't figure out why?
Web Design | | i3MEDIA1 -
What site would be best to push
http://www.buypropertyanywhere.com/bulgaria/smolyan/pamporovo/P-14659.php this Is where the ad is first created it is the fresh content. It is then copied word for word to the following http://www.housesalesbulgaria.com/bulgaria/smolyan/pamporovo/P-14659.php
Web Design | | Feily
http://www.worldofproperty.cn/property-details.php?lang=14&pId=14659
http://www.buypropertyanywhere.ru/property-details.php?lang=14&pId=14659
http://www.worldwideproperty.in/property-details.php?lang=14&pId=14659 Would it be best to push buypropertyanywhere as the main site and left the others drop or to develop buypropertyanywhere with each country as a mini site within it and use the existing urls for example www.housesalesbulgaria.com and 301s to direct back to the mini site ie www.buypropertyanywhere.com/bulgaria. Thanks in advance0 -
I know frames aren't good, but are they bad?
About 3/4 of my website includes frames from the Amazon aStore, but the pages also have at least 500 words of content on them each. I understand that spiders aren't too good with frames but will search engines punish my site for having them or just disregard them? Thanks in advance.
Web Design | | Max_powers0 -
Can including advertising slots have a negative effect on SEO?
Can including advertising slots at the top and side of pages have a negative effect on SEO? Can Google detect these advertising slots? Can they work our advertising pixels to page content pixels ratio? Any ideas, suggestions, comments and opinions are greatly appreciated!
Web Design | | Peter2640