URLs appear in Google Webmaster Tools that I can't find on my own site?!?
-
Hi,
I have a Magento e-commerce site (clothing) and when I had a look through some of the sections in Google Webmaster Tools I found URLs that I can't find on my site.
For example, a product url maybe http://www.example.co.uk/product-url/ which is fine. In that product there maybe three sizes of the product (Small, Medium, Large) and for some reason Googlebot is sometimes finding a url like:
http://www.example.co.uk/product-url/1202/ has been found and when clicked on is a live url (Status code: 200) with is one of the sizes (medium). However I have ran a site crawl in Screaming Frog and other crawl tests and can't seem to find where Googlebot is finding these URLs.
I think I need to:
1. Find how Googlebot is finding these urls?
2. Find out how to keep out of index (e.g. robots.txt, canonical etc....
Any help would be much appreciated and I'm happy to share the URL with members if they think they can have a look and help with this problem. I can share specific URLs which might make the issue seem clearer, let me know?
Thanks,
Darrell
-
No problem, glad it resolved the problem.
There are a number of possibilities, probably through one of the following;
- XML sitemap
- Faceted navigation
- Magento pinged Google when the page was created
-
Cheers John, sorted the issue! Appreciate your expertise.
-
Thanks John, your reply was really helpful and I've now done that for the 4000 simple product and now those URLs are returning 404 pages, which is great. Well, just going to see if I can find a mass import 301 redirect extension for Magento to 301 redirect these urls to the homepage so I can redirect them rather than leave as 404 pages.
How do you think Googlebot found those pages as there is no links to them? Maybe through a link when the simple products were loaded to the cart?
-
What is the visibility set to on the simple products for different sizes? If it's set to "Catalog" it will still be crawlable but not appear in your website's internal search results.
Setting the visibility to "Not Visible Individually" should resolve this issue.
-
I had a similar issue (not Magento), turns out it was in the sitemap that was submitted to WMTs, did you check there?
check the url in the open site explore too, it might tell you if any urls are linking to it
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving servers which means moving ip address but using the same URL. Would it harm the website's SEO?
Hello everyone, The server (in-house) which we use to host our website is a bit old. We are using CDN77 for our static content. What if I move all our website to the CDN service? meaning I use their storage capability and just have our url point to the IP address they provide. Would that hurt our rankings?
Web Design | | Edgar-Cerecerez0 -
Website Redesign - What to do with old 301 URLs?
My current site is on wordpress. We are currently designing a new wordpress site, with the same URLs. Our current approach is to go into the server, delete the current website files and ad the new website files. My current site has old urls which are 301 redirected to current urls. Here is my question. In the current redesign process, do i need to create pages for old the 301 redirected urls so that we do not lose them in the launch of the new site? or is the 301 command currently existing outside of our server so this does not matter? Thank you in advance.
Web Design | | CamiloSC0 -
Site as one page - SEO implications
We may be inheriting a site and will be asked to do SEO for it. We will have control over the development of the site, so this structure is what it is. My question is - how significant of an impact do you think this is going to have and can you think of any workarounds that may help? Basically, the user experience of the site will feel similar to multiple pages. However, this site will, in essence be one page and pull various content through javascript from different locations. I have not seen the site yet (and believe it is still in development), but this is how it has been explained to me. Any thoughts? My first thought was to add a blog to add page depth to the site and expand the content. Any other thoughts are welcome and appreciated. Thanks. (I know this is limited information, I'm sorry. It's just about all I have to work with right now, and I was a little concerned and was hoping for a second opinion)
Web Design | | AdamWormann0 -
How does a Responsive Site kill SEO?
How does a Responsive Site poentially kill SEO? I've seen a few feeds on twitter how a website took a rankings dive after implementing a Responsive theme; yet, it's not clear to me what is actually going on within a Responsive site that would cause the SEO rank to tank? I can only speculate that it introduces a bunch of 404 errors, or that it changes all of the URLs into gibberish, so you loose all of the links coming into your website if not 301'ed? Can someone clarify, what are the actual mechanical issues on a Responsive website that becomes a concern to SEO? Thanks.
Web Design | | ExploreConsulting1 -
Comparing the site structure/design of my live site to my new design
Hi SEOmoz team, for the last few months I've been working on a new design for my website, the old, live design can be viewed at http://www.concerthotels.com - it is primarily focused on helping users find hotels close to concert venues throughout North America. The old structure was built in such a way that each concert venue had a number of different pages associated with it (all connected via tabs) - a page with information about the venue, a page with nearby hotels to the venue, a page of upcoming events, a page of venue reviews. An example of these pages can be seen at: http://www.concerthotels.com/venue/madison-square-garden/304484 http://www.concerthotels.com/venue-hotels/madison-square-garden-hotels/304484 http://www.concerthotels.com/venue-events/madison-square-garden-events/304484 http://www.concerthotels.com/venue-reviews/madison-square-garden-reviews/304484 The /venue-hotels/ pages are the most important pages on my website - and there is one of these pages for each concert venue - they are the landing pages for about 90% of the traffic on the website. I decided that having four pages for each venue was probably a poor design, since many of the pages ended up having little or no useful, unique content. So my new design attempts to bring a lot of the venue information together into fewer pages. My new website redesign is temporarily situated at: (not currently launched to the public) http://www.concerthotels.com/frontend The equivalent pages for Madison Square Garden are now: http://www.concerthotels.com/frontend/venue/madison-square-garden/304484 (the page above contains venue information, events and reviews) and http://www.concerthotels.com/frontend/venue-hotels/madison-square-garden-hotels/304484 I would really appreciate any feedback from you guys, based on what you think of the new site design compared to the old design from an SEO point of view. Of course, any feedback on site speed, easy of use etc compared to the old design would also be greatly appreciated. 🙂 My main fear is that when I launch the new design (the new URLs will be identical to the old ones), Google will take a dislike to it - I currently receive a large percentage of my traffic through Google organic search, so I don't want to launch a design that might damage that traffic. My gut instinct tells me that Google should prefer the new design - vastly reduced number of pages, each page now contains more unique content, and it's very much designed for users, so I'm hoping bounce rate, conversion etc will improve too. But my gut has been wrong in the past! 🙂 But I'd love to hear your thoughts, and thanks in advance for any feedback, Cheers Mike
Web Design | | mjk260 -
What is the best tool to view your page as Googlebot?
Our site was done with asp.net and a lot of scripting. I want to see what Google can see and what it can't. What is the best tool that duplicates Googlebot? I have found several but they seem old or inaccurate.
Web Design | | EcommerceSite0 -
Suggestions For My Ecommerce Site
I am starting to work on an ecommerce site that I am part owner of. My partner who is the other owner started the site a while back and because he has no internet marketing experience the site didn't come out very well. I am currently overhauling the site and here is a list of things first on my list so I can at least get started on some seo and even ppc. I would really appreciate it if you could take a look at our store and see if you think I am missing anything or you could suggest anything else that really should be done immediately or something that is wrong. The site is www.clubfitnesswarehouse.com Update magento to newest version Fix url structure to be SEO friendly for homepage, category pages, product pages etc. For an example of a site that has great seo friendly urls please refer to examples on http://www.bigfitness.com/ site. For example these pages. http://www.bigfitness.com/treadmillstore.html , http://www.bigfitness.com/bosplincy.html , Remove top navigation menu and instead create left sidebar navigation menu created to navigate site and products. For example of these types of sidebars please refer to sidebar on http://www.bigfitness.com/ or http://www.americanfitness.net/ We will NOT be using same structure as what is in top navigation currently. Categories and keywords of categories will be changed and some will stay the same structure as is currently in navigation For header we would like to feature our shipping policy, returns, privacy policy, no sales tax etc. For example I have seen that I like refer to http://www.spiderofficechairs.com/ Obviously we do not want to copy but something similar for our own site. All images on homepage must be changed to include clickable txt so google can read text not just pictures. We would also like product pages modified to make more user friendly and conversion increased. Add to cart button needs to be changed, text needs to be brighter instead of dull grey color. Add to cart button must also be moved ABOVE THE FOLD! Also on product pages we would like to add sections in addition to Product Description, of Specs, and About section for each product page. We also need the ability to change this information as needed when we need to.
Web Design | | PEnterprises0 -
How long does Google take to re-cache a site?
Specifically, I just redesigned my site. I'm reading Danny Dovers book, and learned about checking the cache version of the site to see what google is REALLY seeing . . . . . . which evidently is my old site. Obviously, my sites not going to make any real progress with SEO as long as the site is out of date. It says it last checked the site on 5/5 and I launched the site on 5/9. Obviously, it does not do these things immediately, but anyone have any ideas on how long it should take before google starts to show me some love?
Web Design | | damon12120