Interest in optimise Google Crawl
-
Hello,
I have an ecommerce site with all pages crawled and indexed by Google.
But I have some pages with multiple urls like : www.sitename.com/product-name.html and www.sitename.com/category/product-name.html
There is a canonical on all these pages linking to the simplest url (so Google index only one page). So the multiple pages are not indexed, but Google still comes crawling them.
My question is : Did I have any interest in avoiding Google to crawl these pages or not ?
My point is that Google crawl around 1500 pages a day on my site, but there are only 800 real pages and they are all indexed on Google. There is no particular issue, so is it interesting to make it change ?
Thanks
-
Hi!
Have you no indexed the pages too? That may help to make sure that they aren't being crawled if that's concerning you. May at least give Google another signal not to crawl those pages.
Obviously it's not a catch all as there's only so much you can do to tell Google not to crawl a page. Sometimes if the alternative page is linked to internally (which it sounds like it is), then it will automatically crawl it even though you've said it has a canonical on it as you're showing that the page is important to your site.
May be worth testing a few pages to see if it has an impact.
-
Hi there!
From my experience, the best results I was ever able to achieve for a Client is when we consolidated all URLs to a single URL solution. Canonicals are amazing, no doubt. But I've experienced a canonical structure being ignored if there are instances where the canonical structure isn't 100% 'correct.'
If there is a way that you can have your website navigation & internal/XML sitemap reinforce your preferred URL, that would certainly reduce the number of URLs Google would crawl. Then, if you permanently (301) redirect all the now non-navigable URLs to the single preferred URL, you should see a significant boost in traffic (from consolidating all of the authority into a single page, now reinforced throughout your entire website).
If that's not possible, and you have to have multiple URLs within your site for budget/platform constraints, then yes, let Google crawl them. Otherwise the algo won't be able to see your canonical tag across them.
So in short: If you have a means to reduce the number of duplicates and redirect them - awesome. If you don't have a means to reduce duplicates, opening them up to Google is good, too.
For more information on making sure your canonical structure is set up properly, check out this Moz blog post: https://moz.com/blog/rel-confused-answers-to-your-rel-canonical-questions
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Homepage is deindexed in Google
Please help for some reason my website home page has disappeared, we have been working on the site but nothing that I can think of which would block it. There are no warnings in google console? Can anyone lend a hand in understanding what has gone wrong, I would really appreciate it. The site is: http://www.discountstickerprinting.co.uk/ Seems to be working again but I had to fetch the home page in google console, any idea why this has happened cannot afford a heat op at this age lol?
Intermediate & Advanced SEO | | BobAnderson0 -
Javascript content not being indexed by Google
I thought Google has gotten better at picking up unique content from javascript. I'm not seeing it with our site. We rate beauty and skincare products using our algorithms. Here is an example of a product -- https://www.skinsafeproducts.com/tide-free-gentle-he-liquid-laundry-detergent-100-fl-oz When you look at the cache page (text) from google none of the core ratings (badges like fragrance free, top free and so forth) are being picked up for ranking. Any idea what we could do to have the rating incorporated in the indexation.
Intermediate & Advanced SEO | | akih0 -
Google serving wrong page...
Hi, When you Google: "Los Angeles divorce attorney", you will see this site on the 5th page of the SERPS: www.berenjifamilylaw.com/blog/. For some reason, Google is serving the BLOG page as opposed to the homepage. This has been going on now for several weeks. Any tips on how to fix this? Obviously, the Homepage is more relevant and has more links going to it, so not sure why it's happening. Would you just leave it alone? Would you use robots.txt to block Google from crawling the BLOG post page? Thanks.
Intermediate & Advanced SEO | | mrodriguez14400 -
Why do Local "5 pack" results vary between showing Google+, Google+ and website address
I had a client ask me a good question. When they pull up a search result they show up at the top but only with a link to their G+ page. Other competitors show their web address and G+ page. Why are these results different in the same search group? Is there a way to ensure the web address shows up?
Intermediate & Advanced SEO | | Ron_McCabe0 -
Crawling issue
Hello, I am working on 3 weeks old new Magento website. On GWT, under index status >advanced, I can only see 1 crawl on the 4th day of launching and I don't see any numbers for indexed or blocked status. | Total indexed | Ever crawled | Blocked by robots | Removed |
Intermediate & Advanced SEO | | sedamiran
| 0 | 1 | 0 | 0 | I can see the traffic on Google Analytic and i can see the website on SERPS when i search for some of the keywords, i can see the links appear on Google but i don't see any numbers on GWT.. As far as I check there is no 'no index' or robot block issue but Google doesn't crawl the website for some reason. Any ideas why i cannot see any numbers for indexed or crawled status on GWT? Thanks Seda | | | | |
| | | | |0 -
Google Places Listing Active In Two Seperate Google Places Accounts?
Hi is there any issues with having a google places listing in two seperate google places accounts. For example we have a client who cannot access their old google places account (ex-employee had their login details which they can't get) and want us to take control over the listing. If we click the "is this your listing" manage this page button - and claim the listing, will this transfer the listing to our control? Or will it create a duplicate? Are there any problems having the listing in different separate accounts. Is it a situation in which the last person who manages the listing takes control? And the listing automatically deactivates from the old account? Do all the images remain aswell? Thanks,
Intermediate & Advanced SEO | | MBASydney
Tom0 -
Unable to Crawl my Website
Hi all, I have a website that I am trying to promote, but tried to add it here in SEOMoz and got the following message: We have detected that the root domain evolving-networks.co.uk does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. Does anyone know why this website cannot be crawled? Please help. Thank you in advance!
Intermediate & Advanced SEO | | LSDigital0 -
Google indexing flash content
Hi Would googles indexing of flash content count towards page content? for example I have over 7000 flash files, with 1 unique flash file per page followed by a short 2 paragraph snippet, would google count the flash as content towards the overall page? Because at the moment I've x-tagged the roberts with noindex, nofollow and no archive to prevent them from appearing in the search engines. I'm just wondering if the google bot visits and accesses the flash file it'll get the x-tag noindex, nofollow and then stop processing. I think this may be why the panda update also had an effect. thanks
Intermediate & Advanced SEO | | Flapjack0