Huge Google index on E-commerce site
-
Hi Guys,
Refering back to my original post I would first like to thank you guys for all the advice.
We implemented canonical url's all over the site and noindexed some url's with robots.txt and the site already went from 100.000+ url's indexed to 87.000 urls indexed in GWT.
My question: Is there way to speed this up?
I do know about the way to remove url's from index (with noindex of robots.txt condition) but this is a very intensive way to do so.I was hoping you guys maybe have a solution for this..
-
Hi,
A few weeks later now and index is now on 63.000 url's so that's a good thing.
Another weird thing is the following.
There's a (old) url still in the index. When i visit it redirects me to the new url, which is good. Cache date is 2 weeks ago but Google still shows the old url.
How is this possible? The 301 redirect is already in place since April 2013.
-
Hi allen Jarosz!
Thanks for your reply
I've actually done all the things you said in the last few weeks. Site is totally indexed but the main problem is that are over 85.000 url's indexed but the site only exists of 13.000 urls.
So the main question is wether i can speed things up in one way or another to get those 70.000 url's deindexed.Are any options besides noindex, robots.txt and removing some url's ? Because now it's just waiting.
It looks like we are going the right way when you check the image.
-
SSiebn,
I have had some success in speeding things up, but only to a point.
Google webmaster tools is a GREAT tool that fortunately for us Google allows us to use, and its free!
I'm sure you probably already use the service, but I have found a few ways to use the tools to improve their scan rate. First block the spiders from crawling any pages you don't want indexed, for instance your backend files, this allows more time to be spent on the pages you want indexed. Second ensure you pages link to each other in the site, this allows pages to be linked by flowing through to each other, (no dead ends). Third use "Fetch as Google" from WMT, you are allowed up to 10 fetches. These fetches can be configured to follow linking pages, once crawled, you may submit the results to the Google index, with up to 500 fetches. It may be beneficial to submit for "Fetch as Google" your main categories. Lastly check your "Crawl Rate" to ensure that you have chosen "<label for="recommendedType">Let Google optimize for my site (recommended)</label>"
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do internal search results get indexed by Google?
Hi all, Most of the URLs that are created by using the internal search function of a website/web shop shouldn't be indexed since they create duplicate content or waste crawl budget. The standard way to go is to 'noindex, follow' these pages or sometimes to use robots.txt to disallow crawling of these pages. The first question I have is how these pages actually would get indexed in the first place if you wouldn't use one of the options above. Crawlers follow links to index a website's pages. If a random visitor comes to your site and uses the search function, this creates a URL. There are no links leading to this URL, it is not in a sitemap, it can't be found through navigating on the website,... so how can search engines index these URLs that were generated by using an internal search function? Second question: let's say somebody embeds a link on his website pointing to a URL from your website that was created by an internal search. Now let's assume you used robots.txt to make sure these URLs weren't indexed. This means Google won't even crawl those pages. Is it possible then that the link that was used on another website will show an empty page after a while, since Google doesn't even crawl this page? Thanks for your thoughts guys.
Intermediate & Advanced SEO | | Mat_C0 -
Can Google Crawl & Index my Schema in CSR JavaScript
We currently only have one option for implementing our Schema. It is populated in the JSON which is rendered by JavaScript on the CLIENT side. I've heard tons of mixed reviews about if this will work or not. So, does anyone know for sure if this will or will not work. Also, how can I build a test to see if it does or does not work?
Intermediate & Advanced SEO | | MJTrevens0 -
No images in Google index
No images are indexed on this site (client of ours): http://www.rubbermagazijn.nl/. We've tried everything (descriptive alt texts, image sitemaps, fetch&render, check robots) but a site:www.rubbermagazijn.nl shows 0 image results and the sitemap report in Search Console shows 0 images indexed. We're not sure how to proceed from here. Is there anyone with an idea what the problem could be?
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
Google Index Constantly Decreases Week over Week (for over 1 year now)
Hi, I recently started working with two products (one is community driven content), the other is editorial content, but I've seen a strange pattern in both of them. The Google Index constantly decreases week over week, for at least 1 year. Yes, the decrease increased 🙂 when the new Mobile version of Google came out, but it was still declining before that. Has it ever happened to you? How did you find out what was wrong? How did you solve it? What I want to do is take the sitemap and look for the urls in the index, to first determine which are the missing links. The problem though is that the sitemap is huge (6 M pages). Have you find out a solution on how to deal with such big index changes? Cheers, Andrei
Intermediate & Advanced SEO | | andreib0 -
Does anyone know how to appear with snippet that says something like: Jobs 1-10 of 80 in the beginning of the description on Google? e.g. like on: https://www.google.co.za/#q=pickers+and+packers
Does anyone know how to appear with snippet that says something like: Jobs 1-10 of 80 in the beginning of the description on Google? e.g. like on: https://www.google.co.za/#q=pickers+and+packers Any markup that could be used to be listed like this. Why is some sites listed like this and some not. Why is the adzuna.co.za page listed with Results 1-10 while some other with Jobs 1-10 ?
Intermediate & Advanced SEO | | classifiedtech0 -
E Commerce site - removing discontinued items
We have been hit with a Panda penalty and the site has slowly been losing rankings since January, I've now realised that we have 4000+ page indexed in Google, but only 2000 live products. We have never deleted any of the pages with discontinued items, most of which were created when keyword stuffing and thin content reigned supreme - which explains the Panda penalty. But which is the best and quickest way to delete them from Google? We have already implemented a 'noindex' across all these pages, but as they are no longer in the 'crawlable' site, how will Google find them to know this? Would a 404 work any better - I'm not concerned about any link juice etc to/from these pages, I just want rid. I'm not sure if we can move all these pages into a dedicated directory which would allow us to use Google's Removal Tool - using it with the individual urls would be a mammoth task. Any advice would be most greatly appreciated.
Intermediate & Advanced SEO | | ElaineAllkids0 -
Google recognising regional canadian site as primary instead .com
Hi, we updated corporate site salvagedata.com to new design,but for migration test we do it on our canadian salvagedata.ca site. In few day we migrated salvageta.com. In this time google indexed salvagedata.ca contents, and now looks like google recognising it as primary site, and show it higher in search results. for example: hard drive data recovery Can 301 redirect .ca-> .com to resolve problem?
Intermediate & Advanced SEO | | markgray0 -
My Job Site is having Indexing Issues
I have 2 job sites that I am managing and working on. One of the sites has a great deal of job vacancies and expired job pages that have been indexed. This one below: http:// job search.cctc .com/cctc Jobsearch/expandedjobsearch.do This job site does not have any job pages index: http://www.cross countryallied. com/ctAlliedWebSite/ travel-nurse-jobs/job-search.jsp Why and what can I do to get the dynamic pages index and ranking? Any help tips would be much appreciated. Thanks
Intermediate & Advanced SEO | | Melia0