Huge Google index on E-commerce site
-
Hi Guys,
Refering back to my original post I would first like to thank you guys for all the advice.
We implemented canonical url's all over the site and noindexed some url's with robots.txt and the site already went from 100.000+ url's indexed to 87.000 urls indexed in GWT.
My question: Is there way to speed this up?
I do know about the way to remove url's from index (with noindex of robots.txt condition) but this is a very intensive way to do so.I was hoping you guys maybe have a solution for this..
-
Hi,
A few weeks later now and index is now on 63.000 url's so that's a good thing.
Another weird thing is the following.
There's a (old) url still in the index. When i visit it redirects me to the new url, which is good. Cache date is 2 weeks ago but Google still shows the old url.
How is this possible? The 301 redirect is already in place since April 2013.
-
Hi allen Jarosz!
Thanks for your reply
I've actually done all the things you said in the last few weeks. Site is totally indexed but the main problem is that are over 85.000 url's indexed but the site only exists of 13.000 urls.
So the main question is wether i can speed things up in one way or another to get those 70.000 url's deindexed.Are any options besides noindex, robots.txt and removing some url's ? Because now it's just waiting.
It looks like we are going the right way when you check the image.
-
SSiebn,
I have had some success in speeding things up, but only to a point.
Google webmaster tools is a GREAT tool that fortunately for us Google allows us to use, and its free!
I'm sure you probably already use the service, but I have found a few ways to use the tools to improve their scan rate. First block the spiders from crawling any pages you don't want indexed, for instance your backend files, this allows more time to be spent on the pages you want indexed. Second ensure you pages link to each other in the site, this allows pages to be linked by flowing through to each other, (no dead ends). Third use "Fetch as Google" from WMT, you are allowed up to 10 fetches. These fetches can be configured to follow linking pages, once crawled, you may submit the results to the Google index, with up to 500 fetches. It may be beneficial to submit for "Fetch as Google" your main categories. Lastly check your "Crawl Rate" to ensure that you have chosen "<label for="recommendedType">Let Google optimize for my site (recommended)</label>"
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexing only 1 page out of 2 similar pages made for different cities
We have created two category pages, in which we are showing products which could be delivered in separate cities. Both pages are related to cake delivery in that city. But out of these two category pages only 1 got indexed in google and other has not. Its been around 1 month but still only Bangalore category page got indexed. We have submitted sitemap and google is not giving any crawl error. We have also submitted for indexing from "Fetch as google" option in webmasters. www.winni.in/c/4/cakes (Indexed - Bangalore page - http://www.winni.in/sitemap/sitemap_blr_cakes.xml) 2. http://www.winni.in/hyderabad/cakes/c/4 (Not indexed - Hyderabad page - http://www.winni.in/sitemap/sitemap_hyd_cakes.xml) I tried searching for "hyderabad site:www.winni.in" in google but there also http://www.winni.in/hyderabad/cakes/c/4 this link is not coming, instead of this only www.winni.in/c/4/cakes is coming. Can anyone please let me know what could be the possible issue with this?
Intermediate & Advanced SEO | | abhihan0 -
Does including your site in Google News (and Google) Alerts helps with SEO?
Based on the following article http://homebusiness.about.com/od/yourbusinesswebsite/a/google-alerts.htm in order to check if you are included you need to run site:domain.com and click the news search tab. If you are not there then... I ran the test on MOZ and got no results which surprised me. Next step according to :https://support.google.com/news/publisher/answer/40787?hl=en#ts=3179198 is to submit your site for inclusion. Should I? Will it help? P.S.
Intermediate & Advanced SEO | | BeytzNet
This is a followup question to the following: http://moz.com/community/q/what-makes-a-site-appear-in-google-alerts-and-does-it-mean-anything0 -
How should I manage duplicate content caused by a guided navigation for my e-commerce site?
I am working with a company which uses Endeca to power the guided navigation for our e-commerce site. I am concerned that the duplicate content generated by having the same products served under numerous refinement levels is damaging the sites ability to rank well, and was hoping the Moz community could help me understand how much of an impact this type of duplicate content could be having. I also would love to know if there are any best practices for how to manage this type of navigation. Should I nofollow all of the URLs which have more than 1 refinement used on a category, or should I allow the search engines to go deeper than that to preserve the long tail? Any help would be appreciated. Thank you.
Intermediate & Advanced SEO | | FireMountainGems0 -
Google isn't seeing the content but it is still indexing the webpage
When I fetch my website page using GWT this is what I receive. HTTP/1.1 301 Moved Permanently
Intermediate & Advanced SEO | | jacobfy
X-Pantheon-Styx-Hostname: styx1560bba9.chios.panth.io
server: nginx
content-type: text/html
location: https://www.inscopix.com/
x-pantheon-endpoint: 4ac0249e-9a7a-4fd6-81fc-a7170812c4d6
Cache-Control: public, max-age=86400
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 14 Mar 2014 16:29:38 GMT
X-Varnish: 2640682369 2640432361
Age: 326
Via: 1.1 varnish
Connection: keep-alive What I used to get is this: HTTP/1.1 200 OK
Date: Thu, 11 Apr 2013 16:00:24 GMT
Server: Apache/2.2.23 (Amazon)
X-Powered-By: PHP/5.3.18
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Thu, 11 Apr 2013 16:00:24 +0000
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1365696024"
Content-Language: en
Link: ; rel="canonical",; rel="shortlink"
X-Generator: Drupal 7 (http://drupal.org)
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8 xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:og="http://ogp.me/ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:sioc="http://rdfs.org/sioc/ns#"
xmlns:sioct="http://rdfs.org/sioc/types#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <title>Inscopix | In vivo rodent brain imaging</title>0 -
Google Custom Searches with site CSS
Anyone good with GCS. I want to add Google custom searches in my site but with my site CSS.
Intermediate & Advanced SEO | | csfarnsworth
I need results from GCS but want to display with my website CSS. Website is in OSCommerce and php.0 -
Website is not indexed in Google, please help with suggestions
Our client website was removed from Google index. Anybody could recommend how to speed up process of re index: Webmaster tools done SM done (Twitter, FB) sitemap.xml done backlinks in process PPC done Robots.txt is fine Guys any recommendations are welcome, client is very unhappy. Thank you
Intermediate & Advanced SEO | | ThinkBDW0 -
What Sources to use to compile an as comprehensive list of pages indexed in Google?
As part of a Panda recovery initiative we are trying to get an as comprehensive list of currently URLs indexed by Google as possible. Using the site:domain.com operator Google displays that approximately 21k pages are indexed. Scraping the results however ends after the listing of 240 links. Are there any other sources we could be using to make the list more comprehensive? To be clear, we are not looking for external crawlers like the SEOmoz crawl tool but sources that would be confidently allow us to determine a list of URLs currently hold in the Google index. Thank you /Thomas
Intermediate & Advanced SEO | | sp800 -
What is the proper way to display e-commerce product guides? PDF / JPG?
Hi, On each product page in my e-commerce site, I have a link to show a certificate of authenticity for the product. (similar to any guide in an e-commerce site). I also have the details as plain text on the page, but this is required. What is the correct way to show it, using PDF or JPG? Thanks
Intermediate & Advanced SEO | | BeytzNet0