Google indexing is slowing down?
-
I have up to 20 million unique pages, and so far I've only submitted about 30k of them on my sitemap.
We had a few load related errors during googles initial visits, and it thought some were duplicates, but we fixed all that. We haven't gotten a crawl related error for 2 weeks now.
Google appears to be indexing fewer and fewer urls every time it visits. Any ideas why? I am not sure how to get all our pages indexed if its going to operate like this... love some help thanks!
-
Try getting links for important pages. As for google crawling and indexing: i have websites that contain 10k of URL's on which i know, half of that is not relevant for search at all. Google does it completely automaticly and cuts out the most duplicate or non-relevant pages in comparison to what i offer in my sitemap and brings 4k of pages or so in actual search.
It's really no issue. You cant tell me you got 20 million of unique (user) generated content.
-
Share your URL for better responses.
Have you submitted a complete XML sitemap through Google Search Console? And are all the pages free of a no-index meta tag?
20 million unique pages can only be created in some automated fashion based on a much smaller amount of original content, so Google may perceive these pages to be spammy and not worth indexing.
-
Ryan,
G doesn't crawl/index every url it encounters. Its decision to do so is based on it perceived value of the the page/the page the link is one. You can increase that value/PR/PA by sculpting your internal links, revised silo-ing of your content and/or acquiring backlinks to pages deeper within your site.
If you're a brand new site with 20M pages, what reason have you give google to crawl all those pages. Just submitting them or listing them on a sitemap isn't enough of a reason. I mean, 20M pages! google's like damn dude, what's up with all that?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Follow no-index
I have a question about the right way to not index pages: With a canonical or follow no-index. First we have a blog page: **Blogpage **
Technical SEO | | Happy-SEO
URL: /blog/
index follow Page 2 blog:
URL: /blog?=p2
index follow
rel="prev" /blog/
el="next" ?=p3 Nothing strange here i guess. But we also have other pages with chance on duplicate content: /SEO-category/
/SEO-category/view-more/ Because i don't want the "view-more" items to be indexed i want to set it on: follow no-index (follow to reach pages). But now the "view-more" also have pagination. What is the best way? Option 1:
/SEO-category/view-more/
Follow no-index /SEO-category/view-more?=p2
Follow no-index
rel="prev" /view-more/
el="next" ?=p3 Option 2: /SEO-category/view-more/
Canonical: /SEO-category/ /SEO-category/view-more?=p2
rel="prev" /view-more/
el="next" ?=p3 Option 3: Other suggests? Thanks!0 -
Google + and Google Knoladge Graph
I am trying to get things to match up for the company brand websearch and the Google + page and we have had it for years now The knowledge graph on Google is showing the map, address and name (shown in attached image), but is not linked to a G+ page, as when i click the "Are you the business owner?" its is trying to make me create a new G+ business page. Anyone have any ideas on this? Also does the wiki name have to be exact for it to show? As for phone number would that be coming from the DNS record as that is nowhere in the markup rich snippet or normal markup Thanks in advance LC9cWdG
Technical SEO | | David-McGawn0 -
Google indexing despite robots.txt block
Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | zeepartner0 -
Blocked URL parameters can still be crawled and indexed by google?
Hy guys, I have two questions and one might be a dumb question but there it goes. I just want to be sure that I understand: IF I tell webmaster tools to ignore an URL Parameter, will google still index and rank my url? IS it ok if I don't append in the url structure the brand filter?, will I still rank for that brand? Thanks, PS: ok 3 questions :)...
Technical SEO | | catalinmoraru0 -
Google Places Reviews
Has anyone had any delays on Google+ reviews to show up? We have multiple clients who have not received a new review in over two months. These are good accounts with good Zagat scores with 15+ good reviews from real customers. Our clients have asked their clients and have confirmed that there has been reviews left recently. However no new reviews have shown up in the past 60+ days.
Technical SEO | | CaseyKluver0 -
Remove Site from Google
How can I get my website out of google? I want all pages completely gone. Thanks!
Technical SEO | | tylerfraser0 -
Can JavaScrip affect Google's index/ranking?
We have changed our website template about a month ago and since then we experienced a huge drop in rankings, especially with our home page. We kept the same url structure on entire website, pretty much the same content and the same on-page seo. We kind of knew we will have a rank drop but not that huge. We used to rank with the homepage on the top of the second page, and now we lost about 20-25 positions. What we changed is that we made a new homepage structure, more user-friendly and with much more organized information, we also have a slider presenting our main services. 80% of our content on the homepage is included inside the slideshow and 3 tabs, but all these elements are JavaScript. The content is unique and is seo optimized but when I am disabling the JavaScript, it becomes completely unavailable. Could this be the reason for the huge rank drop? I used the Webmaster Tolls' Fetch as Googlebot tool and it looks like Google reads perfectly what's inside the JavaScrip slideshow so I did not worried until now when I found this on SEOMoz: "Try to avoid ... using javascript ... since the search engines will ... not indexed them ... " One more weird thing is that although we have no duplicate content and the entire website has been cached, for a few pages (including the homepage), the picture snipet is from the old website. All main urls are the same, we removed some old ones that we don't need anymore, so we kept all the inbound links. The 301 redirects are properly set. But still, we have a huge rank drop. Also, (not sure if this important or not), the robots.txt file is disallowing some folders like: images, modules, templates... (Joomla components). We still have some html errors and warnings but way less than we had with the old website. Any advice would be much appreciated, thank you!
Technical SEO | | echo10 -
Google refuses to index our domain. Any suggestions?
A very similar question was asked previously. (http://www.seomoz.org/q/why-google-did-not-index-our-domain) We've done everything in that post (and comments) and then some. The domain is http://www.miwaterstewardship.org/ and, so far, we have: put "User-agent: * Allow: /" in the robots.txt (We recently removed the "allow" line and included a Sitemap: directive instead.) built a few hundred links from various pages including multiple links from .gov domains properly set up everything in Webmaster Tools submitted site maps (multiple times) checked the "fetch as googlebot" display in Webmaster Tools (everything looks fine) submitted a "request re-consideration" note to Google asking why we're not being indexed Webmaster Tools tells us that it's crawling the site normally and is indexing everything correctly. Yahoo! and Bing have both indexed the site with no problems and are returning results. Additionally, many of the pages on the site have PR0 which is unusual for a non-indexed site. Typically we've seen those sites have no PR at all. If anyone has any ideas about what we could do I'm all ears. We've been working on this for about a month and cannot figure this thing out. Thanks in advance for your advice.
Technical SEO | | NetvantageMarketing0