3,511 Pages Indexed and 3,331 Pages Blocked by Robots
-
Morning,
So I checked our site's index status on WMT, and I'm being told that Google is indexing 3,511 pages and the robots are blocking 3,331. This seems slightly odd as we're only disallowing 24 pages on the robots.txt file. In light of this, I have the following queries:
- Do these figures mean that Google is indexing 3,511 pages and blocking 3,331 other pages? Or does it mean that it's blocking 3,331 pages of the 3,511 indexed?
- As there are only 24 URLs being disallowed on robots.text, why are 3,331 pages being blocked? Will these be variations of the URLs we've submitted?
- Currently, we don't have a sitemap. I know, I know, it's pretty unforgivable but the old one didn't really work and the developers are working on the new one. Once submitted, will this help?
- I think I know the answer to this, but is there any way to ascertain which pages are being blocked?
Thanks in advance!
Lewis
-
Hi,
No more links than a standard e-commerce site should have...
I'm chasing the sitemap as we speak.
Cheers,
-
The blocked URLs are probably no follow links throughout the site. Do you have a lot of links pointing outward from pages?
Google is indexing 3511 pages, of which 3331 are blocked by Robots. I would check some of the internal/external links on those disallowed pages. I don't see how it could come up to 3331 blocked pages, but it couldn't hurt to start there.
Definitely get a sitemap submitted asap. It will help for sure.
-
Excuse the short reply.
Add sitemap to your robots.txt - And submit it to Google WMT.
Just use a free one if you're in the middle of developing?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Switched from and HTTPS to HTTP. My home page is facing a redirect issue from the http to https. Should I no index the HTTP or find the redirect and delete it? Thank you
Switched from and HTTPS to HTTP. My home page is facing a redirect issue from the http to https. Should I no index the HTTP or find the redirect and delete it? Thank you
Technical SEO | | LandmarkRecovery20170 -
Issues Indexing Translated Pages
I'm having trouble getting http://www.procloud.ch/ to index for their german pages. The english pages are being indexed but not the german. Any ideas? Chris
Technical SEO | | ninel_P0 -
Inner pages of a directory site wont index
I have a business directory site thats been around a long time but has always been split into two parts, a subdomain and the main domain. The subdomain has been used for listings for years but just recently Ive opened up the main domain and started adding listings there. The problem is that none of the listing pages seem to be betting indexed in Google. The main domain is indexed as is the category page and all its pages below that eg /category/travel but the actual business listing pages below that will not index. I can however get them to index if I request Google to crawl them in search console. A few other things: I have nothing blocked in the robots.txt file The site has a DA over 50 and a decent amount of backlinks There is a sitemap setup also any ideas?
Technical SEO | | linklander0 -
"Url blocked by robots.txt." on my Video Sitemap
I'm getting a warning about "Url blocked by robots.txt." on my video sitemap - but just for youtube videos? Has anyone else encountered this issue, and how did you fix it if so?! Thanks, J
Technical SEO | | Critical_Mass0 -
Why is my office page not being indexed?
Good Morning from 24 degrees C partly cloudy wetherby UK 🙂 This page is not being indexed by Google:
Technical SEO | | Nightwing
http://www.sandersonweatherall.co.uk/office-to-let-leeds/ 1st Question Ive checked robots txt file no problems, i'm in the midst of updating the xml sitemap (it had the old one in place). It only has one link from this page http://www.sandersonweatherall.co.uk/Site-Map/ So is the reason oits not being indexed just a simple case of lack if SEO juice from inbound links so the remedy lies in routing more inbound links to the offending page? 2nd question Is the quickest way to diagnose if a web address is not being indexed to cut and paste the url in the Google search box and if it doesnt return the page theres a problem? Thanks in advance, David0 -
Same URL in "Duplicate Content" and "Blocked by robots.txt"?
How can the same URL show up in Seomoz Crawl Diagnostics "Most common errors and warnings" in both the "Duplicate Content"-list and the "Blocked by robots.txt"-list? Shouldnt the latter exclude it from the first list?
Technical SEO | | alsvik0 -
Https indexed - though a no index no follow tag has been added
Hi, The https-pages of our booking section are being indexed by Google. We added But the pages are still being indexed. What can I do to exclude these URL's from the Google index? Thank you very much in advance! Kind regards, Dennis Overbeek ACSI Publishing | dennis@acsi.eu
Technical SEO | | SEO_ACSI0 -
Duplicate Page Content and Title for product pages. Is there a way to fix it?
We we're doing pretty good with our SEO, until we added product listing pages. The errors are mostly Duplicate Page Content/Title. e.g. Title: Masterpet | New Zealand Products MasterPet Product page1 MasterPet Product page2 Because the list of products are displayed on several pages, the crawler detects that these two URLs have the same title. From 0 Errors two weeks ago, to 14k+ errors. Is this something we could fix or bother fixing? Will our SERP ranking suffer because of this? Hoping someone could shed some light on this issue. Thanks.
Technical SEO | | Peter.Huxley590