Recovering from Blocked Pages Debaucle
-
Hi, per this thread: http://www.seomoz.org/q/800-000-pages-blocked-by-robots We had a huge number of pages blocked by robots.txt by some dynamic file that must have integrated with our CMS somehow. In just a few weeks hundreds of thousands of pages were "blocked." This number is now going down, but instead of by the hundreds of thousands, it is going down by the hundreds and very sloooooowwwwllly. So, we really need to speed up this process. We have our sitemap we will re-submit, but I have a few questions related to it: Previously the sitemap had the <lastmod>tag set to the original date of the page. So, all of these pages have been changed since then. Any harm in doing a mass change of the <lastmod>field? It would be an accurate reflection, but I don't want it to be caught by some spam catcher. The easy thing to do would be to just set that date to now, but then they would all have the same date. Any other tips on how to get these pages "unblocked" faster? Thanks! Craig</lastmod></lastmod>
-
Hey Dan,
I am actually not so concerned about the pages being indexed. I don't really think they were ever de-indexed. Unless I am wrong, I think they were de-ranked.
I know others have said that when they "disallowed" large portions of their sites, their pages dropped in the rankings, and did not necessarily disappear. This is more what I want to see recovery from.
Thanks!
Craig
-
Craig
D'you have screaming frog? BEST way to make sure you're all set is - run a crawl with Screaming Frog. By default it will acknowledge robots.txt and not crawl anything being blocked. Set the user agent to Googlebot.
If it crawls all the pages you want it to just fine, than you are all set!
-Dan
-
Thanks for jumping in Dan. The number of blocked pages, over a month later is still way up there. It really has barely gone done. As of today it is at 904,000.
So, we still wait and hope that:
A. That many pages aren't actually blocked (whatever blocked actually means.)
B. The rate at which that number falls will begin to increase.
Thanks for your answer!
Craig
-
Hey There
I see this question is a bit old ... are you still have these issues? If so, when you say "going down" do you mean according to the numbers showing in Webmaster Tools?
I do know that quite often there can be a delay in the data in Webmaster Tools (especially the indexation report which you may be referring to).
I don't think there's any harm in updating the dates to reflect the most recent version of the page, so long as they are accurate.
Let me know if that helps or if you're all set.
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Glossary Page - best practice
Hi guys, We have a glossary on our website. All terms are accessible via a 'view all' URL, however we also have each letter on their own URL, e.g /a. Currently the rel=canonical tag for all the individual letter pages points to the view all URL. I'm just wondering whether that is best practice or not, as currently not all the individual letter pages are being indexed. Thanks 🙂
Technical SEO | | brian-madden0 -
Getting high priority issue for our xxx.com and xxx.com/home as duplicate pages and duplicate page titles can't seem to find anything that needs to be corrected, what might I be missing?
I am getting high priority issue for our xxx.com and xxx.com/home as reporting both duplicate pages and duplicate page titles on crawl results, I can't seem to find anything that needs to be corrected, what am I be missing? Has anyone else had a similar issue, how was it corrected?
Technical SEO | | tgwebmaster0 -
Is Google suppressing a page from results - if so why?
UPDATE: It seems the issue was that pages were accessible via multiple URLs (i.e. with and without trailing slash, with and without .aspx extension). Once this issue was resolved, pages started ranking again. Our website used to rank well for a keyword (top 5), though this was over a year ago now. Since then the page no longer ranks at all, but sub pages of that page rank around 40th-60th. I searched for our site and the term on Google (i.e. 'Keyword site:MySite.com') and increased the number of results to 100, again the page isn't in the results. However when I just search for our site (site:MySite.com) then the page is there, appearing higher up the results than the sub pages. I thought this may be down to keyword stuffing; there were around 20-30 instances of the keyword on the page, however roughly the same quantity of keywords were on each sub pages as well. I've now removed some of the excess keywords from all sections as it was getting in the way of usability as well, but I just wanted some thoughts on whether this is a likely cause or if there is something else I should be worried about.
Technical SEO | | Datel1 -
My website pages are not crawled, what to do?
Hi all. I have made some changes on the website so i like to crawled them by the search engines Google especially. I have made these changes around 2 weeks ago. I have submitted my website on good bookmarking websites. Also i used a tool available in Google webmasters "Fetch as Google", Resubmitted a sitemap.xml. Still my pages are not crawled your opinion please. Thanks
Technical SEO | | lucidsoftech0 -
If my home page never shows up in SERPS but other pages do, does that mean Google is penalizing me?
So my website I do local SEO for, xyz.com is finally getting better on some keywords (Thanks SEOMOZ) But only pages that are like this xyz.com/better_widgets_ or xyz.com/mousetrap_removals Is Google penalizing me possibly for some duplicate content websites I have out there (working on, I know I know it is bad)...
Technical SEO | | greenhornet770 -
Have a client that migrated their site; went live with noindex/nofollow and for last two SEOMoz crawls only getting one page crawled. In contrast, G.A. is crawling all pages. Just wait?
Client site is 15 + pages. New site had noindex/nofollow removed prior to last two crawls.
Technical SEO | | alankoen1230 -
Page title vs page element
Hello! I'm new to SEO as my question would imply. Can someone show me the difference between a page title and a page element? Thank you!
Technical SEO | | atrenary1 -
Duplicate Page Title
The crawl of my website http://www.aboutaburningfire.com revealed an error showing a duplicate page title. Can someone please explain to me how to fix this? I'm not sure what it means or how to fix it. | House Church Chicago, Organic Church, Illinois http://www.aboutaburningfire.com/ 1 Pending Pending House Church Chicago, Organic Church, Illinois http://www.aboutaburningfire.com/index.html |
Technical SEO | | severity0