Crawl Diagnostics Updates
-
I have several page types on my sites that I have blocked using the robots.txt file (ex: emailafriend.asp, shoppingcart.asp, login.asp), but they are still showing up in crawl diagnostics as issues (ex: duplicate page content, duplicate title tag, etc). Is there a way to filter these issues or perhaps there is something I'm doing wrong resulting in the issues that are showing up?
- Ryan
-
Hi Ryan,
try to move the sitemap to the end and leave a space before it. something like this:
User-agent:*
Disallow: /cgi-bin/
Disallow: /ShoppingCart.asp
Disallow: /SearchResults.asp...
...
Disallow: /mailinglist_subscribe.asp
Disallow: /mailinglist_unsubscribe.asp
Disallow: /EmailaFriend.asp -
I added the pages that it was suggesting to the robots.txt file:
http://www.naturalrugco.com/robots.txt
Most of the pages listed in the high priority errors within moz analytics crawl diagnostics are the emailafriend.asp pages which I've disallowed. Ex: http://www.naturalrugco.com/EmailaFriend.asp?ProductCode=AMB0012-parent
-
Hi Ryan,
At the end of this page you will find several ways to block Roger bot from indexing pages: http://moz.com/help/pro/rogerbot-crawler
I hope it helps,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
To update or not to update news URLs ?
We manage a huge daily news website in my small country - keeping this a bit mysterious in case competitors are reading 🙂 Our URL structure is www.companyname.com/news/categoryofnews/title-of-article?id=articleid In this hyperreactive news world, title of articles change frequently (may be ten times a day for the main stories). The question we debate is : should we reflect the modification of the title in the URL or not ? Example : "Trump says he wants to ban search engines" would have URL http://www.companyname.com/news/entertainment/Trump-says-he-wants-to-ban-search-engines?id=12345678 Later in the day the title becomes "Trump denies he suggested banning search engines". Should the URL be modified to http://www.companyname.com/news/entertainment/Trump-denies-he-suggested-banning-search-engines?id=12345678 (option A) or not (option B) ? In Google News it makes no difference because of the sitemap, but in Google organic things are different. At present (option B in place), Google apparently doesn't see that the article has been updated, and shows the initial timestamp which is visually (and presumably SEOwise) not good : our new news looks like old news. Modifiying the URL would solve that issue, but could, may be, create another one : the new URL, being considered a new article, would lose, the acquired weight of the previous one in terms of referrals, social trafic and so on. Or not ? What do you think is the best option ? Thanks for your expertise, Yves
On-Page Optimization | | yves678901 -
"Issue: Duplicate Page Content " in Crawl Diagnostics - but these pages are noindex
Saw an issue back in 2011 about this and I'm experiencing the same issue. http://moz.com/community/q/issue-duplicate-page-content-in-crawl-diagnostics-but-these-pages-are-noindex We have pages that are meta-tagged as no-everything for bots but are being reported as duplicate. Any suggestions on how to exclude them from the Moz bot?
On-Page Optimization | | Deb_VHB0 -
I want to check which pages have been crawled
I would like to find out which pages have been crawled by seomoz on my site
On-Page Optimization | | seoworx1230 -
Can I force an update of Grade Reports?
It looks like my weekly crawl has finished, but my Grade Reports still reflect last week. Is there a way to manually update them, or do I just have to wait it out?
On-Page Optimization | | FDAitsupport0 -
Crawl Diagnostics - Duplicates and canonical problem
SEOmoz crowl diagnostic reports duplicates (title, content) issue on this addres: http://www.meblobranie.pl/biurowe/fotele-biurowe/promocje page already has canonical tag - is this a bug of crowler, or smth wrong on page?
On-Page Optimization | | SITS0 -
Updating Old Posts
I have ~ 45 posts that I wrote 2-3 years ago that need to be updated with current information and I'm wondering if I should: Just update them Update them and change the date published to present day Publish the updated info. as a completely new post other? ... and why. I've read so many conflicting thoughts on this, really curious to hear what other Pro members think (or would do if it were them). To give a little more background, the topics of the posts are various retirement communities. Things that may have changed could be they added new amenities, new home types, prices, number of homes still available, etc. I have one page of my site that acts as sort of a directory linking to an article(post) for each community, but worried if I add all the updates as new posts I'll have to link to separate articles about each community which doesn't really make things too friendly for the reader. They want to know about what's going on with each community now...not back 3 years ago. Thoughts? Suggestions? Many thanks! Ryan
On-Page Optimization | | ryanerisman0 -
How long after a URL starts showing a 404 does Google stop crawling?
Before hiring me to do SEO, a client re-launched their site and did not 301 the old URLs to the new. Only the home page URL stayed the same. For a month after the re-launch, the old URLs returned a 404. For the next month, all 404 pages (basically any non-existent URL) were 301'd to the home page. Finally, 2 months after launching, they properly 301'd the old URLs to the new. Now, the new URLs are not ranking well. I assume it's too late to realize any benefit from the 301's, just checking to see if anybody has any insight into how long Google keeps trying to crawl old/404/improperly 301'd URLs. Thanks!
On-Page Optimization | | AndrewMiller0 -
SEOmoz crawl error
Hi, I'm getting a crawl error and it complains about there being missing meta description... But, the errors are all for non existent index files in directories that only contain pdf files and some thumbs of the front page... Just started trying to learn this stuff...! Cheers Rod
On-Page Optimization | | DrWho0