Functionality of SEOmoz crawl page reports
-
I am trying to find a way to ask SEOmoz staff to answer this question because I think it is a functionality question so I checked SEOmoz pro resources. I also have had no responses in the Forum too it either. So here it is again. Thanks much for your consideration!
Is it possible to configure the SEOMoz Rogerbot error-finding bot (that make the crawl diagnostic reports) to obey the instructions in the individual page headers and http://client.com/robots.txt file?
For example, there is a page at http://truthbook.com/quotes/index.cfm month=5&day=14&year=2007 that has – in the header -
<meta name="robots" content="noindex"> </meta name="robots" content="noindex">This page is themed Quote of the Day page and is duplicated twice intentionally at http://truthbook.com/quotes/index.cfm?month=5&day=14&year=2004
and also at
http://truthbook.com/quotes/index.cfm?month=5&day=14&year=2010 but they all have <meta name="robots" content="noindex"> in them. So Google should not see them as duplicates right. Google does not in Webmaster Tools.</meta name="robots" content="noindex">
So it should not be counted 3 times? But it seems to be? How do we gen a report of the actual pages shown in the report as dups so we can check? We do not believe Google sees it as a duplicate page but Roger appears too.
Similarly, one can use http://truthbook.com/contemplative_prayer/ , here also the http://truthbook.com/robots.txt tells Google to stay clear.
Yet we are showing thousands of dup. page content errors when Google Webmaster tools as shown only a few hundred configured as described.
Anyone?
Jim
-
Hi Jimmy,
Thanks for writing in with a great question.
In regard to the "noindex" meta tag, our crawler will obey that tag as soon as we find it in the code, but we will also crawl any other source code up until we hit the tag in the code so pages with the "noindex" tag will still show up in the crawl. We just don't crawl any information past that tag. One of the notices we include is "Blocked by meta robots" and for the truthbook.com campaign, we show over 2000 pages under that notice.
For example, on the page http://truthbook.com/quotes/index.cfm?month=5&day=14&year=2010, there are six lines of code, including the title, that we would crawl before hitting the "noindex" directive. Google's crawler is much more sophisticated than ours, so they are better at handling the meta robots "noindex" tag.
As for http://truthbook.com/contemplative_prayer/, we do respect the "*" wildcard directive in the robots.txt file and we are not that page. I checked your full CSV report and there is no record of us crawling any pages with /contemplative_prayer/ in the URL (http://screencast.com/t/hMFuQnc9v1S) so we are correctly respecting the disallow directives in the robots.txt file.
Also, if you would ever like to reach out to the Help Team directly in the future, you can email us from the Help Hub here: http://www.seomoz.org/help, but we are happy to answer questions in the Q&A forum, as well.
I hope this helps. Please let me know if you have any other questions.
Chiaryn
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I need an interlinking report for my site, is there a report in Moz or another application that tell me how all of my pages are linked to other pages on my site?
I am in the process of doing a redesign for one of my sites. I need an interlinking report for my site. Is there a report in Moz or another application that tell me how all of my pages are linked to other pages on my site?
Moz Pro | | seoflorida0 -
Duplicate content in SEOMOZ report
Hi guys, The SEOMOZ report shows there is duplicate content on my Magento ecommerce: footdistrict.com Example: http://footdistrict.com/nike-air-royalty-386169602.html?___store=footdistrict_en Duplicate content shown on the report: http://footdistrict.com/marcas/puma.html?___store=footdistrict_en
Moz Pro | | footd
http://footdistrict.com/new-balance-m400rk.html?___store=footdistrict_en
http://footdistrict.com/new-balance-gm500mbn.html?___store=footdistrict_en
http://footdistrict.com/new-balance-m400nnb.html?___store=footdistrict_en My guess is that this is due to the fixed footer that we have set where modal windows pop up with delivery info and so on. As such, all the content within it is repeated through all the pages What do you recommend me to remove this duplicate content? I have read about duplicate content issues but they don't usually deal with div tag duplicate issues, modal windows and so on. Thanks Regards0 -
SEOMoz ranking reports inaccurate for Google?
So I have notice that, at least for some searches, the rankings shown in SEOMoz's ranking reports are meaningless. I assume this is due to blended search results including local search. For example, I have a client, who is ranked 3rd overall for one of his most important search terms, but his ranking is based upon his local result (there are 2 organic search results and then he is the first local result). The SEOMoz report shows him being ranked 12th. Anyway I count down to the 12th ranked site (including local search, not including local search) his site is not there. In fact the only place it is in the top 3 pages is in the local result. As a local marketing consultant, almost all of my clients are looking to be found for "Jackson Hole" this or that, or "Jackson, WY" this or that, so this is a pretty critical issue to me. I would appreciate feedback. Thanks!
Moz Pro | | farlandlee0 -
Archiving Campaigns in SEOmoz
First off, I love the campaign archive feature. Very useful for my purposes. My question is: Is there a limit to how many campaigns I can archive? Thanks in advance!
Moz Pro | | CollinJarman0 -
Site Ranking Report
Hi guys, My site ranking report says that I've gone from being 1-20 for a variety of keywords in Google UK to not in the top 50. When I do a search myself I see that my site remains where it previously was (between 1-20). How reliable is the site ranking reporting on a weekly basis? Is it best to look at it monthly?
Moz Pro | | columbus0 -
Why are these pages considered duplicate page content?
A recent crawl diagnostic for a client's website had several new duplicate page content errors. The problem is, I'm not sure where the error comes from since the content in the webpage is different from one another. Here's the pages that SEOMOZ reported to have duplicate page content errors: http://www.imaginet.com.ph/wireless-internet-service-providers-term http://www.imaginet.com.ph/antivirus-term http://www.imaginet.com.ph/berkeley-internet-name-domain http://www.imaginet.com.ph/customer-premises-equipment-term The only thing similar that I see is the headline which says "Glossary Terms Used in this Site" - I hope that the one sentence is the reason for the error. Any input is appreciated as I want to find out the best solution for my client's website errors. Thanks!
Moz Pro | | TheNorthernOffice790 -
How often does SEOmoz reports get refreshed?
Is there a way I can refresh the reports manually instead of waiting for it to pull the updated data?
Moz Pro | | RBA0 -
How can I change (specifically, decrease) the reporting/crawling frequency of the keyword ranking?
It always seem to compare the standings based on the week before, which confuses the issue when I'm only reporting monthly or quarterly. Is there currently (or might there be in the future) a way to set this so that the comparison is based on a time period that I specify?
Moz Pro | | MackenzieFogelson0