Sitemap as Referrer in Crawl Error Report
-
I have just downloaded the SEOMoz crawl error report, and I have a number of pages listed which all show FALSE.
The only common denominator is the referrer - the sitemap.
I can't find anything wrong, should I be worried this is appearing in the error report?
-
Thanks Tom.
The site map is pointing to the correct pages, and when visiting the pages in the search engines no problems arises.
I don't understand why these pages are listed in the crawl error report when I can't see any obvious issue.
-
Hi Christina
If the referrer is the sitemap, it means that the SEOMoz crawler has been directed to that page because of the sitemap you have submitted.
If you're getting 404 errors or access errors for certain pages and they are only able to be accessed via the sitemap, then it's a good idea to remove those URLs from the sitemap altogether. It doesn't make sense to have URLs listed in your sitemap if those URLs don't exist or have restricted access.
A cleaner sitemap will ultimately help in the long run. Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Which pages should I index or have in my XML sitemap?
Hi there, my website is ConcertHotels.com - a site which helps users find hotels close to concert venues. I have a hotel listing page for every concert venue on my site - about 12,000 of them I think (and the same for nearby restaurants). e.g. https://www.concerthotels.com/venue-hotels/madison-square-garden-hotels/304484 Each of these pages list the nearby hotels to that concert venue. Users clicking on the individual hotel are brought through to a hotel (product) page e.g. https://www.concerthotels.com/hotel/the-new-yorker-a-wyndham-hotel/136818 I made a decision years ago to noindex all of the /hotel/ pages since they don't have a huge amount of unique content and aren't the pages I'd like my users to land on . The primary pages on my site are the /venue-hotels/ listing pages. I have similar pages for nearby restaurants, so there are approximately 12,000 venue-restaurants pages, again, one listing page for each concert venue. However, while all of these pages are potentially money-earners, in reality, the vast majority of subsequent hotel bookings have come from a fraction of the 12,000 venues. I would say 2000 venues are key money earning pages, a further 6000 have generated income of a low level, and 4000 are yet to generate income. I have a few related questions: Although there is potential for any of these pages to generate revenue, should I be brutal and simply delete a venue if it hasn't generated revenue within a time period, and just accept that, while it "could" be useful, it hasn't proven to be and isn't worth the link equity. Or should I noindex these "poorly performing pages"? Should all 12,000 pages be listed in my XML sitemap? Or simply the ones that are generating revenue, or perhaps just the ones that have generated significant revenue in the past and have proved to be most important to my business? Thanks Mike
Technical SEO | | mjk260 -
Sitemap
I have a question for the links in a sitemap. Wordpress works with a sitemap that first link to the different kind of pages: pagesitemap.xml categorysitemap.xml productsitemap.xml etc. etc. These links on the first page are clickable. We have a website that also links to the different pages but it's not clickable, just a flat link. Is this an issue?
Technical SEO | | Happy-SEO0 -
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
Dulpicate Content being reported
Hi I have a new client whose first MA crawl report is showing lots of duplicate content. The main batch of these are all the HP url with an 'attachment' part at the end such as: www.domain.com/?attachment_id=4176 As far as i can tell its some sort of slide show just showing a different image in the main frame of each page, with no other content. Each one does have a unique meta title & H1 though. Whats the best thing to do here ? Not a problem and leave as is Use the paremeter handling tool in GWT Canonicalise, referencing the HP or other solution ? Many Thanks Dan
Technical SEO | | Dan-Lawrence0 -
Crawl Test Report only shows home page and no inner site pages?
Hi, My site is [removed] When I first tried to set up a new campaign for the site, I received the error: Roger has detected a problem: We have detected that the root domain [removed] does not respond to web requests. Using this domain, we will be unable to crawl your site or present accurate SERP information. I then ran a Crawl Test per the FAQ. The SEOmoz crawl report only shows my home page URL and does not have any inner site pages. This is a Joomla site. What is the problem? Thanks! Dave
Technical SEO | | crave810 -
Blocking https from being crawled
I have an ecommerce site where https is being crawled for some pages. Wondering if the below solution will fix the issue www.example.com will be my domain In the nav there is a login page www.example.com/login which is redirecting to the https://www.example.com/login If I just disallowed /login in the robots file wouldn't it not follow the redirect and index that stuff? The redirect part is what I am questioning.
Technical SEO | | Sean_Dawes0 -
Help with bing redirection error
Can somebody help me figure out this bing redirect error. The link to "http://w******/flea-control" has resulted in HTTP redirection to "http://w******/feas/flea-control/".Search engines can only pass page rankings and other relevant data through a single redirection hop. Using unnecessary redirects can have a negative impact on page ranking. I am using wordpress. I am actually linking to the /feas/flea-control/ version. I have looked every where for help. I got this error using bings seo toftware
Technical SEO | | OxzenMedia0 -
Google Crawler Error / restricting crawling
Hi On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant. The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error. Currently Google is attempting to crawl these pages twice a day. I have implemented the following to stop this:- 1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri: http://www.domian.com/catalogsearch/advanced/result/ 2. Added Disallow to robots.txt Disallow: /catalogsearch/advanced/result/* Disallow: /catalogsearch/advanced/result/ 3. Add rel="nofollow" to the links in the site linking to the advanced search. Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status. Can anything else be done? http://www.domain.com/catalogsearch/advanced/result/?bust_line=94&category=55&color_layered=128&csize[0]=0&fabric=92&inventry_status=97&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=115&category=55&color_layered=130&csize[0]=0&fabric=0&inventry_status=97&length=116&price=3%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=94&category=55&color_layered=126&csize[0]=0&fabric=92&inventry_status=97&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=137&csize[0]=0&fabric=93&inventry_status=96&length=0&price=8%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=142&csize[0]=0&fabric=93&inventry_status=96&length=0&price=4%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=137&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=142&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=135&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=128&csize[0]=0&fabric=93&inventry_status=96&length=0&price=5%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=127&csize[0]=0&fabric=93&inventry_status=96&length=0&price=4%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=127&csize[0]=0&fabric=93&inventry_status=96&length=0&price=3%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=128&csize[0]=0&fabric=93&inventry_status=96&length=0&price=10%2C10http://www.domain.com/catalogsearch/advanced/result/?bust_line=0&category=55&color_layered=122&csize[0]=0&fabric=93&inventry_status=96&length=0&price=8%2C10
Technical SEO | | Flipmedia1120