Dynamic URL pages in Crawl Diagnostics
-
The crawl diagnostic has found errors for pages that do not exist within the site. These pages do not appear in the SERPs and are seemingly dynamic URL pages.
Most of the URLs that appear are formatted http://mysite.com/keyword,%20_keyword_,%20key_word_/ which appear as dynamic URLs for potential search phrases within the site.
The other popular variety among these pages have a URL format of http://mysite.com/tag/keyword/filename.xml?sort=filter which are only generated by a filter utility on the site.
These pages comprise about 90% of 401 errors, duplicate page content/title, overly-dynamic URL, missing meta decription tag, etc. Many of the same pages appear for multiple errors/warnings/notices categories.
So, why are these pages being received into the crawl test? and how to I stop it to gauge for a better analysis of my site via SEOmoz?
-
I am having a similar issue. I am getting hit with 404 errors for pages that do not exist anymore of have been fixed. How do I get these to stop showing up?
-
I am having a similar issue. I am getting hit with 403 errors for pages that do not exist anymore of have been fixed. How do I get these to stop showing up?
-
Based on what has happened from time to time on our sites, my guess will be that it is caused by a widget or plug in on your CMS in some way interacting with the Bot. You are likely being crawled on these urls by Google (and producing 404's) as well and it is not likely it is just Roger bot picking it up. There is a lot on the GWMT forums regarding this with a myriad of suggested fixes: mod rewrite, http 410 for 404, etc.
One fix used by many is if your site has relative links you can do full out urls. If you have a ton of pages this might be a bit more of a pain. (Our clients typically have smaller sites so not too much of a problem).
If you are using WordPress (or another CMS that can utilize Extra Options Plug In) it is stated in the forums that the 404's can be stopped by:
In Extra Options plugin: I checked off all of the below options,, the last two do the job.. read about the nonindex nonfollow where appropriate,,, in that plugin,, this could be the answer.
Make meta descriptions from excerpts
Make home meta description from taglineAdd noindex where appropriate
Add nofollow where appropriateAnother option is to insure you have no
There are plenty of bright coders on the moz who can pitch in here and be more eloquent,
Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to choose the best canonical URL
In a duplicate content situation, and assuming that both rel=canonical and a 301 redirect pass link equity (I know there is still some speculation on this), how should you choose the "best" version of the URL to establish as the redirect target or authoritative URL? For example, we have a series of duplicate pages on our site. Typically we choose the "cleanest" or shortest non-trailing-slash version of the URL as the canonical, but what if those pages are already established and have varying page authority/backlink profiles? The URLs are: example.com/stores/locate/index?parameters=tags - PA = 54, Inbound Links = 259 example.com/stores/locate/index - PA = 60, Inbound Links = 302 example.com/stores/ - This is the version that currently ranks. PA = 42, Inbound Links = 3 example.com/stores - PA = 40, Inbound Links = 8 This might not really even matter, but in the interests of conserving as much SEO value as possible, which would you choose as either the 301 redirect target and/or the canonical version? My gut is to go with the URL that's already ranking (example.com/stores/) but curious if PA, backlinks, and trailing slashes should be considered also. We of course would not 301 the URL with the tracking parameters. 🙂 Thanks for your help!
Moz Pro | | Critical_Mass0 -
Why am I getting all these duplicate pages?
This is going for basically all my pages, but my website has 3 'duplicates' as the rest just have 2 (no index) Why are these 3 variations counting as duplicate pages? http://www.homepage.com http://homepage.com http://www.hompage.com/index.php
Moz Pro | | W2GITeam0 -
Hoe to crawl specific subfolders
I tried to create a campaign to crawl the subfolders of my site, but it stops at just 1 folder. Basically what I want to do is crawl everything after folder1: www.domain.com/web/folder1/* I tried to create 2 campaigns: Subfolder Campaign 1: www.domain.com/web/folder1/*
Moz Pro | | gofluent
Subfolder Campaign 2: www.domain.com/web/folder1/ In both cases, it did not crawl and folders after the last /. Can you help me ?0 -
After I make corrections of my crawl diagnostics report, how can I tell is those corrections "took". Is there a way to immediatly refresh that report. Will it eventually refresh?'
I have made corrections to the crawl diagnostics report. Can I refresh this report? I would like to see if my corrections were correct. Thanks for your anticipated answer!
Moz Pro | | Bob550 -
Where do these error 404 pages come from
Hi, I've got a list of about 12 url's in our 404 section on here which I'm confused about. The url's relate to Christmas so they have not been active for 9 months. Can anyone answer where the SeoMoz crawler found these url's as they are not linked to on the website. Thanks
Moz Pro | | SimmoSimmo0 -
Crawl Diagnostics Error Spike
With the last crawl update to one of my sites there was a huge spike in errors reported. The errors jumped by 16,659 -- majority of which are under the duplicate title and duplicate content category. When I look at the specific issues it seems that the crawler is crawling a ton of blank pages on the sites blog through pagination. The odd thing is that the site has not been updated in a while and prior to this crawl on Jun 4th there were no reports of these blank pages. Is this something that can be an error on the crawler side of things? Any suggestions on next steps would be greatly appreciated. I'm adding an image of the error spike Xovep.jpg?1 Xovep.jpg?1
Moz Pro | | VanadiumInteractive1 -
Moz crawling
Hi Everyone! I'm new to the SEOMoz and wanted to find out if there is a way to decrease the waiting time for the campaign crawl. I have made a lot of changes based on the first crawl and would like to see how these are reflected on the reports, but can't until the next crawl is performed. Any help would be greatly appreciated.
Moz Pro | | coremediadesign0 -
Why do pages with canonical urls show in my report as a "Duplicate Page Title"?
eg: Page One
Moz Pro | | DPSSeomonkey
<title>Page one</title>
No canonical url Page Two
<title>Page one</title> Page two is counted as being a page with a duplicate page title.
Shouldn't it be excluded?0