Moz Crawler Causing Server Timeouts... Crawling thousands of non-existant pages with query parameters
-
Moz crawler is crawling all pages like this:
- http://www.xxxx.com/?product_count=100&product_order=desc&product_orderby=date
- http://www.xxxx.com/?product_count=100&product_order=desc&paged=1
- http://www.xxx.com/?product_count=100&product_order=desc&product_view=grid
Last month it crawled 80,000 pages on a site with less than 100 pages. Is there a way to select only certain pages to be crawled? Right now it is still crawling this site, since Monday morning and it's Tuesday mid-day. Every Monday it is causing time-outs from high band width on our server. Just getting ready to delete this client from the account unless there is a solution someone can give us.
Thanks.
-
The immediate solution is use your robots.txt file to block the Moz crawler from crawling URLs with parameters. Pamela.
User-agent: rogerbot
Disallow: /*?utmThose pages are coming from the bot trying to follow links to all the different ways product pages can be sorted. You'll want to insure Googlebot isn't having the same problem.
Hope that helps;
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I pull "Mobile Rankings" in MOZ?
My client wants me to pull mobile-specific rankings through Moz's keyword explorer. Does anyone know if this is possible? Thanks!
Moz Bar | | TaylorAtVelox0 -
Looking for a Tool to Find Referring Pages of Specific URLs
Hello Everyone, We are Looking for a Tool to Find Referring Pages of Specific URLs. Please let me know if you know of a Moz tool or another tool for this need. Thanks.
Moz Bar | | Pushm4 -
Crawl Test : Error attempting to request HTTPS page
Hallo When I launch the crawl report I get csv file with this error : 804 : HTTPS (SSL) error encountered when requesting page.
Moz Bar | | micvitale
Error attempting to request page; see title for details. Website is https://bastabollette.it0 -
Unusual "internal links" causing SEO issues?
Hi all, I'm working on an ecommerce site which has been around for almost 20 years. Over the years it has started to suffer in Google's search results and the decision was recently made to completely overhaul the site. We're now very happy with the website's design, and care was taken to maintain page rank via 301s, etc. However, the site has just fallen off the bottom of Google's first search result page (for the first time in years) for our main keyword. I signed up here in the hope of using Moz's SEO tools to help us return to our former glory, but I'm seeing some confusing results: I've run a crawl test on our site, as well as on our two biggest competitors. One thing that really stood out was that we have over 1000 "internal links" to our homepage, whereas our competitors both have around 20-30 (both of which appear at the top of the first SR page). Since the rest of the "on-page SEO" looks OK, I suspect that this could be causing our problems, but I don't understand where this "internal links" number is coming from. Links to our competitor's homepage appear in the navigation bar on every single one of their product pages (which they have about 500 of), yet your report only claims that they have 30 links. The only link to the homepage appears in the site's main navigation bar (which obviously appears on every product page - exactly as it does on our competitors' sites). Additionally, almost every other page on our site apparently has 0 "internal links" and 0 page authority. Is this a problem with Moz's crawl test tool, or is our site actually at fault? The above has been asked directly to Moz staff, but I haven't had a reply. I'd hugely appreciate any words of wisdom from the community. Many thanks in advance. Nick
Moz Bar | | nick45010 -
On-page optimization
I have a list of the top 350 keywords sending volume to my site, sorted by volume. I am using your On-Page Optimization tool to look at the top 10 keywords and the grade for each of the relevant pages on the website. So for "hard wood flooring," I am searching for that term on Google and finding the first listing for my site lumberliquidators.com that comess up. Then I paste that page link into the On-Page Optimizer. Is this the best way to do this to determine performance for the most relevant page? Moz gave this keyword an F (home page) even though LL came up #2 in the organic Google rankings.
Moz Bar | | AlanJacob0 -
How old is this Moz quiz?
Does anyone know? I want to use it as a study tool if the info is still current. SEO Expert Quiz
Moz Bar | | SSFCU
http://moz.com/seo-expert-quiz Thanks,
Sarah0 -
Have any insight into why our Moz Rank dropped?
I'm working on a site with a very low domain authority to start and in viewing our historical MozRank comparison to competitors I see that we had a MozRank between 2 and 3 two months ago, but now have a MozRank of 0. What could have triggered this dropoff? It's clear we need to boost domain authority, but we have never had any so we're no worse in that department now than we were two months ago. Any insight here would be useful. Thanks! W2A1u2D.png
Moz Bar | | bshanahan0 -
Moz "Crawl Diagnostics" doesn't respect robots.txt
Hello, I've just had a new website crawled by the Moz bot. It's come back with thousands of errors saying things like: Duplicate content Overly dynamic URLs Duplicate Page Titles The duplicate content & URLs it's found are all blocked in the robots.txt so why am I seeing these errors?
Moz Bar | | Vitalized
Here's an example of some of the robots.txt that blocks things like dynamic URLs and directories (which Moz bot ignored): Disallow: /?mode=
Disallow: /?limit=
Disallow: /?dir=
Disallow: /?p=*&
Disallow: /?SID=
Disallow: /reviews/
Disallow: /home/ Many thanks for any info on this issue.0