Why the number of crawled pages is so low¿?
-
Hi, my website is www.theprinterdepo.com and I have been in seomoz pro for 2 months.
When it started it crawled 10000 pages, then I modified robots.txt to disallow some specific parameters in the pages to be crawled.
We have about 3500 products, so thhe number of crawled pages should be close to that number
In the last crawl, it shows only 1700, What should I do?
-
Hi levelencia1,
This could have been caused by many factors. Was the robots.txt the only change you made? Other things that could have caused it could have been meta "noindex" tags, nofollow links, or broken navigation structures.
In rare instances, sometimes rogerbot has a hiccup.
Let us know if things return to normal on your next crawl. If you have any difficulties feel free to contact the help team (help@seomoz.org) and they should be able to get things straightened out.
Best of luck with your SEO!
-
levalencia1
Still don't know what you wanted to accomplish with Robots re: I modified robots.txt to disallow some specific parameters in the pages to be crawled.
Go to GWMT: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449&from=35237&rd=1
This will allow you to determine what your robots.txt accomplished or not:
The Test robots.txt tool will show you if your robots.txt file is accidentally blocking Googlebot from a file or directory on your site, or if it's permitting Googlebot to crawl files that should not appear on the web. When you enter the text of a proposed robots.txt file, the tool reads it in the same way Googlebot does, and lists the effects of the file and any problems found.
Hope it helps you out,
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
ANY IDEA?
-
this is my robots.txt
User-agent: * Disallow: */product_compare/* Disallow: *dir=* Disallow: *order=*
-
levalencia1
What did you disallow?
Are there specific categories or products you know are missing?
Is there a specific sub directory(s) that is missing?
What is it you wanted to block with robots?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page titles in browser not matching WP page title
I have an issue with a few page titles not matching the title I have In WordPress. I have 2 pages, blog & creative gallery, that show the homepage title, which is causing duplicate title errors. This has been going on for 5 weeks, so its not an a crawl issue. Any ideas what could cause this? To clarify, I have the page title set in WP, and I checked "Disable PSP title format on this page/post:"...but this page is still showing the homepage title. Is there an additional title setting for a page in WP?
Technical SEO | | Branden_S0 -
How to Delete a Page on the Web?
Google reports and I have confirmed that the following old page is presenting on the Web. http://www.audiobooksonline.com/The_Great_American_Baseball_Box_Greatest_Moments_from_the_Last_80_Years_original_audio_collection_compact_discs.html This page hasn't been in our site's directory for some time and is no longer needed by us. What is the best way to fix this Google reported crawl error?
Technical SEO | | lbohen0 -
Duplicates on the page
Hello SEOMOZ, I've one big question about one project. We have a page http://eb5info.com/eb5-attorneys and a lot of other similar pages. And we got a big list of errors, warnings saying that we have duplicate pages. But in real not all of them are same, they have small differences. For example - you select "State" in the left sidebar and you see a list on the right. List on the right panel is changing depending on the what you selecting on the left. But on report pages marked as duplicates. Maybe you can give some advices how to improve quality of the pages and make SEO better? Thanks Igor
Technical SEO | | usadvisors0 -
My report only says it crawled 1 page of my site.
My report used to crawl my entire site which is around 90 pages. Any idea of why this would happen? www.treelifedesigns.com
Technical SEO | | nathan.marcarelli0 -
Indexed pages and current pages - Big difference?
Our website shows ~22k pages in the sitemap but ~56k are showing indexed on Google through the "site:" command. Firstly, how much attention should we paying to the discrepancy? If we should be worried what's the best way to find the cause of the difference? The domain canonical is set so can't really figure out if we've got a problem or not?
Technical SEO | | Nathan.Smith0 -
404 - page authority?
If in open site explorer my 404 pages have a higer page authority - what benefit would i see in rankings if I 301 redirected those pages to the right page. For example www.site.com/widget is a 404 but has authority according to open site explorer - but the page i see in the serps is www.site.com/widget/ with the / at the end. so what benefit would i see in rankings if I 301 redirected those pages to the right page?
Technical SEO | | DavidS-2820610 -
Why just 1 Page has been crawled till date?
We have started SEO for our nestle-family.com/english/ site. However, till date only just 1 page has been crawled. What are the reason for the pages not being crawled?
Technical SEO | | Francis_GlobalMediaInsight0 -
Cache my page
So I need to get this page cached: http://www.flowerpetal.com/index.jsp?info=13 It's been 4-5 months since uploaded. Now it's linked to from the homepage of a PR5 site. I've tweeted that link 10 times, facebooked, stumbled, linked to it from other articles and still nothing. And I submitted the url to google twice. Any thoughts? Thanks Tyler
Technical SEO | | tylerfraser0