Why the number of crawled pages is so low¿?
-
Hi, my website is www.theprinterdepo.com and I have been in seomoz pro for 2 months.
When it started it crawled 10000 pages, then I modified robots.txt to disallow some specific parameters in the pages to be crawled.
We have about 3500 products, so thhe number of crawled pages should be close to that number
In the last crawl, it shows only 1700, What should I do?
-
Hi levelencia1,
This could have been caused by many factors. Was the robots.txt the only change you made? Other things that could have caused it could have been meta "noindex" tags, nofollow links, or broken navigation structures.
In rare instances, sometimes rogerbot has a hiccup.
Let us know if things return to normal on your next crawl. If you have any difficulties feel free to contact the help team (help@seomoz.org) and they should be able to get things straightened out.
Best of luck with your SEO!
-
levalencia1
Still don't know what you wanted to accomplish with Robots re: I modified robots.txt to disallow some specific parameters in the pages to be crawled.
Go to GWMT: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449&from=35237&rd=1
This will allow you to determine what your robots.txt accomplished or not:
The Test robots.txt tool will show you if your robots.txt file is accidentally blocking Googlebot from a file or directory on your site, or if it's permitting Googlebot to crawl files that should not appear on the web. When you enter the text of a proposed robots.txt file, the tool reads it in the same way Googlebot does, and lists the effects of the file and any problems found.
Hope it helps you out,
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
ANY IDEA?
-
this is my robots.txt
User-agent: * Disallow: */product_compare/* Disallow: *dir=* Disallow: *order=*
-
levalencia1
What did you disallow?
Are there specific categories or products you know are missing?
Is there a specific sub directory(s) that is missing?
What is it you wanted to block with robots?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Extreme high number of pages found on webshop
Hi, Im working for the first time on a magento webshop. But i run into a problem where crawlers find then thousands of pages while there are a few hunderd products. I expect is has something to do with filters that generate dynamic URL's. I can't find any setting in Magento to prevent this and i think this will hurt SEO performance because of duplicate content and high amount of pages that need to be crawled while the site has no authority. What would my approach be to solve this? Do i need to ad certain tags to the pages or are these settings in my robots file.
Technical SEO | | J05B0 -
Pages with 301 redirects showing as 200 when crawled using RogerBot
Hi guys, I recently did an audit for a client and ran a crawl on the site using RogerBot. We quickly noticed that all but one page was showing as status code 200, but we knew that there were a lot of 301 redirects in place. When our developers checked it, they saw the pages as 301s, as did the Moz toolbar. If page A redirected to page B, our developers and the Moz toolbar saw page A as 301 and page B as 200. However the crawl showed both page A and page B as 200. Does anyone have any idea why the crawl may have been showing the status codes as 200? We've checked and the redirect is definitely in place for the user, but our worry is that there could be an issue with duplicate content if a crawler isn't picking up on the 301 redirect. Thanks!
Technical SEO | | Welford-Media0 -
Odd 404 pages
Evening all, I've performed a Screaming Frog technical crawl of a site, and it's returning links like this as 404s: http://clientsite.co.uk/accidents-caused-by-colleagues/js/modernizr-2.0.6.min.js Now, I recognise that Modernizr is used for detecting features in the user's browser - but why would it have created an indexed page that no longer exists? Would you leave them as is? 410 them? Or do something else entirely? Thanks for reading, I look forward to hearing your thoughts! Kind regards, John.
Technical SEO | | Muhammad-Isap0 -
Translating Page Titles & Page Descriptions
I am working on a site that will be published in the original English, with localized versions in French, Spanish, Japanese and Chinese. All the versions will use the English information architecture. As part of the process, we will be translating the page the titles and page descriptions. Translation quality will be outstanding. The client is a translation company. Each version will get at least four pairs of eyes including expert translators, editors, QA experts and proofreaders. My question is what special SEO instructions should be issued to translators re: the page titles and page descriptions. (We have to presume the translators know nothing about SEO.) I was thinking of: stick to the character counts for titles and descriptions make sure the title and description work together avoid over repetition of keywords page titles (over-optimization peril) think of the descriptions as marketing copy try to repeat some title phrases in the description (to get the bolding and promote click though) That's the micro stuff. The macro stuff: We haven't done extensive keyword research for the other languages. Most of the clients are in the US. The other language versions are more a demo of translation ability than looking for clients elsewhere. Are we missing something big here?
Technical SEO | | DanielFreedman0 -
Duplicates on the page
Hello SEOMOZ, I've one big question about one project. We have a page http://eb5info.com/eb5-attorneys and a lot of other similar pages. And we got a big list of errors, warnings saying that we have duplicate pages. But in real not all of them are same, they have small differences. For example - you select "State" in the left sidebar and you see a list on the right. List on the right panel is changing depending on the what you selecting on the left. But on report pages marked as duplicates. Maybe you can give some advices how to improve quality of the pages and make SEO better? Thanks Igor
Technical SEO | | usadvisors0 -
2 links on home page to each category page ..... is page rank being watered down?
I am working on a site that has a home page containing 2 links to each category page. One of the links is a text link and one link is an image link. I think I'm right in thinking that Google will only pay attention to the anchor text/alt text of the first link that it spiders with the anchor text/alt text of the second being ignored. This is not my question however. My question is about the page rank that is passed to each category page..... Because of the double links on the home page, my reckoning is that PR is being divided up twice as many times as necessary. Am I also right in thinking that if Google ignore the 2nd identical link on a page only one lot of this divided up PR will be passed to each category page rather than 2 lots ..... hence horribly watering down the 'link juice' that is being passed to each category page?? Please help me win this argument with a developer and improve the ranking potential of the category pages on the site 🙂
Technical SEO | | QubaSEO0 -
Page crawling is only seeing a portion of the pages. Any Advice?
last couple of page crawls have returned 14 out of 35 pages. Is there any suggestions I can take.
Technical SEO | | cubetech0