Why the number of crawled pages is so low¿?
-
Hi, my website is www.theprinterdepo.com and I have been in seomoz pro for 2 months.
When it started it crawled 10000 pages, then I modified robots.txt to disallow some specific parameters in the pages to be crawled.
We have about 3500 products, so thhe number of crawled pages should be close to that number
In the last crawl, it shows only 1700, What should I do?
-
Hi levelencia1,
This could have been caused by many factors. Was the robots.txt the only change you made? Other things that could have caused it could have been meta "noindex" tags, nofollow links, or broken navigation structures.
In rare instances, sometimes rogerbot has a hiccup.
Let us know if things return to normal on your next crawl. If you have any difficulties feel free to contact the help team (help@seomoz.org) and they should be able to get things straightened out.
Best of luck with your SEO!
-
levalencia1
Still don't know what you wanted to accomplish with Robots re: I modified robots.txt to disallow some specific parameters in the pages to be crawled.
Go to GWMT: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449&from=35237&rd=1
This will allow you to determine what your robots.txt accomplished or not:
The Test robots.txt tool will show you if your robots.txt file is accidentally blocking Googlebot from a file or directory on your site, or if it's permitting Googlebot to crawl files that should not appear on the web. When you enter the text of a proposed robots.txt file, the tool reads it in the same way Googlebot does, and lists the effects of the file and any problems found.
Hope it helps you out,
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
Sorry, This one got lost. I will look at it in the a.m. and give you the feedback. Have you run anything like Xenu on the site? Do you know what is not showing up that would be outside of the robots.txt?
-
ANY IDEA?
-
this is my robots.txt
User-agent: * Disallow: */product_compare/* Disallow: *dir=* Disallow: *order=*
-
levalencia1
What did you disallow?
Are there specific categories or products you know are missing?
Is there a specific sub directory(s) that is missing?
What is it you wanted to block with robots?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blog Page Titles - Page 1, Page 2 etc.
Hi All, I have a couple of crawl errors coming up in MOZ that I am trying to fix. They are duplicate page title issues with my blog area. For example we have a URL of www.ourwebsite.com/blog/page/1 and as we have quite a few blog posts they get put onto another page, example www.ourwebsite.com/blog/page/2 both of these urls have the same heading, title, meta description etc. I was just wondering if this was an actual SEO problem or not and if there is a way to fix it. I am using Wordpress for reference but I can't see anywhere to access the settings of these pages. Thanks
Technical SEO | | O2C0 -
Help! Pages not being indexed
Hi Mozzers, I need your help.
Technical SEO | | bshanahan
Our website (www.barnettcapitaladvisors.com) stopped being indexed in search engines following a round of major changes to URLs and content. There were a number of dead links for a few days before 301 redirects were properly put in place. And now, only 3 pages show up in bing when I do the search "site:barnettcapitaladvisors.com". A bunch of pages show up in Google for that search, but they're not any of the pages we want to show up. Our home page and most important services pages are nowhere in search results. What's going on here?
Our sitemap is at http://www.barnettcapitaladvisors.com/sites/default/files/users/AndrewCarrillo/sitemap/sitemap.xml
Robots.txt is at: http://www.barnettcapitaladvisors.com/robots.txt Thanks!0 -
Duplicate Page Title
Our pages has so many DUPLİCATE PAGE TİTLE
Technical SEO | | iskq
I want to change all of them, is it right way?0 -
Product Pages Outranking Category Pages
Hi, We are noticing an issue where some product pages are outranking our relevant category pages for certain keywords. For a made up example, a "heavy duty widgets" product page might rank for the keyword phrase Heavy Duty Widgets, instead of our Heavy Duty Widgets category page appearing in the SERPs. We've noticed this happening primarily in cases where the name of the product page contains an at least partial match for the desired keyword phrase we want the category page to rank for. However, we've also found isolated cases where the specified keyword points to a completely irrelevent pages instead of the relevant category page. Has anyone encountered a similar issue before, or have any ideas as to what may cause this to happen? Let me know if more clarification of the question is needed. Thanks!
Technical SEO | | ShawnHerrick0 -
Container Page/Content Page Duplicate Content
My client has a container page on their website, they are using SiteFinity, so it is called a "group page", in which individual pages appear and can be scrolled through. When link are followed, they first lead to the group page URL, in which the first content page is shown. However, when navigating through the content pages, the URL changes. When navigating BACK to the first content page, the URL is that for the content page, but it appears to indexers as a duplicate of the group page, that is, the URL that appeared when first linking to the group page. The client updates this on the regular, so I need to find a solution that will allow them to add more pages, the new one always becoming the top page, without requiring extra coding. For instance, I had considered integrating REL=NEXT and REL=PREV, but they aren't going to keep that up to date.
Technical SEO | | SpokeHQ1 -
Testimonial pages
Is it better to have one long testimonial page on your site, or break it down into several smaller pages with testimonials? First time I've posted on the forum. But I'm excited! Ron
Technical SEO | | yatesandcojewelers0 -
Moz Crawl Reporting Duplicate content on "template" styled pages
We have a lot of detail pages on our site that reference specific scholarships. Each page has a different Title and Description. They also have unique information all regarding the same data points. The pages are displayed in a similar structure to the user so the data is easy to read. My problem is a lot of these pages are being reported as duplicate content when they certainly are not. Most of them are reported as duplicates when they have the same sponsor. They may have the same contact information listed. These two are being reported as duplicate of each other. They share some data but they are definitely different scholarships. http://www.collegexpress.com/scholarships/adelaide-mcclelland-garden-club-scholarship/9254/ http://www.collegexpress.com/scholarships/mary-wannamaker-witt-and-lee-hampton-witt-memorial-scholarship/10785/ Would it help to add a Canonical for each page to themselves? Any other suggestions would be great. Thanks
Technical SEO | | GeorgeLaRochelle0 -
Removing Duplicate Pages
Hi everyone. I'm sure this falls under novice seo question. But how do i remove duplicate pages from my site. I have not created the pages per say. Their may be a an internal link on a page that links to the page causing the duplication. Do i remove the internal link here is a sample of a duplicate page http://www.ticketplatform.com/about/ticket-industry-news-details/11-03-07/Ticket_Platform_to_help_LilysProject_com_to_raise_money_for_ALYN_Hospital_in_Israel.aspx?ReturnURL=%2fabout%2fticket-industry-news.aspx http://www.ticketplatform.com/about/ticket-industry-news-details/11-03-07/Ticket_Platform_to_help_LilysProject_com_to_raise_money_for_ALYN_Hospital_in_Israel.aspx?ReturnURL=%2fhome.aspx&CntPageID=1 I know the url is way too long. working on it Thanks for your feedbacks.
Technical SEO | | ticketplatform0