My site is not being regularly crawled?
-
My site used to be crawled regularly, but not anymore. My pages aren't showing up in the index months after they've been up. I've added them to the sitemap and everything. I now have to submit them through webmaster tools to get them to index. And then they don't really rank?
Before you go spouting off the standard SEO resolutions...
- Yes, I checked for crawl errors on Google Webmaster and no, there aren't any issues
- No, the pages are not noindex. These pages are index,follow
- No, the pages are not canonical
- No, the robots.txt does not block any of these pages
- No, there is nothing funky going on in my .htaccess. The pages load fine
- No, I don't have any URL parameters set
What else would be interfereing?
Here is one of the URLs that wasn't crawled for over a month: http://www.howlatthemoon.com/locations/location-st-louis
-
Google may have reduced their crawl budget for your site if they found enough pages to be low quality, thin, duplicate, etc... Here are some examples that you can probably apply a noindex tag to in order to reduce crawl budget waste on them.
Google these:
site:howlatthemoon.com/ inurl:tag
1,360 indexed pages that don't need to be in the SERPssite:howlatthemoon.com/ inurl:tag page
Even if you want the tag pages indexed, you don't need their paginated pages indexed toosite:howlatthemoon.com/ inurl:page inurl:category
A few dozen category pagination pages in the SERPs, many more on the site that have been booted from the index
Example:
http://www.howlatthemoon.com/dueling_piano_bar/category/nightlife/page/2/I would install the Yoast Wordpress SEO plugin, which should fix these in the following ways:
Use rel canonical tags
Use rel Next/Prev tagsNoindex tag pages
etc... -
That page is indexed. Can you give me an example of a page that hasn't been indexed yet? Your onsite SEO seems fine except you should get rid of your meta keywords tag and your Meta Description tag is a bit long.
-
oh sorry! Here is one of the pages that wasn't crawled since it was posted in early November.
-
You forgot one...what is the URL of your site
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site-wide Links
Hey y'all, I know this question has been asked many times before but I wanted to see what your stance was on this particular case. The organisation I work for is a group of 12 companies - each with its own website. On some of the sites we have a link to the other sites within the group on every single page of that site. Our organic search traffic has dropped a bit but not significantly and we haven't received any manual penalties from Google. It's also worth mentioning that the referral traffic for these sites from the other sites I control is quite good and the bounce rate is extremely low. If you were in my shoes would you remove the links, put a nofollow tag on the links or leave the links as they are? Thanks guys 🙂
Technical SEO | | AAttias0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Firefox Add-On for crawl frequency??
Hi all, a short one: is there a firefox add-on available, which lets you see the crawl frequency of your page(s)? Would be interesting to see if google bot comes around more lately... There are some statistics in the webmaster tools, but I don't find them very attractive 🙂 I know there is something for Wordpress, but we don't use it... I don't to put up an excel-sheet and check the cache-version for myself. And I would love to see how deep the crawler gets and which sites do not get crawled... So, any existing add-ons or tools that are for free?? 🙂 Thanx....
Technical SEO | | accessKellyOCG0 -
Site Map Problems or Are They?
According to webmaster tools my Sitemap contains urls which are blocked by robots.txt Our site map is generically generated and encompasses all web pages, whether I have excluded them using the robots.txt file As far as I am aware this has never been an issue until recently. Is this hurting my rankings and how do I fix it? Secondly, webmaster tools says there is over 5,000 error/warnings on my site map. But site map is only 1,400 or so pages submitted. How do I see what is going on?
Technical SEO | | Professor0 -
Googlebot Crawl Rate causing site slowdown
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT: http://imgur.com/dyIbf I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot. Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings? Thanks
Technical SEO | | SuperMikeLewis0 -
How do you diagnose if on your site is only 50% crawled?
Good Morning from 7 degrees C, goodbye arctic conditions wetherby UK, If a site had 100 pages for example & that site was plugged into Webmaster Tools how could you diagnose if all the pages had been crawled? The thing is I want to learn how to diagnose crawl issues with sites, is their a known methodology for this? Thanks in advance, David
Technical SEO | | Nightwing0 -
Traffic has dropped from my site.
Hello, I never had amazing traffic, but during the last week my site seems to have almost dropped of search engines. Nothing drastic has changed during this time that I can see would have caused this. The site is http://www.comparebestodds.com Does any one have any ideas that can help? Thanks
Technical SEO | | jwdesign0 -
E-Commerce Site Crawling Problem
Our website displays all of the products in our website If you attempt to visit a category or page that doesn't exist but conforms to our site url structure. Somehow google crawled these pages and indexed them, and they have TONS of duplicate content that hurt us. How do I deal with this problem?
Technical SEO | | 13375auc30