Help, a certain directory is not being indexed
-
Before I start, dont expect this to be too easy. This really has me puzzled and am surprised I am still yet to find a solution for it. Get ready.
We have a wordpress website, launched over 6 months ago and have never had an issue getting content such as pages and post pages and categories indexed. However, I some what recently (about 2 months ago) installed a directory plugin (Business Directory Plugin) which lists businesses via unique urls that are accesible from a sub folder. Its these business listings that I absolutely cannot get indexed.
The index page to the directory which links to the business pages is indexed, however for some reason google is not indexing all the listing pages which are linked to from this page. Its not an issue of the content being uncrawlable or at least dont think so as when I run crawlers on my site such as xml sitemap crawlers it finds all the pages including the directory pages so I am sure its not an issue of the search engines not finding the content.
I have created xml sitemaps and uploaded to webmaster tools, tools recongises that there are many pages in the xml sitemap but google continues to only index a small percentage (everything but my business listings).
The directory has been there for about 8 weeks now so I know there is a issue as it should of been indexed by now.
See our main website at www.smashrepairbid.com.au and the business directory index page at www.smashrepairbid.com.au/our-shops/
To throw in a curve ball, in looking into this issue and setting up tools we noticed a lot of 404 error pages (nearly 4,000). We were very confused where these were coming from as they were only being generated from search engines - humans could not access the 404s and so we are guessing se's were firing some javascript code to generate them or something else weird. We could see the 404s in the logs so we know they were legit but again feel it was only search engines, this was validated when we added some rules to robots.txt and we saw the errors in the logs stop. We put the rules in robots txt file to try and stop google from indexing the 404 pages as we could not find anyway to fix the site / code (no idea what is causing them). If you do a site search in google you will see all the pages that are omitted in the results.
Since adding the rules to robots, our impressions shown through tools have jumped right up (increased by 5 times) so thought this was a good indication of improvement but still not getting the results we want.
Does anyone have any clue whats going on or why google and other se's are not indexing this content? Any help would be greatly appreciated and if you need any other information to assist just ask me.
Really appreciate anyone who can spare their time to help me, I sure do need it.
Thanks.
-
OK issue resolved!
Lynn thank you - was the relative url in the canonical tag that played havoc Changing it to absolute is now causing the pages to be indexed.
Lesson learnt.
-
Hey Kane,
The /shops url was a old url that had a directory in it. We blocked it in the robots as it was generating tons of 404 errors. In webmaster tools we can see thousands of 404 errors within that directory so we deleted it all and tried to block se's from throwing the errors (like i described in initial post).
A number of those listing do have very little information however there are a bunch that do have great content which is why I am not sure if that is the case. I will keep an eye on this though and also check about the logs and let you know what that says.
-
Thanks Lynn.
I have taken on your recommendation and changed the canonical tag to be absolute. Thanks for your help we will see how it goes.
-
As Lynn said, relative canonical tags could absolutely cause issues. That said, I'm seeing absolute URLs in the canonical tag now, so you may have fixed that in the past few days.
Also, I do see the Our Shops pages indexed when I search for site:smashrepairbid.com.au, but I don't see any other pages in the /our-shops/ directory aside from www.smashrepairbid.com.au/our-shops/?action=search
Your robots.txt is currently blocking /shops/. I don't think that would cause an issue but would be nice to remove that if it's not needed...
There's almost zero content on the pages I glanced at, eg. http://www.smashrepairbid.com.au/our-shops/1263/bakker-towing/ and http://www.smashrepairbid.com.au/our-shops/1616/coastal-towing-service/. When you look at it from Google's perspective, there's very little value being added by these pages. No unique photos, no phone number, no website, etc. There's a million local business scrapers that have more content than this, so why should they bother indexing these pages?
Try pulling up your logs and seeing if these URLs have been requested by Google's spiders. Here's a good guide from Ian Lurie on how to do that in Excel: http://www.portent.com/blog/analytics/how-to-read-a-web-site-log-file.htm
If the spiders are crawling those shop URLs but aren't indexing them, I think the first thing to do is add way more content to the pages.
-
Hi Trent,
Having a quick look I saw that you have relative urls in your canonical tag and this could be problematic. I think it would be worth making those urls absolute to avoid any confusion on Google's part in determining what page or page version should be indexed.
Cannot say for sure if this is the problem, but worth looking into.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexing Issues
One of the main pages on my site, http://www.waikoloavacationrentals.com/kolea-rentals/condos, I have been having a hard time getting google to index it correctly or at all. It is one of the top pages on my site and should be in my sub links in google, but it is not even showing up in searches. Any input would be appreciated. The only red flap issue is the number of outgoing links, but that is the way the page is supposed to be. I would assume most real estate listing pages are very similar. Ultimately when you look at traffic, time on page, inbound links, etc. it is one of the top pages on my site in all those categories. Any input would be greatly appreciated.
On-Page Optimization | | RobDalton0 -
[HELP!] File Name and ALT Tags
Hi, please answer my questions: 1. Is it okay to use the same keyword on both file name and alt tags when inserting an image? Example: File Name: buy-lego-online.jpg ALT tag: buy-lego-online Will it trigger Google Panda? Will I be penalized for that? Or the file name and alt tags should be different from each other? Because when inserting an image on Wordpress, the alt tags are always the same as the file name by default. 2. For example, I have 2 images in a page (same topic/niche) and I will put "cheap-lego-for-kids" and "best-lego-for-sale" as alt tags. Considering that I repeat the word "lego", is it considered keyword stuffing? Will I be penalized for that? Thanks in advance!
On-Page Optimization | | bubblymaiko0 -
No-index all the posts of a category
Hi everyone! I would like no-indexing all the posts of a specific category of my wordpress site. The problem is that the structure of my URL is composed without /category/: www.site-name.ext/date/post-name/
On-Page Optimization | | salvyy
so without /category-name/ Is possibile to disallow the indexing of all the posts of the category via robots.txt? Using Yoast Plugin I can put the no-index for each post, but I would like to put the no-index (or disallow/) a time for all the post of the category. Thanks in advance for your help and sorry for my english. Mike0 -
Help an SEO-DUMMY : ) Established hyphenated domain...redirect?!...new domain?!
Hello, everybody. I am definitely not an SEO specialist. My family owns a transportation business (since 2010) and i am the one responsible for the website (until we find a good SEO company). My question: Several years ago i did not know much about SEO and have chosen a domain name www.airporttransportation-limo.com (it is not the actual domain...just an example...i'm not sure if i can post the real website here) and another domain that is just the name of our company (it also has hyphen in it). Both websites are still doing good and we receive quite a bit of traffic, but i read more an more about how hyphenated domains and domains with more then two worlds can be bad for your SEO/business/traffic. I feel like the websites are stuck and not moving up any more..could that be because of the hyphens? I registered another domain that is the name of our company (which is well known by now) without any hyphens. Now i have no idea what to do. Should i redirect both old domains (old websites are different and do not have duplicate content) to the new one, or should i just redirect the old domain (just the name of our company with hyphen) to a new one (without hyphen) and leave the www.airportransportation-limo.com as is... Or maybe i should register another domain without any hyphens (two words only) and redirect the www.airporttransportation-limo.com to it... I am very nervous to make any changes and loose all the traffic. My family will kill me. Please help! I'm lost!
On-Page Optimization | | KL20140 -
Google Index Report
Hi, I have just checked my google webmaster tools account and viewed the index status of my website and it produced the attached graph, which show quite a big spike in indexing during July and August 2012. Does this look normal or does it reveal anything peculiar? We did have a new website launched in June 2012 and I re-submitted the sites URL's to google as part of the re-launch and so I am unsure if this may account for the spike. Any advice appreciated. Thanks indexing.png
On-Page Optimization | | UnderMe0 -
Canonical URL tags help I am not sure what this is
I am trying to get an A grade on my webpage and this is one of the critical steps canonical URL tags I cant find much information as to what this even is never mind fixing it. Thanks I am a total newbe at this any advice is appreciated
On-Page Optimization | | gemfirez0 -
Getting pages indexed by Google
Hi SEOMoz, I relaunched a site back in February of this year (www.uniquip.com) with about 1 million URL's. Right now I'm seeing that Google is not going past 110k indexed URL's (based on sitemaps). Do you have any tips on what I can do to make the site more likeable by Google and get more indexed URL's? All the the part pages can be browsed to by going to: http://www.uniquip.com/product-line-card/suppliers/sw-a/p-1 I've tried to make the content as unique as possible by adding random testimonials and random "related part numbers" see here: http://www.uniquip.com/id/246172/electronic-components/infineon/microcontrollers-mcu/sabc161pilfca Do I need to wait more time and be more patient with Google? It just seems like I'm only getting a few thousand URL's per day at the most. Would it help me if I implemented a breadcrumb on all part pages? Thanks, -Carlos
On-Page Optimization | | caneja0 -
Why isn't Google indexing me?
Recently got handed off a .org site for a quasi state agency here in Michigan. Turns out the developer had the site live for the past six months but left the noindex, nofollow tag on everything so the site was invisible to search engines. Obviously we wiped all of those things a couple weeks ago when we got started, added all of our sitemaps to bing/yahoo/google webmaster tools and we've already started getting indexed by yahoo and bing and showing up for branded terms...but NOTHING from Google. WMT says our pages are all indexed, but we aren't showing up for anything in search and we don't seem to be indexed at all. Granted, if this site was brand new and didn't have any links I could see us taking a little time to get found, but this site has very good .gov and .edu links, plus we've built some other solid links to it since we've launched and Google continues to ignore it. I haven't seen this before, but could Google still be ignoring us from the months of noindex, nofollowing? If so, any tips on how to get back in teh Google's good graces here?
On-Page Optimization | | NetvantageMarketing0