Help, a certain directory is not being indexed
-
Before I start, dont expect this to be too easy. This really has me puzzled and am surprised I am still yet to find a solution for it. Get ready.
We have a wordpress website, launched over 6 months ago and have never had an issue getting content such as pages and post pages and categories indexed. However, I some what recently (about 2 months ago) installed a directory plugin (Business Directory Plugin) which lists businesses via unique urls that are accesible from a sub folder. Its these business listings that I absolutely cannot get indexed.
The index page to the directory which links to the business pages is indexed, however for some reason google is not indexing all the listing pages which are linked to from this page. Its not an issue of the content being uncrawlable or at least dont think so as when I run crawlers on my site such as xml sitemap crawlers it finds all the pages including the directory pages so I am sure its not an issue of the search engines not finding the content.
I have created xml sitemaps and uploaded to webmaster tools, tools recongises that there are many pages in the xml sitemap but google continues to only index a small percentage (everything but my business listings).
The directory has been there for about 8 weeks now so I know there is a issue as it should of been indexed by now.
See our main website at www.smashrepairbid.com.au and the business directory index page at www.smashrepairbid.com.au/our-shops/
To throw in a curve ball, in looking into this issue and setting up tools we noticed a lot of 404 error pages (nearly 4,000). We were very confused where these were coming from as they were only being generated from search engines - humans could not access the 404s and so we are guessing se's were firing some javascript code to generate them or something else weird. We could see the 404s in the logs so we know they were legit but again feel it was only search engines, this was validated when we added some rules to robots.txt and we saw the errors in the logs stop. We put the rules in robots txt file to try and stop google from indexing the 404 pages as we could not find anyway to fix the site / code (no idea what is causing them). If you do a site search in google you will see all the pages that are omitted in the results.
Since adding the rules to robots, our impressions shown through tools have jumped right up (increased by 5 times) so thought this was a good indication of improvement but still not getting the results we want.
Does anyone have any clue whats going on or why google and other se's are not indexing this content? Any help would be greatly appreciated and if you need any other information to assist just ask me.
Really appreciate anyone who can spare their time to help me, I sure do need it.
Thanks.
-
OK issue resolved!
Lynn thank you - was the relative url in the canonical tag that played havoc Changing it to absolute is now causing the pages to be indexed.
Lesson learnt.
-
Hey Kane,
The /shops url was a old url that had a directory in it. We blocked it in the robots as it was generating tons of 404 errors. In webmaster tools we can see thousands of 404 errors within that directory so we deleted it all and tried to block se's from throwing the errors (like i described in initial post).
A number of those listing do have very little information however there are a bunch that do have great content which is why I am not sure if that is the case. I will keep an eye on this though and also check about the logs and let you know what that says.
-
Thanks Lynn.
I have taken on your recommendation and changed the canonical tag to be absolute. Thanks for your help we will see how it goes.
-
As Lynn said, relative canonical tags could absolutely cause issues. That said, I'm seeing absolute URLs in the canonical tag now, so you may have fixed that in the past few days.
Also, I do see the Our Shops pages indexed when I search for site:smashrepairbid.com.au, but I don't see any other pages in the /our-shops/ directory aside from www.smashrepairbid.com.au/our-shops/?action=search
Your robots.txt is currently blocking /shops/. I don't think that would cause an issue but would be nice to remove that if it's not needed...
There's almost zero content on the pages I glanced at, eg. http://www.smashrepairbid.com.au/our-shops/1263/bakker-towing/ and http://www.smashrepairbid.com.au/our-shops/1616/coastal-towing-service/. When you look at it from Google's perspective, there's very little value being added by these pages. No unique photos, no phone number, no website, etc. There's a million local business scrapers that have more content than this, so why should they bother indexing these pages?
Try pulling up your logs and seeing if these URLs have been requested by Google's spiders. Here's a good guide from Ian Lurie on how to do that in Excel: http://www.portent.com/blog/analytics/how-to-read-a-web-site-log-file.htm
If the spiders are crawling those shop URLs but aren't indexing them, I think the first thing to do is add way more content to the pages.
-
Hi Trent,
Having a quick look I saw that you have relative urls in your canonical tag and this could be problematic. I think it would be worth making those urls absolute to avoid any confusion on Google's part in determining what page or page version should be indexed.
Cannot say for sure if this is the problem, but worth looking into.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to block index of link and content
Hi, We have pages where articles are shown and in the sides we have small snippets of Articles which shows the title and close to 25 words and a image. When i search for something in Google the snippet image and content is shown and in Google when clicked it redirects to a page which is not meant to be shown for the keyword the visitor is querying Is there a way i can block all the links and content shown in the right and left side of the page so Google does not get confused with the page content thats not related to that page? thanks
On-Page Optimization | | AlexisWithers0 -
Index dropped 20 pages at once since yesterday
Hi community, I just realized that my indexed pages dropped from the amount of 95 to 75 and I don't know why. I did some title tag arrangements because we are launching with our first product (before that it was just a blog). I did these changes 1 week ago and fetched to google the homepage and some subdomains. Thanks for your help. Kind regards Marco
On-Page Optimization | | Marc19870 -
Disappearing and reappearing in google index
Hello. I made a lot of car accident lawyer city pages. They probably weren't as unique as they should have been. Suddenly, they all disappeared from the rankings and I freaked out. Then, two days later, they all returned. Is this a bad sign? Should I be worried? Why would they drop out of the rankings and come back in? Let me know, thanks.
On-Page Optimization | | RafeTLouis0 -
Directory site with an URL structure dilemma
Hello, We run a site, which lists local businesses and tag them by their nature of business (similar to Yelp). Our problem is, that our category and sub-category(i.e.: www.example.com/budapest/restaurant or www.example.com/budapest/cars/spare-parts) pages are extremely weak, and get almost no traffic, but most of the traffic (95+ percent) goes for the actual business pages. While this might be a completely normal thing, I still would like to strengthen our category (listing) pages as well, as these should be the ones targeted by some of general keywords, like ‘restaurant’ or ‘restaurant+budapest’. One of the issues I have identified as a possible problem, that we do not have a clear hierarchy within the site, so while the main category pages are linked from the homepage (and the sub-categories from here), there is no bottom-up linking from the business pages back to the category pages, as the business page URLs look like this: www.example.com/business/onyx-restaurant-budapest. I think, that the good site- and url structure for the above would be like this: www.example.com/budapest/restaurant/hungarian/onyx-restaurant. My only issue is, perhaps not with the restaurants but with others, that some of the businesses have multiple tags, so they can be tagged i.e. as car saloon, auto repair and spare parts at the same time. Sometimes, they even have 5+ tags on them. My idea is, that I will try to identify a primary tag for all the businesses (we maintain 99 percent of them right now), and the rest of their tags would be secondary ones. I would then use canonicalization and mark the page with the primary tag in the url as the preferred one for that specific content. With this scenario, I might have several URLs with the same content (complete duplicates), but they would point to one page only as the preferred one, while our visitors could still reach the businesses in any preferred ways, so either by looking for car saloons, auto-repair or spare parts. This way, we could also have breadcrumbs on all the pages, which now we miss completely. Can this be a feasible scenario? Might it have a side-effect? Any hints on how to do it a better way? Many thanks, Andras
On-Page Optimization | | Dilbak0 -
Will a new domain name help rankings
If I purchase a domain name that links to my site with the new domain name being keyword specific....will that help boost rankings in Google? Reason I ask is that a particular website always ranks higher than ours because of their domain name (keyword specific). They are currently not even "open" and they still manage to rank high. I checked for links with the seomoz tools but did not see any high links etc.. Thanks!
On-Page Optimization | | teachcsg0 -
Blog outgoing links to a certain domain?
Hi Mozzers, I am working with a website with very decentralized ownership. There are two different languages, each with a different owner. Owner A keeps linking to crap sites, that hurt the entire site. My question is this: Is there a way - through .htaccess or robots.txt - that Google can be asked NOT to crawl the links to external crap sites? The problem is that Owner B cannot control Owner A's html, and thus not implement rel="nofollow" on links. Thanks!
On-Page Optimization | | ThomasHgenhaven0 -
3 Different Home Page URL's Being Indexed?
Hello Everyone! I own a dog supplies eCom site on the x-cart platform. I recently upgraded to 4.4 version about 3 weeks ago and am noticing 3 different home page URL's getting indexed and ranked: /
On-Page Optimization | | k9byron
/home.php
/home.php?cat= I dont know why this is happening and I dont claim to be an expert SEO but know this cant be good! I am seeing high rankings on certain terms for all 3 URL's. Has anyone seen this before and can anyone give me any feedback on this and how it may be effecting my sites ranking in the future? Thanks in advance!
Byron-0 -
SERP Rankings for Certain Keywords
For some of our keywords, we rank on page one in google, but the page is an individual product and not the catelgory page? Any idea why this is happening? Thanks for your Help!
On-Page Optimization | | MRabidoux0