Webmaster Crawl errors caused by Joomla menu structure.
-
Webmaster Tools is reporting crawl errors for pages that do not exist due to how my Joomla menu system works. Example, I have a menu item named "Service Area" that stores 3 sub items but no actual page for Service Area. This results in a URL like domainDOTcom/service-area/service-page.html
Because the Service Area menu item is constructed in a way that shows the bot it is a link, I am getting a 404 error saying it can't find domainDOTcom/service-area/ (The link is to "javasript:;") Note, the error doesn't say domainDOTcom/service-area/javascript:; it just says /service-area/
What is the best way to handle this? Can I do something in robots.txt to tell the bot that this /service-area/ should be ignored but any page after /service-area/ is good to go? Should I just mark them as fixed as it's really not a 404 a human will encounter or is it best to somehow explain this to the bot? I was advised on google forums to try this, but I'm nervous about it.
Disallow: /service-area/*
Allow: /service-area/summerlin-pool-service.
Allow: /service-area/north-las-vegas
Allow: /service-area/centennial-hills-pool-serviceI tried a 301 redirect of /service-area to home page but then it pulls that out of the url and my landing pages become 404's.
http://www.lvpoolcleaners.com/
Thanks for any advice!
Derrick
-
No problem Derrick, my pleasure.
Tom
-
Wow,
Tom, thank you for the amazingly complete and well articulated response. You, kind sir, are a interwebs Rock Star!
-
Hi Derrick,
if you wish to use robots.txt you could simply use:
Allow: /service-area/*
Disallow: /service-area/This will allow access to any child of /service-area/ but not /service-area/.
You could redirect this page to your homepage if you wished, and to stop children of this page being redirected you could use RedirectMatch instead of the Redirect directive and use a simple regular expression to only redirect if the URI ends with /service-area/, like this:
RedirectMatch 301 /service-area/?$ http://www.lvpoolcleaners.com/
The $ sign at the end signs that the apache should only redirect if the URI is ending in that pattern, and the ? after the trailing / allows the redirect to happen with or without the trailing slash.
But perhaps the simplest solution to this problem would be making your /service-area/ link point to '#' if the Joomla menu will allow it. This will append an empty anchor to the url, it will not refresh or redirect the page and anchors in URLs are not counted as duplicate URLs.
For human usability this would be the nicest way to interact with the menu, as you don't want a visitor being interrupted mid-way through their buying cycle by being sent back to the homepage when they didn't ask for it.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Googlebot crawl error Javascript method is not defined
Hi All, I have this problem, that has been a pain in the ****. I get tons of crawl errors from "Googlebot" saying a specific Javascript method does not exist in my logs. I then go to the affected page and test in a web browser and the page works without any Javascript errors. Can some help with resolving this issue? Thanks in advance.
Technical SEO | | FreddyKgapza0 -
Missing xml tag error
Our xml sitemap is divided up in to many smaller xml sitemaps so we have fewer products per sitemap, in order to easily identify errors. A couple of weeks ago, we changed our xml sitemap by reordering some of the products. However, this has left some old xml sitemaps without any data, and they are no longer appearing in our xml sitemap. But, Google is still identifying these sitemaps since they once existed, and they are giving errors since they can't locate them. Should we 404 those xml sitemaps, or is there a better way to handle this?
Technical SEO | | ang0 -
404 errors
Hi I am getting these show up in WMT crawl error any help would be very much appreciated | ?escaped_fragment=Meditation-find-peace-within/csso/55991bd90cf2efdf74ec3f60 | 404 | 12/5/15 |
Technical SEO | | ReSEOlve
| | 2 | mobile/?escaped_fragment= | 404 | 10/26/15 |
| | 3 | ?escaped_fragment=Tips-for-a-balanced-lifestyle/csso/1 | 404 | 12/1/15 |
| | 4 | ?escaped_fragment=My-favorite-yoga-spot/csso/5598e2130cf2585ebcde3b9a | 404 | 12/1/15 |
| | 5 | ?escaped_fragment=blog/c19s6 | 404 | 11/29/15 |
| | 6 | ?escaped_fragment=blog/c19s6/Tag/yoga | 404 | 11/30/15 |
| | 7 | ?escaped_fragment=Inhale-exhale-and-once-again/csso/2 | 404 | 11/27/15 |
| | 8 | ?escaped_fragment=classes/covl | 404 | 10/29/15 |
| | 9 | m/?escaped_fragment= | 404 | 10/26/15 |
| | 10 | ?escaped_fragment=blog/c19s6/Page/1 | 404 | 11/30/15 | | |0 -
Folder Hierarchy Structure Theory
Hi, I was wondering if search engines, in particular, Google, actually use folder hierarchy to determine how important a particular page on a website might be for ranking purposes, or is on-site page inter-linking only taken into consideration. I know that external and internal links help to support the authority or 'page rank' of a particular webpage on a website. In a typical Wordpress installation, for example, it is easy to create a page and assign child-pages to support it. These sub-pages would naturally link to their parent pages via menu and/or body links, so they would theoretically 'support' the authority of the parent folder/page. My question is... would search engines see the parent folder page as more authoritative than a child-page, even without a lot of on-site interlinking of child and parent pages, just because it is higher up in the folder structure? For example, I have a client who has a Wordpress website, but is using a plugin to make all pages have a .htm ending. The site is fairly 'flat', hierarchally speaking and does not use any /folders/, but the pages are inter-linked. In the following scenario, there are 4 testimonial pages... 1 main one and 3 supporting pages. The 3 supporting pages are linked to from the parent page and vice versa. /testimonials.htm /testimonials-quality.htm /testimonials-price.htm /testimonials-ease.htm I was wondering if it is worth suggesting to my client that we remove that plugin so that we can more easily employ the natural folder hierarchy functions of Wordpress, such as this scenario: /testimonials/ /testimonials/quality/ /testimonials/price/ /testimonials/ease/ Would the loss of 'link juice' due to redirects and the work that would be involved would be worth the possible ranking increases of potentially structuring the website better... or are we fine just relying on the existing page interlinking to show the search engines what are the important parent pages?
Technical SEO | | OrionGroup0 -
Webmaster tools
Hello, My sites are showing odd "links to your site" data in WMT. Its not showing any links to the homepages and reduced links for other pages. Anyone else seeing this? Penguin refresh maybe?
Technical SEO | | jwdl0 -
Multilingual blogs and site structure
Hi everyone, I have a question about multilingual blogs and site structure. Right now, we have the typical subfolder localization structure. ex: domain.com/page (english site) domain.com/ja/page (japanese site) However, the blog is a slightly more complicated. We'd like to have english posts available in other languages (as many of our users are bilinguals). The current structure suggests we use a typical domain.com/blog or domain.com/ja/blog format, but we have issues if a Japanese (logged in) user wants to view an English page. domain.com/blog/article would redirect them to domain.com/ja/blog/article thus 404-ing the user if the post doesn't exist in the alternate language. One suggestion (that I have seen on sites such as etsy/spotify is to add a /en/ to the blog area: ex domain.com/en/blog domain.com/ja/blog Would this be the correct way to avoid this issue? I know we could technically work around the 404 issue, but I don't want to create duplicate posts in /ja/ that are in English or visa versa. Would it affect the rest of the site if we use a /en/ subfolder just for the blog? Another option is to use: domain.com/blog/en domain.com/blog/ja but I'm not sure if this alternative is better. Any help would be appreciated!
Technical SEO | | Seiyav0 -
RSS Feed Errors in Google
We recently (2 months ago) launched RSS feeds for the category pages on our site. Last week we started seeing error pages in Webmaster Tools' Crawl Errors report pop up for feeds of old pages that have been deleted from the site, deleted from the sitemap, and not in Google's index since long before we launched the RSS feeds. Example: www.mysite.com/super-old-page/feed/ I checked and both the URL for the feed and the URL for the actual page are returning 404 statuses. www.mysite.com/super-old-page/ is also showing up in our Crawl Errors. Its been deleted for months but Webmaster Tools is very slow to remove the page from their Crawl Error report. Where is Google finding these feeds that never existed?
Technical SEO | | Hakkasan0 -
Crawl Errors
Okay, I was just in my Google Webmaster Tools and was looking at some of the stats. I have 1354 "not found" pages google says. Many of these URL's are bizarre. I don't know what they are. Others I do know. What should I do about this? Especially all the URL's I don't even know what they are?
Technical SEO | | azguy0