How to deal with old, indexed hashbang URLs?
-
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here
Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/).
My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries.
So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #).
I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way?
If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue.
Best,
-G
-
Celts,
Did you ever resolve this? What you were discussing back in 2012 is called a "hashbang", and you can learn more about it here on Google. It is technically a way to get AJAX-loaded pages indexed on their own URL.
You asked this question a couple of years ago, and things have changed since then with push states and HTML 5 being preferred over hashbangs, and not loading a page's content with AJAX still the recommendation when possible.
-
Thanks for your answer. Yeah, I've seen the hash tag function as you've described it when being used for named anchors. However, in my case, Google IS indexing the URLs that contain the #! and it is also grabbing my homepage's title and using it in the SERPs on those results. So, given that that's happening, I'm concerned that the #! IS hurting me in this case.
In thinking more about this, I think what I'll do is put a canonical tag on the homepage and that should hopefully provide the extra guidance/insurance that I need to tell spiders that there is only ONE version of the homepage.
-
Google ignores the hash tag when indexing URLs. You can offer your home page with various versions of hash tags appended to the end of the URL and Google will not mind a bit. It will not case any issue for SEO.
A few more notes:
- Hash tags are used in HTML as an onpage anchor. Wikipedia is a good example. Take a look at the following page: http://en.wikipedia.org/wiki/Guitar. If you hover over the HISTORY link in the Table of Contents at the top of the page, notice the URL for the HISTORY link is http://en.wikipedia.org/wiki/Guitar#History. When you click the link, you remain on the same page but move to the History part of the page.
If you search Google.com for "Guitar History" you will notice the WIki page is listed first. (see attachment). The URL offered by Google is the page URL without any hash tag. Google does offer the ability to "Jump to History" which includes the hash tag link. That is a benefit to using anchor text on a page. Otherwise Google does not take the hash tag nor anything after it into account when indexing pages.
Rand offers a short video on this exact topic: http://www.seomoz.org/blog/whiteboard-friday-using-the-hash
I am not familiar with the exclamation point (bang) being used after the hash tag outside of twitter. The standard twitter URLs use it.
Summary - the hash bag is not the reason for your recent drop in rankings.
I am unclear what you mean by "Google still has thousands of the old hashbang (#!) URLs in its index." Can you share an example?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Does Yelp Create URLs?
Hi all, How does Yelp (or other sites) go about creating URLs for just about every service and city possible ending with the search? in the URL like this https://www.yelp.com/search?cflt=chiropractors&find_loc=West+Palm+Beach%2C+FL. They clearly aren't creating all of these pages, so how do you go about setting a meta title/optimization formula that allows these pages to exist AND to be crawled by search engines and indexed?
Intermediate & Advanced SEO | | RickyShockley0 -
Magento: Should we disable old URL's or delete the page altogether
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages. We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google. I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap. My questions are: 1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether? 2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
Intermediate & Advanced SEO | | andyheath0 -
Website Re-Launch - New URLS / Old URL WMT
Hello... We recently re-launched website with a new CMS (Magento). We kept the same domain name, however most of the structure changed. We were diligent about inputting the 301 redirects. The domain is over 15 years old and has tons of link equity and history. Today marks 27 days since launch...And Google Webmaster Tools showed me a recently detected (dated two days ago) URL from the old structure. Our natural search traffic has take a slow dive since launch...Any thoughts? Some background info: The old site did not have a sitemap.xml. The relaunched site does. Thanks!
Intermediate & Advanced SEO | | 19prince0 -
URL Parameter Being Improperly Crawled & Indexed by Google
Hi All, We just discovered that Google is indexing a subset of our URL’s embedded with our analytics tracking parameter. For the search “dresses” we are appearing in position 11 (page 2, rank 1) with the following URL: www.anthropologie.com/anthro/category/dresses/clothes-dresses.jsp?cm_mmc=Email--Anthro_12--070612_Dress_Anthro-_-shop You’ll note that “cm_mmc=Email” is appended. This is causing our analytics (CoreMetrics) to mis-attribute this traffic and revenue to Email vs. SEO. A few questions: 1) Why is this happening? This is an email from June 2012 and we don’t have an email specific landing page embedded with this parameter. Somehow Google found and indexed this page with these tracking parameters. Has anyone else seen something similar happening?
Intermediate & Advanced SEO | | kevin_reyes
2) What is the recommended method of “politely” telling Google to index the version without the tracking parameters? Some thoughts on this:
a. Implement a self-referencing canonical on the page.
- This is done, but we have some technical issues with the canonical due to our ecommerce platform (ATG). Even though page source code looks correct, Googlebot is seeing the canonical with a JSession ID.
b. Resubmit both URL’s in WMT Fetch feature hoping that Google recognizes the canonical.
- We did this, but given the canonical issue it won’t be effective until we can fix it.
c. URL handling change in WMT
- We made this change, but it didn’t seem to fix the problem
d. 301 or No Index the version with the email tracking parameters
- This seems drastic and I’m concerned that we’d lose ranking on this very strategic keyword Thoughts? Thanks in advance, Kevin0 -
How much is the effect of redirecting an old URL to another URL under a new domain?
Example: http://www.olddomain.com/buy/product-type/region/city/area http://www.newdomain.com/product-type-for-sale/city/area Thanks in advance!
Intermediate & Advanced SEO | | esiow20130 -
301 redirect with /? in URL
For a Wordpress site that has the ending / in the URL with a ? after it... how can you do a 301 redirect to strip off anything after the / For example how to take this URL domain.com/article-name/?utm_source=feedburner and 301 to this URL domain.com/article-name/ Thank you for the help
Intermediate & Advanced SEO | | COEDMediaGroup0 -
Need Perfect URLs
I'm redesigning a site's structure from the ground up, and am having issues with the URLs. I'd love to have them be perfect, but kept finding conflicting advice online. 1. For my services blog, is it best to have it set up like www.example.com/services/keyword or
Intermediate & Advanced SEO | | Stryde
www.example.com/keyword There seems to be conflicting advice as to keep it short and keep the keyword as far to the left as possible, but also that including the word services would help with long tail phrases and site organization. 2. For my blog section, is it best to have it set up like
www.example.com/blog/keyword or
www.example.com/keyword or
www.example.com/blog-post-title-with**-keyword**-in-it It's similar to the first question, but also adds the question of including the entire post title in the URL or just the keyword. Your help would be greatly appreciated!1 -
Does having a trailing slash make a url different than the same url without the trailing slash?
Does having a trailing slash make a url different than the same url without the trailing slash? www.example.com/services Or www.example.com/services**/** Does Google consider these to be the same link or does Google treat them as different links?
Intermediate & Advanced SEO | | webestate0