How to deal with old, indexed hashbang URLs?
-
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here
Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/).
My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries.
So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #).
I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way?
If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue.
Best,
-G
-
Celts,
Did you ever resolve this? What you were discussing back in 2012 is called a "hashbang", and you can learn more about it here on Google. It is technically a way to get AJAX-loaded pages indexed on their own URL.
You asked this question a couple of years ago, and things have changed since then with push states and HTML 5 being preferred over hashbangs, and not loading a page's content with AJAX still the recommendation when possible.
-
Thanks for your answer. Yeah, I've seen the hash tag function as you've described it when being used for named anchors. However, in my case, Google IS indexing the URLs that contain the #! and it is also grabbing my homepage's title and using it in the SERPs on those results. So, given that that's happening, I'm concerned that the #! IS hurting me in this case.
In thinking more about this, I think what I'll do is put a canonical tag on the homepage and that should hopefully provide the extra guidance/insurance that I need to tell spiders that there is only ONE version of the homepage.
-
Google ignores the hash tag when indexing URLs. You can offer your home page with various versions of hash tags appended to the end of the URL and Google will not mind a bit. It will not case any issue for SEO.
A few more notes:
- Hash tags are used in HTML as an onpage anchor. Wikipedia is a good example. Take a look at the following page: http://en.wikipedia.org/wiki/Guitar. If you hover over the HISTORY link in the Table of Contents at the top of the page, notice the URL for the HISTORY link is http://en.wikipedia.org/wiki/Guitar#History. When you click the link, you remain on the same page but move to the History part of the page.
If you search Google.com for "Guitar History" you will notice the WIki page is listed first. (see attachment). The URL offered by Google is the page URL without any hash tag. Google does offer the ability to "Jump to History" which includes the hash tag link. That is a benefit to using anchor text on a page. Otherwise Google does not take the hash tag nor anything after it into account when indexing pages.
Rand offers a short video on this exact topic: http://www.seomoz.org/blog/whiteboard-friday-using-the-hash
I am not familiar with the exclamation point (bang) being used after the hash tag outside of twitter. The standard twitter URLs use it.
Summary - the hash bag is not the reason for your recent drop in rankings.
I am unclear what you mean by "Google still has thousands of the old hashbang (#!) URLs in its index." Can you share an example?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any Tips for Reviving Old Websites?
Hi, I have a series of websites that have been offline for seven years. Do you guys have any tips that might help restore them to their former SERPs glory? Nothing about the sites themselves has changes since they went offline. Same domains, same content, and only a different server. What has changed is the SERPs landscape. I've noticed competitive terms that these sites used to rank on the first page for with far more results now. I have also noticed some terms result in what seems like a thesaurus similar language results from traditionally more authoritative websites instead of the exact phrase searched for. This concerns me because I could see a less relevant page outranking me just because it is on a .gov domain with similar vocabulary even though the result is not what people searching for the term are most likely searching for. The sites have also lost numerous backlinks but still have some really good ones.
Intermediate & Advanced SEO | | CopBlaster.com1 -
Google not indexing images
Hi there, We have a strange issue at a client website (www.rubbermagazijn.nl). Webpage are indexed by Google but images are not, and have never been since the site went live in '12 (We recently started SEO work on this client). Similar sites like www.damenrubber.nl are being indexed correctly. We have correct robots and sitemap setup and directions. Fetch as google (Search Console) shows all images displayed correctly (despite scripted mouseover on the page) Client doesn't use CDN Search console shows 2k images indexed (out of 18k+) but a site:rubbermagazijn.nl query shows a couple of images from PDF files and some of the thumbnails, but no productimages or category images from homepage. (product page example: http://www.rubbermagazijn.nl/collectie/slangen/olie-benzineslangen/7703_zwart_nbr-oliebestendig-6mm-l-1000mm.html) We've changed the filenames from non-descriptive names to descriptive names, without any result. Descriptive alt texts were added We're at a loss. Has anyone encountered a similar issue before, and do you have any advice? I'd be happy to provide more information if needed. CBqqw
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
Dealing with 404s during site migration
Hi everyone - What is the best way to deal with 404s on an old site when you're migrating to a new website? Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Dealing with thin comment
Hi again! I've got a site where around 30% of URLs have less than 250 words of copy. It's big though, so that is roughly 5,000 pages. It's an ecommerce site and not feasible to bulk up each one. I'm wondering if noindexing them is a good idea, and then measuring if this has an effect on organic search?
Intermediate & Advanced SEO | | Blink-SEO1 -
How to Index Faster?
Hello, I have a new website and updated fresh content regularly. My indexing status is very slow. When I search how to improve my indexing rate by Google, I found most of the members of Moz community replied there is no certain technique to improve your indexing. Apart from this you should keep posting fresh content more and more and wait for Google Indexing. Some of them asked for submitting sitemap and share posts on Twitter, Facebook and Google Plus. Well the above comments are from the year of 2012. I'm curious to know is there any new technique or methods are used to improve indexing rate? Need your suggestions! Thanks.
Intermediate & Advanced SEO | | TopLeagueTechnologies0 -
Index, Nofollow Issue
We are having on our site a couple of pages that we want the page to be indexed, however, we don't want the links on the page to be followed. For example url: http://www.printez.com/animal-personal-checks.html. We have added in our code: . Bing Webmaster Tools, is telling us the following: The pages uses a meta robots tag. Review the value of the tag to see if you are not unintentionally blocking the page from being indexed (NOINDEX). Question is, is the page using the right code as of now or do we need to do any changes in the code, if so, what should we use for them to index the page, but not to follow the links on the page? Please advise, Morris
Intermediate & Advanced SEO | | PrintEZ0 -
HTTPS Certificate Expired. Website with https urls now still in index issue.
Hi Guys This week the Security certificate of our website expired and basically we now have to wail till next Tuesday for it to be re-instated. So now obviously our website is now index with the https urls, and we had to drop the https from our site, so that people will not be faced with a security risk screen, which most browsers give you, to ask if you are sure that you want to visit the site, because it's seeing it as an untrusted one. So now we are basically sitting with the site urls, only being www... My question what should we do, in order to prevent google from penalizing us, since obviously if googlebot comes to crawl these urls, there will be nothing. I did however re-submitted it to Google to crawl it, but I guess it's going to take time, before Google picks up that now only want the www urls in the index. Can somebody please give me some advice on this. Thanks Dave
Intermediate & Advanced SEO | | daveza0 -
Overly-Dynamic URLs & Changing URL Structure w Web Redesign
I have a client that has multiple apartment complexes in different states and metro areas. They get good traffic and pretty good conversions but the site needs a lot of updating, including the architecture, to implement SEO standards. Right now they rank for " <brand_name>apartments" on every place but not " <city_name>apartments".</city_name></brand_name> There current architecture displays their URLs like: http://www.<client_apartments>.com/index.php?mainLevelCurrent=communities&communityID=28&secLevelCurrent=overview</client_apartments> http://www.<client_apartments>.com/index.php?mainLevelCurrent=communities&communityID=28&secLevelCurrent=floorplans&floorPlanID=121</client_apartments> I know it is said to never change the URL structure but what about this site? I see this URL structure being bad for SEO, bad for users, and basically forces us to keep the current architecture. They don't have many links built to their community pages so will creating a new URL structure and doing 301 redirects to the new URLs drastically drop rankings? Is this something that we should bite the bullet on now for future rankings, traffic, and a better architecture?
Intermediate & Advanced SEO | | JaredDetroit0