Google suddenly indexing and displaying URLs that haven't existed for years?
-
We recently noticed google is showing approx 23,000 indexed .jsp urls for our site. These are ancient pages that haven't existed in years and have long been 301 redirected to valid urls. I'm talking 6 years.
Checking the serps the other day (and our current SEOMoz pro campaign), I see that a few of these urls are now replacing our correct ones in the serps for important, competitive phrases.
What the heck is going on here?
Is Google suddenly ignoring rewrite rules and redirects?
Here's an example of the rewrite rules that we've used for 6+ years:
RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301]
Now, this 'bottom paint' url has been incredibly stable in the serps for over a half decade. All of a sudden, a google search for 'bottom paint' (no quotes) brings up the jsp page at position 2-3.
This is just one example of something very bizarre happening. Has anyone else had something similar happen lately?
Thank You
<colgroup><col width="64"></colgroup>
| RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301] | -
Oleg
Thank you for the reply. I am going to submit to G as well. What's really interesting is that for some of those ancient pages that have somehow resurfaced, you can view the cache dates. Those pages seem to have cache dates from late nov and dec 2012. But for others, attempting to view the cached version yields a google 404!
IMO, this suggests to its a bug.
As an aside, you are certainly correct about canonical and pagination issues on our site. We have implemented canonical thus far only on product pages (over 10k prod pages), and I've had getting next/prev for pagination of subcategories as a top priority for months now.
Thanks
-
Is Google suddenly ignoring rewrite rules and redirects?
Shouldn't be.. pretty odd. You can try blocking the crawler from accessing the old .jsp pages if they all follow a format (below code is if every page starts with /xref_)
User-agent:*
Disallow: /xref_*Looks like you don't really need a RewriteRule line there.. just a redirect would do the trick
Redirect 301 /xref_interlux_antifoulingoutboards&keels.jsp /userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID
But I don't think that is the problem since its still sending a 301 response code when you visit the .jsp file.
One thing that may help is adding canonical tags to your current pages - make sure you utilize rel=canonical as well as rel=next/prev for your paginated pages.
Overall, I'm not sure =/ Try posting/submitting it to G, could be a bug.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexed Site A's Content On Site B, Site C etc
Hi All, I have an issue where the content (pages and images) of Site A (www.ericreynolds.photography) are showing up in Google under different domains Site B (www.fastphonerepair.com), Site C (www.quarryhillvet.com), Site D (www.spacasey.com). I believe this happened because I installed an SSL cert on Site A but didn't have the default SSL domain set on the server. You were able to access Site B and any page from Site A and it would pull up properly. I have since fixed that SSL issue and am now doing a 301 redirect from Sites B, C and D to Site A for anything https since Sites B, C, D are not using an SSL cert. My question is, how can I trigger google to re-index all of the sites to remove the wrong listings in the index. I have a screen shot attached so you can see the issue clearer. I have resubmitted my site map but I'm not seeing much of a change in the index for my site. Any help on what I could do would be great. Thanks
Intermediate & Advanced SEO | | cwscontent
Eric TeVM49b.png qPtXvME.png1 -
My homepage doesn't seem to be indexed. Any suggestions?
As the title said, I don't think my homepage is being indexed. When I use "site:" search operator it's not there, but it's still ranking for other various keywords. Also the pages of my site I would expect to see with the "site:" search operator aren't there either. Site for reference: three29.com Any ideas what could be causing this? I don't have any errors or penalties in Search Console. Thanks.
Intermediate & Advanced SEO | | Three290 -
A client rebranded a few years ago and doesn't want to be associated with it's old brand name. He wishes not to appear when the old brand is searched in Google, is there something we can do?
The problem is there was redirection between the old branded site and the new one, and now when you type in the name of the old brand, the new one comes up. I have desperately tried to convince this client there is nothing we can do about it, dozens of news articles crop up with the two brands together as this was a hot topic a few years ago, but just in case I missed something I thought I'd ask the community of experts here on Moz. An example for this would be Tyco Healthcare that became covidien in 2007. When you type tyco healthcare, covidien crops up here and there. Any ideas? Thanks!
Intermediate & Advanced SEO | | Netsociety0 -
ECommerce Replatforming URL's
We are in the process of re-platforming our eCommerce site to Magento 2. For the most part, the majority of site content will remain the same. Unfortunately on our current platform, we have been inconsistent with the use of .html as a URL suffix. As a result, our category and product pages are half and half - /stainless-steel-hardware.html
Intermediate & Advanced SEO | | BoatOutfitters
&
/stainless-steel-hardware We are considering taking the opportunity to clean up and standardize our URLs. (Drop the .html from all URLs on the new site and 301 redirect these to the same URL without the .html) Our concern is that many of the .html pages are good categories with strong page rank and I've read many articles about page rank loss from 301 redirects. We are debating internally if it really makes sense to take an SEO hit for something is seemingly small as dropping the .html from the URL. It would be a no-brainer if we were taking the opportunity to change to more SEO friendly natural language URLs. However currently our URL's appear acceptable with the exception of the inconsistent suffix. Thanks in advance for any insight on how you would approach this!2 -
"Null" appearing as top keyword in "Content Keywords" under Google index in Google Search Console
Hi, "Null" is appearing as top keyword in Google search console > Google Index > Content Keywords for our site http://goo.gl/cKaQ4K . We do not use "null" as keyword on site. We are not able to find why Google is treating "null" as a keyword for our site. Is anyone facing such issue. Thanks & Regards
Intermediate & Advanced SEO | | vivekrathore0 -
"No Index, No Follow" or No Index, Follow" for URLs with Thin Content?
Greetings MOZ community: If I have a site with about 200 thin content pages that I want Google to remove from their index, should I set them to "No Index, No Follow" or to "No Index, Follow"? My SEO firm has advised me to set them to "No Index, Follow" but on a recent MOZ help forum post someone suggested "No Index, No Follow". The MOZ poster said that telling Google the content was should not be indexed but the links should be followed was inconstant and could get me into trouble. This make a lot of sense. What is proper form? As background, I think I have recently been hit with a Panda 4.0 penalty for thin content. I have several hundred URLs with less than 50 words and want them de-indexed. My site is a commercial real estate site and the listings apparently have too little content. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Why my site it's not being indexed?
Hello.... I got to tell that I feel like a newbie (I am, but know I feel like it)... We were working with a client until january this year, they kept going on their own until september that they contacted us again... Someone on the team that handled things while we were gone, updated it´s robots.txt file to Disallow everything... for maybe 3 weeks before we were back in.... Additionally they were working on a different subdomain, the new version of the site and of course the didn't block the robots on that one. So now the whole site it's been duplicated, even it´s content, the exact same pages exist on the suddomain that was public the same time the other one was blocked. We came in changes the robots.txt file on both server, resend all the sitemaps, sent our URL on google+... everything the book says... but the site it´s not getting indexed. It's been 5 weeks now and no response what so ever. We were highly positioned on several important keywords and now it's gone. I now you guys can help, any advice will be highly appreciated. thanks Dan
Intermediate & Advanced SEO | | daniel.alvarez0 -
Indexing non-indexed content and Google crawlers
On a news website we have a system where articles are given a publish date which is often in the future. The articles were showing up in Google before the publish date despite us not being able to find them linked from anywhere on the website. I've added a 'noindex' meta tag to articles that shouldn't be live until a future date. When the date comes for them to appear on the website, the noindex disappears. Is anyone aware of any issues doing this - say Google crawls a page that is noindex, then 2 hours later it finds out it should now be indexed? Should it still appear in Google search, News etc. as normal, as a new page? Thanks. 🙂
Intermediate & Advanced SEO | | Alex-Harford0