Google suddenly indexing and displaying URLs that haven't existed for years?
-
We recently noticed google is showing approx 23,000 indexed .jsp urls for our site. These are ancient pages that haven't existed in years and have long been 301 redirected to valid urls. I'm talking 6 years.
Checking the serps the other day (and our current SEOMoz pro campaign), I see that a few of these urls are now replacing our correct ones in the serps for important, competitive phrases.
What the heck is going on here?
Is Google suddenly ignoring rewrite rules and redirects?
Here's an example of the rewrite rules that we've used for 6+ years:
RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301]
Now, this 'bottom paint' url has been incredibly stable in the serps for over a half decade. All of a sudden, a google search for 'bottom paint' (no quotes) brings up the jsp page at position 2-3.
This is just one example of something very bizarre happening. Has anyone else had something similar happen lately?
Thank You
<colgroup><col width="64"></colgroup>
| RewriteRule ^(.*)/xref_interlux_antifoulingoutboards&keels.jsp$ $1/userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID [R=301] | -
Oleg
Thank you for the reply. I am going to submit to G as well. What's really interesting is that for some of those ancient pages that have somehow resurfaced, you can view the cache dates. Those pages seem to have cache dates from late nov and dec 2012. But for others, attempting to view the cached version yields a google 404!
IMO, this suggests to its a bug.
As an aside, you are certainly correct about canonical and pagination issues on our site. We have implemented canonical thus far only on product pages (over 10k prod pages), and I've had getting next/prev for pagination of subcategories as a top priority for months now.
Thanks
-
Is Google suddenly ignoring rewrite rules and redirects?
Shouldn't be.. pretty odd. You can try blocking the crawler from accessing the old .jsp pages if they all follow a format (below code is if every page starts with /xref_)
User-agent:*
Disallow: /xref_*Looks like you don't really need a RewriteRule line there.. just a redirect would do the trick
Redirect 301 /xref_interlux_antifoulingoutboards&keels.jsp /userportal/search_subCategory.do?categoryName=Bottom%20Paint&categoryId=35&refine=1&page=GRID
But I don't think that is the problem since its still sending a 301 response code when you visit the .jsp file.
One thing that may help is adding canonical tags to your current pages - make sure you utilize rel=canonical as well as rel=next/prev for your paginated pages.
Overall, I'm not sure =/ Try posting/submitting it to G, could be a bug.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my website not ranking for it's brand name in SERPs but has been indexed by Google?
The website https://christchurch.crowneplaza.com has been live for a couple of months but is not being found in Google search results - even when searching for it's own brand name 'crowne plaza christchurch.' Google has indexed the site - but we are still not showing - https://www.google.co.nz/search?q=site%3Ahttp%3A%2F%2Fchristchurch.crowneplaza.com&rlz=1C1NHXL_enNZ735NZ735&oq=site%3A&aqs=chrome.0.69i59j69i57j69i58j69i59l2j69i65.896j0j7&sourceid=chrome&ie=UTF-8 Any ideas as to why? I think it may be because their are two versions of the site, http and https, both with their own rel=canonical tags. Could this be the cause? Any help much appreciated.
Intermediate & Advanced SEO | | Timmy30 -
Getting into Google News, URL's & Sitemaps
Hello, I know that one of the 'technical requirements' to get into google news is that the URL's have unique numbers at the end, BUT, that requirement can be circumvented if you have a Google News Sitemap. I've purchased the Yoast Google News Sitemap (https://yoast.com/wordpress/plugins/news-seo/) BUT just found out that you cannot submit a google news Sitemap until you are accepted into google news. Thus, my question is that do you need to add the digits to the URL's temporarily until you get in and can submit a google news sitemap, OR, is it ok to apply without them and take care of the sitemap after you get in. If anyone has any other tips about getting into Google News that would be great! Thanks!
Intermediate & Advanced SEO | | stacksnew0 -
Help my site it's not being indexed
Hello... We have a client, that had arround 17K visits a month... Last september he hired a company to do a redesign of his website....They needed to create a copy of the site on a different subdomain on another root domain... so I told them to block that content in order to not affect my production site, cause it was going to be an exact replica of the content but different design.... The developmet team did it wrong and blocked the production site (using robots.txt), so my site lost all it's organica traffic, which was 85-90% of the total traffic and now only get a couple of hundreds visits a month... First I thought we had been somehow penalized, however when I the other site recieving new traffic and being indexed i realized so I switched the robots.txt and created 301 redirect from the subdomain to the production site. After resending sitemaps, links to google+ and many things I can't get google to reindex my site.... when i do a site:domain.com search in google I only get 3 results. Its been now almost 2 month and honestly dont know what to do.... Any help would be greatly appreciated Thanks Dan
Intermediate & Advanced SEO | | daniel.alvarez0 -
Most recent blog post isn't being indexed?
http://www.howlatthemoon.com/dueling_piano_bar/kids-activities-denver/ Even if I put the URL into Google it doesn't show up....
Intermediate & Advanced SEO | | howlusa0 -
We're indexed in Google News, any tips or suggestions for getting traffic from news?
We have a news sitemap, and follow all best practices as outlined by Google for news. We are covering breaking stories at the same time as other publications, but have only made it to the front page of Google News once in the last few weeks. Does anyone have any tips, recommended reading, etc for how to get to the front page of Google News? Thanks!
Intermediate & Advanced SEO | | nicole.healthline0 -
Adding Orphaned Pages to the Google Index
Hey folks, How do you think Google will treat adding 300K orphaned pages to a 4.5 million page site. The URLs would resolve but there would be no on site navigation to those pages, Google would only know about them through sitemap.xmls. These pages are super low competition. The plot thickens, what we are really after is to get 150k real pages back on the site, these pages do have crawlable paths on the site but in order to do that (for technical reasons) we need to push these other 300k orphaned pages live (it's an all or nothing deal) a) Do you think Google will have a problem with this or just decide to not index some or most these pages since they are orphaned. b) If these pages will just fall out of the index or not get included, and have no chance of ever accumulating PR anyway since they are not linked to, would it make sense to just noindex them? c) Should we not submit sitemap.xml files at all, and take our 150k and just ignore these 300k and hope Google ignores them as well since they are orhpaned? d) If Google is OK with this maybe we should submit the sitemap.xmls and keep an eye on the pages, maybe they will rank and bring us a bit of traffic, but we don't want to do that if it could be an issue with Google. Thanks for your opinions and if you have any hard evidence either way especially thanks for that info. 😉
Intermediate & Advanced SEO | | irvingw0 -
Removing a Page From Google index
We accidentally generated some pages on our site that ended up getting indexed by google. We have corrected the issue on the site and we 404 all of those pages. Should we manually delete the extra pages from Google's index or should we just let Google figure out that they are 404'd? What the best practice here?
Intermediate & Advanced SEO | | dbuckles0 -
Google indexing flash content
Hi Would googles indexing of flash content count towards page content? for example I have over 7000 flash files, with 1 unique flash file per page followed by a short 2 paragraph snippet, would google count the flash as content towards the overall page? Because at the moment I've x-tagged the roberts with noindex, nofollow and no archive to prevent them from appearing in the search engines. I'm just wondering if the google bot visits and accesses the flash file it'll get the x-tag noindex, nofollow and then stop processing. I think this may be why the panda update also had an effect. thanks
Intermediate & Advanced SEO | | Flapjack0