404's being re-indexed
-
Hi All,
We are experiencing issues with pages that have been 404'd being indexed. Originally, these were /wp-content/ index pages, that were included in Google's index. Once I realized this, I added in a directive into our htaccess to 404 all of these pages - as there were hundreds. I tried to let Google crawl and remove these pages naturally but after a few months I used the URL removal tool to remove them manually.
However, Google seems to be continually re/indexing these pages, even after they have been manually requested for removal in search console. Do you have suggestions? They all respond to 404's.
Thanks
-
Just to follow up - I have now actually 410'd the pages and the 410's are still being re-indexed.
-
I'll check this one out as well, thanks! I used a header response extension which reveals the presence of x-botots headers called web developer.
-
First it would be helpful to know how you are detecting that it isn't working. What indexation tool are you using to see whether the blocks are being detected? I personally really like this one: https://chrome.google.com/webstore/detail/seo-indexability-check/olojclckfadnlhnlmlekdihebmjpjnoa?hl=en-GB
Or obviously at scale - Screaming Frog
-
Thank you for the quick response,
The pages are truly removed, however, because there were so many of these types of pages that leaked into the index, I added a redirect to keep users on our site - no intentions of being "shady", I just didn't want hundreds of 404's getting clicked and causing a very high bounce rate.
For the x-robots header, could you offer some insight into why my directive isn't working? I believe it's a regex issue on the wp-content. I have tried to troubleshoot to no avail.
<filesmatch <strong="">"(wp-content)">
Header set X-Robots-Tag: "noindex, nofollow"</filesmatch>I appreciate the help!
-
Well if a page has been removed and has not been moved to a new destination - you shouldn't redirect a user anyway (which kind of 'tricks' users into thinking the content was found). That's actually bad UX
If the content has been properly removed or was never supposed to be there, just leave it at a 410 (but maybe create a nice custom 410 page, in the same vein as a decent UX custom 404 page). Use the page to admit that the content is gone (without shady redirects) but to point to related posts or products. Let the user decide, but still be useful
If the content is actually still there and, hence you are doing a redirect - then you shouldn't be serving 404s or 410s in the first place. You should be serving 301s, and just doing HTTP redirects to the content's new (or revised) destination URL
Yes, the HTTP header method is the correct replacement when the HTML implementation gets stripped out. HTTP Header X-Robots is the way for you!
-
Thank you! I am in the process of doing so, however with a 410 I can not leave my JS redirect after the page loads, this creates some UX issues. Do you have any suggestions to remedy this?
Additionally, after the 410 the non x-robots noindex is now being stripped so it only resolves to a 410 with no noindex or redirect. I am still working on a noindex header, as the 410 is server-side, I assume this would be the only way, correct?
-
You know that 404 means "temporarily gone but will be coming back" right? By saying a page is temporarily unavailable, you actively encourage Google to come back later
If you want to say that the page is permanently gone use status code 410 (gone)
Leave the Meta no-index stuff in the HTTP header via X-Robots, that was a good call. But it was a bad call to combine Meta no-index and 404, as they contradict each other ("don't index me now but then do come back and index me later as I'll probably be back at some point")
Use Meta no-index and 410, which agree with each other ("don't index me now and don't bother coming back")
-
Yes, all pages have a noindex. I have also tried to noindex them using htaccess, to add an extra layer of security, but it seems to be incorrect. I believe it is an issue with the regex. Attempting to match anything with wp-content.
<filesmatch "(wp-content)"="">Header set X-Robots-Tag: "noindex, nofollow"</filesmatch>
-
Back to basics. Have you marked those pages/posts as 'no-index'. With many wp plugins, you can no-index them in bulk then submit for re-indexation.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have recently re-done my website. My buyers guide and my category page are ranking for keywords I'm after.
I have recently re-done my entire site (only a few days). I believe Google is still re-crawling and updating (however, the amount of movement on other searches has been significant). My buyers guide is ranking very high for its intended keywords, as well as high for the keywords of the category page. Both are at the beginning of the second page and I wonder if its dragging me down. What do you think I should do? Is it to early to take action as everything has been completed redone.
Technical SEO | | Code2Chil0 -
Redirect 'keyword-url' to improve ranking?
I was wondering if a good url, with a keyword in it, can help you improve the position of that certain keyword by redirecting that url to your website. To make it clear: We run the website www.terello.nl, and have the possibility to let the url www.iphonereparatie.nl (translation: iphonerepair) redirect to our website. Would this help us to rank for the keyword 'iPhone reparatie'? I hope that I made myself clear this way:) Otherwise i'm more than happy to clearify myself!
Technical SEO | | Jan-Peter0 -
My 404 page is returning a 404
Hi there, Moz has highlighted that my 404 page is returning a 404... Looking at webmaster tools within crawl errors, it's the same story. The only big change on the website is that we recently moved to https for the entire site, so all pages have a 301 to the corresponding https page, including the old 404 http page. I don't know if that makes any difference? Any help or advice on how I reasolve this will be much appreciated. Thanks, Stuart
Technical SEO | | Stuart260 -
To integrate a blog tool onto site - or build a blog solution - what's better for SEO?
Currently looking at adding a blog to our company site subdirectory and wanted to know if there was a SEO distinction between the following methods: Integrating a bolt-on blog tool with the site to create the blog VS. just using the current site infrastructure to build blog functionality. What's better for SEO? (and if tool integration is the overwhelming response - which tool?). Cheers.
Technical SEO | | Oxfordcomma0 -
H2's are already ranking well. Should I rock the boat?
I recently began work for a company and discovered that they are not using h1's (using h2's) and rank in the top 5 for ~90% of their keywords. The site is one of the original players in their industry, has massive amounts of domain authority and tens of thousands of linking root domains. However, they are currently being beaten on some of their top keywords by a few of their younger competitors. Moving their current h2 text into h1 tags could be helpful. But to what extent? Since they already rank well for so many competitive keywords, Is it worth it to rock the boat by moving their h2 text into h1 tags and risk affecting their current rankings?
Technical SEO | | 5outhpaw0 -
What's the best format for a e-commerce URL product page
We have over 2000 non branded experiences and activities sold through our website. The website is having a face lift with the a new look and a stronger focus on SEO. As part of this, I am keen to establish what the best practice is for product based URLs. I've researched the market and come up with a few alternatives that are used: domain/category/subcategory/activity_name domain/activity_name/category/subcategory/activity_reference domain/generic_term/activity_reference/activity_name domain/category/activity_location/activity_name Activities are location based but the location can change (say once every 2 years). Activity names, category, subcategory and activity_reference rarely change. Are there any thoughts/ research on the best method? (If there is one) Many thanks in advance for your insights.
Technical SEO | | philwill0 -
Just relaunched a website - why did it fall in Google's SERPs?
I work for a marketing agency that just redesigned, rewrote and relaunched a client's website. They used to rank #4 on Google for the company's name (which is a fairly common one, for what it's worth). Now they're at #10 and want to know why. I'd like to explain to them what happened but don't know myself. Can someone explain it to me? And can I tell them if/when their ranking might go back up? In case this matters, I can tell you that it looks like Google hasn't yet crawled the new site. Anyway, thanks in advance for any help you can provide.
Technical SEO | | matt-145670 -
Switching ecommerce CMS's - Best Way to write URL 301's and sub pages?
Hey guys, What a headache i've been going through the last few days trying to make sure my upcoming move is near-perfect. Right now all my urls are written like this /page-name (all lowercase, exact, no forward slash at end). In the new CMS they will be written like this: /Page-Name/ (with the forward slash at the end). When I generate an XML sitemap in the new ecomm CMS internally it lists the category pages with a forward slash at the end, just like they show up through out the CMS. This seems sloppy to me, but I have no control over it. Is this OK for SEO? I'm worried my PR 4, well built ecommerce website is going to lose value to small (but potentially large) errors like this. If this is indeed not good practice, is there a resource about not using the forward slash at the end of URLS in sitemaps i can present to the community at the platform? They are usually real quick to make fixes if something is not up to standards. Thanks in advance, -First Time Ecommerce Platform Transition Guy
Technical SEO | | Hyrule0