Removing indexed pages
-
Hi all, this is my first post so be kind - I have a one page Wordpress site that has the Yoast plugin installed. Unfortunately, when I first submitted the site's XML sitemap to the Google Search Console, I didn't check the Yoast settings and it submitted some example files from a theme demo I was using. These got indexed, which is a pain, so now I am trying to remove them. Originally I did a bunch of 301's but that didn't remove them from (at least not after about a month) - so now I have set up 410's - These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?
Thanks in advance for any suggestions. -
Thanks for all the responses!
At the moment I am serving the 410's using the .htaaccess file as I removed the actual pages a while ago. The pages don't show in most searches, however, two of them do show up in some instances under the sitelinks which is the main pain. I manually asked for them to be removed using 'remove urls' however that only last a couple of months and they are now back.
So I guess the best way is to recreate the pages and insert a noindex?
Thanks again for everyone time, it's much appreciated.
-
I agree with ViviCa1's methods, so go with that.
One thing I just wanted to bring up though, is that unless people are actually visiting those pages you don't want indexed, or it does some type of brand damage, then you don't really need to make it a priority.
Just because they're indexed doesn't mean they're showing up for any searches - and most likely they aren't - so people will realistically never see them. And if you only have a one-page site, you're not wasting much crawl budget on those.
I just bring this up since sometimes we (I'm guilty of it too) can get bogged down by small distractions in SEO that don't really help much, when we should be creating and producing new things!
"These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?"
There was a good related response from Google employee Susan Moskwa:
“The best way to stop Googlebot from crawling URLs that it has discovered in the past is to make those URLs (such as your old Sitemaps) 404. After seeing that a URL repeatedly 404s, we stop crawling it. And after we stop crawling a Sitemap, it should drop out of your "All Sitemaps" tab.”
A bit older, but shows how Google discovers URLs through the sitemap. Take a look at the rest of that thread as well.
-
I'd suggest adding a noindex robots meta tag to the affected pages (see how to do this here: https://support.google.com/webmasters/answer/93710?hl=en) and until Google recrawls use the remove URLs tool (see how to use this here: https://support.google.com/webmasters/answer/1663419?hl=en).
If you use the noindex robots meta tag, don't disallow the pages through your robots.txt or Google won't even see the tag. Disallowing Google from crawling a page doesn't mean it won't be indexed (or removed from the index), it just means Google won't crawl the page.
-
Couple of ideas spring to mind
- Use the robots.txt file
- Demote the site link in Google search console (see https://support.google.com/webmasters/answer/47334)
Example of robots.txt file...
Disallow: /the-link/you-dont/want-to-show.html
Disallow: /the-link/you-dont/want-to-show2.htmlDon't include the domain just the link to the page, Plenty of tutorials out there worthwhile having a look at http://www.robotstxt.org
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our protected pages 302 redirect to a login page if not a member. Is that a problem for SEO?
We have a membership site that has links out in our unprotected pages. If a non-member clicks on these links it sends a 302 redirect to the login / join page. Is this an issue for SEO? Thanks!
Technical SEO | | rimix1 -
Should you use google url remover if older indexed pages are still being kept?
Hello, A client recently did a redesign a few months ago, resulting in 700 pages being reduced to 60, mostly due to panda penalty and just low interest in products on those pages. Now google is still indexing a good number of them ( around 650 ) when we only have 70 on our sitemap. Thing is google indexes our site on average now for 115 urls when we only have 60 urls that need indexing and only 70 on our sitemap. I would of thought these urls would be crawled and not found, but is taking a very long period of time. Our rankings haven't recovered as much as we'd hope, and we believe that the indexed older pages are causes this. Would you agree and also would you think removing those old urls via the remover tool would be best option? It would mean using the url remover tool for 650 pages. Thank you in advance
Technical SEO | | Deacyde0 -
Why Are Some Pages On A New Domain Not Being Indexed?
Background: A company I am working with recently consolidated content from several existing domains into one new domain. Each of the old domains focused on a vertical and each had a number of product pages and a number of blog pages; these are now in directories on the new domain. For example, what was www.verticaldomainone.com/products/productname is now www.newdomain.com/verticalone/products/product name and the blog posts have moved from www.verticaldomaintwo.com/blog/blogpost to www.newdomain.com/verticaltwo/blog/blogpost. Many of those pages used to rank in the SERPs but they now do not. Investigation so far: Looking at Search Console's crawl stats most of the product pages and blog posts do not appear to be being indexed. This is confirmed by using the site: search modifier, which only returns a couple of products and a couple of blog posts in each vertical. Those pages are not the same as the pages with backlinks pointing directly at them. I've investigated the obvious points without success so far: There are a couple of issues with 301s that I am working with them to rectify but I have checked all pages on the old site and most redirects are in place and working There is currently no HTML or XML sitemap for the new site (this will be put in place soon) but I don't think this is an issue since a few products are being indexed and appearing in SERPs Search Console is returning no crawl errors, manual penalties, or anything else adverse Every product page is linked to from the /course page for the relevant vertical through a followed link. None of the pages have a noindex tag on them and the robots.txt allows all crawlers to access all pages One thing to note is that the site is build using react.js, so all content is within app.js. However this does not appear to affect pages higher up the navigation trees like the /vertical/products pages or the home page. So the question is: "Why might product and blog pages not be indexed on the new domain when they were previously and what can I do about it?"
Technical SEO | | BenjaminMorel0 -
Page titles in browser not matching WP page title
I have an issue with a few page titles not matching the title I have In WordPress. I have 2 pages, blog & creative gallery, that show the homepage title, which is causing duplicate title errors. This has been going on for 5 weeks, so its not an a crawl issue. Any ideas what could cause this? To clarify, I have the page title set in WP, and I checked "Disable PSP title format on this page/post:"...but this page is still showing the homepage title. Is there an additional title setting for a page in WP?
Technical SEO | | Branden_S0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
New EMD update effected my mom's legit author page? From page 1 in SERP to nowhere for her name
I think my mom's site, MargaretTerry.com was hit by this update for her name "Margaret Terry". Went from bouncing around the first page on google.com and .ca all the time to nowhere on the index. The results are now very strange, a mix of Youtube, linked in, and small book stores that she has done events at recently to promote her first book. I was checking after some of my SEO buddys were freaking out about their EMD's getting hit on Sunday. She is an aspiring author with a book coming out this month. There is obviously no ads or spam content on the site... I have never done SEO for it either except a bit of on page I guess. It sucks that people might be grabbing her book soon and when they Google her name nothing shows up. This couldn't have really happened at a worse time. Not to mention the hours spent building the site to her liking, free of charge of course 🙂 Is there anyone I can contact there to help me out? Shouldn't and EMD that is someones name still rank when you search their name?
Technical SEO | | Operatic0 -
Page not being indexed
Hi all, On our site we have a lot of bookmaker reviews, and we are ranking pretty good for most bookmaker names as keywords, however a single bookmaker seems to have been shunned by Google. For a search "betsafe" in Denmark, this page does not appear among the top 50: http://www.betxpert.com/bookmakere/betsafe All of our other review pages rank in top 10-20 for the bookmaker name as keyword. What to do if Google has "banned" a page? Best regards, Rasmus
Technical SEO | | rasmusbang0 -
Importance of an optimized home page (index)
I'm helping a client redesign their website and they want to have a home page that's primarily graphics and/or flash (or jquery). If they are able to optimize all of their key sub-pages, what is the harm in terms of SEO?
Technical SEO | | EricVallee340