Removing pages from index
-
My client is running 4 websites on ModX CMS and using the same database for all the sites.
Roger has discovered that one of the sites has 2050 302 redirects pointing to the clients other sites.
The Sitemap for the site in question includes 860 pages. Google Webmaster Tools has indexed 540 pages. Roger has discovered 5200 pages and a Site: query of Google reveals 7200 pages.
Diving into the SERP results many of the pages indexed are pointing to the other 3 sites. I believe there is a configuration problem with the site because the other sites when crawled do not have a huge volume of redirects.
My concern is how can we remove from Google's index the 2050 pages that are redirecting to the other sites via a 302 redirect?
-
Thank you gentlemen for your answers.
As per the post the website is built with ModX CMS and the difficulty has come about because the client is using one database to serve 4 websites. One of the sites has been configured incorrectly and is creating pages that relate to listings on other sites and passing traffic via a 302 redirect.
We're working on fixing the configuration so new pages and 302 redirects are not created. I want to remove from the index the 2050 302 redirects we've discovered to date. GWT allows you to remove URLs one at a time although the removal request procedure does not seem appropriate in the situation.
Due to NDA can not publish URL.
-
You might be running into the 302 error due to the system that your site uses to redirect URLs. What CMS or cart is the site built in? A agree with Nakul, that we need to see the underlying code to diagnose the issue.
-
The first and the foremost suggestion would be to clean the underlying code and see if you can change the 302s to 301s. Can you post or send via private message the example URLs ?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to get a large number of urls out of Google's Index when there are no pages to noindex tag?
Hi, I'm working with a site that has created a large group of urls (150,000) that have crept into Google's index. If these urls actually existed as pages, which they don't, I'd just noindex tag them and over time the number would drift down. The thing is, they created them through a complicated internal linking arrangement that adds affiliate code to the links and forwards them to the affiliate. GoogleBot would crawl a link that looks like it's to the client's same domain and wind up on Amazon or somewhere else with some affiiiate code. GoogleBot would then grab the original link on the clients domain and index it... even though the page served is on Amazon or somewhere else. Ergo, I don't have a page to noindex tag. I have to get this 150K block of cruft out of Google's index, but without actual pages to noindex tag, it's a bit of a puzzler. Any ideas? Thanks! Best... Michael P.S., All 150K urls seem to share the same url pattern... exmpledomain.com/item/... so /item/ is common to all of them, if that helps.
Intermediate & Advanced SEO | | 945010 -
Fetch as Google -- Does not result in pages getting indexed
I run a exotic pet website which currently has several types of species of reptiles. It has done well in SERP for the first couple of types of reptiles, but I am continuing to add new species and for each of these comes the task of getting ranked and I need to figure out the best process. We just released our 4th species, "reticulated pythons", about 2 weeks ago, and I made these pages public and in Webmaster tools did a "Fetch as Google" and index page and child pages for this page: http://www.morphmarket.com/c/reptiles/pythons/reticulated-pythons/index While Google immediately indexed the index page, it did not really index the couple of dozen pages linked from this page despite me checking the option to crawl child pages. I know this by two ways: first, in Google Webmaster Tools, if I look at Search Analytics and Pages filtered by "retic", there are only 2 listed. This at least tells me it's not showing these pages to users. More directly though, if I look at Google search for "site:morphmarket.com/c/reptiles/pythons/reticulated-pythons" there are only 7 pages indexed. More details -- I've tested at least one of these URLs with the robot checker and they are not blocked. The canonical values look right. I have not monkeyed really with Crawl URL Parameters. I do NOT have these pages listed in my sitemap, but in my experience Google didn't care a lot about that -- I previously had about 100 pages there and google didn't index some of them for more than 1 year. Google has indexed "105k" pages from my site so it is very happy to do so, apparently just not the ones I want (this large value is due to permutations of search parameters, something I think I've since improved with canonical, robots, etc). I may have some nofollow links to the same URLs but NOT on this page, so assuming nofollow has only local effects, this shouldn't matter. Any advice on what could be going wrong here. I really want Google to index the top couple of links on this page (home, index, stores, calculator) as well as the couple dozen gene/tag links below.
Intermediate & Advanced SEO | | jplehmann0 -
What is the impact of an off-topic page to other pages on the site?
We are working with a client who has one irrelevant, off-topic post ranking incredibly well and driving a lot of traffic. However, none of the other pages on the site, that are relevant to this client's business, are ranking. Links are good and in-line with competitors for the various terms. Oddly, very few external links reference this off-topic post, most are to the home page. Local profile is also in-line with competitors, including reviews, categorization, geo-targeting, pictures, etc. No spam issues exist and no warnings in Google Search Console. The only thing that seems weird is this off-topic post but that could affect rankings on other pages of the site? Would removing that off-topic post potentially help increase traffic and rankings for the other more relevant pages of the site? Appreciate any and all help or ideas of where to go from here. Thanks!
Intermediate & Advanced SEO | | Matthew_Edgar0 -
Glossary index and individual pages create duplicate content. How much might this hurt me?
I've got a glossary on my site with an index page for each letter of the alphabet that has a definition. So the M section lists every definition (the whole definition). But each definition also has its own individual page (and we link to those pages internally so the user doesn't have to hunt down the entire M page). So I definitely have duplicate content ... 112 instances (112 terms). Maybe it's not so bad because each definition is just a short paragraph(?) How much does this hurt my potential ranking for each definition? How much does it hurt my site overall? Am I better off making the individual pages no-index? or canonicalizing them?
Intermediate & Advanced SEO | | LeadSEOlogist0 -
Removing Parameterized URLs from Google Index
We have duplicate eCommerce websites, and we are in the process of implementing cross-domain canonicals. (We can't 301 - both sites are major brands). So far, this is working well - rankings are improving dramatically in most cases. However, what we are seeing in some cases is that Google has indexed a parameterized page for the site being canonicaled (this is the site that is getting the canonical tag - the "from" page). When this happens, both sites are being ranked, and the parameterized page appears to be blocking the canonical. The question is, how do I remove canonicaled pages from Google's index? If Google doesn't crawl the page in question, it never sees the canonical tag, and we still have duplicate content. Example: A. www.domain2.com/productname.cfm%3FclickSource%3DXSELL_PR is ranked at #35, and B. www.domain1.com/productname.cfm is ranked at #12. (yes, I know that upper case is bad. We fixed that too.) Page A has the canonical tag, but page B's rank didn't improve. I know that there are no guarantees that it will improve, but I am seeing a pattern. Page A appears to be preventing Google from passing link juice via canonical. If Google doesn't crawl Page A, it can't see the rel=canonical tag. We likely have thousands of pages like this. Any ideas? Does it make sense to block the "clicksource" parameter in GWT? That kind of scares me.
Intermediate & Advanced SEO | | AMHC0 -
Removing Low Rank Pages Help Others Shine?
Good Morning! I have a handful of pages that are not ranking very well, if at all. They are not driving any traffic, and are realistically just sorta "there". I have already determined I will not be bringing them over to our new web redesign. My question, could it be in our best interest to try and save these pages with ZERO traction and optimize them? Re-purpose them? Or does having them on our site currently muddy up our other pages? Any help is greatly appreciated! Thanks!
Intermediate & Advanced SEO | | HashtagHustler0 -
Drop in indexed pages!
Hi everybody! I've been working on http://thewilddeckcompany.co.uk/ for a little while now. Until recently, everything was great - good rankings for the key terms of 'bird hides' and 'pond dipping platforms'. However, rankings have tanked over the past few days. I can't point my finger at it yet, but a site:thewilddeckcompany.co.uk search shows only three pages have been indexed. There's only 10 on the site, and it was fine beforehand. Any advice would be much appreciated,
Intermediate & Advanced SEO | | Blink-SEO0 -
How can you indexed pages or content on pages that are behind a pay wall or subscription login.
I have a client that has a boat of awesome content they provide to their client that's behind a pay wall ( ie: paid subscribers can only access ) Any suggestions mozzers? How do I get those pages index? Without completely giving away the contents in the front end.
Intermediate & Advanced SEO | | BizDetox0