Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
301s being indexed
-
A client website was moved about six months ago to a new domain. At the time of the move, 301 redirects were setup from the pages on the old domain to point to the same page on the new domain. New pages were setup on the old domain for a different purpose. Now almost six months later when I do a query in google on the old domain like site:example.com 80% of the pages returned are 301 redirects to the new domain. I would have expected this to go away by now. I tried removing these URLs in webmaster tools but the removal requests expire and the URLs come back. Is this something we should be concerned with?
-
Hi,
This is completely normal at the moment. Many 301 URLs stay in the index for 6-12 months.
Case in point, google this:
There isn't anything you can do. Verify your 301s are set-up correctly. Move on.
-
Hi there,
Have you run a crawl on your site to see if there are a lot of links pointing to the old URLs? If Google sees more links point to the old version of the URLs rather than the new version, it's possible that it thinks that the old pages aren't really gone for good.
- Kristina
-
Hi,
Thanks for your responses. There are no issues with robots or canonical tags that are apparent. The 301 redirects are accessible by Googlebot, I checked in Webmaster Tools. And the page that the 301 redirects to on the other domain has a canonical tag set to the proper URL (itself).
-
Hi IrvCo_Interactive,
I'd recommend digging in to the pages being 301 redirected to make sure there are no conflicting directives, e.g. a rel="canonical" tag pointing to another page on the old domain. I've seen this issue of conflicting directives affecting indexation before and wrote about it here: http://upstreamist.co/indexation-canonical-greater-than-301/
If there are no existing conflicting directives, it may be worth trying the canonical tag on top of the 301 redirect at least for a few pages to see if the canonical tag is more effective in removing the page from the index.
-Trung
-
If it's six months old - they yes - it might be a reason for concern as users might be set to the old domain. Can you check and see if you are blocking with robots.txt the old domain some how ? Since if that's the case the bot can't reach the old pages and see the redirection and if those pages are already in the index they will stay that way.
Alternatively check the logs and see if google bot did hit those pages in the last 6 mo - although I doubt it didn't - it's safe to check.
Cheers !
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
E-Commerce Site Collection Pages Not Being Indexed
Hello Everyone, So this is not really my strong suit but I’m going to do my best to explain the full scope of the issue and really hope someone has any insight. We have an e-commerce client (can't really share the domain) that uses Shopify; they have a large number of products categorized by Collections. The issue is when we do a site:search of our Collection Pages (site:Domain.com/Collections/) they don’t seem to be indexed. Also, not sure if it’s relevant but we also recently did an over-hall of our design. Because we haven’t been able to identify the issue here’s everything we know/have done so far: Moz Crawl Check and the Collection Pages came up. Checked Organic Landing Page Analytics (source/medium: Google) and the pages are getting traffic. Submitted the pages to Google Search Console. The URLs are listed on the sitemap.xml but when we tried to submit the Collections sitemap.xml to Google Search Console 99 were submitted but nothing came back as being indexed (like our other pages and products). We tested the URL in GSC’s robots.txt tester and it came up as being “allowed” but just in case below is the language used in our robots:
Intermediate & Advanced SEO | | Ben-R
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /9545580/checkouts
Disallow: /carts
Disallow: /account
Disallow: /collections/+
Disallow: /collections/%2B
Disallow: /collections/%2b
Disallow: /blogs/+
Disallow: /blogs/%2B
Disallow: /blogs/%2b
Disallow: /design_theme_id
Disallow: /preview_theme_id
Disallow: /preview_script_id
Disallow: /apple-app-site-association
Sitemap: https://domain.com/sitemap.xml A Google Cache:Search currently shows a collections/all page we have up that lists all of our products. Please let us know if there’s any other details we could provide that might help. Any insight or suggestions would be very much appreciated. Looking forward to hearing all of your thoughts! Thank you in advance. Best,0 -
How long to re-index a page after being blocked
Morning all! I am doing some research at the moment and am trying to find out, just roughly, how long you have ever had to wait to have a page re-indexed by Google. For this purpose, say you had blocked a page via meta noindex or disallowed access by robots.txt, and then opened it back up. No right or wrong answers, just after a few numbers 🙂 Cheers, -Andy
Intermediate & Advanced SEO | | Andy.Drinkwater0 -
Home page suddenly dropped from index!!
A client's home page, which has always done very well, has just dropped out of Google's index overnight!
Intermediate & Advanced SEO | | Caro-O
Webmaster tools does not show any problem. The page doesn't even show up if we Google the company name. The Robot.txt contains: Default Flywheel robots file User-agent: * Disallow: /calendar/action:posterboard/
Disallow: /events/action~posterboard/ The only unusual thing I'm aware of is some A/B testing of the page done with 'Optimizely' - it redirects visitors to a test page, but it's not a 'real' redirect in that redirect checker tools still see the page as a 200. Also, other pages that are being tested this way are not having the same problem. Other recent activity over the last few weeks/months includes linking to the page from some of our blog posts using the page topic as anchor text. Any thoughts would be appreciated.
Caro0 -
Wrong URLs indexed, Failing To Rank Anywhere
I’m struggling with a client website that's massively failing to rank. It was published in Nov/Dec last year - not optimised or ranking for anything, it's about 20 pages. I came onboard recently, and 5-6 weeks ago we added new content, did the on-page and finally changed from the non-www to the www version in htaccess and WP settings (while setting www as preferred in Search Console). We then did a press release and since then, have acquired about 4 partial match contextual links on good websites (before this, it had virtually none, save for social profiles etc.) I should note that just before we added the (about 50%) new content and optimised, my developer accidentally published the dev site of the old version of the site and it got indexed. He immediately added it correctly to robots.txt, and I assumed it would therefore drop out of the index fairly quickly and we need not be concerned. Now it's about 6 weeks later, and we’re still not ranking anywhere for our chosen keywords. The keywords are around “egg freezing,” so only moderate competition. We’re not even ranking for our brand name, which is 4 words long and pretty unique. We were ranking in the top 30 for this until yesterday, but it was the press release page on the old (non-www) URL! I was convinced we must have a duplicate content issue after realising the dev site was still indexed, so last week, we went into Search Console to remove all of the dev URLs manually from the index. The next day, they were all removed, and we suddenly began ranking (~83) for “freezing your eggs,” one of our keywords! This seemed unlikely to be a coincidence, but once again, the positive sign was dampened by the fact it was non-www page that was ranking, which made me wonder why the non-www pages were still even indexed. When I do site:oursite.com, for example, both non-www and www URLs are still showing up…. Can someone with more experience than me tell me whether I need to give up on this site, or what I could do to find out if I do? I feel like I may be wasting the client’s money here by building links to a site that could be under a very weird penalty 😕
Intermediate & Advanced SEO | | Ullamalm0 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
Getting Pages Requiring Login Indexed
Somehow certain newspapers' webpages show up in the index but require login. My client has a whole section of the site that requires a login (registration is free), and we'd love to get that content indexed. The developer offered to remove the login requirement for specific user agents (eg Googlebot, et al.). I am afraid this might get us penalized. Any insight?
Intermediate & Advanced SEO | | TheEspresseo0 -
Yoast SEO Plugin: To Index or Not to index Categories?
Taking a poll out there......In most cases would you want to index or NOT index your category pages using the Yoast SEO plugin?
Intermediate & Advanced SEO | | webestate0 -
Indexed Pages in Google, How do I find Out?
Is there a way to get a list of pages that google has indexed? Is there some software that can do this? I do not have access to webmaster tools, so hoping there is another way to do this. Would be great if I could also see if the indexed page is a 404 or other Thanks for your help, sorry if its basic question 😞
Intermediate & Advanced SEO | | JohnPeters0