Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Unnecessary pages getting indexed in Google for my blog
-
I have a blog dapazze.com and I am suffering from a problem for a long time. I found out that Google have indexed hundreds of replytocom links and images attachment pages for my blog.
I had to remove these pages manually using the URL removal tool. I had used "Disallow: ?replytocom" in my robots.txt, but Google disobeyed it. After that, I removed the parameter from my blog completely using the SEO by Yoast plugin.
But now I see that Google has again started indexing these links even after they are not present in my blog (I use #comment). Google have also indexed many of my admin and plugin pages, whereas they are disallowed in my robots.txt file.
Have a look at my robots.txt file here: http://dapazze.com/robots.txt
Please help me out to solve this problem permanently?
-
Me too have the same issue ! but not indexed in the Google ! but URL parameters in Google Webmasters shows there are 5K errors !
Should i use the URL Parameters settings or which one ?
Also make sure replytocom links are not blocked using Robots.txt, as it will stop Google bots from crawling and this your links won’t get deindexed. This is one mistake which I did, and later after removing replytocom parameter from robots.txt file, I was able to get most of my replytocom links deindexed. These are warning by the blogger ! http://www.shoutmeloud.com/how-to-fix-replytocom-links-issue-in-wordpress.html - he showed how to do that ! but my problem is different - It's Good that it's not indexed but i don't want to take any risk ! how to avooid them for future !
Someone else told me here that some plugins are doing/helping for you ! and not seen in your Robot.txt !
Confused confused ! so much confused ! Please help me !
-
Actually previously I had removed the links manually. But I am seeing them come up again even after removing the parameter completely.
Can you please point our the problem for me?
-
Please check that the comment pages are blocked by robots.txt file -
However, the blocked pages are now getting redirected to the main landing page of the blog posts.
Seems like it will take a while for Google to recrawl these pages and sort the issue.
In the mean time, could you please show some pages that are getting indexed by Google.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Indexing without content
Hello. I have a problem of page indexing without content. I have website in 3 different languages and 2 of the pages are indexing just fine, but one language page (the most important one) is indexing without content. When searching using site: page comes up, but when searching unique keywords for which I should rank 100% nothing comes up. This page was indexing just fine and the problem arose couple of days ago after google update finished. Looking further, the problem is language related and every page in the given language that is newly indexed has this problem, while pages that were last crawled around one week ago are just fine. Has anyone ran into this type of problem?
Technical SEO | | AtuliSulava1 -
Does Page speed matter for google ranking?
We are not sure that page does matter or not for google ranking as I am working for this keyword "flower delivery in Bangalore" from last few months and I saw some website's google first page who have low page speed but still ranking so I am really worried about my page that has also low page speed. will my Bangalore page rank on google the first page if the speed is low and kindly suggest me more tips for the ranking best factors which really works in 2020 and one more thing that domain authority really matters in this year? as I also saw some websites with low domain authority and ranking on google's first page. Home page: Flowerportal Bangalore page: https://flowerportal.in/flower-delivery/bangalore/ focus Keyword is: Flower delivery in Bangalore, send flowers to Bangalore
Technical SEO | | vidi34231 -
Should search pages be indexed?
Hey guys, I've always believed that search pages should be no-indexed but now I'm wondering if there is an argument to index them? Appreciate any thoughts!
Technical SEO | | RebekahVP0 -
Removed Subdomain Sites Still in Google Index
Hey guys, I've got kind of a strange situation going on and I can't seem to find it addressed anywhere. I have a site that at one point had several development sites set up at subdomains. Those sites have since launched on their own domains, but the subdomain sites are still showing up in the Google index. However, if you look at the cached version of pages on these non-existent subdomains, it lists the NEW url, not the dev one in the little blurb that says "This is Google's cached version of www.correcturl.com." Clearly Google recognizes that the content resides at the new location, so how come the old pages are still in the index? Attempting to visit one of them gives a "Server Not Found" error, so they are definitely gone. This is happening to a couple of sites, one that was launched over a year ago so it doesn't appear to be a "wait and see" solution. Any suggestions would be a huge help. Thanks!!
Technical SEO | | SarahLK0 -
De-indexing millions of pages - would this work?
Hi all, We run an e-commerce site with a catalogue of around 5 million products. Unfortunately, we have let Googlebot crawl and index tens of millions of search URLs, the majority of which are very thin of content or duplicates of other URLs. In short: we are in deep. Our bloated Google-index is hampering our real content to rank; Googlebot does not bother crawling our real content (product pages specifically) and hammers the life out of our servers. Since having Googlebot crawl and de-index tens of millions of old URLs would probably take years (?), my plan is this: 301 redirect all old SERP URLs to a new SERP URL. If new URL should not be indexed, add meta robots noindex tag on new URL. When it is evident that Google has indexed most "high quality" new URLs, robots.txt disallow crawling of old SERP URLs. Then directory style remove all old SERP URLs in GWT URL Removal Tool This would be an example of an old URL:
Technical SEO | | TalkInThePark
www.site.com/cgi-bin/weirdapplicationname.cgi?word=bmw&what=1.2&how=2 This would be an example of a new URL:
www.site.com/search?q=bmw&category=cars&color=blue I have to specific questions: Would Google both de-index the old URL and not index the new URL after 301 redirecting the old URL to the new URL (which is noindexed) as described in point 2 above? What risks are associated with removing tens of millions of URLs directory style in GWT URL Removal Tool? I have done this before but then I removed "only" some useless 50 000 "add to cart"-URLs.Google says themselves that you should not remove duplicate/thin content this way and that using this tool tools this way "may cause problems for your site". And yes, these tens of millions of SERP URLs is a result of a faceted navigation/search function let loose all to long.
And no, we cannot wait for Googlebot to crawl all these millions of URLs in order to discover the 301. By then we would be out of business. Best regards,
TalkInThePark0 -
How to block "print" pages from indexing
I have a fairly large FAQ section and every article has a "print" button. Unfortunately, this is creating a page for every article which is muddying up the index - especially on my own site using Google Custom Search. Can you recommend a way to block this from happening? Example Article: http://www.knottyboy.com/lore/idx.php/11/183/Maintenance-of-Mature-Locks-6-months-/article/How-do-I-get-sand-out-of-my-dreads.html Example "Print" page: http://www.knottyboy.com/lore/article.php?id=052&action=print
Technical SEO | | dreadmichael0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0 -
Why is a 301 redirected url still getting indexed?
We recently fixed a redirect issue in a website, and although it appears that the redirection is working fine, the url in question keeps on getting crawled, indexed and cached by google. The redirect was done a month ago, and google shows cached version of it, even for a couple of days ago. Manual checking shows that its being redirected, and also a couple of online tools i checked report a 301 redirect. Do you have any idea why this could be happening? The website I'm talking about is www.hotelmajestic.gr and its being redirected to www.hotel-majestic.gr
Technical SEO | | dim_d0