Robots.txt, Disallow & Indexed-Pages..

thekiller99

Hi guys,

hope you're well.

I have a problem with my new website. I have 3 pages with the same content:

The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages...

I don't know how should do now, but, i am thinking 2 posibilites:

Remove filters (true, false) and leave only the good page and show 404 page for others pages.
Update robots.txt with disallow for these parameters & remove those URL's manually

Thank you so much!

thekiller99

Finally, i decided to do the next:

Delete all pages from my site with filters (i have the option and it wasn't a problem)
Delete URL using GWT individually

It works!

MattRoney

Hi thekiller99! Did this get worked out? We'd love an update.

solvid

Hi,

Did you actually implement canonical tags on duplicate pages, and do the point to the original piece?

Yoav-Blustein

Hi!

Not sure if i understood how you implemented the canonical element on your pages, but it sounds like you have only put the canonical code to what you call "good page"

The scenario should be like this:
1. You have 3 pages with similar/exact content.
2. Obviously you want to index only one of them and in your case it is the one without the parameters ("good page")
3. You need to go ahead and implement the canonical elements in the following way:

page-1: http://example.examples.com/brand/brand1 (you do not have to, but if it makes it ieasier for you you can use self canonical.)
page-2: http://example.examples.com/brand/brand1?show=false (canonical to page-1)
page-3: http://example.examples.com/brand/brand1?show=true (canonical page-1)

PS. Google best practice suggests that you should never use robots.txt to de-index a page from the search results. In case you decide to remove certain pages completely from the search results, the best practice is to 404 them and use Google Search console to signal google that these pages are no longer available. But if you implement the canonical element as described above, you will have no problems.

Best

Yossi

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt, Disallow & Indexed-Pages..

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

When do you use article markup for AMP pages?

Twitter Robots.TXT

Do internal links from non-indexed pages matter?

Pages are being dropped from index after a few days - AngularJS site serving "_escaped_fragment_"

Thousands of Web Pages Disappered from Google Index

Home Page Got Indexed as httpS and Rankings Went Down.

Robots.txt: Can you put a /* wildcard in the middle of a URL?

How do I index these parameter generated pages?