Robots.txt, Disallow & Indexed-Pages..

thekiller99

Hi guys,

hope you're well.

I have a problem with my new website. I have 3 pages with the same content:

The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages...

I don't know how should do now, but, i am thinking 2 posibilites:

Remove filters (true, false) and leave only the good page and show 404 page for others pages.
Update robots.txt with disallow for these parameters & remove those URL's manually

Thank you so much!

thekiller99

Finally, i decided to do the next:

Delete all pages from my site with filters (i have the option and it wasn't a problem)
Delete URL using GWT individually

It works!

MattRoney

Hi thekiller99! Did this get worked out? We'd love an update.

solvid

Hi,

Did you actually implement canonical tags on duplicate pages, and do the point to the original piece?

Yoav-Blustein

Hi!

Not sure if i understood how you implemented the canonical element on your pages, but it sounds like you have only put the canonical code to what you call "good page"

The scenario should be like this:
1. You have 3 pages with similar/exact content.
2. Obviously you want to index only one of them and in your case it is the one without the parameters ("good page")
3. You need to go ahead and implement the canonical elements in the following way:

page-1: http://example.examples.com/brand/brand1 (you do not have to, but if it makes it ieasier for you you can use self canonical.)
page-2: http://example.examples.com/brand/brand1?show=false (canonical to page-1)
page-3: http://example.examples.com/brand/brand1?show=true (canonical page-1)

PS. Google best practice suggests that you should never use robots.txt to de-index a page from the search results. In case you decide to remove certain pages completely from the search results, the best practice is to 404 them and use Google Search console to signal google that these pages are no longer available. But if you implement the canonical element as described above, you will have no problems.

Best

Yossi

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt, Disallow & Indexed-Pages..

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

How long will old pages stay in Google's cache index. We have a new site that is two months old but we are seeing old pages even though we used 301 redirects.

How to check if the page is indexable for SEs?

Google indexing only 1 page out of 2 similar pages made for different cities

When does Google index a fetched page?

Better for SEO to No-Index Pages with High Bounce Rates

Robots.txt: Syntax URL to disallow

Indexation of content from internal pages (registration) by Google

Reciprocal Links and nofollow/noindex/robots.txt