Robots.txt, Disallow & Indexed-Pages..
-
Hi guys,
hope you're well.
I have a problem with my new website. I have 3 pages with the same content:
- http://example.examples.com/brand/brand1 (good page)
- http://example.examples.com/brand/brand1?show=false
- http://example.examples.com/brand/brand1?show=true
The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages...
I don't know how should do now, but, i am thinking 2 posibilites:
- Remove filters (true, false) and leave only the good page and show 404 page for others pages.
- Update robots.txt with disallow for these parameters & remove those URL's manually
Thank you so much!
-
Finally, i decided to do the next:
-
Delete all pages from my site with filters (i have the option and it wasn't a problem)
-
Delete URL using GWT individually
It works!
-
-
Hi thekiller99! Did this get worked out? We'd love an update.
-
Hi,
Did you actually implement canonical tags on duplicate pages, and do the point to the original piece?
-
Hi!
Not sure if i understood how you implemented the canonical element on your pages, but it sounds like you have only put the canonical code to what you call "good page"
The scenario should be like this:
1. You have 3 pages with similar/exact content.
2. Obviously you want to index only one of them and in your case it is the one without the parameters ("good page")
3. You need to go ahead and implement the canonical elements in the following way:- page-1: http://example.examples.com/brand/brand1 (you do not have to, but if it makes it ieasier for you you can use self canonical.)
- page-2: http://example.examples.com/brand/brand1?show=false (canonical to page-1)
- page-3: http://example.examples.com/brand/brand1?show=true (canonical page-1)
PS. Google best practice suggests that you should never use robots.txt to de-index a page from the search results. In case you decide to remove certain pages completely from the search results, the best practice is to 404 them and use Google Search console to signal google that these pages are no longer available. But if you implement the canonical element as described above, you will have no problems.
Best
Yossi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Magento & Accelerated Mobile Pages
Hi Folks, With Google rolling out changes to AMP & webmasters being encouraged to implement AMP.
Intermediate & Advanced SEO | | Patrick_556
Has anyone had any experiences implementing AMP for Magento Ecommerce. I understand that AMP is primary for articles & blog posts, but assuming AMP could be implemented on Product Pages, they would load faster & offer a better user experience & a step in the right direction What do you guys think? Many Thanks,
Patrick0 -
Robots.txt Allowed
Hello all, We want to block something that has the following at the end: http://www.domain.com/category/product/some+demo+-text-+example--writing+here So I was wondering if doing: /*example--writing+here would work?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Client has moved to secured https webpages but non secured http pages are still being indexed in Google. Is this an issue
We are currently working with a client that relaunched their website two months ago to have hypertext transfer protocol secure pages (https) across their entire site architecture. The problem is that their non secure (http) pages are still accessible and being indexed in Google. Here are our concerns: 1. Are co-existing non secure and secure webpages (http and https) considered duplicate content?
Intermediate & Advanced SEO | | VanguardCommunications
2. If these pages are duplicate content should we use 301 redirects or rel canonicals?
3. If we go with rel canonicals, is it okay for a non secure page to have rel canonical to the secure version? Thanks for the advice.0 -
Robots.txt
What would be a perfect robots.txt file my site is propdental.es Can i just place: User-agent: * Or should i write something more???
Intermediate & Advanced SEO | | maestrosonrisas0 -
About robots.txt for resolve Duplicate content
I have a trouble with Duplicate content and title, i try to many way to resolve them but because of the web code so i am still in problem. I decide to use robots.txt to block contents that are duplicate. The first Question: How do i use command in robots.txt to block all of URL like this: http://vietnamfoodtour.com/foodcourses/Cooking-School/
Intermediate & Advanced SEO | | magician
http://vietnamfoodtour.com/foodcourses/Cooking-Class/ ....... User-agent: * Disallow: /foodcourses ( Is that right? ) And the parameter URL: h
ttp://vietnamfoodtour.com/?mod=vietnamfood&page=2
http://vietnamfoodtour.com/?mod=vietnamfood&page=3
http://vietnamfoodtour.com/?mod=vietnamfood&page=4 User-agent: * Disallow: /?mod=vietnamfood ( Is that right? i have folder contain module, could i use: disallow:/module/*) The 2nd question is: Which is the priority " robots.txt" or " meta robot"? If i use robots.txt to block URL, but in that URL my meta robot is "index, follow"0 -
10,000 New Pages of New Content - Should I Block in Robots.txt?
I'm almost ready to launch a redesign of a client's website. The new site has over 10,000 new product pages, which contain unique product descriptions, but do feature some similar text to other products throughout the site. An example of the page similarities would be the following two products: Brown leather 2 seat sofa Brown leather 4 seat corner sofa Obviously, the products are different, but the pages feature very similar terms and phrases. I'm worried that the Panda update will mean that these pages are sand-boxed and/or penalised. Would you block the new pages? Add them gradually? What would you recommend in this situation?
Intermediate & Advanced SEO | | cmaddison0 -
Should I prevent Google from indexing blog tag and category pages?
I am working on a website that has a regularly updated Wordpress blog and am unsure whether or not the category and tag pages should be indexable. The blog posts are often outranked by the tag and category pages and they are ultimately leaving me with a duplicate content issue. With this in mind, I assumed that the best thing to do would be to remove the tag and category pages from the index, but after speaking to someone else about the issue, I am no longer sure. I have tried researching online, but there isn't anything that provided any further information. Please can anyone with any experience of dealing with issues like this or with any knowledge of the topic help me to resolve this annoying issue. Any input will be greatly appreciated. Thanks Paul
Intermediate & Advanced SEO | | PaulRogers0 -
Should we block urls like this - domainname/shop/leather-chairs.html?brand=244&cat=16&dir=ascℴ=price&price=1 within the robots.txt?
I've recently added a campaign within the SEOmoz interface and received an alarming number of errors ~9,000 on our eCommerce website. This site was built in Magento, and we are using search friendly url's however most of our errors were duplicate content / titles due to url's like: domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=1 and domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=4. Is this hurting us in the search engines? Is rogerbot too good? What can we do to cut off bots after the ".html?" ? Any help would be much appreciated 🙂
Intermediate & Advanced SEO | | MonsterWeb280