Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Whats the best way to remove search indexed pages on magento?
-
A new client ( aqmp.com.br/ )call me yestarday and she told me since they moved on magento they droped down more than US$ 20.000 in sales revenue ( monthly)...
I´ve just checked the webmaster tool and I´ve just discovered the number of crawled pages went from 3.260 to 75.000 since magento started... magento is creating lots of pages with queries like search and filters. Example:
- http://aqmp.com.br/acessorios/lencos.html
- http://aqmp.com.br/acessorios/lencos.html?mode=grid
- http://aqmp.com.br/acessorios/lencos.html?dir=desc&order=name
Add a instruction on robots.txt is the best way to remove unnecessary pages of the search engine?
-
I have tried using them and didn´t do anything - furthermore, if you check this video out by Google themselves, you will find that using these parameters is a "hint/suggestion" as opposed to a solid directive.
http://www.youtube.com/watch?v=DiEYcBZ36poRel Canonical is also a hint.
But Meta Noindex,follow is a solid directive which they have to pay attention to.
Hope that helps - been there, done it got the t shirt through a lot of pain and frustration!
-
What do you think about Google URL parameters? http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687
-
Hi Ian,
You are right in that Yoast Meta Robots can be cranky - I installed it and had to play around with it to get it working.
However, it does offer a very nice feature that I think is worth it - you can apply various combinations of Meta Robots directives to product pages individually - so this adds more value than just being able to do NOINDEX on reviews, wishlists, etc... pages. But install it on your dev site before trying it live.
So my solution uses both Yoast and my custom code - you check the URL for any querystrings, such as ?manufacturer etc... and apply different logic according to what you wish to be indexed or not.
Feel free to PM me.
-
Hi,
can you expand on this and point me in the right direction if possible please BJS1976? I have the same problems too as originally asked by 'SEO Martin'.
I have obviously seen that the Yoast_MetaRobots plugin is recommended by others when searching for a solution to noindexing the non-content pages (search results, filters etc). However I am very reluctant to install this as many people who have tried said it has broken their sites.
If there is another way of implementing the noindex, follow meta tag, I would be very greatful to know how as like you I am really struggling to with this one.
Many Thanks
-
Hi,
I am quite familiar with Magento and struggling with the SEO of this ecommerce mammoth!
As far as I am aware, you should implement the meta tag "NOINDEX, FOLLOW" on those pages that you do not want indexed - as your pages are already in the index, this is the way to go - blocking them on robots.txt does not get pages out from the index if they are already in there.
I suggest you apply some "querystring" logic to your template - you will find the page here:
app/design/frontend/default/YOURTEMPLATE/template/page/html/head.phtmlThat way, you can apply the
depending on the page content.
Hope this helps you and let's stay in touch about Magento! (PM me)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long to re-index a page after being blocked
Morning all! I am doing some research at the moment and am trying to find out, just roughly, how long you have ever had to wait to have a page re-indexed by Google. For this purpose, say you had blocked a page via meta noindex or disallowed access by robots.txt, and then opened it back up. No right or wrong answers, just after a few numbers 🙂 Cheers, -Andy
Intermediate & Advanced SEO | | Andy.Drinkwater0 -
On 1 of our sites we have our Company name in the H1 on our other site we have the page title in our H1 - does anyone have any advise about the best information to have in the H1, H2 and Page Tile
We have 2 sites that have been set up slightly differently. On 1 site we have the Company name in the H1 and the product name in the page title and H2. On the other site we have the Product name in the H1 and no H2. Does anyone have any advise about the best information to have in the H1 and H2
Intermediate & Advanced SEO | | CostumeD0 -
Best way to remove full demo (staging server) website from Google index
I've recently taken over an in-house role at a property auction company, they have a main site on the top-level domain (TLD) and 400+ agency sub domains! company.com agency1.company.com agency2.company.com... I recently found that the web development team have a demo domain per site, which is found on a subdomain of the original domain - mirroring the site. The problem is that they have all been found and indexed by Google: demo.company.com demo.agency1.company.com demo.agency2.company.com... Obviously this is a problem as it is duplicate content and so on, so my question is... what is the best way to remove the demo domain / sub domains from Google's index? We are taking action to add a noindex tag into the header (of all pages) on the individual domains but this isn't going to get it removed any time soon! Or is it? I was also going to add a robots.txt file into the root of each domain, just as a precaution! Within this file I had intended to disallow all. The final course of action (which I'm holding off in the hope someone comes up with a better solution) is to add each demo domain / sub domain into Google Webmaster and remove the URLs individually. Or would it be better to go down the canonical route?
Intermediate & Advanced SEO | | iam-sold0 -
Best way to noindex an image?
Hi all, A client wanted a few pages noindexed, which was no problem using the meta robots noindex tag. However they now want associated images removed, some of which still appear on pages that they still want indexed. I added the images to their robots.txt file a few weeks ago (probably over a month ago actually) but they're all still showing when you do an image search. What's the best way to noindex them for good, and how do I go about implementing it? Many thanks, Steve
Intermediate & Advanced SEO | | steviephil0 -
How important is the number of indexed pages?
I'm considering making a change to using AJAX filtered navigation on my e-commerce site. If I do this, the user experience will be significantly improved but the number of pages that Google finds on my site will go down significantly (in the 10,000's). It feels to me like our filtered navigation has grown out of control and we spend too much time worrying about the url structure of it - in some ways it's paralyzing us. I'd like to be able to focus on pages that matter (explicit Category and Sub-Category) pages and then just let ajax take care of filtering products below these levels. For customer usability this is smart. From the perspective of manageable code and long term design this also seems very smart -we can't continue to worry so much about filtered navigation. My concern is that losing so many indexed pages will have a large negative effect (however, we will reduce duplicate content and be able provide much better category and sub-category pages). We probably should have thought about this a year ago before Google indexed everything :-). Does anybody have any experience with this or insight on what to do? Thanks, -Jason
Intermediate & Advanced SEO | | cre80 -
Disallowed Pages Still Showing Up in Google Index. What do we do?
We recently disallowed a wide variety of pages for www.udemy.com which we do not want google indexing (e.g., /tags or /lectures). Basically we don't want to spread our link juice around to all these pages that are never going to rank. We want to keep it focused on our core pages which are for our courses. We've added them as disallows in robots.txt, but after 2-3 weeks google is still showing them in it's index. When we lookup "site: udemy.com", for example, Google currently shows ~650,000 pages indexed... when really it should only be showing ~5,000 pages indexed. As another example, if you search for "site:udemy.com/tag", google shows 129,000 results. We've definitely added "/tag" into our robots.txt properly, so this should not be happening... Google showed be showing 0 results. Any ideas re: how we get Google to pay attention and re-index our site properly?
Intermediate & Advanced SEO | | udemy0 -
Best way to block a search engine from crawling a link?
If we have one page on our site that is is only linked to by one other page, what is the best way to block crawler access to that page? I know we could set the link to "nofollow" and that would prevent the crawler from passing any authority, and we can set the page to "noindex" to prevent it from appearing in search results, but what is the best way to prevent the crawler from accessing that one link?
Intermediate & Advanced SEO | | nicole.healthline0 -
Should I Allow Blog Tag Pages to be Indexed?
I have a wordpress blog with settings currently set so that Google does not index tag pages. Is this a best practice that avoids duplicate content or am I hurting the site by taking eligible pages out of the index?
Intermediate & Advanced SEO | | JSOC0