Need help with Robots.txt
-
An eCommerce site built with Modx CMS. I found lots of auto generated duplicate page issue on that site. Now I need to disallow some pages from that category. Here is the actual product page url looks like
product_listing.php?cat=6857And here is the auto generated url structure
product_listing.php?cat=6857&cPath=dropship&size=19Can any one suggest how to disallow this specific category through robots.txt. I am not so familiar with Modx and this kind of link structure.
Your help will be appreciated.
Thanks
-
I would actually add a canonical tag and then handle these using the Parameters section of Search Console. That's why it's there, for exactly this type of site with exactly this issue.
-
Nahid, before you use the robots.txt file's disallow for those URLs, you may want to reconsider. You may want to use the canonical tag instead. In the case where you have different sizes, colors, etc. we typically recommend using the Canonical Tag and not the disallow in robots.txt.
Anyhow, if you'd like to use the disallow you can use one of these:
Disallow: /?
or
Disallow: /?cat=
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help Dealing with Sustained Negative SEO Attack
Hello, I am hoping that someone is able to help with a problem that is destroying both my business and my health. We are an ecommerce site who have been trading since 2004 and who have always had strong rankings in Google. Unfortunately, over the past couple of months, these have significantly decreased (I would estimate around 40% drop in organic traffic). We have not had a manual penalty and still have decent rankings for a lot of competitive keywords, so we think it is more likely to be an algorithmic penalty.The most likely culprit is due to a huge scale negative SEO attack that has been going on for around 18 months. Last September, we suffered a major drop in rankings as a result of the 302 hijack scheme, but after submitting a disavow file (of around 500 domains) on 12th November, we recovered on 26th November (although we now don't know whether this was due to disavow file or the Phantom III update on 19th November).After suffering another major drop at the end of June, we submitted a disavow file of 1100 domains (this the scale of the problem!). This tempoarily halted the slide, however it is getting worse again. I have attached a file from Majestic which shows the increase in the backlinks (however we are not building these).We are at a loss and desperately need help. We have contacting all the sites to try and get links removed but they are happening faster than we can contact them. We have also done a full technical audit and added around 50,000 words of unique, handwritten content, as well as continuing to work through all technical fixes and improvements.At the moment, the only thing we can think of doing is submitting a weekly disavow for all the new spammy domains that come up. The questions I have are: Is there anything we can do to stop the attack? Is this increase in backlinks likely to be the culprit for the drops (both the big drops and the subsequent weekly 10% drop)? If so, would weekly disavows solve the problem? Is this likely to take months (years?) to recover from or can it be done quicker? Can you give me any ray of light to help me sleep at night? 😞 Really appreciate any and all help. I wouldn't wish ths on anyone.Thanks,Simon
Intermediate & Advanced SEO | | simonukss0 -
Robots.txt Blocking - Best Practices
Hi All, We have a web provider who's not willing to remove the wildcard line of code blocking all agents from crawling our client's site (user-agent: *, Disallow: /). They have other lines allowing certain bots to crawl the site but we're wondering if they're missing out on organic traffic by having this main blocking line. It's also a pain because we're unable to set up Moz Pro, potentially because of this first line. We've researched and haven't found a ton of best practices regarding blocking all bots, then allowing certain ones. What do you think is a best practice for these files? Thanks! User-agent: * Disallow: / User-agent: Googlebot Disallow: Crawl-delay: 5 User-agent: Yahoo-slurp Disallow: User-agent: bingbot Disallow: User-agent: rogerbot Disallow: User-agent: * Crawl-delay: 5 Disallow: /new_vehicle_detail.asp Disallow: /new_vehicle_compare.asp Disallow: /news_article.asp Disallow: /new_model_detail_print.asp Disallow: /used_bikes/ Disallow: /default.asp?page=xCompareModels Disallow: /fiche_section_detail.asp
Intermediate & Advanced SEO | | ReunionMarketing0 -
Help with duplicate pages
Hi there, I have a client who's site I am currently reviewing prior to a SEO campaign. They still work with the development team who built the site (not my company). I have discovered 311 instances of duplicate content within the crawl report. The duplicate content appears to either be 1, 2, or 3 versions of the same pages but with differing URL's. Example: http://www.sitename.com http://sitename.com http://sitename.com/index.php And other pages follow a similar or same pattern. I suppose my question is mainly what could be causing this and how can I fix it? Or, is it something that will have to be fixed by the website developers? Thanks in advance Darren
Intermediate & Advanced SEO | | SEODarren0 -
Certain Pages Not Being Indexed - Please Help
We are having trouble getting a bulk of our pages indexed in google. Any help would be greatly appreciated! The Following Page types are being indexed through escaped fragment: http://www.cbuy.tv/#! http://www.cbuy.tv/celebrity#!65-Ashley-Tisdale/fashion/4097-Casadei-BLADE-PUMP/Product/175199 <cite>www.cbuy.tv/celebrity/155-Sophia-Bush#!</cite> However, all our pages that look like this, are not being indexed: http://www.cbuy.tv/#!Type=Photo&id=b1d18759-5e52-4a1c-9491-6fb3cb9d4b95&Katie-Holmes-Hot-Pink-Pants-Isabel-Marant-DAVID-DOUBLE-BREASTED-Wool-COAT-Maison-Pumps-Black-Bag
Intermediate & Advanced SEO | | CBuy0 -
Please help on this penalized site!
OK, this is slowly frying my brain and would like some clarification from someone in the know, we have posted multiple reconsideration requests the regular "site violates googles quality guidelines" .."look for unnatural links etc" email back in March 2012, I came aboard the business in August 2012 to overcome bad SEO companies work. So far i have filled several disavow requests by domain and cleared over 90% of our backlink profile which where all directory, multiple forum spam links etc from WMT, OSE and Ahrefs and compiled this to the disavow tool, as well as sending a google docs shared file in our reconsideration request of all the links we have been able to remove and the disavow tool, since most where built in 2009/2010 a lot where impossible to remove. We managed to shift about 12 - 15% of our backlink profile by working very very hard too remove them. The only links that where left where quality links and forum posts created by genuine users and relevant non spam links As well as this we now have a high quality link profile which has also counteracted a lot of the bad "seo" work done by these previous companies, i have explained this fully in our reconsideration request as well as a massive apology on behalf of the work those companies did, and we are STILL getting generic "site violates" messages, so far we have spent in excess of 150 hours to get this penalty removed and so far Google hasn't even batted an eyelid. We have worked SO hard to combat this issue it almost feels almost very personal, if Google read the reconsideration request they would see how much work we have done too remove this issue. If anyone can give any updates or help on anything we have missed i would appreciate it, i feel like we have covered every base!! Chris www.palicomp.co.uk
Intermediate & Advanced SEO | | palicomp0 -
How much great targeted conent do we need to add?
Hi, I'm adding content to a client's website through textbroker. It's ecommerce and it's tough to find backlinks. We have decided to write 100 articles of at least 500 words so that we can say in our backlink campaign email that we have 100 helpful articles. We're thinking that people would like that. Also, we think that 100 good helpful articles will give us traffic and natural backlinks. How do we know if 100 is enough? Do we need 200? 500? Thanks.
Intermediate & Advanced SEO | | BobGW0 -
Reciprocal Links and nofollow/noindex/robots.txt
Hypothetical Situations: You get a guest post on another blog and it offers a great link back to your website. You want to tell your readers about it, but linking the post will turn that link into a reciprocal link instead of a one way link, which presumably has more value. Should you nofollow your link to the guest post? My intuition here, and the answer that I expect, is that if it's good for users, the link belongs there, and as such there is no trouble with linking to the post. Is this the right way to think about it? Would grey hats agree? You're working for a small local business and you want to explore some reciprocal link opportunities with other companies in your niche using a "links" page you created on your domain. You decide to get sneaky and either noindex your links page, block the links page with robots.txt, or nofollow the links on the page. What is the best practice? My intuition here, and the answer that I expect, is that this would be a sneaky practice, and could lead to bad blood with the people you're exchanging links with. Would these tactics even be effective in turning a reciprocal link into a one-way link if you could overlook the potential immorality of the practice? Would grey hats agree?
Intermediate & Advanced SEO | | AnthonyMangia0 -
Need Help Finding Directories to Submit To
I am looking for a lot of free "do follow" technology directories to submit to. Does anyone know of a good directory or a list of some sort of technology directories or something similar? Actually, I guess any directory that has a technology category would be helpful.
Intermediate & Advanced SEO | | MyNet0