RegEx help needed for robots.txt potential conflict
-
I've created a robots.txt file for a new Magento install and used an existing site-map that was on the Magento help forums but the trouble is I can't decipher something. It seems that I am allowing and disallowing access to the same expression for pagination. My robots.txt file (and a lot of other Magento site-maps it seems) includes both:
Allow: /*?p=
and
Disallow: /?p=&
I've searched for help on RegEx and I can't see what "&" does but it seems to me that I'm allowing crawler access to all pagination URLs, but then possibly disallowing access to all pagination URLs that include anything other than just the page number?
I've looked at several resources and there is practically no reference to what "&" does...
Can anyone shed any light on this, to ensure I am allowing suitable access to a shop?
Thanks in advance for any assistance
-
Hey James
It looks to me like you are just disallowing access to any URLs that have more than the initial p= variable. So, you are reducing the impact of potential duplication through searches and the like.
Good
?p=1
Bad
?p=1&q=search string
I am no magento expert but this seems to be a simple attempt to reduce the myriad duplication that can happen with search pages and the like inside a complex CMS like Magento.
The SEOMoz crawler tool should give you some good insight and to be sure, try removing the 'Disallow: /?p=&' and see if you get a buckletload of duplicate content warnings.
Ultimately, the thing to remember here is that the & is part of the URL and not part of the regex.
Hope that helps!
Marcus
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonicalization help
Hi Moz Community, If I have two different sub-category pages: http://www.example.com/rings/anniversary-rings/
Technical SEO | | IceIcebaby
http://www.example.com/wedding/anniversary-rings/ And the first one is ranking for all KWs, should I add a rel=canonical to the second URL or leave it since it's slightly different? Or should I try and create different unique content for the second URL? Everything in terms of content is the same on both these pages except for the URLs, which aren't that different to begin with. Thanks for your help! -Reed0 -
Meta title Tag dilemma.... need help
Hey, Guys I have a dilemma that I cannot figure out how to solve. One thing that I have learned is that the meta tag is probably one of the most important factors of SEO. I work in the industry of real estate and we are located in a mid-sized market, Augusta, GA, which does not have a hugely competitive digital marketplace. So, I have told my web developer the changes that I want her to make to our major sub-domain pages on our website. I am anticipating that once she makes these changes which will allow me to make the necessary SEO changes to website, that we will see some good results. I have one dilemma that I can't figure out how to solve with the meta title tag. Check out our rental section: http://aubenrealty.com/rentals.cfm Now, click on any rental property and it will take you to that rental's page. Notice the page title " Auben Realty- real estate....." This is identical for every active and non-active property on our website. Every time we create a new property, this is what it spits out. Now, take it a step further and click on " Contact me about this property," and you will see the same page title. My dilemma is, " How do we fix this?" My assumption is that the best page title would be the address for each property( ex, 1322 Laurel Street, Augusta Ga 30904), right ? Is this some kind of simple coding adjustment?
Technical SEO | | AubbiefromAubenRealty0 -
No indexing url including query string with Robots txt
Dear all, how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt? Thanks!
Technical SEO | | HMK-NL0 -
Best practise needed for translating content
Hi all, I was after some advice in the best solution to follow for translating website content into multiple languages? I am working on a content rich UK site and want to know a good solution to translate this content into other languages for best practise SEO? Would anyone have any recommendations in best practise to follow as well as best solutions? Many thanks Simon
Technical SEO | | simonsw0 -
Suggestions on how to hire help with my SEO?
Hi, I just signed up with SEOMOZ and found some major duplicate content issues with Wordpress. I have installed Yoast SEO plug in but honestly am a little lost on how Wordpress handles it all and need guidance. I would love to hire someone to do some desktop share / Skype sessions to teach me the proper way to set this up but did not see if there was any place on SEOMOZ where employers and providers can connect. I am interested in more than just this one issue, looking for a freelancer ongoing to work on our site for SEO. Any suggestions? Thanks in advance Force7
Technical SEO | | Force70 -
Link Volume - calculate what you need?
Hi everyone, an interesting question here. How do you determien what link volume you should try and get into your website? What analysis do you do to determine the number of links you feel is right to go into a back-link profiel every month? obviously there is no magic number but its an interesting question to know what others do. Obviously you don't want to build too many or too little. If you have been penalised for bad links in the past and are now back on track - how do you calculate the volume? Do you take links dropping out into consideration?
Technical SEO | | pauledwards0 -
Mobile site: robots.txt best practices
If there are canonical tags pointing to the web version of each mobile page, what should a robots.txt file for a mobile site have?
Technical SEO | | bonnierSEO0 -
Robots.txt
Hi there, My question relates to the robots.txt file. This statement: /*/trackback Would this block domain.com/trackback and domain.com/fred/trackback ? Peter
Technical SEO | | PeterM220