Robot.txt help
-
Hi,
We have a blog that is killing our SEO.
We need to
Disallow
Disallow: /Blog/?tag*
Disallow: /Blog/?page*
Disallow: /Blog/category/*
Disallow: /Blog/author/*
Disallow: /Blog/archive/*
Disallow: /Blog/Account/.
Disallow: /Blog/search*
Disallow: /Blog/search.aspx
Disallow: /Blog/error404.aspx
Disallow: /Blog/archive*
Disallow: /Blog/archive.aspx
Disallow: /Blog/sitemap.axd
Disallow: /Blog/post.aspxBut Allow everything below /Blog/Post
The disallow list seems to keep growing as we find issues. So rather than adding in to our Robot.txt all the areas to disallow. Is there a way to easily just say Allow /Blog/Post and ignore the rest. How do we do that in Robot.txt
Thanks
-
These: http://screencast.com/t/p120RbUhCT
They appear on every page I looked at, and take up the entire area "above the fold" and the content is "below the fold"
-Dan
-
Thanks Dan, but what grey areas, what url are you looking at?
-
Ahh. I see. You just need to "noindex" the pages you don't want in the index. As far as how to do that with blogengine, I am not sure, as I have never used it before.
But I think a bigger issue is like the giant box areas at the top of every page. They are pushing your content way down. That's definitely hurting UX and making the site a little confusing. I'd suggest improving that as well
-Dan
-
Hi Dan, Yes sorry that's the one!
-
Hi There... that address does not seem to work for me. Should it be .net? http://www.dotnetblogengine.net/
-Dan
-
Hi
The blog is www.dotnetblogengine.com
The content is only on the blog once it is just it can be accessed lots of different ways
-
Andrew
I doubt that one thing made your rankings drop so much. Also, what type of CMS are you on? Duplicate content like that should be controlled through indexation for the most part, but I am not recognizing that type of URL structure as any particular CMS?
Are just the title tags duplicate or the entire page content? Essentially, I would either change the content of the pages so they are not duplicate, or if that doesn't make sense I would just "noindex" them.
-Dan
-
Hi Dan,
I am getting duplicate content errors in WMT like
This is because tag=ABC and page=1 are both different ways to get to www.mysite.com/Blog/Post/My-Blog-Post.aspx
To fix this I have remove the URL's www.mysite.com/Blog/?tag=ABC and www.mysite.com/Blog/?Page=1from GWMT and by setting robot.txt up like
User-agent: *
Disallow: /Blog/
Allow: /Blog/post
Allow: /Blog/PostI hope to solve the duplicate content issue to stop it happening again.
Since doing this my SERP's have dropped massively. Is what I have done wrong or bad? How would I fix?
Hope this makes sense thanks for you help on this its appreciated.
Andrew
-
Hi There
Where are they appearing in WMT? In crawl errors?
You can also control crawling of parameters within webmaster tools - but I am still not quite sure if you are trying to remove these from the index or just prevent crawling (and if preventing crawling, for what reason?) or both?
-Dan
-
Hi Dan,
The issue is my blog had tagging switched on, it cause canonicalization mayhem.
I switched it off, but the tags still appears in Google Webmaster Tools (GWMT). I Remove URL via GWMT but they are still appearing. This has also caused me to plummet down the SERPs! I am hoping this is why my SERPs had dropped anyway! I am now trying to get to a point where google just sees my blog posts and not the ?Tag or ?Author or any other parameter that is going to cause me canoncilization pain. In the meantime I am sat waiting for google to bring me back up the SERPs when things settle down but it has been 2 weeks now so maybe something else is up?
-
I'm wondering why you want to block crawling of these URLs - I think what you're going for is to not index them, yes? If you block them from being crawled, they'll remain in the index. I would suggest considering robots meta noindex tags - unless you can describe in a little more detail what the issue is?
-Dan
-
Ok then you should be all set if your tests on GWMT did not indicate any errors.
-
Thanks it goes straight to www.mysite.com/Blog
-
Yup, I understand that you want to see your main site. This is why I recommended blocking only /Blog and not / (your root domain).
However, many blogs have a landing page. Does yours? In other words, when you click on your blog link, does it take you straight to Blog/posts or is there another page in between, eg /Blog/welcome?
If it does not go straight into Blog/posts you would want to also allow the landing page.
Does that make sense?
-
The structure is:
www.mysite.com - want to see everything at this level and below it
www.mysite.com/Blog - want to BLOCK everything at this level
www.mysite.com/Blog/posts - want to see everything at this level and below it
-
Well what Martijn (sorry, I spelled his name wrong before) and I were saying was not to forget to allow the landing page of your blog - otherwise this will not be indexed as you are disallowing the main blog directory.
Do you have a specific landing page for your blog or does it go straight into the /posts directory?
I'd say there's nothing wrong with allowing both Blog/Post and Blog/post just to be on the safe side...honestly not sure about case sensitivity in this instance.
-
"We're getting closer David, but after reading the question again I think we both miss an essential point ;-)" What was the essential point you missed. sorry I don't understand. I don;t want to make a mistake in my Robot.txt so would like to be 100% sure on what you are saying
-
Thanks guys so I have
User-agent: *
Disallow: /Blog/
Allow: /Blog/post
Allow: /Blog/Postthat works. My Home page also works. I there anything wrong with including both uppercase "Post" and lowercase "post". It is lowercase on the site but want uppercase "P" just incase. Is there a way to make the entry non case sensitive?
Thanks
-
Correct, Martijin. Good catch!
-
There was a reason that I said he should test this!
We're getting closer David, but after reading the question again I think we both miss an essential point ;-). As we know also exclude the robots from crawling the 'homepage' of the blog. If you have this homepage don't forget to also Allow it.
-
Well, no point in a blog that hurts your seo
I respectfully disagree with Martijin; I believe what you would want to do is disallow the Blog directory itself, not the whole site. It would seem if you Disallow: / and _Allow:/Blog/Post _ that you are telling SEs not to index anything on your site except for /Blog/Post.
I'd recommend:
User-agent: *
Disallow: /Blog/
Allow: /Blog/PostThis should block off the entire Blog directory except for your post subdirectory. As Maritijin stated; always test before you make real changes to your robots.txt.
-
That would be something like this, please check this or test this within Google Webmaster Tools if it works because I don't want to screw up your whole site. What this does is disallowing your complete site and just allows the /Blog/Post urls.
User-agent: *
Disallow: /
Allow: /Blog/Post
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our parent company has included their sitemap links in our robots.txt file - will that have an impact on the way our site is crawled?
Our parent company has included their sitemap links in our robots.txt file. All of their sitemap links are on a different domain and I'm wondering if this will have any impact on our searchability or potential rankings.
Intermediate & Advanced SEO | | tsmith1310 -
Site Migration and Traffic Help!
Hi Moz, I recently migrated my website with the help of an SEO company using 301 redirects. The reason for the move was to change our CMS from .aspx to Drupal/Wordpress. The homepage (www.shiftins.com) and the blog (www.shiftins.com/blog) were the only two pages that kept the same url. Everything else was redirected. It's been about two months since the redirects were completed and traffic has dropped off about 90%. I'm starting to worry that something was not done properly and my traffic may never return. The process for the redirects seem correct when I checked the work the SEO company did. All pages were duplicated, redirected to individual pages, then the old pages were de-indexed. Are there any insights the community can provide? Please help!
Intermediate & Advanced SEO | | shictins1 -
Please help with serp placment question?
We own discount banner printing and we are trying to rank 1 for pvc banners or vinyl banners and cannot understand for example how the below is correct, we did suffer a link penalty years ago but we fixed this and the domain has some good links (more and better quality than the sites above us) and cannot understand how we rank below most of the sites above us? If we type on for example pvc banners we get http://www.bannershop.co.uk/cats/pvc_banners.htm https://www.hfe-signs.co.uk/banners.php http://bannerprintingandroid.co.uk/pvc-banners/ http://www.discountbannerprinting.co.uk/banners/vinyl-pvc-banners.html And if we type in vinyl banners we get http://www.vistaprint.co.uk/banners.aspx http://www.bigvaluebanners.co.uk/ http://vinylbannersprinting.co.uk/ http://www.discountdisplays.co.uk/html/vinyl_banners.html https://www.buildasign.co.uk/banners http://www.monkey-print.com/outdoor banners/budget-outdoor-banners http://www.discountbannerprinting.co.uk/banners/vinyl-pvc-banners.html
Intermediate & Advanced SEO | | BobAnderson0 -
Robots.txt help
Hi Moz Community, Google is indexing some developer pages from a previous website where I currently work: ddcblog.dev.examplewebsite.com/categories/sub-categories Was wondering how I include these in a robots.txt file so they no longer appear on Google. Can I do it under our homepage GWT account or do I have to have a separate account set up for these URL types? As always, your expertise is greatly appreciated, -Reed
Intermediate & Advanced SEO | | IceIcebaby0 -
Help with homepage SEO please
Hi I have been looking after this site www.kids-academy.co.uk for 3 weeks now. I spotted that links were a major problem with the site and started to strip out the black hat inbound links straight away. I have also been doing some onsite optimisation for the main areas I have been asked to focus on. This saw results within a week for the subpages which is great. however, the design of the site meant there were no landing pages for the categories and the menu is a permanent "fixture" called a megamenu. I have advised /landingpages were needed not only for the ease of the end user to find what they need but also for SEO. Now the issues I have are that due to the menu style, it sees every one of those links within the homepage - over 200 links! I am wondering if I should nofollow some of them, or get them to change the style of the menu as surely this is having a direct result on the homepage and landing pages just not being seen at all within Google (as normal - Bing and Yahoo love the site). There is a lot of work to do on this site, but I would have thought to have seen some movement on the homepage at least by now. Any help is much appreciated. (Please note, there are some duplicate pages on there at the moment as I amalgamated some of the pages together last night and need to redirect these but I am having issues with redirect loops so those are not a contributing factor as this is a recent change). Thanks
Intermediate & Advanced SEO | | LeanneSEO
Leanne0 -
Help Identifying Unnatural Links
http://bit.ly/XT8yYYHi,Any help with the below will be most appreciated.We received an unnatural links warning in Webmaster Tools and noticed a large drop in our rankings. We downloaded and carried out a full link audit (3639 links) and logged in an excel spreadsheet with the following status: OK, Have Contacted, Can't Contact, Not SureWe have had some success but the majority of the ones we identified are not contactable.We use the dis-avow tool to tell Google of these. We then submitted a reconsideration request where we explained to Google our efforts and that we can supply them with our audit if necessary by email as you can't upload any evidence.A few days later we received a response suggesting that we still have unnatural links. We are a little stuck as we don't know what they can be:1. Is Google actually looking at our dis-avowed links before making this judgement?2. We have missed something that Google is considering bad but we can't see in our audit?Again we need a little help as we are trying to sort this out but can't see what we are falling down on.I can provide our spreadsheet if necessary.Many ThanksLee
Intermediate & Advanced SEO | | LeeFella0 -
I can't help but think something is wrong with my SEO
So we re-launched our site about a month ago, and ever since we've seen a dramatic drop in search results (probably due to some errors that were made) when changing servers and permalink structure. But, I can't help but think something else is at play here. When we write something, I can check 24 hours later, and if I copy the Title verbatim, but we don't always show up in SERPs. In fact, I looked at a post today, and the meta description showing is not the same, but when I check the source code, it's right. What shows up in Google: http://d.pr/i/jGJg What's actually in the source code: http://d.pr/i/p4s8 Why is this happening? Website is The Tech Block
Intermediate & Advanced SEO | | ttb0 -
I need help with setting the preferred domain; www. or not??
Hi! I'm kinda new to the SEO game and struggling with this site I'm working on: http://www.moondoggieinc.com I set the preferred domain to www. in GWT but I'm not seeing it reroute to that? I can't seem to get any of my internal pages to rank, and I was thinking it's possiblly b/c of a duplicate content issue cause by this problem. Any help or guidance on the right way to set preferred domain for this site and whiy I can't get my internal pages to rank? THANKS! KristyO
Intermediate & Advanced SEO | | KristyO0