What content should I block in wodpress with robots.txt?
-
I need to know if anyone has tips on creating a good robots.txt. I have read a lot of info, but I am just not clear on what I should allow and not allow on wordpress. For example there are pages and posts, then attachments, wp-admin, wp-content and so on. Does anyone have a good robots.txt guideline?
-
Here is what I have changed it to...found various articles including the one listed above and decided to go with this, not sure if it is good or bad.
-
The less you put in robots.txt the better...
If there exists anywhere (on your site or on another) a link to some page that you put in robots.txt, it won't stop google from indexing it, and it will look bad (no title, no description since the bot can't access them). Worse, if you wan to noindex that page in the future it won't work !
So block content if you know there's no link to it and there won't ever be, otherwise the right move is the noindex meta tag.
-
Here's a sample robots.txt from the wordpress.org site itself:
http://codex.wordpress.org/Search_Engine_Optimization_for_Wordpress#Robots.txt_Optimization
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content From API - Remove or to Redirect ?
Hi Guys,
Intermediate & Advanced SEO | | PaddyM556
I am working on a site at the moment,
Previous developer used a API to pull in HealthCare content (HSE) .
So the API basically generates landing pages within the site, and generates the content.
To date it has over 2k in pages being generated.
Some actually rank organically and some don't. New site being launch: So a new site is being launched & the "health advice" where this content used to live be not included in the new site. So this content will not have a place to be displayed. My Query: Would you allow the old content die off in the migration process & just become 404's
Or
Would you 301 redirect the all or only ranking pages to the homepage ? Other considerations, site will be moved to https:// so site will be submitted to search console & re-indexed by Google. Would love to hear if anyone had similar situation or suggestions.
Best Regards
Pat0 -
Content update on 24hr schedule
Hello! I have a website with over 1300 landings pages for specific products. These individual pages update on a 24hr cycle through out API. Our API pulls reviews/ratings from other sources and then writes/updates that content onto the page. Is that 'bad"? Can that be viewed as spammy or dangerous in the eyes of google? (My first thought is no, its fine) Is there such a thing as "too much content". For example if we are adding roughly 20 articles to our site a week, is that ok? (I know news websites add much more than that on a daily basis but I just figured I would ask) On that note, would it be better to stagger our posting? For example 20 articles each week for a total of 80 articles, or 80 articles once a month? (I feel like trickle posting is probably preferable but I figured I would ask.) Is there any negatives to the process of an API writing/updating content? Should we have 800+ words of static content on each page? Thank you all mozzers!
Intermediate & Advanced SEO | | HashtagHustler0 -
Penalty for duplicate content on the same website?
Is it possible to get a penalty for duplicate content on the same website? I have a old custom-built site with a large number of filters that are pre-generated for speed. Basically the only difference is the meta title and H1 tag, with a few text differences here and there. Obviously I could no-follow all the filter links but it would take an enormous amount of work. The site is performing well in the search. I'm trying to decide whether if there is a risk of a penalty, if not I'm loath to do anything in case it causes other issues.
Intermediate & Advanced SEO | | seoman100 -
Faceted Navigation and Dupe Content
Hi, We have a Magento website using layered navigation - it has created a lot of duplicate content and I did ask Google in GWT to "No URLS" most of the querystrings except the "p" which is for pagination. After reading how to tackle this issue, I tried to tackle it using a combination of Meta Noindex, Robots, Canonical but still it was a snowball I was trying to control. In the end, I opted for using Ajax for the layered navigation - no matter what option is selected there is no parameters latched on to the url, so no dupe/near dupe URL's created. So please correct me if I am wrong, but no new links flow to those extra URL's now so presumably in due course Google will remove them from the index? Am I correct in thinking that? Plus these extra URL's have Meta Noindex on them too - I still have tens of thousands of pages indexed in Google. How long will it take for Google to remove them from index? Will having Meta No Index on the pages that need to be removed help? Any other way of removing thousands of URLS from GWT? Thanks again, B
Intermediate & Advanced SEO | | bjs20100 -
Blocked from google
Hi, i used to get a lot of trafic from google but sudantly there was a problem with the website and it seams to be blocked. We are also in the middle of changing the root domain because we are making a new webpage, i have looked at the webmaster tools and corrected al the errors but the page is still not visible in google. I have also orderd a new crawl. Anyone have any trics? do i loose a lot when i move the domainname, or is this a good thing in this mater? The old one is smakenavitalia.no The new one is Marthecarrara.no Best regards Svein Økland
Intermediate & Advanced SEO | | sveinokl0 -
Magento Duplicate Content Recovery
Hi, we switched platforms to Magento last year. Since then our SERPS rankings have declined considerably (no sudden drop on any Panda/Penguin date lines). After investigating, it appeared we neglected to No index, follow all our filter pages and our total indexed pages rose sevenfold in a matter of weeks. We have since fixed the no index issue and the pages indexed are now below what we had pre switch to Magento. We've seen some positive results in the last week. Any ideas when/if our rankings will return? Thanks!
Intermediate & Advanced SEO | | Jonnygeeuk0 -
Will blocking urls in robots.txt void out any backlink benefits? - I'll explain...
Ok... So I add tracking parameters to some of my social media campaigns but block those parameters via robots.txt. This helps avoid duplicate content issues (Yes, I do also have correct canonical tags added)... but my question is -- Does this cause me to miss out on any backlink magic coming my way from these articles, posts or links? Example url: www.mysite.com/subject/?tracking-info-goes-here-1234 Canonical tag is: www.mysite.com/subject/ I'm blocking anything with "?tracking-info-goes-here" via robots.txt The url with the tracking info of course IS NOT indexed in Google but IT IS indexed without the tracking parameters. What are your thoughts? Should I nix the robots.txt stuff since I already have the canonical tag in place? Do you think I'm getting the backlink "juice" from all the links with the tracking parameter? What would you do? Why? Are you sure? 🙂
Intermediate & Advanced SEO | | AubieJon0 -
How are they avoiding duplicate content?
One of the largest stores in USA for soccer runs a number of whitelabel sites for major partners such as Fox and ESPN. However, the effect of this is that they are creating duplicate content for their products (and even the overall site structure is very similar). Take a look at: http://www.worldsoccershop.com/23147.html http://www.foxsoccershop.com/23147.html http://www.soccernetstore.com/23147.html You can see that practically everything is the same including: product URL product title product description My question is, why is Google not classing this as duplicate content? Have they coded for it in a certain way or is there something I'm missing which is helping them achieve rankings for all sites?
Intermediate & Advanced SEO | | ukss19840