Default Robots.txt in WordPress - Should i change it??
-
I have a WordPress site as using theme Genesis i am using default robots.txt. that has a line Allow: /wp-admin/admin-ajax.php, is it okay or any problem. Should i change it?
-
Yes, we're a news site as well and in our case we want to make sure the low quality pages on TNW aren't indexed.
-
Thank you both for your response.
@Martijn your robots.txt is really a nice example but for my new site is it good practice to block this areas??
@Peter To be a safe side I was using the same robots.txt...
-
In addition of Martijn here is mine robots.txt:
User-agent: *
Disallow:Sitemap: http://peter.nikolow.me/sitemap_index.xml
But using Yoast - categories, tags, most of archives and other generated pages are disabled for indexing.
-
Hi Peter,
Usually I would say it's not enough as the robots.txt is forgeting about excluding the search pages and in most cases you want to make sure the WP core files are not included + tag pages. Take a look at our robots.txt to see what we've included there: http://thenextweb.com/robots.txt then you'll notice we include for example these:
User-agent: *
Disallow: ?p=
Disallow: /wp-includes/
Disallow: /wp-login.php
Disallow: /wp-admin/*
Disallow: /wp-register.php
Disallow: /wp-content/themes/icetea/includes/*
Disallow: /tag/
Disallow: ?s=
Disallow: /search/*Other cases in our robots.txt are very specifically in there because of our site and may not apply to others.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Disallowed "Search" results with robots.txt and Sessions dropped
Hi
Intermediate & Advanced SEO | | Frankie-BTDublin
I've started working on our website and I've found millions of "Search" URL's which I don't think should be getting crawled & indexed (e.g. .../search/?q=brown&prefn1=brand&prefv1=C.P. COMPANY|AERIN|NIKE|Vintage Playing Cards|BIALETTI|EMMA PAKE|QUILTS OF DENMARK|JOHN ATKINSON|STANCE|ISABEL MARANT ÉTOILE|AMIRI|CLOON KEEN|SAMSONITE|MCQ|DANSE LENTE|GAYNOR|EZCARAY|ARGOSY|BIANCA|CRAFTHOUSE|ETON). I tried to disallow them on the Robots.txt file, but our Sessions dropped about 10% and our Average Position on Search Console dropped 4-5 positions over 1 week. Looks like over 50 Million URL's have been blocked, and all of them look like all of them are like the example above and aren't getting any traffic to the site. I've allowed them again, and we're starting to recover. We've been fixing problems with getting the site crawled properly (Sitemaps weren't added correctly, products blocked from spiders on Categories pages, canonical pages being blocked from Crawlers in robots.txt) and I'm thinking Google were doing us a favour and using these pages to crawl the product pages as it was the best/only way of accessing them. Should I be blocking these "Search" URL's, or is there a better way about going about it??? I can't see any value from these pages except Google using them to crawl the site.0 -
Removing dates from wordpress blog URL
Hi all, Ours is website's blog is built with wordpress. We used to have the below URL pattern like may other websites: www.website.com/blog/2016/04/10/topic-on-how-to-optimise-blog. Recently we removed the date and made the URL pattern to just like: www.website.com/blog/topic-on-how-to-optimise-blog All the links have been generated with new URLs across the blog. Still all the old URLs have been reported as crawl errors in search console. I am wondering will there be any auto redirect formula to redirect all the old URLs to new URLs. Thanks
Intermediate & Advanced SEO | | vtmoz0 -
SEO is changing - how has your day to day changed?
I'm sure we all read on our alternatives to Google Reader that SEO is changing - "here's what we must do to be relevant in 2014". I find these articles boring and uninformative. I suspect I'm not alone. The reason I'm not their biggest fan is because I feel like I've invested 10 minutes into an article that I have no actual guidance from. Therefore, I thought I'd ask the real SEO's, you guys, what has actually changed for you? Are you now not creating content with the aim of getting links? If you run a commercial website, what are you doing different to rank your product pages - directly or indirectly? Please share with the group. I'm sure many like me are still brainstorming and creating content they think will grab people's attention and gain them links, whilst also pushing their Facebook, Twitter, Youtube profiles, etc etc. What has changed about this?
Intermediate & Advanced SEO | | purpleindigo0 -
I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!
Hi, I've just found two sitemaps - one of them is .php and represents part of the site structure on the website. The second is a .txt file which lists every page on the website. The .txt file is blocked via robots exclusion protocol (which doesn't appear to be very logical as it's the only full sitemap). Any ideas why a developer might have done that?
Intermediate & Advanced SEO | | McTaggart0 -
Change of language
Hi everyone, We bought a domain which had content in German for over 8 years. So the rankings it had were in another search engine aswell. So i've changed the language of the content + targetting in webmaster tools to Dutch. (i've created unique content, in case your wondering)
Intermediate & Advanced SEO | | Online_Supply
Now we don't rank in the targetted search engine, nor in the search engine the website was previously ranked. My question is how can we fix this so we are going to get indexed and ranked for the targetted search engine. Thanks in advance.0 -
If i disallow unfriendly URL via robots.txt, will its friendly counterpart still be indexed?
Our not-so-lovely CMS loves to render pages regardless of the URL structure, just as long as the page name itself is correct. For example, it will render the following as the same page: example.com/123.html example.com/dumb/123.html example.com/really/dumb/duplicative/URL/123.html To help combat this, we are creating mod rewrites with friendly urls, so all of the above would simply render as example.com/123 I understand robots.txt respects the wildcard (*), so I was considering adding this to our robots.txt: Disallow: */123.html If I move forward, will this block all of the potential permutations of the directories preceding 123.html yet not block our friendly example.com/123? Oh, and yes, we do use the canonical tag religiously - we're just mucking with the robots.txt as an added safety net.
Intermediate & Advanced SEO | | mrwestern0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
Simple Press forum for wordpress
I'm using a forum plugin called Simple Press, and the rest of my site is looking good with only a few minor errors due to a long url. Anyway, the only 4 major errors I have are these; These 3 links have no titles, so is there somewhere I can give them titles, or do a rel=nofollow? /index.php?sf_ahah=acknowledge /index.php?sf_ahah=permissions /index.php?sf_ahah=tags And then the 3 above plus this one; http://www.societyforethicsand…..?xfeed=all Have no META DESCRIPTION associated with them. So, is there somewhere I can add the meta description for all 4? I have spoken to support, and it turns out the first 3 links with no titles are ajax content for pop ups, instead of waiting for them to work out how to resolve this issue, does anyone know how to stop them coming up as major errors?
Intermediate & Advanced SEO | | CosmikCarrot0