Robots.txt is blocking Wordpress Pages from Googlebot?
-
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
-
Delete everything under the following directives and you should be good.
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
As a rule of thumb, it's not a good idea to use wild cards in your robots.txt file - you may be excluding an entire folder inadvertently.
-
Here is my robots.txt from google webmaster tools. These are the pages that are being blocked and I am not sure which of these to get rid of in order to unblock blog posts from being searched.
http://ensoplastics.com/theblog/?cat=743
http://ensoplastics.com/theblog/?p=240
These category pages and blog posts are blocked so do I delete the /? ...I am new to SEO and web development so I am not sure why the developer of this robots.txt file would block pages and posts in wordpress. It seems to me like that is the reason why someone has a blog so it can be searched and get more exposure for SEO purposes.
Sitemap: http://www.ensobottles.com/blog/sitemap.xml
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
User-agent: *Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
Disallow: /trackback
Disallow: /commentsDisallow: /feed
-
I'm not sure to what extent your website is being blocked with the robots.txt file but it's pretty easy to diagnose. You'll first need to identify and confirm that googlebot is being blocked by typing in your web browser ~> www.mywebsite.com/robots.txt
If you see an entry such as "User-agent: *" or "User-agent: googlebot" being used in conjunction with "Disallow" then you know your website is being blocked with the robots.txt file. Given your situation you'll need to go through a two step process.
First, go into your wordpress plugin page and deactivate the plugin which generates your robots.txt file. Second, login to the root folder of your server and look for the robots.txt file. Lastly, change "Disallow" to "Allow" and that should work but you'll need to confirm by typing in the robots URL again.
Given the limited information in your question I hope that helps. If you run into any more issues don't hesitate to post them here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is robots met tag a more reliable than robots.txt at preventing indexing by Google?
What's your experience of using robots meta tag v robots.txt when it comes to a stand alone solution to prevent Google indexing? I am pretty sure robots meta tag is more reliable - going on own experiences, I have never experience any probs with robots meta tags but plenty with robots.txt as a stand alone solution. Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart1 -
Search Results Pages Blocked in Robots.txt?
Hi I am reviewing our robots.txt file. I wondered if search results pages should be blocked from crawling? We currently have this in the file /searchterm* Is it a good thing for SEO?
Intermediate & Advanced SEO | | BeckyKey0 -
Best practice for disallowing URLS with Robots.txt
Hi Everybody, We are currently trying to tidy up the crawling errors which are appearing when we crawl the site. On first viewing, we were very worried to say the least:17000+. But after looking closer at the report, we found the majority of these errors were being caused by bad URLs featuring: Currency - For example: "directory/currency/switch/currency/GBP/uenc/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL3dvcmt3ZWFyP3ByaWNlPTUwLSZzdGFuZGFyZHM9NzEx/" Color - For example: ?color=91 Price - For example: "?price=650-700" Order - For example: ?dir=desc&order=most_popular Page - For example: "?p=1&standards=704" Login - For example: "customer/account/login/referer/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL2NhdGFsb2cvcHJvZHVjdC92aWV3L2lkLzQ1ODczLyNyZXZpZXctZm9ybQ,,/" My question now is as a novice of working with Robots.txt, what would be the best practice for disallowing URLs featuring these from being crawled? Any advice would be appreciated!
Intermediate & Advanced SEO | | centurysafety0 -
Robots.txt Allowed
Hello all, We want to block something that has the following at the end: http://www.domain.com/category/product/some+demo+-text-+example--writing+here So I was wondering if doing: /*example--writing+here would work?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
Robots.txt Blocking - Best Practices
Hi All, We have a web provider who's not willing to remove the wildcard line of code blocking all agents from crawling our client's site (user-agent: *, Disallow: /). They have other lines allowing certain bots to crawl the site but we're wondering if they're missing out on organic traffic by having this main blocking line. It's also a pain because we're unable to set up Moz Pro, potentially because of this first line. We've researched and haven't found a ton of best practices regarding blocking all bots, then allowing certain ones. What do you think is a best practice for these files? Thanks! User-agent: * Disallow: / User-agent: Googlebot Disallow: Crawl-delay: 5 User-agent: Yahoo-slurp Disallow: User-agent: bingbot Disallow: User-agent: rogerbot Disallow: User-agent: * Crawl-delay: 5 Disallow: /new_vehicle_detail.asp Disallow: /new_vehicle_compare.asp Disallow: /news_article.asp Disallow: /new_model_detail_print.asp Disallow: /used_bikes/ Disallow: /default.asp?page=xCompareModels Disallow: /fiche_section_detail.asp
Intermediate & Advanced SEO | | ReunionMarketing0 -
Redirecting thin content city pages to the state page, 404s or 301s?
I have a large number of thin content city-level pages (possibly 20,000+) that I recently removed from a site. Currently, I have it set up to send a 404 header when any of these removed city-level pages are accessed. But I'm not sending the visitor (or search engine) to a site-wide 404 page. Instead, I'm using PHP to redirect the visitor to the corresponding state-level page for that removed city-level page. Something like: if (this city page should be removed) { header("HTTP/1.0 404 Not Found");
Intermediate & Advanced SEO | | rriot
header("Location:http://example.com/state-level-page")
exit();
} Is it problematic to send a 404 header and still redirect to a category-level page like this? By doing this, I'm sending any visitors to removed pages to the next most relevant page. Does it make more sense to 301 all the removed city-level pages to the state-level page? Also, these removed city-level pages collectively have very little to none inbound links from other sites. I suspect that any inbound links to these removed pages are from low quality scraper-type sites anyway. Thanks in advance!2 -
301 - should I redirect entire domain or page for page?
Hi, We recently enabled a 301 on our domain from our old website to our new website. On the advice of fellow mozzer's we copied the old site exactly to the new domain, then did the 301 so that the sites are identical. Question is, should we be doing the 301 as a whole domain redirect, i.e. www.oldsite.com is now > www.newsite.com, or individually setting each page, i.e. www.oldsite.com/page1 is now www.newsite.com/page1 etc for each page in our site? Remembering that both old and new sites (for now) are identical copies. Also we set the 301 about 5 days ago and have verified its working but haven't seen a single change in rank either from the old site or new - is this because Google hasn't likely re-indexed yet? Thanks, Anthony
Intermediate & Advanced SEO | | Grenadi0