Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt is blocking Wordpress Pages from Googlebot?
-
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
-
Delete everything under the following directives and you should be good.
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
As a rule of thumb, it's not a good idea to use wild cards in your robots.txt file - you may be excluding an entire folder inadvertently.
-
Here is my robots.txt from google webmaster tools. These are the pages that are being blocked and I am not sure which of these to get rid of in order to unblock blog posts from being searched.
http://ensoplastics.com/theblog/?cat=743
http://ensoplastics.com/theblog/?p=240
These category pages and blog posts are blocked so do I delete the /? ...I am new to SEO and web development so I am not sure why the developer of this robots.txt file would block pages and posts in wordpress. It seems to me like that is the reason why someone has a blog so it can be searched and get more exposure for SEO purposes.
Sitemap: http://www.ensobottles.com/blog/sitemap.xml
User-agent: Googlebot
Disallow: /*/trackback
Disallow: /*/feed
Disallow: /*/comments
Disallow: /?
Disallow: /*?
Disallow: /page/
User-agent: *Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
Disallow: /trackback
Disallow: /commentsDisallow: /feed
-
I'm not sure to what extent your website is being blocked with the robots.txt file but it's pretty easy to diagnose. You'll first need to identify and confirm that googlebot is being blocked by typing in your web browser ~> www.mywebsite.com/robots.txt
If you see an entry such as "User-agent: *" or "User-agent: googlebot" being used in conjunction with "Disallow" then you know your website is being blocked with the robots.txt file. Given your situation you'll need to go through a two step process.
First, go into your wordpress plugin page and deactivate the plugin which generates your robots.txt file. Second, login to the root folder of your server and look for the robots.txt file. Lastly, change "Disallow" to "Allow" and that should work but you'll need to confirm by typing in the robots URL again.
Given the limited information in your question I hope that helps. If you run into any more issues don't hesitate to post them here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why does Google rank a product page rather than a category page?
Hi, everybody In the Moz ranking tool for one of our client's (the client sells sport equipment) account, there is a trend where more and more of their landing pages are product pages instead of category pages. The optimal landing page for the term "sleeping bag" is of course the sleeping bag category page, but Google is sending them to a product page for a specific sleeping bag.. What could be the critical factors that makes the product page more relevant than the category page as the landing page?
Intermediate & Advanced SEO | | Inevo0 -
What do you add to your robots.txt on your ecommerce sites?
We're looking at expanding our robots.txt, we currently don't have the ability to noindex/nofollow. We're thinking about adding the following: Checkout Basket Then possibly: Price Theme Sortby other misc filters. What do you include?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Hreflang and paginated page
Hi, I can not seem to find good documentation about the use of hreflang and paginated page when using rel=next , rel=prev
Intermediate & Advanced SEO | | TjeerdvZ
Does any know where to find decent documentatio?, I could only find documentation about pagination and hreflang when using canonicals on the paginated page. I have doubts on what is the best option: The way tripadvisor does it:
http://www.tripadvisor.nl/Hotels-g187139-oa390-Corsica-Hotels.html
Each paginated page is referring to it's hreflang paginated page, for example: So should the hreflang refer to the pagined specific page or should it refer to the "1st" page? in this case:
http://www.tripadvisor.nl/Hotels-g187139-Corsica-Hotels.html Looking foward to your suggestions.0 -
Baidu Spider appearing on robots.txt
Hi, I'm not too sure what to do about this or what to think of it. This magically appeared in my companies robots.txt file (literally magically appeared/text is below) User-agent: Baiduspider
Intermediate & Advanced SEO | | IceIcebaby
User-agent: Baiduspider-video
User-agent: Baiduspider-image
Disallow: / I know that Baidu is the Google of China, but I'm not sure why this would appear in our robots.txt all of a sudden. Should I be worried about a hack? Also, would I want to disallow Baidu from crawling my companies website? Thanks for your help,
-Reed0 -
Should I use meta noindex and robots.txt disallow?
Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks 🙂
Intermediate & Advanced SEO | | ntcma0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
Hi! I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us. The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this: User-agent: googlebot Disallow: /community/photos/ Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible? Thanks! Leona
Intermediate & Advanced SEO | | HD_Leona0