Issue with Robots.txt file blocking meta description
-
Hi,
Can you please tell me why the following error is showing up in the serps for a website that was just re-launched 7 days ago with new pages (301 redirects are built in)?
A description for this result is not available because of this site's robots.txt – learn more.
Once we noticed it yesterday, we made some changed to the file and removed the amount of items in the disallow list.
Here is the current Robots.txt file:
# XML Sitemap & Google News Feeds version 4.2 - http://status301.net/wordpress-plugins/xml-sitemap-feed/ Sitemap: http://www.website.com/sitemap.xml Sitemap: http://www.website.com/sitemap-news.xml User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Other notes... the site was developed in WordPress and uses that followign plugins:
- WooCommerce All-in-One SEO Pack
- Google Analytics for WordPress
- XML Sitemap
- Google News Feeds
Currently, in the SERPs, it keeps jumping back and forth between showing the meta description for the www domain and showing the error message (above).
Originally, WP Super Cache was installed and has since been deactivated, removed from WP-config.php and deleted permanently.
One other thing to note, we noticed yesterday that there was an old xml sitemap still on file, which we have since removed and resubmitted a new one via WMT. Also, the old pages are still showing up in the SERPs.
Could it just be that this will take time, to review the new sitemap and re-index the new site?
If so, what kind of timeframes are you seeing these days for the new pages to show up in SERPs? Days, weeks? Thanks, Erin ```
-
At the moment, it doesn't seem that rel=publisher is doing all that much for sites (aside from sometimes showing better info ion the knowledge graph listing on Brand searches) but personally I believe it's functionality and influence are going to be greatly expanded fairly soon, so well worth doing. As far as it contributing anything to help speed up indexing... doubt it.
P.
-
Paul,
Thanks... you hit upon my hunch, that we will just have to wait.
Much of the information in the SERPs (metadescriptions, titles and urls) are still old,even though they redirect to the new pages when I click.
Thanks for the tip... and about social media.
Do you think it will help to get the rel=publisher link to the Google+ page on the site?
Erin
-
A lot of people, especially WP users use modules that may block certain spiders crawling your site, but in your case, you don't seem to have any.
-
If you just changed the robots.txt file yesterday, my guess is you're going to have to be patient while the site gets recrawled, Erin. Any of the pages that are in the index and were cached before yesterday's robots update will still include the directive not to include the metadescription (since that's the condition they were under when they were cached.)
I suspect the pages you're seeing with metadescriptions were crawled since the robots update. Are you seeing the same page change whether it shows metadescription or not?
As far as old pages showing in the SERPs, again they'll all have to be crawled before the 301 redirects can be discovered and the SEs can begin to understand they should be dropped. (Even then it can take days to weeks for the originals to drop out.)
Another very effective way to help get the new site indexed faster is to attract some good-quality new links to the new pages. Social Media can be especially effective for this, Google+ in particular.
Paul
-
Thanks!
What do I need to look for in the .htaccess file?
Here is what is there... and the rest (not shown) are redirects:
BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On RewriteBase / RewriteRule ^index.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L]</ifmodule> # END WordPress
BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On RewriteBase / RewriteRule ^index.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L]</ifmodule> # END WordPress
-
Thanks for the tips! Let me check it out.
-
I'd also insure its not something to do with your .htacess file.
-
Make sure the pages aren't blocked with meta robots noindex tag
Fetch as Google in WMT to request a full site recrawl.
Run brokenlinkcheck.com and see if their crawler is successfully crawling or if it's blocked.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What happens to crawled URLs subsequently blocked by robots.txt?
We have a very large store with 278,146 individual product pages. Since these are all various sizes and packaging quantities of less than 200 product categories my feeling is that Google would be better off making sure our category pages are indexed. I would like to block all product pages via robots.txt until we are sure all category pages are indexed, then unblock them. Our product pages rarely change, no ratings or product reviews so there is little reason for a search engine to revisit a product page. The sales team is afraid blocking a previously indexed product page will result in in it being removed from the Google index and would prefer to submit the categories by hand, 10 per day via requested crawling. Which is the better practice?
Intermediate & Advanced SEO | | AspenFasteners1 -
Hreflang implementation issue
We are currently handling search for a global brand www.example.com which has presence in many countries worldwide. To help Google understand that there is an alternate version of the website available in another language, we have used “hreflang” tags. Also, there is a mother website (www.example.com/global) which is given the attribution of “x-default” in the “hreflang” tag. For Malaysia as a geolocation, the mother website is ranking instead of the local website (www.example.com/my) for majority of the products. The code used for “hreflang” tag execution, on a product page, being: These “hreflang” tags are also present in the XML sitemap of the website, mentioning them below: <loc>http://www.example.com/my/product_name</loc> <lastmod>2017-06-20</lastmod> Is this implementation of “hreflang” tags fine? As this implementation is true across all geo-locations, but the mother website is out-ranking me only in the Malaysia market. If the implementation is correct, what could be other reasons for the same ranking issue, as all other SEO elements have been thoroughly verified and they seem fine.
Intermediate & Advanced SEO | | Starcom_Search0 -
If my website do not have a robot.txt file, does it hurt my website ranking?
After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help!
Intermediate & Advanced SEO | | binhlai0 -
Disavow files on m.site
Hi I have a site www.example.com and finally have got the developers to add Google webmaster verification codes for: example.com m.example.com As I was advised this is best practice - however I was wondering does this mean I now need to add the disavow file. Thanks Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
meta robots no follow on page for paid links
Hi I have a page containing paid links. i would like to add no follow attribute to these links
Intermediate & Advanced SEO | | Kung_fu_Panda
but from technical reasons, i can only place meta robots no follow on page level (
is that enough for telling Google that the links in this page are paid and and to prevent Google penlizling the sites that the page link to? Thanks!0 -
Baidu Spider appearing on robots.txt
Hi, I'm not too sure what to do about this or what to think of it. This magically appeared in my companies robots.txt file (literally magically appeared/text is below) User-agent: Baiduspider
Intermediate & Advanced SEO | | IceIcebaby
User-agent: Baiduspider-video
User-agent: Baiduspider-image
Disallow: / I know that Baidu is the Google of China, but I'm not sure why this would appear in our robots.txt all of a sudden. Should I be worried about a hack? Also, would I want to disallow Baidu from crawling my companies website? Thanks for your help,
-Reed0 -
How to fix issues from 301s
Case: We are currently in the middle of a site migration from .asp to .net and Endeca PageBuilder, and from a homebrewed search provider to Endeca Search. We have migrated most of our primary landing pages and our entire e-commerce site to the new platforms. During the transition approximately 100 of our primary landing pages were inadvertently 302ed to the new version. Once this was caught they were immediately changed to 301s and submitted to the Google’s index through webmaster tools. We initially saw increases in visits to the new pages, but currently (approximately 3 weeks after the change from 301 to 302) are experiencing a significant decline in visits. Issue: My assumption is many of the internal links (from pages which are now 301ed as well) to these primary landing pages are still pointing to the old version of the primary landing page in Google’s cache, and thus have not passed the importance and internal juice to the new versions. There are no navigational links or entry points to the old supporting pages left, and I believe this is what is driving the decline. Proposed resolution: I intend to create a series of HTML sitemaps of the old version (.asp) of all pages which have recently been 301ed. I will then submit these pages to Google’s index (not as sitemaps, just normal pages) with the selection to index all linked pages. My intention is to force Google to pick up all of the 301s, thus enforcing the authority channels we have set up. Question 1: Is the assumption that the decline could be because of missed authority signals reasonable? Question 2: Could the proposed solution be harmful? Question 3: Will the proposed solution be adequate to resolve the issue? Any help would be sincerely appreciated. Thank you in advance, David
Intermediate & Advanced SEO | | FireMountainGems0 -
Improve change my Meta Description shows in SERP
I feel my meta descriptions are descriptive and fairly represent the info on each page of my site. However, Google frequently includes this "20+ items" in front of the snippet. I run a job site and each page list 20 jobs. What if I include a bit of coding in the Meta Description to include "Latest Jobs Posted TODAY's DATE" - since the jobs listed on the page will include a date. On each page there is also option to "Create Email Alert" and "save Jobs" maybe I should include writing about that as well? I have read all Google's documents on the importance of making Meta Des relevant for the page etc, so any good insight how increase my chances of getting the meta des displayed in the SERP would be appreciated. thank you, Kristian
Intermediate & Advanced SEO | | knielsen0