Noindex, nofollow on a blog since 2009
-
Just reviewed a WordPress blog that was launched in 2009 but somehow the privacy setting was to not index it, so all this time there's been a noindex, nofollow meta tag in the header. The client couldn't figure out why masses of content wasn't showing up in search results.
I've fixed the setting and assume Google will spider in short order; the blog is a subdirectory of their main site. My question is whether there is anything else I can or should do. Can Google recognize the age of the content, or that it once had a noindex meta tag? Will it "date" the blog as of today? Has the client lost out on untold benefits from the long history of content creation? I imagine that link juice from any backlinks to the blog will now flow back to the main site; think that's true?
Just curious what others might think of this scenario and whether any other action is warranted.
-
Thanks Dan. One thing I found interesting is that Google Webmaster Tools doesn't offer any alerts about pages that aren't indexed because of meta tags, only about those included in the robots.txt file.
-
Hi
Great responses Matt and Ben, thanks!! Only things I could add are;
Webmaster Tools
- Check google webmaster tools every few days for the first 2-3 weeks.
- You may turn up some 404s or other types of errors that should be corrected.
- And keep your eyes out for any other warnings
Analytics
- You're going to spike your traffic (potentially, hopefully) in analytics big time, or at least skew the data
- Use filters and advanced segments to separate blog traffic so you can still analyze things even after a potential spike in blog search traffic.
- At minimum make an annotation of the date you made it indexable.
Dates
- Regarding the dates, I did come across this recently - I have not tested, so please take it with a grain of salt - removing dates from the SERPs - I would only recommend trying it if the content was not "time sensitive" (like a cooking recipe).
Hope all this helps!
-Dan
-
Thanks for the clarification Ben. I think I'll leave older posts as is. They've been actively posting several times a week, so there should be enough fresh content. My hope is that Google recognizes the age of the blog because it's my understanding that age factors in the ranking algorithm.
-
Ahh yeah my bad, ignore that bit. I think you'd still want to make a subtle change to each post so WordPress can set the date updated flag on the sitemap to today, that way Google will put a higher priority on the content when indexing your site.
-
Thanks, the site maps are a good idea. Ben, I'm not sure what you mean about making the content different to what Google has in its index. Because of the meta tag, it doesn't have any content in its index, right?
-
You've done the most important step (removing the noindex/nofollow) tags. The only additional thing I would do is submit (or resubmit) the XML sitemap to Google. Make sure that XML sitemap is perfect and error free so that you don't create any additional errors.
Google should be smart enough to recognize the dates. I've never had a situation where it was years between publish and index. I have however had situations where it was days or weeks in between publish and index and in those situations Google has recognize the date. I'd imagine the same is true here (assuming of course, you have the date in a recognizable format and don't change the date to today).
I'd be curious to find out what happens. Definitely update this Q&A when you find out what happens!
-
I would probably re-arrange some of the paragraphs (or add some more content) to the old posts and update them in WordPress, this then makes the content different to what Google has in its index.
I would then use the Yoast WordPress SEO plugin to regenerate your sitemap. Since you've updated and added new content to the posts their last updated date would have changed so Google will probably see this as revised content. I would submit to all major search engines as your first port of call.
In terms of the "link juice", I would say that Google will still count links to the article as a ranking factor, but because you have noindex the content wont appear in search results. So the content will have a fairly good page rank (possibly) but its being held back by the exclusion of the search engine index.
Now that the setting has been changed and the sitemap / content has been updated you should start to see the results in the search results in due time.
You could also add a few new articles of content to the blog and publicise that over social media to help get back in the game a bit quicker.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results. Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/ Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page. I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed. Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...?
Technical SEO | | d.bird0 -
Home Page Blog Snippets - Duplicate Content Help?
Afternoon Folks- I have been asked to contribute to a new site that has a blogfeed prominently displayed on the home page. It's laid out like this: Logo | Menu HOME PAGE SLIDER Blog 1 Title about 100 words of blog 1 Text Blog 2 Title about 100 words of blog 2 Text Blog 3 Title about 100 words of blog 3 Text Footer: -- This seems like an obvious duplicate content situation but also a way I have seen a lot of blogs laid out. (I.E. With blog content snippets being a significant portion of the home page content) I want the blogs to rank and I want the home page to rank, so I don't feel like a rel canonical on the blog post's is the correct option unless I have misunderstood their purpose. Anyone have any ideas or know how this is usually handled?
Technical SEO | | CRO_first0 -
GWT Malware notification for meta noindex'ed pages ?
I was wondering if GWT will send me Malware notification for pages that are tagged with meta noindex ? EG: I have a site with pages like example.com/indexed/content-1.html
Technical SEO | | Saijo.George
example.com/indexed/content-2.html
example.com/indexed/content-3.html
....
example.com/not-indexed/content-1.html
example.com/not-indexed/content-2.html
example.com/not-indexed/content-3.html
.... Here all the pages like the ones below, are tagged with meta noindex and does not show up in search.
example.com/not-indexed/content-1.html
example.com/not-indexed/content-2.html
example.com/not-indexed/content-3.html Now one fine day example.com/not-indexed/content-2.html page on the site gets hacked and starts to serve malware, none of the other pages are affected .. Will GWT send me a warning for this ? What if the pages are blocked by Robots.txt instead of meta noindex ? Regard
Saijo UPDATE hope this helps someone else : https://plus.google.com/u/0/109548904802332365989/posts/4m17sUtPyUS0 -
Mitigating duplicate page content on dynamic sites such as social networks and blogs.
Hello, I recently did an SEOMoz crawl for a client site. As it typical, the most common errors were duplicate page title and duplicate content. The client site is a custom social network for researchers. Most of the pages that showing as duplicate are simple variations of each user's profile such as comment sections, friends pages, and events. So my question is how can we limit duplicate content errors for a complex site like this. I already know about the rel canonical tag, and rel next tag, but I'm not sure if either of these will do the job. Also, I don't want to lose potential links/link juice for good pages. Are there ways of using the "noindex" tag in batches? For instance: noindex all urls containing this character? Or do most CMS allow this to be done systematically? Anyone with experience doing SEO for a custom Social Network or Forum, please advise. Thanks!!!
Technical SEO | | BPIAnalytics0 -
Blog question
If i set up a blog like this - http://www.abccompany.com/blog ? ( in a folder ), will each link to http://www.abccompany.com/blog carry more value to the main site than if the blog were set up like this- http://www.blog.abccompany.com
Technical SEO | | seoug_20050 -
Very well established blog, new posts now being indexed very late
I have an established blog.We update it on daily basis. In the past, when I would publish a new post, it would get indexed within a minute or so. But since a month or so, its taking hours. Sometimes like 10-12 hours for new posts to get indexed. Only thing I have changed is robots.txt. This is the current robots file. User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /wp-login.php Disallow: /*wp-login.php* Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /author Disallow: /category Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /login/ Disallow: /wget/ Disallow: /httpd/ Disallow: /*.php$ Disallow: /*?* Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /*?* Disallow: /*? Allow: /wp-content/uploads User-agent: TechnoratiBot/8.1 Disallow: # ia_archiver User-agent: ia_archiver Disallow: / # disable duggmirror User-agent: duggmirror Disallow: / # allow google image bot to search all images User-agent: Googlebot-Image Disallow: /wp-includes/ Allow: /* # allow adsense bot on entire site User-agent: Mediapartners-Google* Disallow: Allow: /* Sitemap: http://www.domainname.com/sitemap.xml.gz Site has tons of backlinks. Just wondering if something is wrong with the robots file or if it could be something else.
Technical SEO | | rookie1230 -
Two company websites - should 1 be nofollow?
We are an automotive dealership that is required to keep a manufacturer appointed website. It's pretty terrible so we are having a new site developed in addition to that. We plan on transferring our current domain name to the new site but I am not sure what to do with the older, less useful site. Should I try to have both sites appear in the search results to capture more traffic? Or is that inefficient? Should I make the new site no follow?
Technical SEO | | kylesuss0 -
What are the SEO related negative aspects of having a blog on a subdomain?
Just double checking this one. Is it less than ideal to have a blog on a subdomain rather than as a sub folder on a domain?
Technical SEO | | PerchDigital0