Should I use meta noindex and robots.txt disallow?
-
Hi, we have an alternate "list view" version of every one of our search results pages
The list view has its own URL, indicated by a URL parameter
I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling
When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled
Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"...
Thanks
-
Hi,
Thanks, I will do some testing to confirm that this behaves how I would like it to
-
if all pages are 100#5 not indexed then I would block it in robots.txt, Google's John Muller confirmed to me that Googlebot will continue to crawl every link to check to see if a nofollow or noindex has changed status.
So as a result we blocked our pages with robots.txt and saw a great increases in index/crawl rates on pages we want Google to pay attention to. It also reduces waste in server resources.
However if there are any pages that are index, if you block them in robots.txt then Googlebot will never be able to crawl the link to determine that it should be noindex. This means it could stay in a permanent stage of indexed.
I hope that answers all your questions?
-
When you say:
nofollow will tell the crawlers to not crawl the page
I believe you mean to say that this will tell the crawlers not to crawl the links on the page, the page itself is itself still "crawled" is it not?
But yes, you are right to say, that once robots.txt disallow is in place, the meta tag will not be seen and thus be moot (at which point I may as well take it off).
It would be nice to be able to say "don't crawl this and don't put it in the index"... but is there a way?
-
noindex only tells the search crawlers to not include the page in the index but still allows for them to crawl the page. nofollow will tell the crawlers to not crawl the page.
robots.txt will accomplish this as well but both I think would be overkill.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using a US CDN (Cloudflare) for a UK Site. Should I use a UK Based CDN as it says my server is based in USA
Hi All, We are a UK Company with Uk customers only and use CloudFlare CND. Our Site is hosted by a UK company with servers here but from looking online and checking where my site is hosted etc etc , some sites are telling me the name of our UK Hosted company and other sites are telling me my site is hosted in San Fran (USA) , where I presume the Cloudflare is based. I know Cloudflare has a couple of servers in the UK it uses but given all my customers are UK based ,I don't want this is affect rankings etc , as I thought it was a ranking benefit to be hosted in the country you are based. Is there any issue with this and should I change or is google clever enough to know so i shouldn't worry. thanks Pet
Intermediate & Advanced SEO | | PeteC120 -
Some Tools Not Recognizing Meta Tags
I am analyzing a site which has several thousands of pages, checking the headers, meta tags, and other on page factors. I noticed that the spider tool on SEO Book (http://tools.seobook.com/general/spider-test) does not seem to recognize the meta tags for various pages. However, using other tools including Moz, it seems the meta tags are being recognized. I wouldn't be as concerned with why a tool is not picking up the tags. But, the site suffered a large traffic loss and we're still trying to figure out what remaining issues need to be addressed. Also, many of those pages once ranked in Google and now cannot be found unless you do a site:// search. Is it possible that there is something blocking where various tools or crawlers can easily read them, but other tools cannot. This would seem very strange to me, but the above is what I've witnessed recently. Your suggestions and feedback are appreciated, especially as this site continues to battle Panda.
Intermediate & Advanced SEO | | ABK7170 -
Permanently using 301 for internal link
Hello Folks, Tried going through the 301 answers but could not find any question similar to what I had. The issue we have is we have got a listing page with the products like this: /used-peugeot/used-toyota-corolla As you can see this URL is not really ideal and I want to redirect it to /used-toyota/corolla using mod_rewrite. The redirect will be 301. My concern here is the URL in the listing page won't change to /used-toyota/corolla and hence the 301 will be 'permanently' placed and I was wondering if this will lose some link juice of the 301ed URL. Now with 301 being a 'permanent' redirect one would assume it should not be an issue but I just wanted to be sure that I am correct in assuming so. Thank you for your time.
Intermediate & Advanced SEO | | nirpan0 -
Should comments and feeds be disallowed in robots.txt?
Hi My robots file is currently set up as listed below. From an SEO point of view is it good to disallow feeds, rss and comments? I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly. What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback. Thanks. Eddy User-agent: Googlebot Crawl-delay: 10 Allow: /* User-agent: * Crawl-delay: 10 Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /rss/ Disallow: /comments/feed/ Disallow: /page/ Disallow: /date/ Disallow: /comments/ # Allow Everything Allow: /*
Intermediate & Advanced SEO | | workathomecareers0 -
Why google index some meta titles I dont have?
Hi there, I have a problem with a website and I am desperate to find a solution because I have tried many things and nothing works! My website its: adtriboo.com Google does not find my main URL (main countro spain) www.adtriboo.com/es and I dont see this page its indexed in google. See link https://www.google.es/search?num=100&hl=es&site=&source=hp&q=site%3Aadtriboo.com&oq=site%3Aadtriboo.com&gs_l=hp.3...1189.4419.0.4586.17.17.0.0.0.0.223.1457.9j6j1.16.0...0.0...1c.1.8.hp.brTKX-zPwVI Also, google its showing some meta titles that are not in my page! For example my subfolder for the country Chile shows this title: Chile - Adtriboo but this its my real title Diseño logo, logotipos, video corporativo - adtriboo In webmaster tools everything looks good, and if I explore the webpage like google in webmaster tools the code its ok and everything lookd okay. If you see for example the URL from Chile (www.adtriboo.com/es_CL) the meta title is not the right one! Also i have a problem indexatión because i am not visible for any of my keywords even in the page 10! Please, somebody knows what happen?
Intermediate & Advanced SEO | | Comunicare0 -
How Should I Start Using SEO For Wordpress
I want to ranked my all keywords on top 10 position but unforunately only few keywords are in top 10 position.. Now i already using some recommend plugins like Yoast, Easy WP SEO. So my main question is how should i start promoting my wordpress topic. I already submitting to social network but not getting enough traffic except Stumbleupon... I am using SEO software like SenukeX and Bookmarking Demon but still no luck.. i am not getting quality backlinks.... please help i am very frustrated... please someone give proper guideline.. every day is over by thinking the strategy...
Intermediate & Advanced SEO | | mamuti0 -
Do you use your own Blog networks?
Do you use a network of sites you own for links to your clients in your seo efforts? I see so many seo companies doing this from such junk sites with all their clients in the blog roll, it seems totally crazy. It seems this stuff works do any of you do this if so how do you keep it white hat?
Intermediate & Advanced SEO | | DavidKonigsberg0 -
Googlebot + Meta-Refresh
Quick question, can Googlebot (or other search engines) follow meta refresh tags? Does it work anything like a 301 in terms of passing value to the new page?
Intermediate & Advanced SEO | | kchandler1