When you add 10.000 pages that have no real intention to rank in the SERP, should you: "follow,noindex" or disallow the whole directory through robots? What is your opinion?
-
I just want a second opinion
The customer don't want to loose any internal linkvalue by vaporizing link value though a big amount of internal links. What would  you do?
-
Hi Jeff,
Thanks for your answer. Please take a look to the reply above on Fredrico
-
Hi Federico,
In this case it's an affiliate website and the 10.000 pages are all prodcutpages. It's all coming from datafeeds so it's duplicate content.
We don't want to index this that's for sure.
So noindex,follow or disallow the whole directory or both...
We have our own opinion about this but I want to hear what others are thinking about this
Thanks in advanced!
-
Yep, I agree with belt and suspenders.
-
Wesley - I do agree with Federico.
That said, if they really don't want those pages indexed, use the belt-and-suspender method (if you wear both a belt and suspenders, chances are greater that your pants won't fall down).
I'd put a robot.txt file to disallow the indexing of the directory, and also no-index / no-follow each of the pages, too.
That way when they have someone working on the pages in the site and they change things to followed, you're still covered. Â Likewise, if someone blows away the robot.txt file.
Just my $0.02, but hope it helps…
-- Jeff -
What do they have? 10,000 pages of uninteresting content? a robots tag noindex,follow will do to leave them our of engines. But to decide you really need to know what those pages have. 10,000 isn't a few, and if there's value content worth sharing, a page could get a link, that if you disallow it through the robots, won't even flow pagerank.
It all comes down to what are those pages for...?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
Outside Top 10 Even though - Higher Domain/Page Authority/Higher On Page Grade
Hi, Note: this is for Australian search results - for people in Perth.
Intermediate & Advanced SEO | | HeadStud
The website is: http://thedj.com.au I am trying to optimise for the keyword 'perth wedding dj', but also 'wedding dj perth' and for some reason my website isn't even in the top 10 results. Here is what's weird though: My on-page grade with the On-Page Grader for the keyword 'wedding DJ perth' is an 'A' for http://thedj.com.au (http://awesomescreenshot.com/0135135hca) When checking the Keyword Difficulty in the Google Australia search enginge for 'wedding DJ perth' - there are 4 results which have a lower domain authority than 15 (in fact one result has a domain authority of 1) - http://awesomescreenshot.com/03f5134zd1 http://thedj.com.au has a Domain Authority of 23/100 and a Page Authority of 34/100. (http://awesomescreenshot.com/0bb5134tb8) So seeing as the page has gotten an A for on-page optimisation for the keyword 'wedding DJ Perth' and has a higher domain authority then many results in the top 10... why isn't it in the Top 10?! Bonus Question:
Why is DJ Avi showing up at the top of search results (Local listing) depsite the fact that:
a) He has no website to link to
b) No reviews for his listing
c) No keywords that I can see (other than the fact that he's a DJ)
Screenshot: http://awesomescreenshot.com/05151349cb Meanwhile our Local Places - Thanks,
Kosta
http://www.headstudios.com.au0 -
Ranking of Travel Sites in SERPs
Hello, I have noticed that some travel sites rank for almost all the keywords but when I click the page, it has no relevant content and often no content at all. I remember Google once updated its algorithm to do away with such sites but I still found some. The question is - if they don't have relevant content or if they don't have content at all, how do they even rank? Secondly, how come they have pages for all keyword combination? How is this achieved? Regards
Intermediate & Advanced SEO | | IM_Learner0 -
Should I use meta noindex and robots.txt disallow?
Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks 🙂
Intermediate & Advanced SEO | | ntcma0 -
How are pages ranked when using Google's "site:" operator?
Hi, If you perform a Google search like site:seomoz.org, how are the pages displayed sorted/ranked? Thanks!
Intermediate & Advanced SEO | | anthematic0 -
New server update + wrong robots.txt = lost SERP rankings
Over the weekend, we updated our store to a new server. Â Before the switch, we had a robots.txt file on the new server that disallowed its contents from being indexed (we didn't want duplicate pages from both old and new servers). When we finally made the switch, we somehow forgot to remove that robots.txt file, so the new pages weren't indexed. Â We quickly put our good robots.txt in place, and we submitted a request for a re-crawl of the site. The problem is that many of our search rankings have changed. Â We were ranking #2 for some keywords, and now we're not showing up at all. Â Is there anything we can do? Â Google Webmaster Tools says that the next crawl could take up to weeks! Â Any suggestions will be much appreciated.
Intermediate & Advanced SEO | | 9Studios0 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0 -
How Google Carwler Cached Orphan pages and directory?
I have website www.test.com I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval. Now, my demo link will be www.test.com/demo/ I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/ Then how Google crawler find it and cached some pages or entire directory? Thanks
Intermediate & Advanced SEO | | darshit210