Can you see the 'indexing rules' that are in place for your own site?
-
By 'index rules' I mean the stipulations that constitute whether or not a given page will be indexed.
If you can see them - how?
-
Unfortunately, that would be specific to your own platform and server-side code. When you look at the SEOmoz source code, you're either going to see a nofollow or you're not. The code that drives that is on our servers and is unique to our build (PHP/Cake, I think).
You'd have to dig into the source code generating the Robots.txt file. I don't think you can have a fully dynamic Robots.txt (it has to have a .txt extension), so there must be a piece of code that generates a new Robots.txt file, probably on a timer. It could be called something similar, like Robots.php, Robots.aspx, etc. Just a guess.
FYI, dynamic Robots.txt could be a little dicey - it might be better to do this with a META NOINDEX in the header of the user profile pages. That would also avoid the timer approach. The pages would dynamically NOINDEX themselves as they're created.
-
To hopefully clarify what I'm talking about, I want to provide this example: SEOmoz will remove the "no-follow" tag from the first link in your profile if you get 200 mozpoints.
This is a set rule which I believe will automatically occur once a user reaches the minimum. On my site, a similar rule exists where the meta noindex tag will be removed from a user page if you submit 10 'files'.
There were other rules similar to this created and I need to know what they are. How?
-
On my site, there was a rule created where users are blocked by robots unless they have submitted a minimum number of 'files'. This was done to ensure that only quality user profile pages are being indexed and not just spam/untouched profiles.
There have been other rules like this created but I don't know what they are and I'd like to find out.
-
Hi David,
Do you mean how robots.txt is configured and if the robots file is blocking a certain page from being indexed? If so, yes. If the file is complex and you're not sure if it's blocking a particular page, you can go into Google Webmaster Tool and they have a robots.txt utility where you can input a particular URL and it will tell you if the robots.txt file you are using (or proposing) blocks that URL.
If you mean whether the page is quality enough for a search engine to choose to index it? No, that's part of the algorithm and none of the major engines are that nice and open.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can I change a URL on a site that has only a few back links?
I have a site that wants to change their URL, It's a very basic site with hardly any backlinks. http://www.cproofingandexteriors.com/ The only change they want to make is taking out the 'and'.. so it would be cproofingexteriors.com they already own the domain. What should I do?? Thanks
Intermediate & Advanced SEO | | MissThumann0 -
Community Discussion - What's the ROI of "pruning" content from your ecommerce site?
Happy Friday, everyone! 🙂 This week's Community Discussion comes from Monday's blog post by Everett Sizemore. Everett suggests that pruning underperforming product pages and other content from your ecommerce site can provide the greatest ROI a larger site can get in 2016. Do you agree or disagree? While the "pruning" tactic here is suggested for ecommerce and for larger sites, do you think you could implement a similar protocol on your own site with positive results? What would you change? What would you test?
Intermediate & Advanced SEO | | MattRoney2 -
Shouldn't Lower Bounce Rate Correlate into Greater Click Thru Rate for a Web Site?
Greetings: I run a real estate web site in New York City with about 650 pages out of which 330 are property listing pages. About 250 of those listing pages contain less than 150 words of content. In late August I set about 250 of the listing pages that generated the least traffic (generally corresponding to those with the least content) to "no-index, follow". Now Google has removed those pages from their index. The overall bounce rate for the site has been reduced from about 69% to about 64% since the removal of these low quality listing pages. However the click thru rate has not improved and is stuck at about 2.2 pages per visitor. Shouldn't the click thru rate improve if the bounce rate goes own? Am I missing something? Also, is a lower bounce rate something that Google will take into account when calculating rank? Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Google can't access/crawl my site!
Hi I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings. [URL Errors: 1st photo] 8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up. The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages. After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked. Also when i go to WMT, and try to Fetch as Google the site, this is what i get: [Fetch as Google: 2nd photo] From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles). What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings? Thanks a lot
Intermediate & Advanced SEO | | granitgash
Granit FvhvDVR.png dKx3m1O.png0 -
Most recent blog post isn't being indexed?
http://www.howlatthemoon.com/dueling_piano_bar/kids-activities-denver/ Even if I put the URL into Google it doesn't show up....
Intermediate & Advanced SEO | | howlusa0 -
How can this site rank post panda/penguin?
I am doing link building for an adult dating comparison website. One of the main competitors though, having checked their backlink profile have anchor text that is not varied at all. In fact many, many links that are all the same. How can they possibly rank in the post panda/penguin era? In fact they're at number 2! The site is an adult site and it www.f hypen buddy.co.uk if anyone wants to runa backlink check on OSE. Any help greatly appreciated!
Intermediate & Advanced SEO | | SamCUK0 -
XML Sitemap Index Percentage (Large Sites)
Hi all I'm wanting to find out from those who have experience dealing with large sites (10s/100s of millions of pages). What's a typical (or highest) percentage of indexed pages vs. submitted pages you've seen? This information can be found in webmaster tools where Google shows you the pages submitted & indexed for each of your sitemap. I'm trying to figure out whether, The average index % out there There is a ceiling (i.e. will never reach 100%) It's possible to improve the indexing percentage further Just to give you some background, sitemap index files (according to schema.org) have been implemented to improve crawl efficiency and I'm wanting to find out other ways to improve this further. I've been thinking about looking at the URL parameters to exclude as there are hundreds (e-commerce site) to help Google improve crawl efficiency and utilise the daily crawl quote more effectively to discover pages that have not been discovered yet. However, I'm not sure yet whether this is the best path to take or I'm just flogging a dead horse if there is such a ceiling or if I'm already at the average ballpark for large sites. Any suggestions/insights would be appreciated. Thanks.
Intermediate & Advanced SEO | | danng0 -
Can't find my site on Bing, since ages
Hi Guys, Well, the problem seems normal but I guess it's not. I have tried many things, and nothing changed it, now I give it last try... ask so maybe you will help me. The problem is.. I can't find my site nowhere in Bing, I mean nowhere by not in first 20 pages for my keywords "beauty tips" and the site is: http://www.beauty-tips.net/. In my opinion it should be pretty high... maybe it's too high so I can't see it ;). I never had special problems with Bing, was easier to be there "somewhere" than in google, but with this one is totally opposite. Any ideas? Thanks for your time!
Intermediate & Advanced SEO | | Luke220