Can you see the 'indexing rules' that are in place for your own site?
-
By 'index rules' I mean the stipulations that constitute whether or not a given page will be indexed.
If you can see them - how?
-
Unfortunately, that would be specific to your own platform and server-side code. When you look at the SEOmoz source code, you're either going to see a nofollow or you're not. The code that drives that is on our servers and is unique to our build (PHP/Cake, I think).
You'd have to dig into the source code generating the Robots.txt file. I don't think you can have a fully dynamic Robots.txt (it has to have a .txt extension), so there must be a piece of code that generates a new Robots.txt file, probably on a timer. It could be called something similar, like Robots.php, Robots.aspx, etc. Just a guess.
FYI, dynamic Robots.txt could be a little dicey - it might be better to do this with a META NOINDEX in the header of the user profile pages. That would also avoid the timer approach. The pages would dynamically NOINDEX themselves as they're created.
-
To hopefully clarify what I'm talking about, I want to provide this example: SEOmoz will remove the "no-follow" tag from the first link in your profile if you get 200 mozpoints.
This is a set rule which I believe will automatically occur once a user reaches the minimum. On my site, a similar rule exists where the meta noindex tag will be removed from a user page if you submit 10 'files'.
There were other rules similar to this created and I need to know what they are. How?
-
On my site, there was a rule created where users are blocked by robots unless they have submitted a minimum number of 'files'. This was done to ensure that only quality user profile pages are being indexed and not just spam/untouched profiles.
There have been other rules like this created but I don't know what they are and I'd like to find out.
-
Hi David,
Do you mean how robots.txt is configured and if the robots file is blocking a certain page from being indexed? If so, yes. If the file is complex and you're not sure if it's blocking a particular page, you can go into Google Webmaster Tool and they have a robots.txt utility where you can input a particular URL and it will tell you if the robots.txt file you are using (or proposing) blocks that URL.
If you mean whether the page is quality enough for a search engine to choose to index it? No, that's part of the algorithm and none of the major engines are that nice and open.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can anyone help me diagnose an indexing/sitemap issue on a large e-commerce site?
Hey guys. Wondering if someone can help diagnose a problem for me. Here's our site: https://www.flagandbanner.com/ We have a fairly large e-commerce site--roughly 23,000 urls according to crawls using both Moz and Screaming Frog. I have created an XML sitemap (using SF) and uploading to Webmaster Tools. WMT is only showing about 2,500 urls indexed. Further, WMT is showing that Google is indexing only about 1/2 (approx. 11,000) of the urls. Finally (to add even more confusion), when doing a site search on Google (site:) it's only showing about 5,400 urls found. The numbers are all over the place! Here's the robots.txt file: User-agent: *
Intermediate & Advanced SEO | | webrocket
Allow: /
Disallow: /aspnet_client/
Disallow: /httperrors/
Disallow: /HTTPErrors/
Disallow: /temp/
Disallow: /test/ Disallow: /i_i_email_friend_request
Disallow: /i_i_narrow_your_search
Disallow: /shopping_cart
Disallow: /add_product_to_favorites
Disallow: /email_friend_request
Disallow: /searchformaction
Disallow: /search_keyword
Disallow: /page=
Disallow: /hid=
Disallow: /fab/* Sitemap: https://www.flagandbanner.com/images/sitemap.xml Anyone have any thoughts as to what our problems are?? Mike0 -
Viewing search results for 'We possibly have internal links that link to 404 pages. What is the most efficient way to check our sites internal links?
We possibly have internal links on our site that point to 404 pages as well as links that point to old pages. I need to tidy this up as efficiently as possible and would like some advice on the best way to go about this.
Intermediate & Advanced SEO | | andyheath0 -
The review stars for my ecommerce site in the organic search disappeared, how can I have them shown again?
We run www.prams.net, an ecommerce store for strollers and other baby products out of the Uk. We always followed schema and the stars appeared in our product snippets in the organic search, helping us a lot. A couple of weeks ago the stars stopped appearing. When I do the site:www.prams.net they appear but not in the search results anymore. We have a french site as well. www.poussette.com, their it still works. Has anybody got tips what we can do to make the stars appear again? Thanks in advance. Dieter Lang
Intermediate & Advanced SEO | | Storesco0 -
Can you no index a page in Wordpress from just Google news?
I'm trying to find a plugin for Wordpress that enables you to no-index an individual page from Google news but not from Google search results. We want to remove some of our pages from Google news without hurting others.
Intermediate & Advanced SEO | | uSw0 -
Google de-indexed a page on my site
I have a site which is around 9 months old. For most search terms we rank fine (including top 3 rankings for competitive terms). Recently one of our pages has been fluctuating wildly in the rankings and has now disappeared altogether from the rankings for over 1 week. As a test I added a similar page to one of my other sites and it ranks fine. I've checked webmaster tools and there is nothing of note there. I'm not really sure what to do at this stage. Any advice would me much appreciated!
Intermediate & Advanced SEO | | deelo5550 -
Google can't access/crawl my site!
Hi I'm dealing with this problem for a few days. In fact i didn't realize it was this serious until today when i saw most of my site "de-indexed" and losing most of the rankings. [URL Errors: 1st photo] 8/21/14 there were only 42 errors but in 8/22/14 this number went to 272 and it just keeps going up. The site i'm talking about is gazetaexpress.com (media news, custom cms) with lot's of pages. After i did some research i came to the conclusion that the problem is to the firewall, who might have blocked google bots from accessing the site. But the server administrator is saying that this isn't true and no google bots have been blocked. Also when i go to WMT, and try to Fetch as Google the site, this is what i get: [Fetch as Google: 2nd photo] From more than 60 tries, 2-3 times it showed Complete (and this only to homepage, never to articles). What can be the problem? Can i get Google to crawl properly my site and is there a chance that i will lose my previous rankings? Thanks a lot
Intermediate & Advanced SEO | | granitgash
Granit FvhvDVR.png dKx3m1O.png0 -
I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!
Hi, I've just found two sitemaps - one of them is .php and represents part of the site structure on the website. The second is a .txt file which lists every page on the website. The .txt file is blocked via robots exclusion protocol (which doesn't appear to be very logical as it's the only full sitemap). Any ideas why a developer might have done that?
Intermediate & Advanced SEO | | McTaggart0 -
My New(ish) Site Isn't Ranking Well And Recently Fell
I launched my site (jesfamilylaw.com) at the beginning of January. Since then, I've been trying to build high quality back links. I have a few back links with keyword targeted anchor text from some guest posts I've published (maybe 3 or so) and I have otherwise signed up for business directories and industry-specific directories. I have a few social media profiles and some likes on Facebook, both for the company page and some posts. Despite this, I've had a lot of trouble cracking Google's top ten for any term, long or tall tail. I was starting to climb for Evanston Family Law, which is the key term I believe I am best optimized for, but took a dive yesterday. I fell from maybe the 14th result to somewhere on the 4th page. For all my other target terms, I don't know if I've gotten into the 20s yet. To further complicate matters, my Google Places listing isn't showing and is on the second page of results for Places searches, after businesses that aren't located in the same city. The night before I fell, I resubmitted my site to Google because Webmaster tools was showing duplicate title tags when I had none. I had also made a couple changes to some internal links and title tags, but only for a small fraction of the site. Long story short, I don't know what's going on. I don't know why I fell in the rankings and why my site isn't competitive for some of my target key phrases. I've read so many horror stories about Penguin that I fear my onsite optimization may be hurting my rankings or my back links are insufficient. I've done plenty of competitor research and the sites that are beating me have very aggressive onsite optimization and few back links. In short, I am very confused. Any help would be immensely appreciated.
Intermediate & Advanced SEO | | JESFamilyLaw0