Google refuses to index our domain. Any suggestions?
-
A very similar question was asked previously. (http://www.seomoz.org/q/why-google-did-not-index-our-domain) We've done everything in that post (and comments) and then some.
The domain is http://www.miwaterstewardship.org/ and, so far, we have:
- put "User-agent: * Allow: /" in the robots.txt (We recently removed the "allow" line and included a Sitemap: directive instead.)
- built a few hundred links from various pages including multiple links from .gov domains
- properly set up everything in Webmaster Tools
- submitted site maps (multiple times)
- checked the "fetch as googlebot" display in Webmaster Tools (everything looks fine)
- submitted a "request re-consideration" note to Google asking why we're not being indexed
Webmaster Tools tells us that it's crawling the site normally and is indexing everything correctly. Yahoo! and Bing have both indexed the site with no problems and are returning results. Additionally, many of the pages on the site have PR0 which is unusual for a non-indexed site. Typically we've seen those sites have no PR at all.
If anyone has any ideas about what we could do I'm all ears. We've been working on this for about a month and cannot figure this thing out.
Thanks in advance for your advice.
-
You make excellent points. I'll escalate this to "the pros" and see if they're able to bring their guru powers to bear on the trouble.
Thanks again Ryan for all your advice. It is greatly appreciated.
-
Looking at the site I can confirm the following:
-
the home page is tagged index follow
-
the status code for the home page is 200, an OK response
-
the robots.txt file is valid and clear
-
your crawl reports look fine to me
-
you stated your sitemap is received and 73 of 75 pages are indexed
-
your site is clearly not in Google's index as a site:miwaterstewardship.org search shows nothing.
-
I looked at your sitemap. I am not familiar with .aspx sitemaps but it does contain valid html links which apparently is enough for Google to utilize
-
you stated your site is not under penalty, as per Google
The possibilities are:
-
one of the pieces of information we are depending on is incorrect
-
we are overlooking a key piece of information
-
our understanding of SEO in this case is not complete
-
there is an issue with Google which is preventing your site from being indexed.
At this point I would suggest using your 1 PRO question on this issue, and reference this Q&A thread. While I don't believe we missed anything we should get the team to look at this issue and rule out every last possibility.
-
-
Thanks for the suggestions, Ryan. All of the previous changes were made before Google did it's last crawl. Here's the other info...
URLs Submitted = 78 / URLs in Web Index = 75 / The sitemap status is green check mark and it was downloaded today--June 14.
Geographic target has not been selected. Google has always been able to determine the crawl rate. I just changed the preferred domain to www.miwaterstewardship.org. That's the only change made recently.
This is another piece that baffles me. Check out the crawl stats for the site here: http://netvantagemarketing.com/temp/miws-crawl-stats.png
The bot is crawling an average of 63 pages per day and there's crawling activity since the middle of March. STILL, though...the domain absolutely will not appear in the index.
We're working with the client right now to see if we can get the site changed to a new IP address. The thought is that perhaps Google has somehow historically blocked the IP that it lives on now and changing to a new IP might get us out of jail. SUPER long shot...but those are the kinds of things we're trying now.
Thanks again for the help.
-
I was able to verify your robots.txt file is correctly set up. Did you make the changes before Google crawled the site on Saturday? You are correct that your site is not in the index. I would guess the robots.txt file was not modified until after the crawl. If it was adjusted pre-crawl, below are the next steps you can take.
Let's take a fresh look at your site:
-
you have a valid robots.txt file
-
your home page has a valid "index, follow" tag (not necessary, but it doesn't hurt either)
-
you have checked WMT and confirmed your site was crawled
In WMT, Site Configuration > Sitempas there is a "URLs submitted" field and a "URLs in web index field". What are those numbers please?
- It's a bit far reaching but while you are there please go over to your Settings tab in WMT. Geographic Target should not be checked, or if it is then Target Users in US should be selected. Preferred domain should be "www.miwaterstewardship.org" and I would recommend "Let Google determine my crawl rate" option unless you have a specific reason for doing otherwise.
-
-
Well...we're still in the virtual doghouse so-to-speak...
I made the changes that Ryan suggested on Friday. Webmaster Tools reports that the GoogleBot crawled the site on Saturday and that everything is OK in its eyes. Google responded to our reconsideration request over the weekend and stated very specifically that no actions were taken by the Google Spam team which would affect the rankings of our site or domain.
Still, if you search info:miwaterstewardship.org in The Ol' Goog, it continues to report that the site does not exist in the Google database.
Are there any other ideas of what we might be able to try?
This is a Dot Net Nuke site. Is there a DNN setting somewhere which might indicate to Google that it should not report our domain in search results?
Thanks for the looks and the help.
-
I agree with Ryan, your backlinks look great, website is structured well and has a great looking design. No signs of you breaking any of googles TOS.
-
Thank you EGOL. I consider myself a student of SEO, definitely not a master.
The Q&A forums here have given me a great opportunity to learn about SEO. I have been spending all my time this past month un-learning all the bad information I gathered from the internet, and learning SEO the right way.
The Matt Cutts videos, SEOmoz webinars, blogs and pro Q&A have all been immensely helpful. Those resources, along with the replies you and other mozzers offer have provided me an incredibly rich learning experience.
Thanks for noticing.
-
Thanks, Ryan. I noticed this as well but figured I'd leave it alone since Webmaster Tools was telling me "all is well" with the robots.txt file. I'll add the code like you suggest and we'll see what happens.
Thanks again.
-
Ryan, you have been giving some really valuable answers. Nice. Keep up the great work!
-
Fix your robots.txt. It is not set up as you suggested.
http://www.miwaterstewardship.org/robots.txt
User-agent: * Sitemap: http://www.miwaterstewardship.org/SiteMap.aspx I am unsure of what action a search engine would take upon encountering your code, but based on your post it seems that it blocks all agents. The correct code would be:
User-agent: *
Disallow:Sitemap: http://www.miwaterstewardship.org/SiteMap.aspx For more information about robots.txt you can take a look at: [http://www.robotstxt.org/robotstxt.html](http://www.robotstxt.org/robotstxt.html)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
My brand name has 2 words but Google only indexing as 1 word. Is there a fix?
Hi all...I'm at a loss. I've never had this happen. Google only shows pages of my site when I search the brand name as one word. When I Google the site as one word BrandBrand- it only shows my blog page and about us page plus Twitter and Facebook on page 1. The homepage does not show up at all. When I Google the site as two words Brand Brand - My Facebook page is on page 1 but nothing else. The homepage isn't showing up at all. When I search both words on Bing and Yahoo both are indexing it as two words and shows on page 1. Any ideas?
Technical SEO | | TexasBlogger0 -
Forwarding a .org domain to a .com domain: any negative impact to consider?
Hello! I have a question I've been unable to find a clear answer to. My client's primary domain is a .com with a satisfactorily high DA. My client owns the .org version of its domain (which has a very low DA, I suppose due to inactivity) but has never forwarded it on. For branding/visibility/traffic reasons, I'd like to recommend they set up the .org domain to forward to the .com domain, but I wanted to ask a few questions first: 1. Does forwarding low-value DA domains to high-value DA domains have any negative authority/SEO impact? 2. If the .org domain was to be forwarded, am I correct that an SSL cert is not necessary for it if the .com domain has an SSL cert? Thanks in advance!
Technical SEO | | mollykathariner_ms1 -
What's going on with google index - javascript and google bot
Hi all, Weird issue with one of my websites. The website URL: http://www.athletictrainers.myindustrytracker.com/ Let's take 2 diffrenet article pages from this website: 1st: http://www.athletictrainers.myindustrytracker.com/en/article/71232/ As you can see the page is indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:dfbzhHkl5K4J:www.athletictrainers.myindustrytracker.com/en/article/71232/10-minute-core-and-cardio&hl=en&strip=1 (that the "text only" version, indexed on May 19th) 2nd: http://www.athletictrainers.myindustrytracker.com/en/article/69811 As you can see the page isn't indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:KeU6-oViFkgJ:www.athletictrainers.myindustrytracker.com/en/article/69811&hl=en&strip=1 (that the "text only" version, indexed on May 21th) They both have the same code, and about the dates, there are pages that indexed before the 19th and they also problematic. Google can't read the content, he can read it when he wants to. Can you think what is the problem with that? I know that google can read JS and crawl our pages correctly, but it happens only with few pages and not all of them (as you can see above).
Technical SEO | | cobano0 -
How to fix google index filled with redundant parameters
Hi All This follows on from a previous question (http://moz.com/community/q/how-to-fix-google-index-after-fixing-site-infected-with-malware) that on further investigation has become a much broader problem. I think this is an issue that may plague many sites following upgrades from CMS systems. First a little history. A new customer wanted to improve their site ranking and SEO. We discovered the site was running an old version of Joomla and had been hacked. URL's such as http://domain.com/index.php?vc=427&Buy_Pinnacle_Studio_14_Ultimate redirected users to other sites and the site was ranking for buy adobe or buy microsoft. There was no notification in webmaster tools that the site had been hacked. So an upgrade to a later version of Joomla was required and we implemented SEF URLs at the same time. This fixed the hacking problem, we now had SEF url's, fixed a lot of duplicate content and added new titles and descriptions. Problem is that after a couple of months things aren't really improving. The site is still ranking for adobe and microsoft and a lot of other rubbish and the urls like http://domain.com/index.php?vc=427&Buy_Pinnacle_Studio_14_Ultimate are still sending visitors but to the home page as are a lot of the old redundant urls with parameters in them. I think it is default behavior for a lot of CMS systems to ignore parameters it doesn't recognise so http://domain.com/index.php?vc=427&Buy_Pinnacle_Studio_14_Ultimate displays the home page and gives a 200 response code. My theory is that Google isn't removing these pages from the index because it's getting a 200 response code from old url's and possibly penalizing the site for duplicate content (which don't showing up in moz because there aren't any links on the site to these url's) The index in webmaster tools is showing over 1000 url's indexed when there are only around 300 actual url's. It also shows thousands of url's for each parameter type most of which aren't used. So my question is how to fix this, I don't think 404's or similar are the answer because there are so many and trying to find each combination of parameter would be impossible. Webmaster tools advises not to make changes to parameters but even so I don't think resetting or editing them individually is going to remove them and only change how google indexes them (if anyone knows different please let me know) Appreciate any assistance and also any comments or discussion on this matter. Regards, Ian
Technical SEO | | iragless0 -
Google Published Date - Does Google Lie?
Here's the scenario. I create a page called "ABC" and it gets published and found by Google lets say on the 13th of April. on the 15th (or 14th) i decide to update the URL, page Title, and content. (Redirect old URL to new URL as well) Will Google still show this page as being published on the 13th? or would it update the publish date according to the new URL? Greg | | | | | | <a id="question_reply-to-question-36769-description_codeblock" class="mceButton mceButtonEnabled mce_codeblock" style="color: #000000; border: 1px solid #f0f0ee; margin: 0px 1px 0px 0px; padding: 0px; background-color: transparent; cursor: default; vertical-align: baseline; width: 20px; border-collapse: separate; display: block; height: 20px;" title="Create Code Block" tabindex="-1"></a>Create Code Block | | | | | | | | | | | | | | |
Technical SEO | | AndreVanKets0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0 -
Why google index my IP URL
hi guys, a question please. if site:112.65.247.14 , you can see google index our website IP address, this could duplicate with our darwinmarketing.com content pages. i am not quite sure why google index my IP pages while index domain pages, i understand this could because of backlink, internal link and etc, but i don't see obvious issues there, also i have submit request to google team to remove ip address index, but seems no luck. Please do you have any other suggestion on this? i was trying to do change of address setting in Google Webmaster Tools, but didn't allow as it said "Restricted to root level domains only", any ideas? Thank you! boson
Technical SEO | | DarwinChinaSEO0