Why isn't our new site being indexed?
-
We built a new website for a client recently.
Site: https://www.woofadvisor.com/
It's been live for three weeks. Robots.txt isn't blocking Googlebot or anything.
Submitted a sitemap.xml through Webmasters but we still aren't being indexed.
Anyone have any ideas?
-
Hey Dirk,
No worries - I visited the question first time today and considered it unanswered as the site is perfectly accessible in California. I like to confirm what Search Console says as that is 'straight from the horses mouth'.
Thanks for confirming that the IP redirect has changed, that is interesting. It is impossible for us to know when that happened - I would have expected thing to get indexed quite fast when it changed.
With the extra info I'm happy to mark this as answered, but would be good to hear from the OP.
Best,
-Tom
-
Hi Tom,
I am not questioning your knowledge - I re-ran the test on webpagetest.org and I see that the site is now accessible for Californian ip (http://www.webpagetest.org/result/150911_6V_14J6/) which wasn't the case a few days ago (check the result on http://www.webpagetest.org/result/150907_G1_TE9/) - so there has been a change on the ip redirection. I also checked from Belgium - the site is now also accessible from here.
I also notice that if I now do a site:woofadvisor.com in Google I get 19 pages indexed rather than 2 I got a few days ago.
Apparently removing the ip redirection solved (or is solving) the indexation issue - but still this question remains marked as "unanswered"
rgds,
Dirk
-
I am in California right now, and can access the website just fine, which is why I didn't mark the question as answered - I don't think we have enough info yet. I think the 'fetch as googlebot' will help us resolve that.
You are correct that if there is no robots.txt then Google assumes the site is open, but my concern is that the developers on the team say that there IS a robots.txt file there and it has some contents. I have, on at least two occasions, come across a team that was serving a robots.txt that was only accessible to search bots (once they were doing that 'for security', another time because they mis-understood how it worked). That is why I suggested that Search Console is checked to see what shows up for robots.txt.
-
To be very honest - I am quite surprised that this question is still marked as "Unanswered".
The owners of the site decided to block access for all non UK / Ireland adresses. The main Googlebot is using a Californian ip address to visit the site. Hence - the only page Googlebot can see is https://www.woofadvisor.com/holding-page.php which has no links to the other parts of the site (this is confirmed by the webpagetest.org test with Californian ip address)
As Google indicates - Googlebot can also use other IP adresses to crawl the site ("With geo-distributed crawling, Googlebot can now use IP addresses that appear to come from other countries, such as Australia.") - however it's is very likely that these bots do not crawl with the same frequency/depth as the main bot (the article clearly indicates " Google might not crawl, index, or rank all of your locale-adaptive content. This is because the default IP addresses of the Googlebot crawler appear to be based in the USA).
This can easily be solved by adding a link on /holding-page.php to the Irish/UK version which contains the full content (accessible for all ip adresses) which can be followed to index the full site (so - only put the ip detection on the homepage - not on the other pages)
The fact that the robots.txt gives a 404 is not relevant: if no robots.txt is found Google assumes that the site can be indexed (check this link) - quote: "You only need a
robots.txt
file if your site includes content that you don't want Google or other search engines to index." -
I'd be concerned about the 404ing robots.txt file.
You should check in Search Console:
-
What does Search Console show in the robots.txt section?
-
What happens if you fetch a page that is no indexed (e.g. https://www.woofadvisor.com/travel-tips.php) with the 'Fetch as Googlebot' tool?
I checked and do not see any obvious indicators of why the pages are not being indexed - we need more info.
-
-
I just did a quick check on your site with Webpagetest.org with California IP address http://www.webpagetest.org/result/150907_G1_TE9/ - as you can see here these IP's also go to the holding page - which is logically the only page which can be indexed as it's the only one Googlebot can access.
rgds,
Dirk
-
Hi,
I can't access your site in Belgium - I guess you are redirecting your users based on ip address. If , like me, they are not located in your target country they are 302 redirected to https://www.woofadvisor.com/holding-page.php and there is only 1 page that is indexed.
Not sure which country you are actually targeting - but could it be that you're accidentally redirecting Google bot as well?
Check also this article from Google on ip based targeting.
rgds
Dirk
-
Strangely, there are two pages indexed on Google Search.
The homepage and one other
-
I noticed the robots.txt file returned a 404 and asked the developers to take a look and they said the content of it is fine.
Sometimes developers say this stuff. If you are getting a 404, demonstrate it to them.
-
I noticed the robots.txt file returned a 404 and asked the developers to take a look and they said the content of it is fine.
But yes, I'll doublecheck the WordPress settings now.
-
Your sitemap all looked good, but when I tried to view the robots.txt file in your root, it returned a 404 and so was unable to determine if there was an issue. Could any of your settings in your WordPress installation also be causing it to trip over.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I handle a redirect chain issue pertaining to a page that doesn't actually exist on my site?
I have a page showing up on the insights report as being a redirect chain. This page however does not exist as far as I can tell. It is not on my dashboard anywhere and pointing a browser to it produces a messy page with Wordpress theme error code spit out. How do I track this down to clean it up if the page does not exist within my Wordpress installation? The page for reference is https://butlermobility.com/dealers/downloads. As it stands today the dealers and downloads pages are separate. There is no downloads sub page within the dealers section.
Technical SEO | | NiteSkirm0 -
Should I create a new site or keep company on parent company's subdomain?
I am working with a realty company that is hosted on a subdomain of the larger, parent realty company: [local realty company].[parent realty company].com How important is it to ride on the DA of the larger company (only about a 40)? I'm trying to weigh the value of creating an entirely separate domain for simplicity of the end user and Google bots: [local company].realtor They don't have any substantial links to their subdomain, so it wouldn't a huge loss. I have a couple options... Create an entirely new site on their current subdomain, leveraging the DA of the larger parent company. Create an entirely new site on a new URL, starting from scratch (which doesn't hurt you as much as it seems it once did). Create two sites, a micro site that targets a sector of their audience that they really want to reach, plus option (1) or (2). Love this community!
Technical SEO | | Gabe_BlueGuru0 -
Why are only PDFs on my client's site being indexed, and not actual pages?
My client has recently built a new site (we did not build this), which is a subdomain of their main site. The new site is: https://addstore.itelligencegroup.com/uk/en/. (Their main domain is: http://itelligencegroup.com/uk/) This new Addstore site has recently gone live (in the past week or so) and so far, Google appears to have indexed 56 pdf files that are on the site, but it hasn't indexed any of the actual web pages yet. I can't figure out why though. I've checked the robots.txt file for the site which appears to be fine: https://addstore.itelligencegroup.com/robots.txt. Does anyone have any ideas about this?
Technical SEO | | mfrgolfgti0 -
Why can't I rank for my brand name?
We are soon to launch a new company in New Zealand called Zing. I have been tasked with the challenge of ranking as highly as possible for anything to do with Zing before launch in February. Zing is in the financial industry so my colleagues thought that it would be a good idea to make a small blog (very small with literally one post) that reviewed other financial lenders. This sight stayed online for a couple of months before it was replaced. The official website is still yet to launch, so as an in between, I asked that we make a splash page with a small competition on it (see here at zing.co.nz). I would have preferred there were more keywords on the website but this was not achieved. I am still pushing for this and am hoping to get a few pages on there in the near future. Instead of getting the keywords on the splash page, I was given permission to start a subdomain, (blog.zing.co.nz). This contains many more common search terms and although its not quite doing the job I would like, the rankings for Zing have started to increase. At the moment, we are ranking number 1 for a few brand related keywords such as zing loans. This is why I feel something is wrong, because we rank number 1 for over 10 similar terms but yet we DO NOT EVEN APPEAR on the search engines at all for Zing. Have we been penalized? Do you have any suggestions at all? Do you think we could have been penalized for the first average blog? Maybe I messed up the swap over? Any help would be hugely appreciated!
Technical SEO | | Startupfactory0 -
Duplicate Titles Aren't Actually Duplicate
I am seeing duplicate title errors, but when I go to fix the problem, the titles are not actually identical. Any advice? Becky
Technical SEO | | Becky_Converge0 -
404 error - but I can't find any broken links on the referrer pages
Hi, My crawl has diagnosed a client's site with eight 404 errors. In my CSV download of the crawl, I have checked the source code of the 'referrer' pages, but can't find where the link to the 404 error page is. Could there be another reason for getting 404 errors? Thanks for your help. Katharine.
Technical SEO | | PooleyK0 -
When to 301 a No1 ranking site to the new domain?
I have a site [company.com] that ranks number one for the products of my brand but I'm moving all the efforts to a dedicated brand domain. The old site covered a number of small brands and we had no dedicated brand sites, but we now focus on just this one brand and it doesn't belong on the old company domain name. BRAND belongs on the new brand.com Because of the age of the old company site and because it had the first copy about the brand, it's still ranking well for the brand product names, and the new site has some duplicate content issues that I'm in the throws of resolving. RANKS Company.com : number one for all product names Brand.com : nowhere for product brand names but top for the brand name (as I say, the product pages on this site have duplicate content issues which is likely keeping them ranked low - Hades low. I would rather not maintain two websites and I want to give brand.com every bit of available oomph , so should I at some point 301 the old company site to the new one? If so, is now the time? Thanks
Technical SEO | | Brocberry0 -
Should we introduce subfolders into the URLs on a new site?
A site we are working on currently gives no indication of the subfolders in the URL. Eg. the site uses: www.examplesite.com/brand-name Rather than: www.examplesite.com/popular-products/brand-name There are breadcrumbs on site to show the user what part of the site they are in and how they navigated there. We are building a new site and have to decide what route to take: Since the site is already performing relatively well in the SERPs and the URLs are nice and short this way, is it a good idea to keep them like this or is it better for usability to include the subfolders? This post suggests that we would be best off to keep the URLs as they are - particularly since less would be changed http://www.seomoz.org/blog/should-i-change-my-urls-for-seo Thanks in advance for your opinions! Liz @lizstraws
Technical SEO | | oneresult0