Why isn't google indexing our site?
-
Hi,
We have majorly redesigned our site. Is is not a big site it is a SaaS site so has the typical structure, Landing, Features, Pricing, Sign Up, Contact Us etc...
The main part of the site is after login so out of google's reach.
Since the new release a month ago, google has indexed some pages, mainly the blog, which is brand new, it has reindexed a few of the original pages I am guessing this as if I click cached on a site: search it shows the new site.
All new pages (of which there are 2) are totally missed. One is HTTP and one HTTPS, does HTTPS make a difference.
I have submitted the site via webmaster tools and it says "URL and linked pages submitted to index" but a site: search doesn't bring all the pages?
What is going on here please? What are we missing? We just want google to recognise the old site has gone and ALL the new site is here ready and waiting for it.
Thanks
Andrew
-
Well, links/shares are good. But of course I'm just begging the question of how you can get those.
Rand gave a great talk called "Inbound Marketing for Startups" at a Hackers & Founders meetup that was focused more on Inbound as a whole than SEO in particular, but it's full of valuable insights: http://vimeo.com/39473593 [video]
Ultimately it'll come down to some kind of a publishing/promotional strategy for your startup. Sometimes your startup is so unique/interesting that it has its own marketing baked right in - in which case you can get a lot of traction by simply doing old-school PR to get your startup in front of the right people.
Other times, you've got to build up links/authority on the back of remarkable marketing.
BufferApp is a great example of a startup that built traction off their blog. Of course, they weren't necessarily blogging as an SEO play - it was more in the aim of getting directly in front of the right audience for direct signups for their product. But they definitely built up some domain authority as a result.
I'd also take a look at the guides Mailchimp has created - they created the dual benefit of getting in front of the right audience in a positive/helpful way (which benefits the brand and drives sign-ups directly) as well as building a considerable number of inbound links, boosting their domain authority overall.
Unfortunately no quick/easy ways to build your domain authority, but things you do to build your authority can also get you immediately in front of the audience you're looking for - and SEO just becomes a lateral benefit to that.
-
Thank you all for your responses. It is strange. we are going to add a link to our g+ page and then add a post.
As a new site what is the best way to get our domain authority up so we get crailed quicker?
Thanks again
Andrew
-
I disagree. Unless the old pages have inbound links from external sites, there's not much reason to 301 them (and not much benefit). If they're serving up 404 errors, they will fall out of the index.
Google absolutely does have a way to know these new pages exist - by crawling the home page and following the links discovered there. Both of the pages in question are linked to prominently, particularly the Features page which is part of the main navigation. A sitemap is just an aid for this process - it can help move things along and help Google find otherwise obscure/deep pages, but it by no means is a necessity for getting prominent pages indexed, particularly pages that are 1-2 levels down from the home page.
-
If you didn't redirect the old URLs to the new ones when the new site went live, this will absolutely be the cause of your problem, Studio33. That, combined with having no (or misdirected) sitemap means there was essentially no way for Google to even know your site's pages existed.
Good catch Billy.
-
Hi Andrew,
-
Google has been indexing HTTPS URLs for years now without a problem, so is unlikely to be part of the issue.
-
Your domain authority on the whole may be slowing Google down in indexing new pages. Bottom line is crawl rate and depth are both functions of how authoritative/important you appear based on links/shares/etc.
-
That said, I don't see any indication as to why these two particular pages are not being indexed by Google. I'm a bit stumped here.
I see some duplication between your Features page and your Facebook timeline, but not with the invoice page.
As above, your domain authority (17) is a bit on the low side. So this could simply be a matter of Google not dedicating enough resources to crawl/index all of your pages yet. But why these two pages would be the only ones is perplexing, particularly after a full month. There are no problems with your Robots.txt, no canonical tag issues, the pages are linked to properly.
Wish I had an easy answer here. One idea, a bit of a long shot: we've seen Google index pages faster when they're linked to from Google+ posts. I see you have a Google+ business page for this website - you might try simply writing a (public) post there that includes a link over to the Features page.
As weak as that is, that's all I've got.
Best of Luck,
Mike -
-
OK - I would get a list of all of your old pages and start 301 redirecting them to your new pages asap. This could be part of your issue.
-
Hi checked XML, its there if you view source it just doesn't have a stylesheet
-
Hi thanks about 1 month. The blog page you are getting maybe the old ones,as they are working this end http://www.invoicestudio.com/Blog . What you have mentioned re the blog is part of the problem. Google has the old site and not the new.
-
Getting this on your Blog pages:
The page cannot be displayed because an internal server error has occurred.
where you aware?
Anyway - may I ask how old these pages are?
-
Thanks. I will look into the sitemap. That only went live about an hour ago whilst this thread has been going on.
-
Yeah - with no path specified the directive is ignored. (you don't have a '/' so the directive (disallow) is ignored)
however, you do direct to your xml sitemap which appears to be empty. You might want to fix that....
-
Hi no I think its fine as we do not have the forward slash after the disallow. See
http://www.robotstxt.org/robotstxt.html
I wish it was as simple as that. Thanks for your help though its appreciated.
-
Hmmm. That link shows that the way you have it will block all robots.
-
Thanks but I think Robots.txt is correct. Excert from http://www.robotstxt.org/robotstxt.html
To exclude all robots from the entire server
User-agent: * Disallow: /
To allow all robots complete access
User-agent: * Disallow:
(or just create an empty "/robots.txt" file, or don't use one at all)
-
It looks like your robots.txt file is the problem. http://www.invoicestudio.com/robots.txt has:
User-agent: * Disallow: When it should be:
User-agent: *
Allow: / -
Hi,
The specific pages are
https://www.invoicestudio.com/Secure/InvoiceTemplate
http://www.invoicestudio.com/Features
I'm not sure what other pages are not indexed.
New site has been live 1 month.
Thanks for your help
Andrew
-
Without seeing the specific pages i cant check for things such as noindex tags or robot text blocking access, i would suggest you double check these aspects. The pages will need to be accesible to Search engines when they crawl your site, so if there are no links to those pages Google will be unable to access them.
How long have they been live since the site re-launch as it may just be that they have not been crawled yet, particuarly if they are deeper pages within your site hierarchy.
Heres a link to Googles resources on crawling and indexing sites incase you have not been able to check through them yet.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there. How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped. Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
Intermediate & Advanced SEO | | rickyporco0 -
Help me to understand why this page doesn't rank
Hello everyone. I am trying to understand why most of my website category pages don't show up in the in the first 50 organic results on Google, despite my high website DA and high PA of those pages. We used to rank high a few years ago, not clear why most of those pages have almost completely disappeared. So, just to take one as an example, please, help me to understand why this page doesn't shows up in the first 50 organic search results for the keyword "cello sheet music": http://www.virtualsheetmusic.com/downloads/Indici/Cello.html I really can't explain why, unless we are under some sort of "penalization" or similar (a curse?!)... I have analyzed any possible metric, and can't find a logical explanation. Looking forward for your thoughts guys! All the best, Fab.
Intermediate & Advanced SEO | | fablau0 -
Should I include www in url, or doesn't it matter?
Hello Mozzers, I was just wondering whether Google prefers www or non www URLs? Or doesn't it matter? Thanks in advance!
Intermediate & Advanced SEO | | McTaggart0 -
How to get a site out of Google's Sandbox
Hi I am working on a website that is ranking well in bing for the domain name / exact url search but appears no where in Google or Yahoo. I have done the site search in Google and it is indexed so I am presuming it is in the sandbox. The website was originally developed in India and I do not know whether it had some history of bad backlinks. The website itself is well optimised and I have checked all pages in Moz - getting a grade A. Webmaster Tools is not showing any manual actions - I was wondering what I could do next?
Intermediate & Advanced SEO | | AllieMc0 -
How do I know what pages of my site is not inedexed by google ?
Hi I my Google webmaster tools under Crawl->sitemaps it shows 1117 pages submitted but 619 has been indexed. Is there any way I can fined which pages are not indexed and why? it has been like this for a while. I also have a manual action (partial) message. "Unnatural links to your site--impacts links" and under affects says "Some incoming links" is that the reason Google does not index some of my pages? Thank you Sina
Intermediate & Advanced SEO | | SinaKashani0 -
I don't get it... A Grade, etc
I have an A grade for a specific keyword, my competitors ranked above me have poor grades.
Intermediate & Advanced SEO | | tylerwp
My "Root Domains Linking to the Page" is higher than competitors ranked above me.
Yet I'm ranked 7th on google.
Even worse the 3rd place ranked URL has very poor page authority and linking. What is giving these poor sites such great rankings? Jxluy8Z.png0 -
Why is Google Still Penalizing My Site?
We got hit pretty hard by Penguin. There were some bad link issues which we've cleared up and we also had a pretty unique situation stemming from about a year ago when we changed the name of the company and created a whole new site with similar content under a different URL. We used the same phone number and address, and left the old site up as it was still performing well. Google didn't care for that so we eventually used 301 redirects to push the link juice from the old site to the new site. That's the background, here's the problem...... We've partially recovered, but there are several keywords that haven't come back anywhere near where they were in Google. We have higher page rank and more links than our competition and are performing in the top 5 for some of our keywords. Other, similar keywords, where we used to be in the top 5, we are now down on page 4 or 5. Our website is www.hudsoncabinetrydesign.com. We build custom cabinetry and furniture in Westchester County, NY just north of NYC. Examples - For "custom built-ins new york" we are number 3 on Google, number 1 on Bing/Yahoo. For "custom kitchen cabinetry ny" we are number 3 on Bing/Yahoo, not in the top 50 on Google. For "custom radiator covers ny" we used to be #1 on Google, are currently #48, currently #2 on Bing/Yahoo. Obviously, we've done something to upset the Google, but we've run out of ideas as to what it could be. Any ideas as to what is going on? Thanks so much for your feedback, Doug B.
Intermediate & Advanced SEO | | doug_b0 -
Removing URLs in bulk when directory exclusion isn't an option?
I had a bunch of URLs on my site that followed the form: http://www.example.com/abcdefg?q=&site_id=0000000048zfkf&l= There were several million pages, each associated with a different site_id. They weren't very useful, so we've removed them entirely and now return a 404.The problem is, they're still stuck in Google's index. I'd like to remove them manually, but how? There's no proper directory (i.e. /abcdefg/) to remove, since there's no trailing /, and removing them one by one isn't an option. Is there any other way to approach the problem or specify URLs in bulk? Any insights are much appreciated. Kurus
Intermediate & Advanced SEO | | kurus1