How do I find out which pages are being indexed on my site and which are not?
-
Hi,
I doing my first technical audit on my site. I am learning how to do an audit as i go and am a lost. I know some page won't be indexed but how do I:
1. Check the site for all pages, both indexed and not indexed
2. Run a report to show indexed pages only (i am presuming i can do this via screaming Frog or webmaster tool)
3. I can do a comparison between the two list and work out which pages are not being indexed.
I'll then need to figure out way. I'll cross this bridge once i get to it
Thanks Ben
-
Hi Ben,
I'd echo what Patrick has said and probably recommend his first suggestion the most. Google Webmaster Tools is a good way of checking indexation and if you have a large site with lots of categories, you can even break down the sitemaps by category so that you can see if certain areas are having problems.
Here is an old, but still relevant post on the topic:
http://www.branded3.com/blogs/using-multiple-sitemaps-to-analyse-indexation-on-large-sites/
In terms of creating the sitemap, Screaming Frog has an option under Advanced Export for creating an XML sitemap file for you which works very well. You just need to make sure you're only including pages that you want indexed in there.
Cheers.
Paddy
-
Hi Patrick,
Thanks for replying.
Can you recommend any tools for creating the site map i've had a look around and the few i've found seem to all deliver different results? One has been submitted previously so i need to go through the process for myself so i can under these basics.
I've had a read up on robot txt so i understand what is happening there from an exclusion perspective and once i understand how the XML site works ill be able to do an audit as mentioned above.
Ben
-
Ben,
You can check a couple things:
- Have you submitted your XML site map to Google? If not, create one and get it submitted so you tell Google what pages you want indexed.
- Submit your domain and all pages through Google Webmasters Tool as well (Login > left side bar > Crawl > Fetch as Google
- Screaming Frog is an awesome software, so yes, if you have it, use it to scan your pages
- Try and do a simple "site:domainname.com" search in Google to see what is being indexed from your domain
Cross reference it all and you will then have a better understanding. I do believe, your sitemap is crucial in telling Google exactly what pages you do and do not want indexed. They will follow that. You're on the right track and hope my input was helpful! - Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New pages on my web site
I have created web sites that appear somewhere on Google in hardly any time at all, but I appear to have forgotten something or things are different for pages added recently to an existing website. I have added a page on a particular subject, optimized it using on page grader, so that I get an A, and a check mark for everything except H1 tags and rel=canonical which my web hosting provider does not support. I do have a check mark for accessible to search engines The page has the format http://www.domain.com/specific-keyword It is in the menu, so should have internal links to it, as I understand it. I have created a new site map, and submitted it in webmaster tools. Interestingly it says that of the 96 pages only 76 were indexed is this a clue? and why would they not index a page I have then shared the page on google plus, facebook, tumblr, pinterest and twitter and some others In OSE it comes up as domain authority 28 page authority 1, the social media shares do show up in metrics on the right but no links internal or external are shown, they do on other pages I created in the same way. Is it just a case of waiting or is their something I do to help thank you
Moz Pro | | singingtelegramsuk0 -
Major site overhaul
Hi guys and girls After reading all the content It became apparent that our website is a bit of a dog. I'm about to update our site with a new theme and content on moz.com I've found myself becoming so much more confident at all this. I've read all the material I can on Moz re: Site migration but still have one question. How do I find out all the links that I need to 301 before I make the new site live? Please don't worry about patronizing me! I'm really new at this! Ben
Moz Pro | | SussexChef830 -
Why am I getting all these duplicate pages?
This is going for basically all my pages, but my website has 3 'duplicates' as the rest just have 2 (no index) Why are these 3 variations counting as duplicate pages? http://www.homepage.com http://homepage.com http://www.hompage.com/index.php
Moz Pro | | W2GITeam0 -
Functionality of SEOmoz crawl page reports
I am trying to find a way to ask SEOmoz staff to answer this question because I think it is a functionality question so I checked SEOmoz pro resources. I also have had no responses in the Forum too it either. So here it is again. Thanks much for your consideration! Is it possible to configure the SEOMoz Rogerbot error-finding bot (that make the crawl diagnostic reports) to obey the instructions in the individual page headers and http://client.com/robots.txt file? For example, there is a page at http://truthbook.com/quotes/index.cfm month=5&day=14&year=2007 that has – in the header -
Moz Pro | | jimmyzig
<meta name="robots" content="noindex"> </meta name="robots" content="noindex"> This page is themed Quote of the Day page and is duplicated twice intentionally at http://truthbook.com/quotes/index.cfm?month=5&day=14&year=2004 and also at http://truthbook.com/quotes/index.cfm?month=5&day=14&year=2010 but they all have <meta name="robots" content="noindex"> in them. So Google should not see them as duplicates right. Google does not in Webmaster Tools.</meta name="robots" content="noindex"> So it should not be counted 3 times? But it seems to be? How do we gen a report of the actual pages shown in the report as dups so we can check? We do not believe Google sees it as a duplicate page but Roger appears too. Similarly, one can use http://truthbook.com/contemplative_prayer/ , here also the http://truthbook.com/robots.txt tells Google to stay clear. Yet we are showing thousands of dup. page content errors when Google Webmaster tools as shown only a few hundred configured as described. Anyone? Jim0 -
What do you use for site audit
What tools do you use for conducting a site audit? I need to do an audit on a site and the seomoz web crawler and on page optimization will takes days if not a full week to return any results. In past Ive used other tools that I could run on the fly and they would return broken links, missing htags, keyword density, server information and more. Curious as to what you all use and what you may recommend to use in conjunction with the moz tools.
Moz Pro | | anthonytjm0 -
Tool Request - What keywords does a site rank for?
Hi folks, Something I've never had to do before so I'm not sure which tool to use, but is there a way to determine the keywords that a website currently ranks for? Hope someone can assist 🙂
Moz Pro | | ChristopherM1 -
How often does site explorer update.
My webmaster tools info is completely differernt to the opensite explorer info, I understand that site explorer only updates every so often but i reckon it been around four months since my stats were updated. Is there anywhere else i could view this info like PR and domain authority and actually get up to date info. Many thanks
Moz Pro | | totaldriveways0 -
Company Name in Page Title creating thousands of "Duplicate Page Title" errors
I am new, and I just got back my crawl results (after a week or more). The first thing I noticed is that the "duplicate page title" is in the thousands, my urls and page titles are different. The only thing I can see is that our company name is at appended to the name of every title. I did search and found one other person with this problem, but no answer was given. Can anyone offer some advice? This doesn't seem right... Thanks,
Moz Pro | | AoyamaJPN0