How to find all indexed pages in Google?
-
Hi,
We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools.
How can I get a list of all pages indexed of our domain? trying to locate the duplicate content.
Doing a "site:www.mydomain.com" only returns up to 676 results...
Any ideas?
Thanks,
Ben
-
You are absolutely right. But if you think that you have duplicate content issues, then Screaming Frog can help you tease that out.
That is also why I suggested the SEOmoz tool, since it is supposed to mimick a SE spider, it can give you a pretty good idea of any issues that you might have.
Using the advanced operator of site:domain makes sense, but if there are issues there like eyepaq said, it is going to be tough sledding.
My suggestion would be to download take a closer look at what GWT is telling you. Are there duplicates there? Is your CMS auto-generating URL's? That is probably going to be your best bet IMO.
Best of luck!
-
@BJS, I would export a file from GWT and filter the results. If your URLs are in GWT, then most likely it's indexed in Google.
-
Thank you to everyone that contributed.
@Zeph and @Francisco - I do use Screaming Frog, but actually, correct me if I am wrong, but it does not show a list of pages indexed, but rather pages that exist in the site - not what Google has already indexed. Thanks anyway
What I wanted was a way of creating a list of all indexed pages in Google - not a count.
But thank you all the same!
-
Hey Zeph! Hope your company is doing great.
@Ben, screaming frog is good for this. You will need to get the paid version of it. There is a video on the site http://www.screamingfrog.co.uk/seo-spider/. Use filters to get to your real URLs.
-
Hi,
There are tools that you can use - though for close 50k pages is harder to crawl. Best bet is the Web master tools count - although is not 100% exact either.
The site:domain is a good indicator but it's generated "on the fly" but it will show you a better result if you go "deeper" and click on page 10-20 and so on.
However right now it looks like there is an issue with site:domain. for more info see: http://www.seroundtable.com/google-site-command-cluster-16829.html
Cheers.
-
Use the tool Screaming Frog to see all your pages, that should help. Also, the SEOmoz toolset has a function that will show you all duplicate content (if you are a pro subscriber).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Staging website got indexed by google
Our staging website got indexed by google and now MOZ is showing all inbound links from staging site, how should i remove those links and make it no index. Note- we already added Meta NOINDEX in head tag
Intermediate & Advanced SEO | | Asmi-Ta0 -
Page with metatag noindex is STILL being indexed?!
Hi Mozers, There are over 200 pages from our site that have a meta tag "noindex" but are STILL being indexed. What else can I do to remove them from the Index?
Intermediate & Advanced SEO | | yaelslater0 -
Google indexing wrong pages
We have a variety of issues at the moment, and need some advice. First off, we have a HUGE indexing issue across our entire website. Website in question: http://www.localsearch.com.au/ Firstly
Intermediate & Advanced SEO | | localdirectories
In Google.com.au, if you search for 'plumbers gosford' (https://www.google.com.au/#q=plumbers+gosford), the wrong page appears - in this instance, the page ranking should be http://www.localsearch.com.au/Gosford,NSW/Plumbers I can see this across the board, across multiple locations. Secondly
Recently I've seen Google reporting in 'Crawl Errors' in webmaster tools URLs such as:
http://www.localsearch.com.au/Saunders-Beach,QLD/Electronic-Equipment-Sales-Repairs&Sa=U&Ei=xs-XVJzAA9T_YQSMgIHQCw&Ved=0CIMBEBYwEg&Usg=AFQjCNHXPrZZg0JU3O4yTGjWbijon1Q8OA This is an invalid URL, and more specifically, those query strings seem to be referrer queries from Google themselves: &Sa=U&Ei=xs-XVJzAA9T_YQSMgIHQCw&Ved=0CIMBEBYwEg&Usg=AFQjCNHXPrZZg0JU3O4yTGjWbijon1Q8OA Here's the above example indexed in Google: https://www.google.com.au/#q="AFQjCNHXPrZZg0JU3O4yTGjWbijon1Q8OA" Does anyone have any advice on those 2 errors?0 -
Dev Subdomain Pages Indexed - How to Remove
I own a website (domain.com) and used the subdomain "dev.domain.com" while adding a new section to the site (as a development link). I forgot to block the dev.domain.com in my robots file, and google indexed all of the dev pages (around 100 of them). I blocked the site (dev.domain.com) in robots, and then proceeded to just delete the entire subdomain altogether. It's been about a week now and I still see the subdomain pages indexed on Google. How do I get these pages removed from Google? Are they causing duplicate content/title issues, or does Google know that it's a development subdomain and it's just taking time for them to recognize that I deleted it already?
Intermediate & Advanced SEO | | WebServiceConsulting.com0 -
Webmaster or analytics can we find pages that are 404
Hi, Webmaster or analytics can we find pages that are 404 404 landing pages? From which source to what page and from page i get the 404 error when someone accesses my webpage? Need to know which pages are live on sites that are broken in my site Thanks
Intermediate & Advanced SEO | | mtthompsons0 -
Wrong Page Indexing in SERPS - Suggestions?
Hey Moz'ers! I have a quick question. Our company (Savvy Panda) is working on ranking for the keyword: "Milwaukee SEO". On our website, we have a page for "Milwaukee SEO" in our services section that's optimized for the keyword and we've been doing link building to this. However, when you search for "Milwaukee SEO" a different page is being displayed in the SERP's. The page that's showing up in the SERP's is a category view of our blog of articles with the tag "Milwaukee SEO". **Is there a way to alert google that the page showing up in the SERP's is not the most relevant and request a new URL to be indexed for that spot? ** I saw a webinar awhile back that showed something like that using google webmaster sitelinks denote tool. I would hate to denote that URL and then loose any kind of indexing for the keyword.
Intermediate & Advanced SEO | | SavvyPanda
Ideas, suggestions?0 -
Should I prevent Google from indexing blog tag and category pages?
I am working on a website that has a regularly updated Wordpress blog and am unsure whether or not the category and tag pages should be indexable. The blog posts are often outranked by the tag and category pages and they are ultimately leaving me with a duplicate content issue. With this in mind, I assumed that the best thing to do would be to remove the tag and category pages from the index, but after speaking to someone else about the issue, I am no longer sure. I have tried researching online, but there isn't anything that provided any further information. Please can anyone with any experience of dealing with issues like this or with any knowledge of the topic help me to resolve this annoying issue. Any input will be greatly appreciated. Thanks Paul
Intermediate & Advanced SEO | | PaulRogers0