How to find all indexed pages in Google?
-
Hi,
We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools.
How can I get a list of all pages indexed of our domain? trying to locate the duplicate content.
Doing a "site:www.mydomain.com" only returns up to 676 results...
Any ideas?
Thanks,
Ben
-
You are absolutely right. But if you think that you have duplicate content issues, then Screaming Frog can help you tease that out.
That is also why I suggested the SEOmoz tool, since it is supposed to mimick a SE spider, it can give you a pretty good idea of any issues that you might have.
Using the advanced operator of site:domain makes sense, but if there are issues there like eyepaq said, it is going to be tough sledding.
My suggestion would be to download take a closer look at what GWT is telling you. Are there duplicates there? Is your CMS auto-generating URL's? That is probably going to be your best bet IMO.
Best of luck!
-
@BJS, I would export a file from GWT and filter the results. If your URLs are in GWT, then most likely it's indexed in Google.
-
Thank you to everyone that contributed.
@Zeph and @Francisco - I do use Screaming Frog, but actually, correct me if I am wrong, but it does not show a list of pages indexed, but rather pages that exist in the site - not what Google has already indexed. Thanks anyway
What I wanted was a way of creating a list of all indexed pages in Google - not a count.
But thank you all the same!
-
Hey Zeph! Hope your company is doing great.
@Ben, screaming frog is good for this. You will need to get the paid version of it. There is a video on the site http://www.screamingfrog.co.uk/seo-spider/. Use filters to get to your real URLs.
-
Hi,
There are tools that you can use - though for close 50k pages is harder to crawl. Best bet is the Web master tools count - although is not 100% exact either.
The site:domain is a good indicator but it's generated "on the fly" but it will show you a better result if you go "deeper" and click on page 10-20 and so on.
However right now it looks like there is an issue with site:domain. for more info see: http://www.seroundtable.com/google-site-command-cluster-16829.html
Cheers.
-
Use the tool Screaming Frog to see all your pages, that should help. Also, the SEOmoz toolset has a function that will show you all duplicate content (if you are a pro subscriber).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to 301 Redirect /page.php to /page, after a RewriteRule has already made /page.php accessible by /page (Getting errors)
A site has its URLs with php extensions, like this: example.com/page.php I used the following rewrite to remove the extension so that the page can now be accessed from example.com/page RewriteCond %{REQUEST_FILENAME}.php -f
Intermediate & Advanced SEO | | rcseo
RewriteRule ^(.*)$ $1.php [L] It works great. I can access it via the example.com/page URL. However, the problem is the page can still be accessed from example.com/page.php. Because I have external links going to the page, I want to 301 redirect example.com/page.php to example.com/page. I've tried this a couple of ways but I get redirect loops or 500 internal server errors. Is there a way to have both? Remove the extension and 301 the .php to no extension? By the way, if it matters, page.php is an actual file in the root directory (not created through another rewrite or URI routing). I'm hoping I can do this, and not just throw a example.com/page canonical tag on the page. Thanks!0 -
Website Does not index in any page?
I created a website www.astrologersktantrik.com 4 days ago and fetch it with google but still my website does not index on google as the keywords I use is with low competition but still my website does not appear on any keywords?
Intermediate & Advanced SEO | | ramansaab0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
Google indexed wrong pages of my website.
When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?
Intermediate & Advanced SEO | | bondhoward0 -
Site not indexed in Google UK
This site was moved to a new host by the client a month back and is still not indexed in Google UK if you search for the site directly. www.loftconversionswestsussex.com Webmaster tools shows that 55 pages have been crawled and no errors have been detected. The client also tried the "Fetch as Google Bot" tactic in GWT as well as running a PPC campaign and the site is still not appearing in Google. Any thoughts please? Cheers, SEO5..
Intermediate & Advanced SEO | | SEO5Team0 -
Website is not indexed in Google, please help with suggestions
Our client website was removed from Google index. Anybody could recommend how to speed up process of re index: Webmaster tools done SM done (Twitter, FB) sitemap.xml done backlinks in process PPC done Robots.txt is fine Guys any recommendations are welcome, client is very unhappy. Thank you
Intermediate & Advanced SEO | | ThinkBDW0 -
My landing page changed in google's serp. I used to have a product page now I have a pdf?
I have been optimizing this page for a few weeks now and and have seen our page for up from 23rd to 11th on the serp's. I come to work today and not only have I dropped to 15 but I've also had my relevant product page replaced by this page . Not to mention the second page is a pdf! I am not sure what happened here but any advice on how I could fix this would be great. My site is www.mynaturalmarket.com and the keyword I'm working on is Zyflamend.
Intermediate & Advanced SEO | | KenyonManu3-SEOSEM0 -
Will Google Revisit a 403 Page
Hi, We've got some pretty strict anti-scraping logic in our website, and it seems we accidentally snared a Googlebot with it. About 100 URL requests were responded to with a 403 Forbidden error. The logic has since been updated, so this should not happen again. I was just wondering if/when Googlebot will come back and try those URLs again. They are linked from other pages on the site, and they are also in our sitemap. Thanks in advance for any assistance.
Intermediate & Advanced SEO | | dbuckles0