Listing of all Google Indexed Pages
-
I started managing a site that has about 391,000 indexed pages. I want to get to the bottom of why there are so many in preparation for a ecommerce Migration and improving SEO. Anyone know of a tool? Many tools I have came across can only take 100 at a time. I would love to get them in excel or a database. I look forward to the suggestions.
-
Using site:yourdomain.com in Google, and then going to the end of the results and telling it to show you all of the results, is a good first start. It should get you enough to get an idea of why there are duplicated pages.
The Moz crawl can also help you figure it out, as often with ecommerce you'll have URLs for sorting products by price, name, pagination parameters, etc. We'll throw up a flag when we see a bunch of duplicate content or duplicate titles.
Also look for the easy stuff, such as non-www doesn't direct to www. Fix that, and you've cut your pages in half.
-
I may not be answering this correctly...
Are you looking for a list of URLs? If so, easy peasy to use screaming frog.
If it's all the pages Google has indexed, I don't really know and I'm sorry! However, I will come back to this thread to see if someone else has the answer for you, because I'm quite interested in it myself!!!
Best of luck,
Amelia
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Source page showsI have 2 h1 tags on my page. I can only find one.
When I grade my page it says I have more than one h1 tag. I view the source page and it shows there are two h1 headings with the same wording. If I delete the one h1 heading I can find, the page source shows I have deleted both of them. I don't know how to get to the other heading to delete it. And I'm off page one of google! Can anybody help? Clay Stephens
Moz Pro | | Coot0 -
Why SEOmoz bot consider these as duplicate pages?
Hello here, SEOmoz bot has recently marked the following two pages as duplicate: http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=mp3 http://www.virtualsheetmusic.com/score/PatrickCollectionFlPf.html?tab=pdf I don't personally see how these pages can be considered duplicate since their content is quite different. Thoughts??!!
Moz Pro | | fablau0 -
Finding page ranking when over 50-100+
How can I find my page ranking (what position i land on in google etc. for a certain keyword) when seomoz and other programs ive tried simply say "50+" or "100+" without giving me the actual number?
Moz Pro | | smarthidbiz0 -
Crawled pages are missing and showing just 1 page crawled
One of my campaign has got around 8500 pages crawled(seomoz) and reports are shown, but suddenly it is showing 1 page crawled. Why it is happened like this? How can i get back the previous reports?
Moz Pro | | Sulekha0 -
Google updated their algorithm. How up to date is OpenSiteExplorer's Domain and Page authority?
Hi guys, as we all know, Google Panda update is here and it changed how they rank inbound links and the linkers domain and page authority. But when I look at OpenSiteExplorer's results, I cant see a difference in the page and domain authority. Why is that?
Moz Pro | | Uds0 -
Why did SEOMoz only crawl 1 page?
I have multiple campaigns and on a few of them SEOMoz has only crawled one page. I think this may have to do with how I set up the campaign. How do I get SEOMoz to crawl more than one page on these campaigns.
Moz Pro | | HermanAdvertising0 -
Yellow Pages
We have just made a yellow pages site n in 3 weeks Google has just indexed 1700 pages out of 18000, so what can we do that Google index all the pages or how the process works? yellowpages.naitazi.com Regards
Moz Pro | | razasaeed0