Listing of all Google Indexed Pages
-
I started managing a site that has about 391,000 indexed pages. I want to get to the bottom of why there are so many in preparation for a ecommerce Migration and improving SEO. Anyone know of a tool? Many tools I have came across can only take 100 at a time. I would love to get them in excel or a database. I look forward to the suggestions.
-
Using site:yourdomain.com in Google, and then going to the end of the results and telling it to show you all of the results, is a good first start. It should get you enough to get an idea of why there are duplicated pages.
The Moz crawl can also help you figure it out, as often with ecommerce you'll have URLs for sorting products by price, name, pagination parameters, etc. We'll throw up a flag when we see a bunch of duplicate content or duplicate titles.
Also look for the easy stuff, such as non-www doesn't direct to www. Fix that, and you've cut your pages in half.
-
I may not be answering this correctly...
Are you looking for a list of URLs? If so, easy peasy to use screaming frog.
If it's all the pages Google has indexed, I don't really know and I'm sorry! However, I will come back to this thread to see if someone else has the answer for you, because I'm quite interested in it myself!!!
Best of luck,
Amelia
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Rank vs Page and Domain Authority - who wins?
A client has found another SEO agency promising various things to do with link building. Most of these promises are based upon links from sites with allegedly high page ranks. So my questions: Page rank seems to be fading out am I safe to stay with PA and DA metrics instead? I don't agree with link building tactics and feel that it should more a networking activity to provide USEFUL links to users... am I being too white hat and missing opporunities? The other company have promised long list of links including 100 SEO friendly web directory listings, 200 PR 8 back links from Pinterest (which i thought was no follow) & 10 long lasting and high quality mini web sites (with three pages/posts, video and pictures). Am I right that this all sounds a little spammy or is this really what I should be doing for me clients?
Moz Pro | | SoundinTheory0 -
What's my best strategy for Duplicate Content if only www pages are indexed?
The MOZ crawl report for my site shows duplicate content with both www and non-www pages on the site. (Only the www are indexed by Google, however.) Do I still need to use a 301 redirect - even if the non-www are not indexed? Is rel=canonical less preferable, as usual? Facts: the site is built using asp.net the homepage has multiple versions which use 'meta refresh' tags to point to 'default.asp'. most links already point to www Current Strategy: set the preferred domain to 'www' in Google's Webmaster Tools. set the Wordpress blog (which sits in a /blog subdirectory) with rel="canonical" to point to the www version. Ask programmer to add 301 redirects from the non-www pages to the www pages. Ask programmer to use 301 redirects as opposed to meta refresh tags & point all homepage versions to www.site.org. Does this strategy make the most sense? (Especially considering the non-indexed but existent non-www pages.) Thanks!!
Moz Pro | | kimmiedawn0 -
Backlink of competitor inner page
How to understand from where the link to a competitor's internal page http://www.exampledomain.com/contest/book are coming? I mean, we want promote the same "event" but we want understanding where the competitor is already present. Open Site Explorer doesn't have any data for the website listed but I'm sure that the page has a lot of backlinks (I've verified the first 50 manually in Google.it) Many Thanks for all your advices.
Moz Pro | | YESdesign0 -
Duplicate page titles in SEOMoz
My on page reports are showing a good number of duplicate title tags, but they are all because of a url tracking parameter that tells us which link the visitor clicked on. For example, http://www.example.com/example-product.htm?ref=navside and http://www.example.com/example-product.htm are the same page, but are treated as to different urls in SEOMoz. This is creating "fake" number of duplicate page titles in my reports. This has not been a problem with Google, but SEOMoz is treating it like this and it's confusing my data. Is there a way to specify this as a url parameter in the Moz software? Or does anybody have another suggestion? Should I specify this in GWT and BWT?
Moz Pro | | InetAll0 -
On-Page Report Card B grade because its a PPC landing page
I have a PPC landing page with I'm getting a B grade on the On-Page Report Card. Can I just ignore that, it says its a "Critical Factor" Thanks Mike Crawl status <dd>Status Code: 200
Moz Pro | | mjrinvent
meta-robots: noindex,nofollowall
meta-refresh: None
X-Robots: None</dd> <dt>Explanation</dt> <dd>Pages that can't be crawled or indexed have no opportunity to rank in the results. Before tweaking keyword targeting or leveraging other optimization techniques, it's essential to make sure this page is accessible.</dd> <dt>Recommendation</dt> <dd>Ensure the URL returns the HTTP code 200 and is not blocked with robots.txt, meta robots or x-robots protocol (and does not meta refresh to another URL)</dd>0 -
Domain / Page Authority - logarithmic
SEOmoz says their Domain / Page Authority is logarithmic, meaning that lower rankings are easier to get, higher rankings harder to get. Makes sense. But does anyone know what logarithmic equation they use? I'm using the domain and page authority as one metric in amongst other metrics in my keyword analysis. I can't have some metrics linear, others exponential and the SEOmoz one logarithmic.
Moz Pro | | eatyourveggies0 -
Page Rank and offline sites
I have a domain with PR6 according to the Historical Pagerank Checker. But that last PR was calculated 2 years ago. I brought the site back online a few days ago and have checked that many/most of the backlinks are still valid. It is now in the Google index but the Historical Pagerank Checker shows PR0. Will it get back its previous rank or something close to it? How long will it take?
Moz Pro | | DomainOptions0