How to find all indexed pages in Google?
-
Hi,
We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools.
How can I get a list of all pages indexed of our domain? trying to locate the duplicate content.
Doing a "site:www.mydomain.com" only returns up to 676 results...
Any ideas?
Thanks,
Ben
-
You are absolutely right. But if you think that you have duplicate content issues, then Screaming Frog can help you tease that out.
That is also why I suggested the SEOmoz tool, since it is supposed to mimick a SE spider, it can give you a pretty good idea of any issues that you might have.
Using the advanced operator of site:domain makes sense, but if there are issues there like eyepaq said, it is going to be tough sledding.
My suggestion would be to download take a closer look at what GWT is telling you. Are there duplicates there? Is your CMS auto-generating URL's? That is probably going to be your best bet IMO.
Best of luck!
-
@BJS, I would export a file from GWT and filter the results. If your URLs are in GWT, then most likely it's indexed in Google.
-
Thank you to everyone that contributed.
@Zeph and @Francisco - I do use Screaming Frog, but actually, correct me if I am wrong, but it does not show a list of pages indexed, but rather pages that exist in the site - not what Google has already indexed. Thanks anyway
What I wanted was a way of creating a list of all indexed pages in Google - not a count.
But thank you all the same!
-
Hey Zeph! Hope your company is doing great.
@Ben, screaming frog is good for this. You will need to get the paid version of it. There is a video on the site http://www.screamingfrog.co.uk/seo-spider/. Use filters to get to your real URLs.
-
Hi,
There are tools that you can use - though for close 50k pages is harder to crawl. Best bet is the Web master tools count - although is not 100% exact either.
The site:domain is a good indicator but it's generated "on the fly" but it will show you a better result if you go "deeper" and click on page 10-20 and so on.
However right now it looks like there is an issue with site:domain. for more info see: http://www.seroundtable.com/google-site-command-cluster-16829.html
Cheers.
-
Use the tool Screaming Frog to see all your pages, that should help. Also, the SEOmoz toolset has a function that will show you all duplicate content (if you are a pro subscriber).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Domain Authority Dropped and Indexed Pages Went Down on Google?
Hi there, We run an e-commerce site on Shopify. Our Domain Authority was 28 at the start of our campaign in May of this year. We also had 610 indexed pages on Google. We did some SEO work which included: Renaming Images for SEO Adding in alt tags Optimizing the meta title to "Product Name - Keyword - Brand Name" for products Optimizing meta descriptions Transition of Hubspot blog to Shopify (it was on a subdomain at Hubspot previously) Fixing some 404s Resubmitting site map after the changes Now it is almost at the 3-month mark and it looks like our Domain Authority has gone down 4 points to 24. The # of indexed pages has gone to down to 555. We made sure all our SEO updates weren't spammy or keyword-stuffed, but took a natural and helpful-sounding approach. We followed guidelines. So there shouldn't be any penalty right? I checked site traffic and it does not coincide with the drop. Our site traffic remains steady. I also looked at "site:" as well as conducted some test searches for the important pages (i.e. main pages, blog pages, and product pages) and they still come up on Google. So could it only be non-important pages being deindexed? My questions are: Why did both the Domain Authority and # of indexed pages go down? Is there any way to see which pages were deindexed? I checked Google Search Console, but couldn't find it. Thank you!
Intermediate & Advanced SEO | | kindalpaca70 -
How to index your website pages on Google 2020 ?
Hey! Hopefully, everyone is fine here I tell you some step how you are index your all website pages on Google 2020. I'm already implementing these same steps for my site Boxes Maker. Now Below I'm giving you some steps for indexing your website pages. These are the most important ways to help Google find your pages: Add a sitemap. ... Make sure people know your site. ... Ensure full navigation on your site. ... Apply the indexing application to your homepage. ... Sites that use URL parameters other than URLs or page names may be more difficult to broadcast.
Intermediate & Advanced SEO | | fbowable0 -
Removing indexed internal search pages from Google when it's driving lots of traffic?
Hi I'm working on an E-Commerce site and the internal Search results page is our 3rd most popular landing page. I've also seen Google has often used this page as a "Google-selected canonical" on Search Console on a few pages, and it has thousands of these Search pages indexed. Hoping you can help with the below: To remove these results, is it as simple as adding "noindex/follow" to Search pages? Should I do it incrementally? There are parameters (brand, colour, size, etc.) in the indexed results and maybe I should block each one of them over time. Will there be an initial negative impact on results I should warn others about? Thanks!
Intermediate & Advanced SEO | | Frankie-BTDublin0 -
Can a duplicate page referencing the original page on another domain in another country using the 'canonical link' still get indexed locally?
Hi I wonder if anyone could help me on a canonical link query/indexing issue. I have given an overview, intended solution and question below. Any advice on this query will be much appreciated. Overview: I have a client who has a .com domain that includes blog content intended for the US market using the correct lang tags. The client also has a .co.uk site without a blog but looking at creating one. As the target keywords and content are relevant across both UK and US markets and not to duplicate work the client has asked would it be worthwhile centralising the blog or provide any other efficient blog site structure recommendations. Suggested solution: As the domain authority (DA) on the .com/.co.uk sites are in the 60+ it would risky moving domains/subdomain at this stage and would be a waste not to utilise the DAs that have built up on both sites. I have suggested they keep both sites and share the same content between them using a content curated WP plugin and using the 'canonical link' to reference the original source (US or UK) - so not to get duplicate content issues. My question: Let's say I'm a potential customer in the UK and i'm searching using a keyword phrase that the content that answers my query is on both the UK and US site although the US content is the original source.
Intermediate & Advanced SEO | | JonRayner
Will the US or UK version blog appear in UK SERPs? My gut is the UK blog will as Google will try and serve me the most appropriate version of the content and as I'm in the UK it will be this version, even though I have identified the US source using the canonical link?2 -
"No index" page still shows in search results and paginated pages shows page 2 in results
I have "no index, follow" on some pages, which I set 2 weeks ago. Today I see one of these pages showing in Google Search Results. I am using rel=next prev on pages, yet Page 2 of a string of pages showed up in results before Page 1. What could be the issue?
Intermediate & Advanced SEO | | khi50 -
Any downsides of (permanent)redirecting 404 pages to more generic pages(category page)
Hi, We have a site which is somewhat like e-bay, they have several categories and advertisements posted by customers/ client. These advertisements disappear over time and turn into 404 pages. We have the option to redirect the user to the corresponding category page, but we're afraid of any negative impact of this change. Are there any downsides, and is this really the best option we have? Thanks in advance!
Intermediate & Advanced SEO | | vhendriks0 -
I have removed over 2000+ pages but Google still says i have 3000+ pages indexed
Good Afternoon, I run a office equipment website called top4office.co.uk. My predecessor decided that he would make an exact copy of the content on our existing site top4office.com and place it on the top4office.co.uk domain which included over 2k of thin pages. Since coming in i have hired a copywriter who has rewritten all the important content and I have removed over 2k pages of thin pages. I have set up 301's and blocked the thin pages using robots.txt and then used Google's removal tool to remove the pages from the index which was successfully done. But, although they were removed and can now longer be found in Google, when i use site:top4office.co.uk i still have over 3k of indexed pages (Originally i had 3700). Does anyone have any ideas why this is happening and more importantly how i can fix it? Our ranking on this site is woeful in comparison to what it was in 2011. I have a deadline and was wondering how quickly, in your opinion, do you think all these changes will impact my SERPs rankings? Look forward to your responses!
Intermediate & Advanced SEO | | apogeecorp0 -
Why google index some meta titles I dont have?
Hi there, I have a problem with a website and I am desperate to find a solution because I have tried many things and nothing works! My website its: adtriboo.com Google does not find my main URL (main countro spain) www.adtriboo.com/es and I dont see this page its indexed in google. See link https://www.google.es/search?num=100&hl=es&site=&source=hp&q=site%3Aadtriboo.com&oq=site%3Aadtriboo.com&gs_l=hp.3...1189.4419.0.4586.17.17.0.0.0.0.223.1457.9j6j1.16.0...0.0...1c.1.8.hp.brTKX-zPwVI Also, google its showing some meta titles that are not in my page! For example my subfolder for the country Chile shows this title: Chile - Adtriboo but this its my real title Diseño logo, logotipos, video corporativo - adtriboo In webmaster tools everything looks good, and if I explore the webpage like google in webmaster tools the code its ok and everything lookd okay. If you see for example the URL from Chile (www.adtriboo.com/es_CL) the meta title is not the right one! Also i have a problem indexatión because i am not visible for any of my keywords even in the page 10! Please, somebody knows what happen?
Intermediate & Advanced SEO | | Comunicare0