Google Search Results...
-
I'm trying to download every google search results for my company site:company.com. The limit I can get is 100. I tried using seoquake but I can only get to 100.
The reason for this? I would like to see what are the pages indexed. www pages, and subdomain pages should only make up 7,000 but search results are 23,000. I would like to see what the others are in the 23,000.
Any advice how to go about this? I can individually check subdomains site:www.company.com and site:static.company.com, but I don't know all the subdomains.
Anyone cracked this? I tried using a scrapper tool but it was only able to retrieve 200.
-
I see. If you have some idea of what section of your site might be in there that you don't want, you can use site:company.com inurl:whatever to narrow it down. You should know the file or call for search and shop pages and can put that name after the inurl modifier.
-
The goal is to identify what pages are Google indexing and are there ones it shouldn't. (We don't index search pages, we don't index basket or checkout pages)
I do know know all of the subdomains and searching them individually isn't making up the total search count when I do site:company.com.
I don't have duplicate pages from my moz reports so it can't be that. If I was able to download a full google search result into a spreadsheet. I could quickly filter and see what pages are being indexed that shouldn't.
-
Ok, but what's your goal with this? And why don't you know your own subdomains that you've created? It seems like you could work backwards from a better starting point by applying those things.
-
My GA is only focused on a single domain, as subdomains hold just PDFs, images etc. Traffic reports from GA are focused on www.company.com pages.
The only way I can know exactly which URLS have been indexed, seems to be going through the google search results, but it caps after 7 pages
-
Hi Cyto. Why don't you try exporting pages receiving google/organic visits from Google Analytics using the Landing Page metric as a secondary dimension... It won't be all inclusive, but it will give you a good idea on what pages are indexed and drawing in visitors. You can then compare that data against your sitemaps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My url disappeared from Google but Search Console shows indexed. This url has been indexed for more than a year. Please help!
Super weird problem that I can't solve for last 5 hours. One of my urls: https://www.dcacar.com/lax-car-service.html Has been indexed for more than a year and also has an AMP version, few hours ago I realized that it had disappeared from serps. We were ranking on page 1 for several key terms. When I perform a search "site:dcacar.com " the url is no where to be found on all 5 pages. But when I check my Google Console it shows as indexed I requested to index again but nothing changed. All other 50 or so urls are not effected at all, this is the only url that has gone missing can someone solve this mystery for me please. Thanks a lot in advance.
Intermediate & Advanced SEO | | Davit19850 -
What should I do after a failed request for validation (error with noindex, nofollow) in new Google Search Console?
Hi guys, We have the following situation: After an error message in new google search console for a large amount of pages with noindex, nofollow tag, a validation is requested before the problem is fixed. (it's incredibly stupid decision taken before asking the SEO team for advice) Google starts the validation, crawls 9 URLs and changes the status to "Failed". All other URLs are still in "pending" status. The problem has been fixed for more than 10 days, but apparently Google doesn't crawl the pages and none of the URLs is back in the index. We tried pinging several pages and html sitemaps, but there is no result. Do you think we should request for re-validation or wait more time? It there something more we could do to speed up the process?
Intermediate & Advanced SEO | | ParisChildress0 -
Google Search Console Site Property Questions
I have a few questions regarding Google Search Console. Google Search Console tells you to add all versions of your website https, http, www, and non-www. 1.) Do I than add ALL the information for ALL versions? Sitemaps, preferred site, etc.? 2.) If yes, when I add sitemaps to each version, do I add the sitemap url of the site version I'm on or my preferred version? - For instance when adding a sitemap to a non-www version of the site, do I use the non-www version of the sitemap? Or since I prefer a https://www.domain.com/sitemap.xml do I use it there? 3.) When adding my preferred site (www or non-www) do I use my preferred site on all site versions? (https, http, www, and non-www) Thanks in advance. Answers vary throughout Google!
Intermediate & Advanced SEO | | Mike.Bean0 -
Not showing up in search results for non-branded terms
Hello! Can anyone see any glaring reasons why this post: "98 Book Marketing Ideas That Can Help Authors Increase Sales" isn't on page one of Google — or even page 10! — for the term "book marketing ideas"? Many other sites with lower domain and page authority — even ones linking to this article — are ranking on the first ten pages for this term, and I can't figure out why we're not appearing anywhere. The same thing is happening for ALL of our other blog posts, and the keywords they're optimized for. According to GA, the only terms we're getting clicks from are branded keywords. This subdomain is now 2 years old, and the domain bookbub.com has been around for 5 years. Our domain authority is 61. We have the Yoast SEO plugin installed and are following all the standard SEO best practices. We have enough external links to at least be ranking within the first 10 pages of this Google search. I feel like there's something technically wrong, maybe in the code or backend, but nobody here can figure it out, and our hosting provider WP Engine has no ideas. Moz is returning crawl errors on our site, mainly "Error Code 804: HTTPS (SSL) Error Encountered" and "Error Code 803: Incomplete HTTP Response Received." I have confirmed with WP Engine that everything is set up correctly on our end, and that this is a known Moz issue. I've reached out to Moz's support team about this, and am awaiting a response. But what else am I missing? There's got to be something — I've been blogging for 10 years for different companies and my own personal websites, and I've never come across anything like this before. I'm completely stuck! I'd appreciate any insights you can offer. Thanks in advance! 🙂 EDIT: I heard back from Moz on those errors. The 804 errors are a Moz-side issue — their crawler isn't equipped to be able to handle SNI. They're looking into a resolution, and this wouldn't affect search engine crawlers. Regarding the 803 error: "When you see an 803 error, that means your site closed its TCP connection to our crawler before our crawler could read a complete HTTP response. You don't see this error when you go to the page in your browser because content-length is an outdated component for modern browsers and they will disregard this error, but the intention of our crawler is to report any errors that might be occurring. So the crawler is configured to detect and report such errors." The only thing I can think to do here is go back to WP Engine with this information, but other than that, I'm not sure what this could mean or how to fix it, or if this might be the underlying technical issue keeping us from ranking.
Intermediate & Advanced SEO | | bookbubpartners1 -
Which search engines should we submit our sitemap to?
Other than Google and Bing, which search engines should we submit our sitemap to?
Intermediate & Advanced SEO | | NicheSocial0 -
How do we preserve images in google search after CMS migration?
Hi Folks we are about to migrate to a new CMS (bigcommerce/volusion type of thing) are are advised that we will preserve our google love for our old URLS with 301 re-directs. OK but what about images that show in search (we have a lot of our images show up high in relevant google image search) will this method work the same or should we do something else to keep the image benefits? many thanks Tom
Intermediate & Advanced SEO | | tomnivore0 -
Google local pointing to Google plus page not homepage
Today my clients homepage dropped off the search results page (was #1 for months, in the top for years). I noticed in the places account everything is suddenly pointing at the Google plus page? The interior pages are still ranking. Any insight would be very helpful! Thanks.
Intermediate & Advanced SEO | | stevenob0 -
How do you rank in the "brands for:" section in Google's search results ?
There's a "brands for:" section that appears above the first organic listing for certain search queries. For example, if you search for "dedicated servers" in Google, you will see that a "brands for:" appears. How do you get listed there? Thanks, Brian
Intermediate & Advanced SEO | | InMotionHosting0