Export list of urls in google's index?
-
Is there a way to export an exact list of urls found in Google's index?
-
hmm, I actually need a near to complete list if possible
-
A good place to start would be to go to the Google Webmaster Tools. If you haven't set it up, I'd highly recommend it. Expand the "Your site on the web" section, and click on the "Search queries" report. The default tab is by "Top queries". Click the tab next to it for "Top pages". This will show all the pages with some information as well for each page, such as impressions, clicks, CTR, and average position on Google SERPs.
I'm not sure if this is a complete list, but it's what I go off of.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it necessary to use Google's Structured Data Markup or alternative for my B2B site?
Hi, We are in the process of going through a re-design for our site. Am trying to understand if we need to use some sort of structured data either from Google Structured data or schema. org?
Intermediate & Advanced SEO | | Krausch0 -
Removing Parameterized URLs from Google Index
We have duplicate eCommerce websites, and we are in the process of implementing cross-domain canonicals. (We can't 301 - both sites are major brands). So far, this is working well - rankings are improving dramatically in most cases. However, what we are seeing in some cases is that Google has indexed a parameterized page for the site being canonicaled (this is the site that is getting the canonical tag - the "from" page). When this happens, both sites are being ranked, and the parameterized page appears to be blocking the canonical. The question is, how do I remove canonicaled pages from Google's index? If Google doesn't crawl the page in question, it never sees the canonical tag, and we still have duplicate content. Example: A. www.domain2.com/productname.cfm%3FclickSource%3DXSELL_PR is ranked at #35, and B. www.domain1.com/productname.cfm is ranked at #12. (yes, I know that upper case is bad. We fixed that too.) Page A has the canonical tag, but page B's rank didn't improve. I know that there are no guarantees that it will improve, but I am seeing a pattern. Page A appears to be preventing Google from passing link juice via canonical. If Google doesn't crawl Page A, it can't see the rel=canonical tag. We likely have thousands of pages like this. Any ideas? Does it make sense to block the "clicksource" parameter in GWT? That kind of scares me.
Intermediate & Advanced SEO | | AMHC0 -
Website with only a portion being 'mobile friendly' -- what to tell Google?
I have a website for desktop that does a lot of things, and have converted part of it do show pages in a mobile friendly format based on the users device. Not responsive design, but actual diff code with different formatting by mobile vs desktop--but each still share the same page url name. Google allows this approach. The mobile-friendly part of the site is not as extensive as desktop, so there are pages that apply to the desktop but not for mobile. So the functionality is limited some for mobile devices, and therefore some pages should only be indexed for desktop users. How should that page be handled for Google crawlers? If it is given a 404 not found for their mobile bot will Google properly still crawl it for the desktop, or will Google see that the url was flagged as 'not found' and not crawl it for the desktop? I asked a similar question yest, but it was not stated clearly. Thanks,Ted
Intermediate & Advanced SEO | | friendoffood0 -
Is it a good or bad idea (in Google's eyes) to add a forum to my website?
I have an active website with many users adding dozens of comments on the many pages of the site daily. I'm am wondering if it would be good for the overall ranking strength of the site if I were to add a forum to it (in a subdirectory, like forum.mysite.com). On one hand, I can see the forum posts as thin content, which Google wouldn't care for. On the other hand, I see the additional user engagement on the site, which I think Google would like. I know the benefits it can have to the users, but for this question, all I want to know is if this would be seen by Google as a plus or a minus for my site, assuming the forum succeeded in becoming popular. I don't want to do anything that will diminish the value of my site in Google's eyes. Thank you.
Intermediate & Advanced SEO | | bizzer0 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
'Nofollow' footer links from another site, are they 'bad' links?
Hi everyone,
Intermediate & Advanced SEO | | romanbond
one of my sites has about 1000 'nofollow' links from the footer of another of my sites. Are these in any way hurtful? Any help appreciated..0 -
What's the best way to check Google search results for all pages NOT linking to a domain?
I need to do a bit of link reclamation for some brand terms. From the little bit of searching I've done, there appear to be several thousand pages that meet the criteria, but I can already tell it's going to be impossible or extremely inefficient to save them all manually. Ideally, I need an exported list of all the pages mentioning brand terms not linking to my domain, and then I'll import them into BuzzStream for a link campaign. Anybody have any ideas about how to do that? Thanks! Jon
Intermediate & Advanced SEO | | JonMorrow0 -
What is the value of Google Crawling Dynamic URLS with NO SEO
Hi All I am Working on travel site for client where there are 1000's of product listing pages that are dynamically created. These pages are not SEO optimised and are just lists of products with no content other than the product details. There are no meta tags for title and description on the listings pages. You then click Find Out more to go to the full product details. There is no way to SEO these Dynamic pages This main product details has no content other than details and now meta tags. To help increase my google rankings for the rest of the site which is search optimised would it be better to block google from indexing these pages. Are these pages hurting my ability to improve rankings if my SEO of the content pages has been done to a good level with good unique Titles, descriptions and useful content thanks In advance John
Intermediate & Advanced SEO | | ingageseo0