Indexed sites
-
Hello I´m a newbie here on seomoz. My question is just simple but i need to be sure about that:Refering to the report which you can create in the "crawl diagnostic summary". Does the CSV export lists all and ONLY the URLS which are in the google index?If not: does a report is available where all and ONLY the URLS are listed which are in google index?Many thanks! Henrik
-
Hi Henrik,
Like Tom and Chris mentioned, our tools won't be able to tell you which of your pages are indexed by Google.
That being said, the crawl diagnostics will tell you if your redirected pages are reachable by search engines and OSE will tell you internal and external links. OSE is updated every few weeks so it might not show your most recent changes. Using a combination of your campaign crawls and OSE should be able to get you the data that you need to make sure you're in the best position for your site relaunch.
Hope this helps and good luck with relaunching your site!
Best,
Sam
Moz Helpster -
Use your OSE report to find pages that are being linked to from external sources and be sure to redirect all of those. Use your analytics report to find pages that are bring in search traffic and be sure to redirect those. The rest are neither being linked to nor bringing you any traffic so their priority is on the low end, anyway.
-
thanks for your answers, my question was more to get a full report with sites which are indexed by google. I´mgoing to make a relauch soon. In the new website i changed a lot of categories and URL´s. Now I want to make sure that all sites which are indexed are redirected to the new URL with 301. It would be very helpfull to have a file whith ALL Urls which are indexed by google now.
-
Hi goodcat, you'll see from the FAQs on this page http://www.seomoz.org/help/crawl-diagnostics that the report data is from seomoz's crawl. Your question was asked and some answeres provided not too long ago over here: http://www.seomoz.org/q/is-it-possible-to-get-a-list-of-pages-indexed-in-google.
-
Hi Henrik
I have just checked this for you and yes, SEOMoz will show you URLs that have been deindexed by Google. As proof, here is a URL that contains a link to a site I've worked with that has come up in the SEOMoz crawl, but you will see has been de-indexed by Google (as has the whole domain):
http://www.doublebitbd.com/otherservices.htm
I'm not sure if there is a tool that will show you links that only exist in the Google index, I'm afraid, although I could be wrong. I do know a tool that will tell you which links pointing to your website have been deindexed and that is the LinkDetox tool. It will show you which links on your site are "toxic", meaning the URLs they sit on have been deindexed. The remaining links will be ones that point to your website.
You could also export the URLs in the SEOMoz report and put them into this IndexCheckingTool. Not sure how accurate it is, but if it gives you URLs that are noindexed from your Moz list, you could manually filter them out.
Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our crawler was not able to access the robots.txt file on your site.
Good morning, Yesterday, Moz gave me an error that is wasn't able to find our robots.txt file. However, this is a new occurrence, we've used Moz and its crawling ability many times prior; not sure why the error is happening now. I validated that the redirects and our robots page are operational and nothing is disallowing Roger in our robots.txt. Any advice or guidance would be much appreciated. https://www.agrisupply.com/robots.txt Thank you for your time. -Danny
Moz Pro | | Danny_Gallagher0 -
Large site with content silo's - best practice for deep indexing silo content
Thanks in advance for any advice/links/discussion. This honestly might be a scenario where we need to do some A/B testing. We have a massive (5 Million) content silo that is the basis for our long tail search strategy. Organic search traffic hits our individual "product" pages and we've divided our silo with a parent category & then secondarily with a field (so we can cross link to other content silo's using the same parent/field categorizations). We don't anticipate, nor expect to have top level category pages receive organic traffic - most people are searching for the individual/specific product (long tail). We're not trying to rank or get traffic for searches of all products in "category X" and others are competing and spending a lot in that area (head). The intent/purpose of the site structure/taxonomy is to more easily enable bots/crawlers to get deeper into our content silos. We've built the page for humans, but included link structure/taxonomy to assist crawlers. So here's my question on best practices. How to handle categories with 1,000+ pages/pagination. With our most popular product categories, there might be 100,000's products in one category. My top level hub page for a category looks like www.mysite/categoryA and the page build is showing 50 products and then pagination from 1-1000+. Currently we're using rel=next for pagination and for pages like www.mysite/categoryA?page=6 we make it reference itself as canonical (not the first/top page www.mysite/categoryA). Our goal is deep crawl/indexation of our silo. I use ScreamingFrog and SEOMoz campaign crawl to sample (site takes a week+ to fully crawl) and with each of these tools it "looks" like crawlers have gotten a bit "bogged down" with large categories with tons of pagination. For example rather than crawl multiple categories or fields to get to multiple product pages, some bots will hit all 1,000 (rel=next) pages of a single category. I don't want to waste crawl budget going through 1,000 pages of a single category, versus discovering/crawling more categories. I can't seem to find a consensus as to how to approach the issue. I can't have a page that lists "all" - there's just too much, so we're going to need pagination. I'm not worried about category pagination pages cannibalizing traffic as I don't expect any (should I make pages 2-1,000) noindex and canonically reference the main/first page in the category?). Should I worry about crawlers going deep in pagination among 1 category versus getting to more top level categories? Thanks!
Moz Pro | | DrewProZ1 -
Can we disavow all spammy looking sites in OSE with a spam score of 5 or above?
Hello, We'd like to use OSE to make a disavow list. Can we just go through everything with a spam score of 5 or higher that looks like spam when we visit the site and disavow all of them? I'll be using Moz pro, are there any other free tools that I can utilize? What do I keep in mind? Thanks!
Moz Pro | | BobGW0 -
Rogerbot did not crawl my site ! What might be the problem?
When I saw the new crawl for my site I wondered why there are no errors, no warning and 0 notices anymore. Then I saw that only 1 page was crawled. There are no Error Messages or webmasters Tools also did not report anything about crawling problems. What might be the problem? thanks for any tips!
Moz Pro | | inlinear
Holger rogerbot-did-not-crawl.PNG0 -
Where do I post this list of hacked sites?
Hey guys, Fairly new to SEOmoz but loving it so far. I was working on a new clients site a noticed some spammy links added right before the tag. Used Open site explorer to list the domains linking to the url and found nearly 300 unsuspecting domains. Some like heartresearch.com.au which just drives me craaazy, I have already emailed them. Below is the list. http://www.opensiteexplorer.org/links.html?group=0&page=3&site=www.rhcie.com Short of emailing every single person can anyone suggest a forum or such that would be helpful for posting this information ? I know it's just a few links but it is frustrating to me and If I can do something about it I would like to. Thanks in advance. Jason
Moz Pro | | RedshiftWebDesign0 -
Csv download from open site explorer
After I run a report in Open Site explorer and download the csv, the bar says it processing # of 10,000 links, when the report is done and i open it there are only 450 links
Moz Pro | | thesea0 -
To Many Links on site
I've had an issue with to many links on the site. My drop down menu, secondary footer and footer. The report told me that I had 253 links on each page. I then programmed my secondary footer to dynamic and ran a crawl and my links reduced accordingly to 201. Then turned the footer into dynamic and ran a crawl with my links increasing to 1500. This also happened between each phase but en went away. Oddly enough, my domain authority increased as well as other factors in the crawl report. This too many links thing is driving me crazy. Please provide some guidance.
Moz Pro | | CHADHARRIS0 -
Competitive .edu Research via Open Site Explorer
I was using open site explorer and trying to figure out how my competitors are getting so many .edu links. Now I won't mention any names here but I am trying to figure out why almost all their links point to downloads of documents. Here are a few of the examples of sites I keep finding... www.cs.uoregon.edu/research/paracomp/tau/monitor/sheehan_examples/tim_context_xmpl?wiki=TailleMaximaleDesFichiersSurUnePartition/orphan www.atmos.albany.edu/facstaff/mathias/video/090926_wima_klimawandel.ivr?url_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.jtitle=American Journal Clinical Nutrition&rft.atitle=Problems with red meat in the WCRF2&rft.volume=89&rft.spage=1274&rft.epage=9&rft.date=2009&rft.aulast=Truswell&rft.aufirst=AS&rfr_id=info%3Asid%2Fwiley.com%3AOnlineLibrary faculty.unlv.edu/jensen/CEE_468/GISTutorialWorkbook/tutorial11/LandUsePgh.lpk?arrow=nws&read=24922 <colgroup><col width="3111"></colgroup>
Moz Pro | | MichealGooden0