Do you know any tool(s) to check if Google can crawl a URL?
-
Our site is currently blocking search bots that's why I can't use Google Webmaster Tools' URL fetch tool.
In Screamingfrog, there are dynamic pages that can't be found if I crawl the homepage.
Thanks in advance!
-
I'm using a tool called GSiteCrawler at the moment, I'm new to it, however it will list all crawlable pages and create a sitemap.xml for you too!
-
You could consider this online tool: http://web-sniffer.net
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Shortened URL is breaking when URL is in Upper Case
Hi there, Currently I'm having some troubling mitigating an odd occurrence with some redirected shortened URLs being in upper case. Here is how they should be behaving - www.rhinosec.com/webapp -> https://rhinosecuritylabs.com/landing/sample-report-webapp-pentest/
Web Design | | BCaudill
www.rhinosec.com/network -> https://rhinosecuritylabs.com/landing/sample-report-network-pentest/
www.rhinosec.com/se -> https://rhinosecuritylabs.com/landing/social-engineering-example-report/ but when the /______ is capitalized - for example - WEBAPP, NETWORK, SE; WordPress either gives me a 404 or guesses the pages and lands on: NETWORK = https://rhinosecuritylabs.com/assessment-services/network-penetration-testing/
SE = https://rhinosecuritylabs.com/assessment-services/secure-code-review/
WEBAPP = 404 I was wondering if this discrepancy should be taken care of in the Htaccess file, Cloudflare, or WordPress redirect plug-in?0 -
How does Google's AJAX Announcement Impact the likes of AngularJS?
Google's announcement last month about depreciating their AJAX crawl directive and Distilled's recent article have got me thinking a lot about how this change impacts frameworks like AngularJS. For those of you that use or are considering using frameworks like AngularJS, does this change impact you? Has it changed your mind about services like Prerender etc? All discussions relating to AJAX crawling welcome. Some resources to get started: https://prerender.io/js-seo/angularjs-seo-get-your-site-indexed-and-to-the-top-of-the-search-results/ https://www.distilled.net/resources/prerender-and-you-a-case-study-in-ajax-crawlability/
Web Design | | ecommercebc1 -
My 404 page is showing a 4xx error. How can that be fixed?
My actual 404 page is giving a 4xx error.
Web Design | | sbetzen
The page address is http://www.ecowindchimes.com/v/404.asp It loads fine... it is the page all 404's are directed to. Why is it showing a 404 error. The page works. How can this be fixed? Stephen0 -
Website URL Structures - Which does Google prefer or does it matter?
Which URL structure does google prefer..............OR DOES IT REALLY MATTER? Option A www.example.com/services/service#1 - this is the default that wordpress uses Option B www.example.com/service#1
Web Design | | webestate0 -
Parameters - Google Web Master Tools
In Google Web Mastertools you can stipulate which paramters you want the Googlebots to ignore when crawling your site. This is common place on pages that add some form of parameterisaton to the end of the link when a web user filters the information on a page (eg. on a clothes website someone may filter the products so they only see 'blue' jumpers, rather than 'all') This is meant to be beneficial as it means Google trawls through less duplicate content. Having now set this up, what impact will this have on my search results, if any? Don't get me wrong, I'm not expecting to shoot up to no.1, but will it benefit me in any way?
Web Design | | DHS_SH0 -
URLs with Hashtags - Does Google Index Them?
Hi there, I have a potential issue with a site whereby all pages are dynamically populated using Javascript. Thus, an example of an URL on their site would be www.example.com/#!/category/product. I have read lots of conflicting information on the web - some says Google will ignore everything after the hashtag; other people say that Google will now index everything after the hashtag. Does anybody have any conclusive information about this? Any links to Google or Matt Cutts as confirmation would be brilliant. P.S. I am aware about the potential issue of duplicate content, but I can assure you that has been dealt with. I am only concerned about whether Google will index full URLs that contain hashtags. Thanks all! Mark
Web Design | | markadoi840 -
Does disabling the "View Source" functionality prevent Google from crawling a website?
I know Google uses a lot of variables when crawling a website. I wasn't sure if disabling the "View Source" option hindered anything.
Web Design | | innovationsimple0 -
Two URLs with same content
We recently had a client who own multiple brands switch from having multiple urls to having a single domain with multiple sub domains. I've posted an example below to better explain. My question is the original url is still functional, so there are two urls with identical content, yet I haven't been getting a duplicate content error. Also, would a rel canonical link be beneficial in this case since the duplicate content is on two separate domains? My thoughts were to put a 301 redirect on the original pages so they permanently forward to the new sub-domain format. Is this the best course of action? If not, what would you recommend? Example: Original URLs
Web Design | | BluespaceCreative
www.example1.com
www.example2.com
www.example3.com
www.parentcompany.com New URLs
example1.parentcompany.com
example2.parentcompany.com
example3.parentcompany.com
www.parentcompany.com Let me know if this I need to clarify anything in better detail.
Thanks in advance!0