ScreamingFrog won't crawl my site.
-
Hey guys,
My site is Netspiren.dk and when I use a tool like Screaming Frog or Integrity, it only crawls my homepage and menu's - not product-pages.
Examples
A menu: http://www.netspiren.dk/pl/Helse-Kosttilskud-Blandingsolie_57699.aspx
A product: http://www.netspiren.dk/pi/All-Omega-3-6-9-180-kapsler_1412956_57699.aspxIs it because the products are being loaded in Javascript?
What's your recommendation?All best,
Fred. -
Hi,
Thank you for this question and the responses because we encountered the same issue; Screaming Frog was only crawling a handful of products out of hundreds, because of JS. We made significant changes to the redirect rules on our dev site, and we want to make sure that the changes will not cause any crawling errors before we deploy to the live site. Is there any way to disable JS just for the purpose of a Screaming Frog crawl?
Our dev site is: https://msc-nop.com
Our regular site is: https://medicalscrubscollection.com
Thanks in advance!
-
I'm not sure if this has been fixed already, and thank you for Dan for chiming in, but I was able to crawl around 700 URLs.
-
Cheers @Andy & @Patrick
Hi Fred,
I haven't performed an extensive check, but the SEO Spider crawls around 35 URLs with /pi/ in the string, which is presumably not all the products on the site
Patrick actually mentions the issue in one of his points above. Essentially it looks like the site uses JavaScript on category pages for products, example - http://www.netspiren.dk/pl/Helse-Homøopati-Allergica-Ron-serien_58721.aspx
If you disable JS in your browser, you'll see a blank page where the products were. Our tool doesn't execute JS, although Google is much smarter and often can.
However, I'll leave you to verify that -
Hope that helps!
Cheers
Dan
-
I have sent Dan from Screaming Frog a tweet for you Fred. I'm sure he will be along presently
-Andy
-
Hi there
It's crawling for me. Here are a list of reasons why ScreamingFrog won't crawl your site:
- The site is blocked by robots.txt. A count of pages blocked by robots.txt is shown in the crawl overview pane on top right hand site of the user interface. You can configure the SEO Spider to ignore robots.txt by going to the “Basic” tab under Configuration->Spider.
- The site behaves differently depending on User Agent. Try changing the User Agent under Configuration->User Agent.
- The site requires JavaScript. Try looking at the site in your browser with JavaScript disabled.
- The site requires Cookies. Can you view the site with cookies disabled in your browser? Licenced users can enable cookies by going to Configuration->Spider and ticking “Allow Cookies” in the “Advanced” tab.
- The ‘nofollow’ attribute is present on links not being crawled. There is an option in Configuration->Spider under the “Basic” tab to follow ‘nofollow’ links.
- The page has a page level ‘nofollow’ attribute. The could be set by either a meta robots tag or an X-Robots-Tag in the HTTP header. These can be seen in the “Directives” tab in the “Nofollow” filter.
- The website is using framesets. The SEO Spider does not crawl the frame src attribute.
- The Content-Type header did not indicate the page is html. This is shown in the Content column and should be either text/html or application/xhtml+xml.
Run through your settings and check and see if you may have turned something on inadvertently that you didn't mean to. One thing you can try, is goto Configuration > Spider and then goto the last option Ignore robots.txt. Click the checkbox and try running it again.
It could just be a slow connection on your end. Give it a few minutes and see if any of the above suggestions work.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving multiple Sites to One Site and SEO Impact/Ideas
Hi there, We are in the process of moving 2 sites with higher page authority to another site we own (that is our company brand), so essentially 3 sites into one. We're at risk of losing a lot of SEO from the original 2 sites that have all the product information. We are doing this since we merged companies a couple years back and need one web precense. Anyhow, the site launch date is in 3 months and the recommendation is to start moving content over prior to that for top pages, which is a big undertaking when we are launching all the pages again with new content, redeisgn and moving sites in 3 months. If it's the right move, we should do it, but I just wanted to get opinions on how others have handled something similiar when moving to a site with lower site authority and trying not to lose rankings.
Intermediate & Advanced SEO | | lauramrobinson320 -
Links to my site still showing in Webmaster Tools from a non-existent site
We owned 2 sites, with the pages on Site A all linking over to similar pages on Site B. We wanted to remove the links from Site A to Site B, so we redirected all the links on Site A to the homepage on Site A, and took Site A down completely. Unfortunately we are still seeing the links from Site A coming through on Google Webmaster Tools for Site B. Does anybody know what else we can do to remove these links?
Intermediate & Advanced SEO | | pedstores0 -
301 redirects aren't passing value.
We recently migrated our shop to a new platform. We are using Wordpress for our main website, but we wanted a separate installation of Wordpress for our shop, so we left the main blog where it was, but moved the shop to a /shop/ sub directory with it's on WP installation. So now we have 2 installations of Wordpress. However, since we've done this, none of the pages on the new shop are ranking for anything. Their page rank is 0, and Moz page authority is 1 for every page on the new site. I've set up the proper 301 redirects, and they're redirecting fine, but none of the page value is coming over. It's been about a week now, and despite re-crawls by google, I'm not seeing any change. Also, one of the original (now re-directed) product pages still has a Page Authority of 13 according to Open Site Explorer. I know it's not high, but it had us ranking in the top 5 for a very important keyword, and now that value is being wasted. For example, one of our product pages that was ranking well was startupfashion.com/product/fashion-brand-line-sheet-template
Intermediate & Advanced SEO | | inkyj
That page is now redirected to
http://startupfashion.com/shop/product/fashion-line-sheet-template I've done 301's plenty of times and I've never seen this issue, so i'm wondering if it could have something to do with having multiple installations of Wordpress. I can't see any obvious issues with it... i have the Yoast SEO plugin configured properly on both installations, and all of the pages ARE being indexed by google. Not sure what is going on. Anyone have any experience with this, or have any ideas? Thanks!!0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Why Google won’t display the right page title?
I'm using a WordPress site with the WordPress SEO plugin by Yoast. Why Google won’t display the right page title for my CATEGORY pages? This is s example category page I'm having a problem with: http://bit.ly/1DReQPP
Intermediate & Advanced SEO | | soralsokal
In the source code of this category page you can see the title is:
<title>Yoga Übungen - Mit visuellen Guides, Videos & viel Inspiration</title> Now when I check the title in the SERPS it only gives me the the category name 'Yoga Übungen'. See screenshot here: http://awesomescreenshot.com/05942buz60
This happens with ALL the category pages on my site. Google uses the category name instead of the title provided in the source code. I found an article from Yoast dealing with this issue: https://yoast.com/google-page-title/ It's correct, that Google sometimes chooses a different page title, but this article doesn't address the exclusive category problem. For 'normal' pages or posts, Google always shows the title which I've setup in Yoast (and which is in the source code). I don't understand what goes wrong for category pages? Do any of you have a similar problem or experience with that?0 -
What to do when all products are one of a kind WYSIWYG and url's are continuously changing. Lots of 404's
Hey Guys, I'm working on a website with WYSIWYG one of a kind products and the url's are continuously changing. There are allot of duplicate page titles (56 currently) but that number is always changing too. Let me give you guys a little background on the website. The site sells different types of live coral. So there may be anywhere from 20 - 150 corals of the same species. Each coral is a unique size, color etc. When the coral gets sold the site owner trashes the product creating a new 404. Sometimes the url gets indexed, other times they don't since the corals get sold within hours/days. I was thinking of optimizing each product with a keyword and re-using the url by having the client update the picture and price but that still leaves allot more products than keywords. Here is an example of the corals with the same title http://austinaquafarms.com/product-category/acans/ Thanks for the help guys. I'm not really sure what to do.
Intermediate & Advanced SEO | | aronwp0 -
Backlinking 3 sites from same domain and backlinking main site too
Hello, we have 4 sites, in which 1 is a main site and rest 3 are niche sites All these 3 sites have dofollow links to main site from home page We got a high quality backlink - through which all 3 niche sites have got it from that domain Is it worth to add backlink from that domain to main site too, despite the fact the 3 sites already have recvd it and they all link to main site many thanks
Intermediate & Advanced SEO | | Modi0 -
What's the best practise for adding a blog to your site post panda? subdomain or subdirectory???
Should i use a subdomain or a subdirectory? i was going to use a subdirectory however i have been reading a lot of articles on the use of subdomains post panda and the advantages of using them instead of using subdirectories. Thanks Ari
Intermediate & Advanced SEO | | dublinbet0