Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Screaming frog Advice
-
Hi
I am trying to crawl my site and it keeps crashing.
My sys admins keeps upgrading the virtual box it sits on and it now currently has 8GB of memory, but still crashes.
It gets to around 200k pages crawl and dies.
Any tips on how I can crawl my whole site, can u use screaming frog to crawl part of a site.
Thanks in advance for any tips.
Andy
-
Thanks, I tried all the tips on the screaming frog site, but I have just tried to 2 pages a second and lets hope that work.
-
Hi Andy. There are quite a few settings you can adjust to make the server load less while the crawl is running. These can be found with descriptions here: http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
For example, by not checking Images, CSS, SWF, and Javascript you'll be able to lessen load substantially, or if you'd like to crawl just a portion of the site you can set it to not check links outside of the start folder.
To have even more control over the crawl, you can use regular expressions to exclude certain pages, or sections that match a given pattern. The page above is fairly robust, so it should help you dial back the crawler to be friendlier to your server. Cheers!
-
Hey there mate,
Sorry to hear that you are having issues. You can actually ask Screaming Frog to use more RAM. If you haven't done that yet please give it a go.
You can find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/general/
If you want to crawl part of your site it can surely do that. You can exclude pages or whole sections.
Find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical and Alternate Advice
At the moment for most of our sites, we have both a desktop and mobile version of our sites. They both show the same content and use the same URL structure as each other. The server determines whether if you're visiting from either device and displays the relevant version of the site. We are in a predicament of how to properly use the canonical and alternate rel tags. Currently we have a canonical on mobile and alternate on desktop, both of which have the same URL because both mobile and desktop use the same as explained in the first paragraph. Would the way of us doing it at the moment be correct?
Intermediate & Advanced SEO | | JH_OffLimits3 -
Please help need some advice?
Can any of you guys please help me I have alerts on links coming in and it looks like recently someone did this, it looks maliciously done as it is only our domain mentioned and most are brand new posts? http://testosteroneclinicindenve53950.shotblogs.com/testosterone-clinic-in-denver-fundamentals-explained-6102386 http://claytondmnnp.ampedpages.com/Details-Fiction-and-testosterone-clinic-in-denver-16897309 http://vinylvehiclecarwrap38041.alltdesign.com/a-review-of-vinyl-vehicle-car-wrap-9574042 http://devinxccct.educationalimpactblog.com/1784474/little-known-facts-about-vinyl-vehicle-car-wrap http://keeganbsftf.ka-blogs.com/7488539/how-vinyl-vehicle-car-wrap-can-save-you-time-stress-and-money http://andybxoes.thezenweb.com/vinyl-vehicle-car-wrap-Fundamentals-Explained-17581028 http://kylerhfdzu.blogkoo.com/not-known-details-about-vinyl-vehicle-car-wrap-9029141 http://troyytkyn.timeblog.net/7695911/the-greatest-guide-to-vinyl-vehicle-car-wrap http://waylontyzab.pointblog.net/testosterone-clinic-in-denver-Secrets-16335972 http://testosteroneclinicindenve30516.onesmablog.com/Top-testosterone-clinic-in-denver-Secrets-17252737 http://emiliogkmop.blogofoto.com/7667522/top-guidelines-of-testosterone-clinic-in-denver http://caidenaczxt.blogs-service.com/7514172/testosterone-clinic-in-denver-fundamentals-explained http://daltonpyfms.mybjjblog.com/5-simple-statements-about-testosterone-clinic-in-denver-explained-6517932 Should I try to disavow these and submit to google or will google know our site which has been up for 5 years is not doing this? Should I do any of these https://tehnoblog.org/google-webmaster-tools-my-website-got-bombed-with-backlinks-what-to-do/
Intermediate & Advanced SEO | | BobAnderson0 -
SEO'ing a sports advice website
Hi Team Moz, Despite being in tech/product development for 10+ years, I'm relatively new to SEO (and completely new to this forum) so was hoping for community advice before I dive in to see how Google likes (or perhaps doesn't) my soon to be built content. I'm building a site (BetSharper, an early-stage work in progress) that will deliver practical, data orientated predictive advice prior to sporting events commencing. The initial user personas I am targeting would need advice on specific games so, as an example, I would build a specific page for the upcoming Stanley Cup Game 1 between the Capitals and the Tampa Bay Lighting. I'm in the midst of keyword research and believe I have found some easier to achieve initial keywords (I'm realistic, building my DA will take time!) that include the team names but don't reference dates or state of the tournament. The question is, hypothetically if I ranked for this page for this sporting event this year, would it make sense to refresh the same page with 2019 matchup content when they meet again next year, or create a new page? I am assuming I would be targeting the same intended keywords but wondering if I get google credit for 2018 engagement post 2019 refresh. Or should I start fresh with a new page and specifically target keywords afresh each time? I read some background info on canonical tabs but wasn't sure if it was relevant in my case. I hope I've managed to articulate myself on what feels like an edge case within the wonderful world of SEO. Any advice the community delivers would be much appreciated...... Kind Regards James.
Intermediate & Advanced SEO | | JB19770 -
302 > 302 > 301 Redirect Chain Issue & Advice
Hi everyone, I recently relaunched our website and everything went well. However, while checking site health, I found a new redirect chain issue (302 > 302 > 301 > 200) when the user requests the HTTP and non-www version of our URL. Here's what's happening: • 302 #1 -- http://domain.com/example/ 302 redirects to http://domain.com/PnVKV/example/ (the 5 characters in the appended "subfolder" are dynamic and change each time)
Intermediate & Advanced SEO | | Andrew_In_Search_of_Answers
• 302 #2 -- http://domain.com/PnVKV/example/ 302 redirects BACK to http://domain.com/example/
• 301 #1 -- http://domain.com/example/ 301 redirects to https://www.domain.com/example/ (as it should have done originally)
• 200 -- https://www.domain.com/example/ resolves properly We're hosted on AWS, and one of my cloud architects investigated and reported GoDaddy was causing the two 302s. That's backed up online by posts like https://stackoverflow.com/questions/46307518/random-5-alpha-character-path-appended-to-requests and https://www.godaddy.com/community/Managing-Domains/My-domain-name-not-resolving-correctly-6-random-characters-are/td-p/60782. I reached out to GoDaddy today, expecting them to say it wasn't a problem on their end, but they actually confirmed this was a known bug (as of September 2017) but there is no timeline for a fix. I asked the first rep I spoke with on the phone to send a summary, and here's what he provided in his own words: From the information gathered on my end and I was able to get from our advanced tech support team, the redirect issue is in a bug report and many examples have been logged with the help of customers, but no log will be made in this case due to the destination URL being met. Most issues being logged are site not resolving properly or resolving errors. I realize the redirect can cause SEO issues with the additional redirects occurring. Also no ETA has been logged for the issue being reported. I do feel for you since I now understand more the SEO issues it can cause. I myself will keep an eye out for the bug report and see if any progress is being made any info outside of this I will email you directly. Thanks. Issue being Experienced: Domains that are set to Go Daddy forwarding IPs may sometimes resolve to a url that has extra characters appended to the end of them. Example: domain1.com forwards to http://www.domain2.com/TLYEZ. However it should just forward to http://www.domain2.com. I think this answers what some Moz users may have been experiencing sporadically, especially this previous thread: https://moz.com/community/q/forwarded-vanity-domains-suddenly-resolving-to-404-with-appended-url-s-ending-in-random-5-characters. My question: Given everything stated above and what we know about the impact of redirect chains on SEO, how severe should I rate this? I told my Director that I would recommend we move away from GoDaddy (something I don't want to do, but feel we _**have **_to do), but she viewed it as just another technical SEO issue and one that didn't necessarily need to be prioritized over others related to the relaunch. How would you respond in my shoes? On a scale of 1 to 10 (10 being the biggest), how big of a technical SEO is this? Would you make it a priority? At the very least, I thought the Moz community would benefit from the GoDaddy confirmation of this issue and knowing about the lack of an ETA on a fix. Thanks!0 -
Screaming Frog returning both HTTP and HTTPS results...
Hi, About 10 months I switched from HTTP to HTTPS. I then switched back (long story). I noticed that Screaming Frog is picking up the HTTP and HTTPS version of the site. Maybe this doesn't matter, but I'd like to know why SF is doing that. The URL is: www.aerlawgroup.com Any feedback, including how to remove the HTTPS version, is greatly appreciated. Thanks.
Intermediate & Advanced SEO | | mrodriguez14400 -
Strange 404s in Screaming Frog
I just ran a website (Drupal) through screaming frog and the only 404s I found related to web pages which were the same as URLs already used on the website plus the company phone number so... www.company.com/[their phone number] - www.company.com/services[their phone number] - any ideas what might be causing this problem?
Intermediate & Advanced SEO | | McTaggart0 -
Can't crawl website with Screaming frog... what is wrong?
Hello all - I've just been trying to crawl a site with Screaming Frog and can't get beyond the homepage - have done the usual stuff (turn off JS and so on) and no problems there with nav and so on- the site's other pages have indexed in Google btw. Now I'm wondering whether there's a problem with this robots.txt file, which I think may be auto-generated by Joomla (I'm not familiar with Joomla...) - are there any issues here? [just checked... and there isn't!] If the Joomla site is installed within a folder such as at e.g. www.example.com/joomla/ the robots.txt file MUST be moved to the site root at e.g. www.example.com/robots.txt AND the joomla folder name MUST be prefixed to the disallowed path, e.g. the Disallow rule for the /administrator/ folder MUST be changed to read Disallow: /joomla/administrator/ For more information about the robots.txt standard, see: http://www.robotstxt.org/orig.html For syntax checking, see: http://tool.motoricerca.info/robots-checker.phtml User-agent: *
Intermediate & Advanced SEO | | McTaggart
Disallow: /administrator/
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /modules/
Disallow: /plugins/
Disallow: /tmp/0 -
SEO Advice for Angular JS
We are changing our homepage (and gradually the rest of the site) to Angular JS.
Intermediate & Advanced SEO | | theLotter
In order not to lose anything in terms of SEO we are implementing Hashbangs + escaped fragment snapshots. Are there any other SEO considerations you think we should have and/or additional elements that we could add to the page to improve it in terms of SEO?0