Screaming frog Advice
-
Hi
I am trying to crawl my site and it keeps crashing.
My sys admins keeps upgrading the virtual box it sits on and it now currently has 8GB of memory, but still crashes.
It gets to around 200k pages crawl and dies.
Any tips on how I can crawl my whole site, can u use screaming frog to crawl part of a site.
Thanks in advance for any tips.
Andy
-
Thanks, I tried all the tips on the screaming frog site, but I have just tried to 2 pages a second and lets hope that work.
-
Hi Andy. There are quite a few settings you can adjust to make the server load less while the crawl is running. These can be found with descriptions here: http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
For example, by not checking Images, CSS, SWF, and Javascript you'll be able to lessen load substantially, or if you'd like to crawl just a portion of the site you can set it to not check links outside of the start folder.
To have even more control over the crawl, you can use regular expressions to exclude certain pages, or sections that match a given pattern. The page above is fairly robust, so it should help you dial back the crawler to be friendlier to your server. Cheers!
-
Hey there mate,
Sorry to hear that you are having issues. You can actually ask Screaming Frog to use more RAM. If you haven't done that yet please give it a go.
You can find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/general/
If you want to crawl part of your site it can surely do that. You can exclude pages or whole sections.
Find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Suggested Screaming Frog configuration to mirror default Googlebot crawl?
Hi All, Does anyone have a suggested Screaming Frog (SF) configuration to mirror default Googlebot crawl? I want to test my site and see if it will return 429 "Too Many Requests" to Google. I have set the User Agent as Googlebot (Smartphone). Is the default SF Menu > Configuration > Speed > Max Threads 5 and Max URLs 2.0 comparable to Googlebot? Context:
Intermediate & Advanced SEO | | gravymatt-se
I had tried NetPeak SEO Spider which did a nice job and had a cool feature that would pause a crawl if it got to many 429. Long Story short, B2B site threw 429 Errors when there should have been no load on a holiday weekend at 1:00 AM.0 -
Self referencing canonicals and paginated content - advice needed
Hi, I help manage a large site that uses a lot of params for tracking, testing and to help deal with paginated content e.g. abc.com/productreview?page=2. The paginated review content correctly uses rel next and rel prev tags to ensure we get the value of all of the paginated review content that we have. The volume of param exclusions I need to maintain in Google & Bing Webmaster tools is getting clunky and frustrating. I would like to use self referencing canonicals, which would make life a lot easier. Here's my issue: If I use canonicals on the review pages the paginated content urls would also use the same canonical e.g. /productreview?page=2 pointing to /productreview I believe I am going to lose the value of those reviews, even though they use the rel next rel prev tags. BTW airbnb do this - do they know something I don't, don't care about the paginated reviews, or are they doing it incorrectly, see http://d.pr/i/14mPU Is my assertion above correct about losing the value of the paginated reviews if I use self referencing canonicals? Any thoughts on a solution to clearing up the param problem or do I have to live with it? Thanks in advance, Andy
Intermediate & Advanced SEO | | AndyMacLean0 -
Organic search data not representative of site Authority, need advice
Hi, I seeking some advice, I have an organic search issue, I would like to figure out if there is any reason why my site www.aatravel.co.za would not be doing well in the rankings? This domain is more powerful than a previous Domain we had, 51 versus 37 according to MOZ, but despite this it is not ranking nearly as well. There are a few things to consider. The domain was owned by us then got taken away about 3 years ago and then 301ed to a completely new site, then it was 404ed for about a year before we got it back, and now we have it back and have populated it with the same data as the less powerful Domain www.aaholidays.co.za. I believe that most of the AA Travel Authority comes from a stronger backlink profile. Why would this now 2 month after we reskinned and converted 301s back not be ranking as highly? Is there an issue with old site structure and google not passing through the 301 link juice from old pages that have links to the new ones(we have 301ed them)? Also I have 301ed the old aaholidays.co.za site to this one as the new home of AA Travel, that organic traffic was at about 8 000 visits a month, and the new site is at about 2 300. Has Google sandboxed the Domain for a certain period of time, or is there something else that may be the matter?
Intermediate & Advanced SEO | | ProsperoDigital0 -
Advice Needed: Why Is My Site Not Ranking Despite All The White Hat I Have Done?
Hi all, I have tried all white hat ways to make my local business website rank well in Google. We have done: 1.) Good quality content for our site on regular basis
Intermediate & Advanced SEO | | chanel27
2.) Submit to Google sitemap
3.) Link in an ethical way
4.) Post on social media sites No Google Panda content farming
No Google Penguin unnatural linking In fact, we have more quality articles to share compared to other laundry dry cleaning websites in Singapore. Can anyone advice me on why my site is my ranking well? Site:
http://www.drycleaning.com.sg0 -
Need advice for indexing a multilingual website
We are in the process of creating a Spanish subdomain of our website. I want to know what needs to be done in regard to meta tags, sitemap.xml and robots.txt so that Google and Bing will index both website properly and not causing the web page on the English site to lost rank. Our English site is www.mydomain.com with the Spanish site being es.mydomain.com We are planning to put a button or link on both sites so that visitors can switch between both sites. The two sites are similar but not all pages are mirror images.
Intermediate & Advanced SEO | | Qualbe-Marketing-Group0 -
Advice needed on how to handle alleged duplicate content and titles
Hi I wonder if anyone can advise on something that's got me scratching my head. The following are examples of urls which are deemed to have duplicate content and title tags. This causes around 8000 errors, which (for the most part) are valid urls because they provide different views on market data. e.g. #1 is the summary, while #2 is 'Holdings and Sector weightings'. #3 is odd because it's crawling the anchored link. I didn't think hashes were crawled? I'd like some advice on how best to handle these, because, really they're just queries against a master url and I'd like to remove the noise around duplicate errors so that I can focus on some other true duplicate url issues we have. Here's some example urls on the same page which are deemed as duplicates. 1) http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE http://markets.ft.com/Research/Markets/Tearsheets/Holdings-and-sectors-weighting?s=IVPM:LSE http://markets.ft.com/Research/Markets/Tearsheets/Summary?s=IVPM:LSE&widgets=1 What's the best way to handle this?
Intermediate & Advanced SEO | | SearchPM0 -
Conversion Rate Optimisation - advice before seeking out a specialist
Hello! I have a site I've been working on for a lovely client, a small business start up since Christmas. The site has a very simple layout, is ranking well and maintaining its positions, has solid social media, is receiving enough traffic and ranking for a number of terms. The problem is - conversions! The site just isn't converting. I have spoken with a few peers who have said advanced CRO will be too much for me to learn in terms of Psychology of Buying, learning about colors, fonts etc. I understand meta descriptions for example are something that I can do, I was wondering if anyone could give me advice on any other basic CRO techniques I could apply to the site before going to a specialist. Any advice would be MUCH appreciated - the moz community is always so helpful! Charlotte 🙂
Intermediate & Advanced SEO | | CharlotteWaller1 -
Any advice for setting up a Job Board?
Hi- I've got a big client who is setting up an in-house job board, ie. as part of their own site. They are a diverse company that offers a wide variety of roles across the whole country. The software they have chosen to use is not in any way SEO-focused. Therefore I'll need to recommend some modifications to the sitemap created by the web design team, within the time and budget available to me. At this stage I am thinking along the lines of determining the major geographical areas and job sectors and creating summary (landing) pages such as /jobs/california/electrical-engineering which contain any currently available roles for that geo+sector. I've tried to find guidance on job board SEO optimization or even case studies but haven't found much at all. This seems useful though: http://www.jobboardmount.com/cm/features/seo_dashboard Does anyone have any tips or links to useful information on job board SEO? Thanks in advance! Jules
Intermediate & Advanced SEO | | Juller1