Screaming frog Advice
-
Hi
I am trying to crawl my site and it keeps crashing.
My sys admins keeps upgrading the virtual box it sits on and it now currently has 8GB of memory, but still crashes.
It gets to around 200k pages crawl and dies.
Any tips on how I can crawl my whole site, can u use screaming frog to crawl part of a site.
Thanks in advance for any tips.
Andy
-
Thanks, I tried all the tips on the screaming frog site, but I have just tried to 2 pages a second and lets hope that work.
-
Hi Andy. There are quite a few settings you can adjust to make the server load less while the crawl is running. These can be found with descriptions here: http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
For example, by not checking Images, CSS, SWF, and Javascript you'll be able to lessen load substantially, or if you'd like to crawl just a portion of the site you can set it to not check links outside of the start folder.
To have even more control over the crawl, you can use regular expressions to exclude certain pages, or sections that match a given pattern. The page above is fairly robust, so it should help you dial back the crawler to be friendlier to your server. Cheers!
-
Hey there mate,
Sorry to hear that you are having issues. You can actually ask Screaming Frog to use more RAM. If you haven't done that yet please give it a go.
You can find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/general/
If you want to crawl part of your site it can surely do that. You can exclude pages or whole sections.
Find more here http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Probably basic, but how to use image Title and Alt Text - and confusing advice from Moz!
I've been doing SEO on my business's site for years and have got good results. I've always used image Titles and Alt Text text. Our blog posts are image-intensive, often with 100-200 pictures (not surprising since we're photographers). For any given blog post, I've tended to have a uniform image Title for each image and then a more specialised Alt Text tag giving a description. A typical image on one of our blog posts would be like this: Image filename: wedding-photography-at-so-and-so-venue-001.jpg .... 002, 003 etc Image Title Attribute: Wedding Photography at So-And-So-Venue by Our-Company-Name - this would be the same for every image in the blog post. Alternative Text: Bride and groom exchanging vows during wedding ceremony at so-and-so-venue - this would be tailed for each image. So my question is - is this right? The Moz help page for image SEO is actually incorrect in one aspect: https://moz.com/ugc/10-tips-for-optimizing-your-images-for-search "Alt text (short for “alternative text”) is used to highlight the identity of an image when you hover over it with your mouse cursor. It also shows as text to all users when there are problems rendering the image." This is not the case. Hovering over the image in Firefox, Chrome, Edge and Opera ALL display the Image Title, NOT Alt Text. Thoughts?
Intermediate & Advanced SEO | | robandsarahgillespie0 -
Sreaming Frog vs. Yoast - meta description clash
Hi all, I'm working on a site where when I crawl it with SF, SF doesn't pick up on the meta description (as in the source code it IS blank). However, the meta description has been set via the Yoast Wordpress plugin and it does exist in the source code and is shown in the SERPs. The code looks like this: <title>Dining Table and Chairs set</title> So my question is: will this be affecting SEO and how the website is ranking if all the actual are blank? Thank you
Intermediate & Advanced SEO | | Bee1591 -
Self referencing canonicals and paginated content - advice needed
Hi, I help manage a large site that uses a lot of params for tracking, testing and to help deal with paginated content e.g. abc.com/productreview?page=2. The paginated review content correctly uses rel next and rel prev tags to ensure we get the value of all of the paginated review content that we have. The volume of param exclusions I need to maintain in Google & Bing Webmaster tools is getting clunky and frustrating. I would like to use self referencing canonicals, which would make life a lot easier. Here's my issue: If I use canonicals on the review pages the paginated content urls would also use the same canonical e.g. /productreview?page=2 pointing to /productreview I believe I am going to lose the value of those reviews, even though they use the rel next rel prev tags. BTW airbnb do this - do they know something I don't, don't care about the paginated reviews, or are they doing it incorrectly, see http://d.pr/i/14mPU Is my assertion above correct about losing the value of the paginated reviews if I use self referencing canonicals? Any thoughts on a solution to clearing up the param problem or do I have to live with it? Thanks in advance, Andy
Intermediate & Advanced SEO | | AndyMacLean0 -
[Advice] Dealing with an immense URl structure full of canonicals with Budget & Time constraint
Good day to you Mozers, I have a website that sells a certain product online and, once bought, is specifically delivered to a point of sale where the client's car gets serviced. This website has a shop, products and informational pages that are duplicated by the number of physical PoS. The organizational decision was that every PoS were supposed to have their own little site that could be managed and modified. Examples are: Every PoS could have a different price on their product Some of them have services available and some may have fewer, but the content on these service page doesn't change. I get over a million URls that are, supposedly, all treated with canonical tags to their respective main page. The reason I use "supposedly" is because verifying the logic they used behind canonicals is proving to be a headache, but I know and I've seen a lot of these pages using the tag. i.e: https:mysite.com/shop/ <-- https:mysite.com/pointofsale-b/shop https:mysite.com/shop/productA <-- https:mysite.com/pointofsale-b/shop/productA The problem is that I have over a million URl that are crawled, when really I may have less than a tenth of them that have organic trafic potential. Question is:
Intermediate & Advanced SEO | | Charles-O
For products, I know I should tell them to put the URl as close to the root as possible and dynamically change the price according to the PoS the end-user chooses. Or even redirect all shops to the main one and only use that one. I need a short term solution to test/show if it is worth investing in development and correct all these useless duplicate pages. Should I use Robots.txt and block off parts of the site I do not want Google to waste his time on? I am worried about: Indexation, Accessibility and crawl budget being wasted. Thank you in advance,1 -
SEO advice with having a blog on sub domain.
Righto, so: I've been working on our company website www.nursesfornurses.com.au which is built on .asp which is a real pain because the site is built so messy and on a very dated CMS which means I have to go back to the dev every time I want to make a change. We've made the decision to move the site over to Wordpress in stages. So, (and I hope logically), i've started by making them a proper blog with better architecture to start targeting industry related keywords. I had to put it on a sub domain as the current hosting does not support Wordpress http://news.nursesfornurses.com.au/Nursing-news/
Intermediate & Advanced SEO | | 9868john
The previous blog is here: http://www.nursesfornurses.com.au/blog Its not live yet, so I'm just looking for SEO advice or issues I might encounter by having the blog on a sub domain. In terms of user experience, I realise that there needs a clearer link back to the main website, I'm just trying to work out the best way to do it... Any advice / criticism is greatly welcomed. Thanks0 -
I Need to put static text every page (600 words) need advice
i need to put static text (about our company brief 600 words) to all content section of pages of our website. I know it's bad for SEO Duplicate Content. But i need to tell google this is my static content and do NOT crawl it. Or something like that. canonical is for whole page but i need to set it up for certain positions of every page. is that possible?
Intermediate & Advanced SEO | | nopsts0 -
Advice on Link Removal Services
Hello everyone, Due to the Penguin update my site unfortunately took a bit of a hit. A little while ago I submitted all of our questionable/bad links to the disavow tool, however I still wante to go back and delete any and all problematic links that are still out there. Ive looked into many services, however I haven't been too impressed. Removeem - The email addresses they provided weren't always valid, and their email tool didn't always deploy correctly - a lot of cross referencing and was not saving me any time. Link Detox - Free trial was a bust. They show you 10 links on the free trial, however for me, 9 of the 10 were all the same. Couldn't get a good feel of their system. Rmoov - Their tool is one where you upload your own links, and they help manage everything, however they DONT allow you to email through their system, so Im not sure how this helps my process if I have to do everthing manaully anyway. A lot of sites I see are also a full service approach that charge you based on how many links they remove, and this can get quite costly. I have also contacted:
Intermediate & Advanced SEO | | Lukin
Link Delete - No reponse from multiple email requests
Linkquidator - No response
Infatex - No response My questions to all of you are: Is there any company out there that you recommend that provide a self service tool [online or desktop driven]? Is this even an avenue I should explore, or should I compile my own list [as 3rd party algorithms are not always accurate] and reach out to sites manually? Is disavowing good enough and Im just spinning my wheeles trying to now get them all removed? Thanks!0 -
Advice on outranking Amazon and other big names in eCommerce
I have a client that is targeting some product related keywords. They are on page one for them but Amazon, OfficeMax and Staples are ranking in the top 3 spots for this specific product. Before I start targeting completely different words, do you have any advice on how to tackle big name eCommerce sites who are ranking higher than you. Thank you!
Intermediate & Advanced SEO | | TheOceanAgency0