Setting A Custom User Agent in Screaming Frog
-
Hi all,
Probably a dumb question, but I wanted to make sure I get this right.
How do we set a custom user agent in Screaming Frog? I know its in the configuration settings, but what do I have to do to create a custom user agent specifically for a website?
Thanks much!
- Malika
-
Setting a custom user agent determines things like HTTP/2 so there can be a big difference if you change it to something that might not take advantage of something like HTTP/2
Apparently, it is coming to Pingdom very soon just like it is to Googlebot
http://royal.pingdom.com/2015/06/11/http2-new-protocol/
This Is an excellent example of a user agent's ability to modify the way your site is crawled as well as how efficient it is.
https://www.keycdn.com/blog/https-performance-overhead/
It is important to note that we didn’t use Pingdom in any of our tests because they use Chrome 39, which doesn’t support the new HTTP/2 protocol. HTTP/2 in Chrome isn’t supported until Chrome 43. You can tell this by looking at the
User-Agent
in the request headers of your test results.Pingdom user-agent
Note: WebPageTest uses Chrome 47 which does support HTTP/2.
Hope that clears things up,
Tom
-
Hi Malika,
Think about screaming frog and what it has to detect in order to do that correctly it needs the correct user agent syntax for it will not be able to make a crawl that would satisfy people.
Using a proper syntax for a user agent is essential and I have tried to be non-technical in this explanation I hope it works.
the reason screaming frog needs the user agent because the user-agent was added to HTTP to help web application developers deliver a better user experience. By respecting the syntax and semantics of the header, we make it easier and faster for header parsers to extract useful information from the headers that we can then act on.
Browser vendors are motivated to make web sites work no matter what specification violations are made. When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect
When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect
It is only by us application developers developing a healthy respect for the standards of the web, that the browser vendors will be able to start tightening up their codebase knowing that they don’t need to account for non-conformances.
For client libraries that do not enforce the syntax rules, you run the risk of using invalid characters that many server side frameworks will not detect. It is possible that only certain users, in particular, environments would identify the syntax violation. This can lead to difficult to track down bugs.
I hope this is a good explanation I've tried to keep it very to the point.
Respectfully,
Thomas
-
Hi Thomas,
would you have a simpler tutorial for me to understand? I am struggling a bit.
Thanks heaps in advance
-
I think I want something that is dumbed down to my level for me to understand. The above tutorials are great but not being a full time coder, I get lost while reading those.
-
Hi Matt,
I havent had a luck with this one yet.
-
Hi Malika! How'd it go? Did everything work out?
-
happy I could be of help let me know if there's any issue and I will try to be of help with it. All the best
-
Hi Thomas,
That's a lot of useful information there. I will have a go on it and let you know how it went.
Thanks heaps!
-
please let me know if I did not answer the question or you have any other questions
-
this gives you a very clear breakdown of user agents and their set of syntax rules. The following is valid example of user-agent that is full of special characters,
read this please http://www.bizcoder.com/the-much-maligned-user-agent-header
user-agent: foo&bar-product!/1.0a$*+ (a;comment,full=of/delimiters
references but you want to pay attention to the first URL
https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
| Mozilla/5.0 (X11; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 |
http://stackoverflow.com/questions/15069533/http-request-header-useragent-variable
-
if you formatted it correctly see below
User-Agent = product *( RWS ( product / comment ) )
and it was received by your headers yes you could fill in the blanks and test it.
https://mobiforge.com/research-analysis/webviews-and-user-agent-strings
http://mobiforge.com/news-comment/standards-and-browser-compatibility
-
No, you Cannot just put anything in there. The site has to recognize it and ask why you are doing this?
I have listed how to build and already built in addition to what your browser will create by using useragentstring.com
Must be formatted correctly and have it work with a header it is not as easy as it sometimes seems but not that hard either.
You can make & use this to make your own from your Mac or PC
http://www.useragentstring.com/
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2747.0 Safari/537.36
how to build a user agent
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
- https://developer.mozilla.org/en-US/docs/Setting_HTTP_request_headers
- https://msdn.microsoft.com/en-us/library/ms537503(VS.85).aspx
Lists of user agents
https://support.google.com/webmasters/answer/1061943?hl=en
https://msdn.microsoft.com/en-us/library/ms537503(v=vs.85).aspx
-
Hi Thomas,
Thanks for responding, much appreciated!
Does that mean, if I type in something like -
HTTP request user agent -
Crawler access V2
&
Robots user agent
Crawler access V2
This will work too?
-
To crawl using a different user agent, select ‘User Agent’ in the ‘Configuration’ menu, then select a search bot from the drop-down or type in your desired user agent strings.
http://i.imgur.com/qPbmxnk.png
&
Video http://cl.ly/gH7p/Screen Recording 2016-05-25 at 08.27 PM.mov
Or
Also see
http://www.seerinteractive.com/blog/screaming-frog-guide/
https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#user-agent
https://www.screamingfrog.co.uk/seo-spider/user-guide/
https://www.screamingfrog.co.uk/seo-spider/faq/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Same URLS different CMS and server set up. Page Authority now 1
We have moved a clients website over to a new CMS and onto a new server. The Domain and URLs on the main pages of the website are exactly the same so we did not do any 301 re directs. The overall Domain Authority of the site and the Page Authority of the Homepage, while having dropped a bit seem OK. However all the other pages now have a Pagerank of 1 I'm not exactly sure what the IT guys have done but there was some re routing on the server level applied. The move happened around the end of December 2014 And yes traffic has dropped significantly Any ideas?
Intermediate & Advanced SEO | | daracreative0 -
After Ranking Drop Continue SEO or Focus on Improving User Experience Instead?
Six months after starting a marketing campaign and spending a lot of money on SEO audits, link removals, wire frames, copywriting and coding my web site (www.nyc-officespace-leader.com) traffic dropped significantly after I launched a new version of my site in early June. Traffic is down about 27%, but most of the traffic from competitive terms is gone and the number of leads (phone calls, form completions) is off by about 70%. On june 6th an upgraded version of the site with mostly cosmetic changes (narrower header without social media buttons, streamlined conversion forms, new right rail was launched. No URLs were changed, and the text remained mostly the same. But somehow my developers botched up either canonical tags or Robot Text and 175 URLs with very little/no content were indexed by Google. At that point my ranking and traffic. A few days ago a request to remove those pages was made via Google WebmasterTools and now the number of pages indexed is down to 675 rather than the incorrect 850 from before. But ranking, traffic and lead generation have not yet recovered. After spending almost $25,000 over nine months this is rather frustrating. I might add the site has very few links from incoming domains and those links are not high quality. An SEO audit was performed in February and in April a link removal campaign occurred with about 30 domains agreeing to remove links and a disavow file being submitted for another 70-80 domains that would not agree to remove links. My SEO believes that we should focus on improving visitor engagement rather that on more esoteric SEO like trying to build incoming links. They think that improving useability will improve conversions and would generate results faster than traditional SEO. Also, they think that improving click through rates, reducing bounce rates will improve ranking by signaling to Google that the site is providing value to visitors. Does this sound like a reasonable approach? On one hand I don't see how my site with a MOZ domain authority could possibly compete against sites with a high number of quality incoming links and that maybe building a better link profile would yield faster results. On the other hand, it seems logical that Google would reward a site that creates a better user experience. Any thoughts from the MOZ community???? Does it sound like the recent loss of traffic is due to the indexing of the 175 pages? If so, when should my traffic and ranking return? Incidentally, these are the steps taken since last November to improve SEO: SEO Traffic & Ranking Drop Analysis and Recommendations (included in-depth SEO technical audit and recommendations). Unnatural Link Removal Program Content Optimization (Audit & Strategy with 20 page keyword matrix) CORE (also provided wireframe for /visitor-details pages at no-charge) SEO Copywriting for 10 pages New wire frames implemented on site on June 6th Jump in indexed pages by 175 on June 10th. Google Webmaster Tools removal request made for those low quality pages on June 23rd. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan11 -
Best set up for mobile site for SEO
Hello Does anyone have any input into what is the best way to have a mobile website URL structure for not responsive display sites. mobile.site.com www.site.com/m/ or neither have it just display on the same URL. Thanks
Intermediate & Advanced SEO | | christaylorconsulting0 -
Google Custom Searches with site CSS
Anyone good with GCS. I want to add Google custom searches in my site but with my site CSS.
Intermediate & Advanced SEO | | csfarnsworth
I need results from GCS but want to display with my website CSS. Website is in OSCommerce and php.0 -
Responsive design (Showing diffrent pages(icons) for Mobile/Tablet users)
I'm writing this question just to insure that we are implementing the responsive design correctly.Example of pages: http://www.yamsafer.me/en/united-arab-emirates/abu-dhabi/hotel/beach-rotana-abu-dhabiAnother : http://www.yamsafer.me/enCan we show different pages(Enhanced for mobile users) to mobile/Tablets visitors (sure same content) but with new icons that enhance the User experience for mobile/tablet users , while hiding these items to PC, laptop users?.Thanks
Intermediate & Advanced SEO | | Yamsafer.com0 -
Set up a rel canonical
I have a question. I was wondering, if it was possible to set up a rel canonical. When I can't access the non canonical pages? For example, my site as at www.site.com , but the non cannocail is at site.com is their any way to set thet up without actually edting it at site.com ? Thanks for your help
Intermediate & Advanced SEO | | PeterRota0 -
Altering Breadcrumbs based on User Path to Product URL
Hi, Our products are listed in multiple categories, and as the URLs are path dependent (example.com/fruit/apples/granny-smith/, example.com/fruit/green-fruit/granny-smith/ and so forth) we canonicalise to the 'default' URL (in this case example.com/fruit/apples/granny-smith/). For mainly crawling bandwidth issues I'm looking to change all product URL's to path neutral so there is only ever one URL per product (example.com/granny-smith/), but still list the product in multiple categories. If a user comes directly to example.com/granny-smith/ then the breadcrumbs will use the default path "Fruit > Apples", however if the user navigated to the product via another category then I'd like the breadcrumbs to reflect this. I'm not worried about cloaking as it's not based on user-agent and it's very logical why it's being done so I don't expect a penalty. My question is - how do you recommend this is achieved from a technical standpoint? Many sites use path neutral product URL's (Ikea, PCWorld etc) but none alter the breadcrumbs depending upon path. Our site is mostly behind a CDN so it has to be a client side solution. I currently view the options as: Store Path to product in a cookie and/or browsers local-cache Attach the Path details after a # in the URL and use Javascript to alter breadcrumbs onload with JQuery When a user clicks to a product from a listing page, use AJAX to pull in the product info but leave the rest of the page (including the breadcrumbs) as-is, updating the URL accordingly Do you think any of these wouldn't work? Do you have a preference on which one is best? Is there another method you'd recommend? We also have "Next/Previous" functionality (links to the previous and next product URLs) on the page so I suspect we'd need to attach the path after a # and make another round trip to the server onload to update the previous and next links. Finally, does anyone know of any sites that do update the breadcrumbs depending upon path? Thanks in advance for your time FashionLux
Intermediate & Advanced SEO | | FashionLux1 -
Is User Agent Detection still a valid method for blocking certain URL parameters from the Search Engines?
I'm concerned with the cloaking issue. Has anyone successfully implemented user agent detection to provide the Search engines with "clean" URLs?
Intermediate & Advanced SEO | | MyaRiemer0