Setting A Custom User Agent in Screaming Frog
-
Hi all,
Probably a dumb question, but I wanted to make sure I get this right.
How do we set a custom user agent in Screaming Frog? I know its in the configuration settings, but what do I have to do to create a custom user agent specifically for a website?
Thanks much!
- Malika
-
Setting a custom user agent determines things like HTTP/2 so there can be a big difference if you change it to something that might not take advantage of something like HTTP/2
Apparently, it is coming to Pingdom very soon just like it is to Googlebot
http://royal.pingdom.com/2015/06/11/http2-new-protocol/
This Is an excellent example of a user agent's ability to modify the way your site is crawled as well as how efficient it is.
https://www.keycdn.com/blog/https-performance-overhead/
It is important to note that we didn’t use Pingdom in any of our tests because they use Chrome 39, which doesn’t support the new HTTP/2 protocol. HTTP/2 in Chrome isn’t supported until Chrome 43. You can tell this by looking at the
User-Agent
in the request headers of your test results.Pingdom user-agent
Note: WebPageTest uses Chrome 47 which does support HTTP/2.
Hope that clears things up,
Tom
-
Hi Malika,
Think about screaming frog and what it has to detect in order to do that correctly it needs the correct user agent syntax for it will not be able to make a crawl that would satisfy people.
Using a proper syntax for a user agent is essential and I have tried to be non-technical in this explanation I hope it works.
the reason screaming frog needs the user agent because the user-agent was added to HTTP to help web application developers deliver a better user experience. By respecting the syntax and semantics of the header, we make it easier and faster for header parsers to extract useful information from the headers that we can then act on.
Browser vendors are motivated to make web sites work no matter what specification violations are made. When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect
When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect
It is only by us application developers developing a healthy respect for the standards of the web, that the browser vendors will be able to start tightening up their codebase knowing that they don’t need to account for non-conformances.
For client libraries that do not enforce the syntax rules, you run the risk of using invalid characters that many server side frameworks will not detect. It is possible that only certain users, in particular, environments would identify the syntax violation. This can lead to difficult to track down bugs.
I hope this is a good explanation I've tried to keep it very to the point.
Respectfully,
Thomas
-
Hi Thomas,
would you have a simpler tutorial for me to understand? I am struggling a bit.
Thanks heaps in advance
-
I think I want something that is dumbed down to my level for me to understand. The above tutorials are great but not being a full time coder, I get lost while reading those.
-
Hi Matt,
I havent had a luck with this one yet.
-
Hi Malika! How'd it go? Did everything work out?
-
happy I could be of help let me know if there's any issue and I will try to be of help with it. All the best
-
Hi Thomas,
That's a lot of useful information there. I will have a go on it and let you know how it went.
Thanks heaps!
-
please let me know if I did not answer the question or you have any other questions
-
this gives you a very clear breakdown of user agents and their set of syntax rules. The following is valid example of user-agent that is full of special characters,
read this please http://www.bizcoder.com/the-much-maligned-user-agent-header
user-agent: foo&bar-product!/1.0a$*+ (a;comment,full=of/delimiters
references but you want to pay attention to the first URL
https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
| Mozilla/5.0 (X11; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 |
http://stackoverflow.com/questions/15069533/http-request-header-useragent-variable
-
if you formatted it correctly see below
User-Agent = product *( RWS ( product / comment ) )
and it was received by your headers yes you could fill in the blanks and test it.
https://mobiforge.com/research-analysis/webviews-and-user-agent-strings
http://mobiforge.com/news-comment/standards-and-browser-compatibility
-
No, you Cannot just put anything in there. The site has to recognize it and ask why you are doing this?
I have listed how to build and already built in addition to what your browser will create by using useragentstring.com
Must be formatted correctly and have it work with a header it is not as easy as it sometimes seems but not that hard either.
You can make & use this to make your own from your Mac or PC
http://www.useragentstring.com/
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2747.0 Safari/537.36
how to build a user agent
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
- https://developer.mozilla.org/en-US/docs/Setting_HTTP_request_headers
- https://msdn.microsoft.com/en-us/library/ms537503(VS.85).aspx
Lists of user agents
https://support.google.com/webmasters/answer/1061943?hl=en
https://msdn.microsoft.com/en-us/library/ms537503(v=vs.85).aspx
-
Hi Thomas,
Thanks for responding, much appreciated!
Does that mean, if I type in something like -
HTTP request user agent -
Crawler access V2
&
Robots user agent
Crawler access V2
This will work too?
-
To crawl using a different user agent, select ‘User Agent’ in the ‘Configuration’ menu, then select a search bot from the drop-down or type in your desired user agent strings.
http://i.imgur.com/qPbmxnk.png
&
Video http://cl.ly/gH7p/Screen Recording 2016-05-25 at 08.27 PM.mov
Or
Also see
http://www.seerinteractive.com/blog/screaming-frog-guide/
https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#user-agent
https://www.screamingfrog.co.uk/seo-spider/user-guide/
https://www.screamingfrog.co.uk/seo-spider/faq/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Could I set a Cruise as an Event in Schema mark up?
Hi there, We are now in the process of implementing a JSON-LD mark-up solution and are building cruises as an event. Will this work and can we get away with this without penalty? Previously they have been marking their cruises as events using the data highlighter and this has displayed correctly in the SERP. The ideal schema would be Trip but this is not supported by Google Rich Results yet, hopefully they will support this in the future. Another alternative would be product but this does not display rich-results as we would like. Event has the best result in terms of how the information is displayed. For example someone might search "Cruises to Spain" and the landing page would display the next 3 cruises that go to Spain, with dates & prices. The event location would be the cruise terminal, the offer would be the starting price and the start & end date would be the cruise duration, these are fixed dates. I am interested to hear the communities opinion and experience with this problem.
Intermediate & Advanced SEO | | NoWayAsh1 -
Pages excluded from Google's index due to "different canonicalization than user"
Hi MOZ community, A few weeks ago we noticed a complete collapse in traffic on some of our pages (7 out of around 150 blog posts in question). We were able to confirm that those pages disappeared for good from Google's index at the end of January '18, they were still findable via all other major search engines. Using Google's Search Console (previously Webmastertools) we found the unindexed URLs in the list of pages being excluded because "Google chose different canonical than user". Content-wise, the page that Google falsely determines as canonical instead has little to no similarity to the pages it thereby excludes from the index. False canonicalization About our setup: We are a SPA, delivering our pages pre-rendered, each with an (empty) rel=canonical tag in the HTTP header that's then dynamically filled with a self-referential link to the pages own URL via Javascript. This seemed and seems to work fine for 99% of our pages but happens to fail for one of our top performing ones (which is why the hassle 😉 ). What we tried so far: going through every step of this handy guide: https://moz.com/blog/panic-stations-how-to-handle-an-important-page-disappearing-from-google-case-study --> inconclusive (healthy pages, no penalties etc.) manually requesting re-indexation via Search Console --> immediately brought back some pages, others shortly re-appeared in the index then got kicked again for the aforementioned reasons checking other search engines --> pages are only gone from Google, can still be found via Bing, DuckDuckGo and other search engines Questions to you: How does the Googlebot operate with Javascript and does anybody know if their setup has changed in that respect around the end of January? Could you think of any other reason to cause the behavior described above? Eternally thankful for any help! ldWB9
Intermediate & Advanced SEO | | SvenRi1 -
Hreflang for Canadian web visitors (when their browsers are set to en-us)
We're in the process of implementing hreflang markup for Canadian & US versions of a website. We've found that about half of our Canadian traffic has browsers that are set to en-us (instead of en-ca, as would be expected). Should we be concerned that Canadians with en-us browser settings will be shown the US versions of the website (as the hreflang would markup 'en-us' for the US version of the page). Our immediate thoughts are that since they're likely to be searching from Google.ca and would also have Canadian IP addresses, that this won't be an issue. Does anyone have any other thoughts here?
Intermediate & Advanced SEO | | ATMOSMarketing560 -
Membership/subscriber (/customer) only content and SEO best practice
Hello Mozzers, I was wondering whether there's any best practice guidance out there re: how to deal with membership/subscriber (existing customer) only content on a website, from an SEO perspective - what is best practice? A few SEOs have told me to make some of the content visible to Google, for SEO purposes, yet I'm really not sure whether this is acceptable / manipulative, and I don't want to upset Google (or users for that matter!) Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
After Ranking Drop Continue SEO or Focus on Improving User Experience Instead?
Six months after starting a marketing campaign and spending a lot of money on SEO audits, link removals, wire frames, copywriting and coding my web site (www.nyc-officespace-leader.com) traffic dropped significantly after I launched a new version of my site in early June. Traffic is down about 27%, but most of the traffic from competitive terms is gone and the number of leads (phone calls, form completions) is off by about 70%. On june 6th an upgraded version of the site with mostly cosmetic changes (narrower header without social media buttons, streamlined conversion forms, new right rail was launched. No URLs were changed, and the text remained mostly the same. But somehow my developers botched up either canonical tags or Robot Text and 175 URLs with very little/no content were indexed by Google. At that point my ranking and traffic. A few days ago a request to remove those pages was made via Google WebmasterTools and now the number of pages indexed is down to 675 rather than the incorrect 850 from before. But ranking, traffic and lead generation have not yet recovered. After spending almost $25,000 over nine months this is rather frustrating. I might add the site has very few links from incoming domains and those links are not high quality. An SEO audit was performed in February and in April a link removal campaign occurred with about 30 domains agreeing to remove links and a disavow file being submitted for another 70-80 domains that would not agree to remove links. My SEO believes that we should focus on improving visitor engagement rather that on more esoteric SEO like trying to build incoming links. They think that improving useability will improve conversions and would generate results faster than traditional SEO. Also, they think that improving click through rates, reducing bounce rates will improve ranking by signaling to Google that the site is providing value to visitors. Does this sound like a reasonable approach? On one hand I don't see how my site with a MOZ domain authority could possibly compete against sites with a high number of quality incoming links and that maybe building a better link profile would yield faster results. On the other hand, it seems logical that Google would reward a site that creates a better user experience. Any thoughts from the MOZ community???? Does it sound like the recent loss of traffic is due to the indexing of the 175 pages? If so, when should my traffic and ranking return? Incidentally, these are the steps taken since last November to improve SEO: SEO Traffic & Ranking Drop Analysis and Recommendations (included in-depth SEO technical audit and recommendations). Unnatural Link Removal Program Content Optimization (Audit & Strategy with 20 page keyword matrix) CORE (also provided wireframe for /visitor-details pages at no-charge) SEO Copywriting for 10 pages New wire frames implemented on site on June 6th Jump in indexed pages by 175 on June 10th. Google Webmaster Tools removal request made for those low quality pages on June 23rd. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan11 -
Using a 302 re-direct from http://www to https://www to secure customer data
My website sends Customers from a http://www.mysite.com/features page to a https://www.mysite.com/register page which is an account sign-up form using a 302 re-direct. Any page that collects customer data has an authenticated SSL certificate to protect any data on the site. Is this 302 the most appropriate way of doing this as the weekly crawl picks it up as being bad practise? Is there a better alternative?
Intermediate & Advanced SEO | | Ubique0 -
Re-Direct Users But Don't Affect Googlebot
This is a fairly technical question... I have a site which has 4 subdomains, all targeting a specific language. The brand owners don't want German users to see the prices on the French sub domain and are forcing users into a re-direct to the relevant subddomain, based on their IP address. If a user comes from a different country, (ie the US) they are forced on the UK sub domain. The client is insistent on keeping control of who sees what (I know that's a debate in it's own right), but these re-directs we're implementing to make that happen, are really making it difficult to get all the subdomains indexed as I think googlebot is also getting re-directed and is failing to do it's job. Is there are a way of re-directing users, but not Googlebot?
Intermediate & Advanced SEO | | eventurerob0 -
Website domain hosting and set-up for foreign domains?
Hi, I am just wondering what the best practice is for marketing a business in two separate countries? I have a new client that wants me to create their website targeted at the UK market which for me is normal but they also want to target Australia (Probably couldn't get any further away) My initial thoughts are that the business would need two separate websites. The first one in the uk and the second website hosted on servers in Australia with different content. Is this correct? or does anyone have any advice which may simplify getting this thing off the ground. Thanks in advance.
Intermediate & Advanced SEO | | AdeLewis
Ade.0