Can't crawl website with Screaming frog... what is wrong?
-
Hello all - I've just been trying to crawl a site with Screaming Frog and can't get beyond the homepage - have done the usual stuff (turn off JS and so on) and no problems there with nav and so on- the site's other pages have indexed in Google btw.
Now I'm wondering whether there's a problem with this robots.txt file, which I think may be auto-generated by Joomla (I'm not familiar with Joomla...) - are there any issues here? [just checked... and there isn't!]
If the Joomla site is installed within a folder such as at
e.g. www.example.com/joomla/ the robots.txt file MUST be
moved to the site root at e.g. www.example.com/robots.txt
AND the joomla folder name MUST be prefixed to the disallowed
path, e.g. the Disallow rule for the /administrator/ folder
MUST be changed to read Disallow: /joomla/administrator/
For more information about the robots.txt standard, see:
http://www.robotstxt.org/orig.html
For syntax checking, see:
http://tool.motoricerca.info/robots-checker.phtml
User-agent: *
Disallow: /administrator/
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /modules/
Disallow: /plugins/
Disallow: /tmp/ -
For anyone wondering; The answer above by Ecommerce Site (odd name btw) works - 21-Nov-2016.
-
This is the best I could find to so someone who had a similar problem with Joomla-
"In the premium version you can slow down the crawl rate under 'speed' in the configuration. In the free lite version, you can crawl the site and then right click on any URLs with a 403 response and press 're-spider'. The server will generally then allow you to crawl these pages (and return a 200 ok response) as you're not requesting too many at once, so you might have to re-spider them individually."
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it still true that 3xx redirects don't cause you to lose any ranking?
In this: https://moz.com/blog/301-redirection-rules-for-seo it says that simply redirecting - provided you don't change anything on the page - isn't going to cost you rankings. Is this still true, or is there any new data/case studies that have been done since? I haven't seen anything updating it and just want to make sure because it's from 2016. We want to do simple 301 redirecting without any changes to the page. Or has anyone had an opposite experience?
Intermediate & Advanced SEO | | AngieJohnston1 -
Google doesn't index image slideshow
Hi, My articles are indexed and images (full size) via a meta in the body also. But, the images in the slideshow are not indexed, have you any idea? A problem with the JS Example : http://www.parismatch.com/People/Television/Sport-a-la-tele-les-femmes-a-l-abordage-962989 Thank you in advance Julien
Intermediate & Advanced SEO | | Julien.Ferras0 -
Should I delete 'data hightlighter' mark-up in webmaster tools after added schema.org mark-up?
LEDSupply.com is my site, and before becoming familiar with schema mark-up I used the 'data-highlighter' in webmaster tools to mark-up as much of the site as I could. Now that Schema is set-up I'm wondering if having both active is bad and am thinking I should delete the previous work with the 'data highlighter' tool. To delete or not to delete? Thank you!
Intermediate & Advanced SEO | | saultienut0 -
What NAP format do I use if the USPS can't even find my client's address?
My client has a site already listed on Google+Local under "5208 N 1st St". He has some other NAPs, e.g., YellowPages, under "5208 N First Street". The USPS finds neither of these, nor any variation that I can possibly think of! Which is better? Do I just take the one that Google has accepted and make all the others like it as best I can? And doesn't it matter that the USPS doesn't even recognize the thing? Or no? Local SEO wizards, thanks in advance for your guidance!
Intermediate & Advanced SEO | | rayvensoft0 -
What am I doing wrong?
I am trying to do my own SEO for my small photography business. I have been with SEOMOZ for about a week now. My ranking before was about the 4th page for "Houston Wedding Photographer" I have found places to add my link for back-links. I fixed the duplicate page content errors... I had Google re-crawl my site the other day because I felt like I had done a significant job improving the rank. My website has now slipped to the 5th page. What in the hell am I doing wrong? http://www.photogbykelly.com/
Intermediate & Advanced SEO | | kurban1 -
%20 Rewrite in CMS doesn't get picked up by Search Engiens
Hi Mozzers I have a little issue on a rewrite that was implemented on a CMS. The CMS was built for my client without the option to put custom slugs in. So it takes the title of a post or page and uses it as a URL, the site was launched with a rewrite so that any space in the title is replaced with a - and that is the permanent URL for that post/page. This morning when I was busy doing my checkup on the site I found that the URLs are being indexed as %20 and not - however, if you navigate through the site the URLs are displaying correctly. How is it that search engines pick this up as a space in the slug if it has clearly been set as a - anyone had this issue before? Its causing duplicate content issues on the site because both ways display the same post/page. Cheers, Chris Captivate.
Intermediate & Advanced SEO | | DROIDSTERS0 -
If I had an issue with a friendly URL module and I lost all my rankings. Will they return now that issue is resolved next time I'm crawled by google?
I have 'magic seo urls' installed on my zencart site. Except for some reason no one can explain why or how the files were disabled. So my static links went back to dynamic (index.php?**********) etc. The issue was resolved with the module except in that time google must have crawled my site and I lost all my rankings. I'm nowher to be found in the top 50. Did this really cause such an extravagant SEO issue as my web developers told me? Can I expect my rankings to return next time my site is crawled by google?
Intermediate & Advanced SEO | | Pete790 -
How to get the 'show map of' tag/link in Google search results
I have 2 clients that have apparently random examples of the 'show map of' link in Google search results. The maps/addresses are accurate and for airports. They are both aggregators, they service the airports e.g. lax airport shuttle (not actual example) BUT DO NOT have Google Place listings for these pages either manually OR auto populated from Google, DO NOT have the map or address info on the pages that are returned in the search results with the map link. Does anyone know how this is the case? Its great that this happens for them but id like to know how/why so I can replicate across all their appropriate pages. My understanding was that for this to happen you HAD to have Google Place pages for the appropriate pages (which they cant do as they are aggregators). Thanks in advance, Andy
Intermediate & Advanced SEO | | AndyMacLean0