Can't crawl website with Screaming frog... what is wrong?
-
Hello all - I've just been trying to crawl a site with Screaming Frog and can't get beyond the homepage - have done the usual stuff (turn off JS and so on) and no problems there with nav and so on- the site's other pages have indexed in Google btw.
Now I'm wondering whether there's a problem with this robots.txt file, which I think may be auto-generated by Joomla (I'm not familiar with Joomla...) - are there any issues here? [just checked... and there isn't!]
If the Joomla site is installed within a folder such as at
e.g. www.example.com/joomla/ the robots.txt file MUST be
moved to the site root at e.g. www.example.com/robots.txt
AND the joomla folder name MUST be prefixed to the disallowed
path, e.g. the Disallow rule for the /administrator/ folder
MUST be changed to read Disallow: /joomla/administrator/
For more information about the robots.txt standard, see:
http://www.robotstxt.org/orig.html
For syntax checking, see:
http://tool.motoricerca.info/robots-checker.phtml
User-agent: *
Disallow: /administrator/
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /modules/
Disallow: /plugins/
Disallow: /tmp/ -
For anyone wondering; The answer above by Ecommerce Site (odd name btw) works - 21-Nov-2016.
-
This is the best I could find to so someone who had a similar problem with Joomla-
"In the premium version you can slow down the crawl rate under 'speed' in the configuration. In the free lite version, you can crawl the site and then right click on any URLs with a 403 response and press 're-spider'. The server will generally then allow you to crawl these pages (and return a 200 ok response) as you're not requesting too many at once, so you might have to re-spider them individually."
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why isn't there a browser tab title AND meta title?
Personal opinion; as a user, it makes sense for me to want a full 50+ character meta title which displays in a search engine that helps me determine if I want to click that link AND a concise browser tab title that tells me which page and brand I have open. As a search engine, I would (possibly wrongly) suppose that having one more piece user-facing of information would be helpful in understanding a page and that page's relation to the rest of the website. Theoretical example Meta title: A great title for the website I've been dreaming of! | OurBrand Browser tab title: Home | OurBrand
Intermediate & Advanced SEO | | sb10300 -
My homepage doesn't seem to be indexed. Any suggestions?
As the title said, I don't think my homepage is being indexed. When I use "site:" search operator it's not there, but it's still ranking for other various keywords. Also the pages of my site I would expect to see with the "site:" search operator aren't there either. Site for reference: three29.com Any ideas what could be causing this? I don't have any errors or penalties in Search Console. Thanks.
Intermediate & Advanced SEO | | Three290 -
I'm in Canada and building a website for the US...approach?
Hi there - we already have a Canadian website for the company and we're building one for our American branch. From an SEO perspective what is the best approach here? We have already purchased a .com domain and the company is branded a little different in the US than in Canada. How do I tell Google that this site is American and should be served primarily to the American audience? Should I be tagging duplicate content with rel=canonical (for similar pages like the About us section for instance) or does that matter here? Hope you guys can help. Thanks!
Intermediate & Advanced SEO | | MelcorDev0 -
Can buying a sponsored post (for non SEO purposes) on a website where you already have a guest post have a negative impact?
Hi, I thinking about buying a sponsored post about our product on a website we have previously contributed a guest post. Can a new sponsored post make Google think our original guest post was paid for? Thanks, Ori
Intermediate & Advanced SEO | | dizi3770 -
Should I delete 'data hightlighter' mark-up in webmaster tools after added schema.org mark-up?
LEDSupply.com is my site, and before becoming familiar with schema mark-up I used the 'data-highlighter' in webmaster tools to mark-up as much of the site as I could. Now that Schema is set-up I'm wondering if having both active is bad and am thinking I should delete the previous work with the 'data highlighter' tool. To delete or not to delete? Thank you!
Intermediate & Advanced SEO | | saultienut0 -
I've seen and heard alot about city-specific landing pages for businesses with multiple locations, but what about city-specific landing pages for cities nearby that you aren't actually located in? Is it ok to create landing pages for nearby cities?
I asked here https://www.google.com/moderator/#7/e=adbf4 but figured out ask the Moz Community also! Is it actually best practice to create landing pages for nearby cities if you don't have an actual address there? Even if your target customers are there? For example, If I am in Miami, but have a lot of customers who come from nearby cities like Fort Lauderdale is it okay to create those LP's? I've heard this described as best practice, but I'm beginning to question whether Google sees it that way.
Intermediate & Advanced SEO | | RickyShockley2 -
What NAP format do I use if the USPS can't even find my client's address?
My client has a site already listed on Google+Local under "5208 N 1st St". He has some other NAPs, e.g., YellowPages, under "5208 N First Street". The USPS finds neither of these, nor any variation that I can possibly think of! Which is better? Do I just take the one that Google has accepted and make all the others like it as best I can? And doesn't it matter that the USPS doesn't even recognize the thing? Or no? Local SEO wizards, thanks in advance for your guidance!
Intermediate & Advanced SEO | | rayvensoft0