Having issues crawling a website
-
We looked to use the Screaming Frog Tool to crawl this website and get a list of all meta-titles from the site, however, it only resulted with the one result - the homepage.
We then sought to obtain a list of the URLs of the site by creating a sitemap using https://www.xml-sitemaps.com/. Once again however, we just go the one result - the homepage.
There is something that seems to be restricting these tools from crawling all pages. If you anyone can shed some light as to what this could be, we'd be most appreciative.
-
That robots.txt should be fine.. its not blocking anything.
The reason the crawl is stopping on the homepage is this code:
<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">nofollow</a>">
Which tells bots to not follow any links on the page. Remove that and you should be good.
-
Hi,
I think it is your robots.txt file that is causing the issue. At the moment you have the following:
**User-agent: ***
Disallow:
I would recommend updating it to the following:
**User-agent: ***
Allow: /
Moz also has a good post about what else you can include in your robots.txt file for best practices etc. :
https://moz.com/learn/seo/robotstxt
Hope that helps
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Infinite scrolling issue?
Hi Guys, Reviewing this E-commerce page - https://tinyurl.com/ybjjwr65 Based on this Google article: https://webmasters.googleblog.com/2014/02/infinite-scroll-search-friendly.html It mentions: Make sure that you or your content management system produces a paginated series (component pages) to go along with your infinite scroll. How would you check this, is there a tool to conduct this test? Cheers.
Intermediate & Advanced SEO | | kayl870 -
How to rank my website in Google UK?
Hi guys, I own a London based rubbish removal company, but don't have enough jobs. I know for sure that some of my competitors get most of their jobs trough Google searches. I also have a website, but don't receive calls from it at all. Can you please tell me how to rank my website on keywords like: "rubbish removal london", "waste clearance london", "junk collection london" and other similar keywords? I know that for person like me (without much experience in online marketing) will be difficult task to optimize the website, but at least - I need some advices from where to start. I'm also thinking to hire an SEO but not sure where to find a trusted company. Most importantly I have no idea how much should pay to expect good results? What is too much and what is too low? I will appreciate all advices.
Intermediate & Advanced SEO | | gorubbishgo0 -
Redirecting M Dot Mobile Website to Responsive Design Website Questions
Hi amazing Moz community 🙂 Couldn't find this question anywhere, and knew this was the place to ask! We are helping a client redirect an M Dot website to a Responsive Design website. We want to retain our mobile rankings for keywords. Three questions - We should use 301 redirects from the M Dot website to the new website correct? (not 302s?) How long does it take for Google to understand that we have launched a responsive website? Can we remove the 301 redirects after a few days (if the M Dot website interferes/breaks the new Responsive website)? We have verified an account on Google Search Console for the M Dot website, along with a mobile sitemap that has been submitted and verified. What should we do with this M Dot GSC account? Just delete it? Or keep it and upload the NEW XML Sitemap with the new WWW links (because the website is responsive). THANK YOU!
Intermediate & Advanced SEO | | accpar0 -
Website Indexing Issues - Search Bots will only crawl Homepage of Website, Help!
Hello Moz World, I am stuck on a problem, and wanted to get some insight. When I attempt to use Screaming Spider or SEO Powersuite, the software is only crawling the homepage of my website. I have 17 pages associated with the main domain i.e. example.com/home, example.com/sevices, etc. I've done a bit of investigating, and I have found that my client's website does not have Robot.txt file or a site map. However, under Google Search Console, all of my client's website pages have been indexed. My questions, Why is my software not crawling all of the pages associated with the website? If I integrate a Robot.txt file & sitemap will that resolve the issue? Thanks ahead of time for all of the great responses. B/R Will H.
Intermediate & Advanced SEO | | MarketingChimp100 -
Rel=canonical on pre-migration website
I have an e-commerce client that is migrating platforms. The current structure of their existing website has led to what I would believe to be mass duplicate content. They have something north of 150,000 indexed URLs. However, 143,000+ of these have query strings and the content is identical to pages without any query string. Even so, the site does pretty well from an organic stand point compared to many of its direct competitors. Here is my question: (1) I am assuming that I should go into WMT (Google/Bing) and tell both search engines to ignore query strings. (2) In a review of back links, it does appear that there is a mish mash of good incoming links both to the clean and the dirty URLs. Should I add a rel=canonical via a script to all the pages with query strings before we make our migration and allow the search engines some time to process? (3) I'm assuming I can continue to watch the indexation of the URLs, but should I also tell search engines to remove the URLs of the dirty URLs? (4) Should I do Fetch in WMT? And if so, what sequence should I do for 1-4. How long should I wait between doing the above and undertaking the migration?
Intermediate & Advanced SEO | | ExploreConsulting0 -
Are these URL hashtags an SEO issue?
Hi guys - I'm looking at a website which uses hashtags to reveal the relevant content So there's page intro text which stays the same... then you can click a button and the text below that changes So this is www.blablabla.com/packages is the main page - and www.blablabla.com/packages#firstpackage reveals first package text on this page - www.blablabla.com/packages#secondpackage reveals second package text on this same page - and so on. What's the best way to deal with this? My understanding is the URLs after # will not be indexed very easily/atall by Google - what is best practice in this situation?
Intermediate & Advanced SEO | | McTaggart0 -
Is my website is having enough content on it to rank?
I have less content on my website, is this okay or I need to add more content on my pages? Website is - brandstenmedia.com.au Any other suggestions for the website?
Intermediate & Advanced SEO | | Green.landon0 -
Website Crawl problems
I have a feeling that Google doesn't crawl my website. E.g. this blogpost - I copy a sentence from it and paste it to Google. The page that shows up in search results is www.silvamethodlife.com/page/9/ - which is just a blog page with all the articles listed, not the link to the article itself! Did anyone ever have this problem? It's definitely some technical issue. Any advice will be deeply appreciated Thanks
Intermediate & Advanced SEO | | Alexey_mindvalley0