Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Unsolved Crawling only the Home of my website
-
Hello,
I don't understand why MOZ crawl only the homepage of our webiste https://www.modelos-de-curriculum.comWe add the website correctly, and we asked for crawling all the pages. But the tool find only the homepage. Why?
We are testing the tool before to suscribe. But we need to be sure that the tool is working for our website. If you can please help us.
-
@Azurius
Certainly, I understand your concern about MOZ only crawling the homepage of your website despite adding it correctly and requesting a full crawl of all pages. It's frustrating when testing a tool before subscribing, and not getting the expected results can raise doubts about its functionality. To address this issue, I recommend double-checking your website's settings within the MOZ platform. Ensure that the website URL provided matches the actual structure of your site and that there are no typos or errors in the input. Additionally, review MOZ's documentation or contact their customer support to confirm if there are specific settings or configurations needed for a comprehensive crawl. Sometimes, minor adjustments in the tool's settings or the website's structure can make a significant difference in the crawling process. I hope this helps, and I wish you success in resolving this matter and making an informed decision about your subscription. -
Crawling only the home page of your website is a common practice in web indexing and search engine optimization. This approach allows search engine bots to focus on your site's main landing page, ensuring that it's properly indexed and ranked. Here's how you can specify that only the home page should be crawled:
Robots.txt File,
Canonical Tag
Sitemap.xml,
Noindex Tags,
Meta Robots TagBy implementing these methods, you can direct search engine crawlers to focus primarily on your website's home page, ensuring that it receives the most attention in terms of indexing and ranking. This can be particularly useful if your home page is the most important and relevant page for your website's SEO strategy.
https://www.clippingnext.com/ -
@Azurius i have the same issue,. I think the answer here is quite helpful
-
Hello,
There could be several reasons why MOZ is only crawling the homepage of your website, https://www.modelos-de-curriculum.com. Here are a few possibilities:
Robots.txt file: Check your website's robots.txt file to ensure that it's not blocking MOZ's web crawlers from accessing other pages. Make sure there are no disallow rules that could restrict access to certain areas of your site.
Nofollow tags: Ensure that your website doesn't have "nofollow" tags on internal links that may be preventing MOZ from following and crawling those links.
JavaScript: If your website heavily relies on JavaScript for content rendering, MOZ may face difficulty crawling the content. Ensure that important content is accessible without JavaScript.
Canonical tags: Check for canonical tags in your website's HTML. If you have specified the homepage as the canonical page for all other pages, this could limit MOZ's ability to crawl additional pages.
Site structure: MOZ may have trouble crawling pages with complex or unconventional site structures. Ensure that your website follows standard navigation and linking practices.
Crawl settings: Double-check the settings in your MOZ account to confirm that you have requested a full site crawl, not just the homepage.
If you've verified all these aspects and MOZ is still not crawling your website correctly, you may want to reach out to their support team for further assistance. They can provide more specific guidance based on your account and settings.
Testing the tool before subscribing is a wise approach to ensure it meets your needs. I recommend contacting MOZ's support for personalized assistance in resolving the crawling issue with your website. (PMP Exam Prep) (Canada PR) (Study abroad) (Project Management Life Cycle)
-
Hello,
The reason Moz is only crawling the homepage of your website could be due to various factors. Here are a few possibilities:
Robots.txt File: Check your website's robots.txt file to ensure that it doesn't block search engine crawlers from accessing specific pages.
Meta Robots Tags: Make sure there are no "noindex" meta tags on your internal pages that might prevent them from being indexed.
Crawl Restrictions: Moz may not have had enough time to crawl all the pages on your website. It might take some time for the tool to explore and index your entire site.
Sitemap: Ensure that your website's sitemap is correctly submitted to Moz. A sitemap helps search engines find and index all the pages on your site.
Internal Linking: Check if there are internal links from your homepage to other pages. A lack of internal links can make it harder for search engines to discover and crawl your site.
Access Permissions: Make sure there are no access restrictions or password protection on certain pages.
To resolve this issue and ensure Moz crawls all the pages of your website, consider checking and addressing these factors. If the problem persists, it might be helpful to reach out to Moz's customer support for specific assistance with your subscription trial.
(Canada PR) (Study abroad) (PMP Exam Prep) (Reference Letter for Canada PR) -
check your .htaccess if there is something that is blocking crawlers.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Advise on the right way to block country specific users but not block Googlebot - and not be seen to be cloaking. Help please!
Hi, I am working on the SEO of an online gaming platform - a platform that can only be accessed by people in certain countries, where the games and content are legally allowed.
International SEO | | MarkCanning
Example: The games are not allowed in the USA, but they are allowed in Canada. Present Situation:
Presently when a user from the USA visits the site they get directed to a restricted location page with the following message: RESTRICTED LOCATION
Due to licensing restrictions, we can't currently offer our services in your location. We're working hard to expand our reach, so stay tuned for updates! Because USA visitors are blocked Google which primarily (but not always) crawls from the USA is also blocked, so the company webpages are not being crawled and indexed. Objective / What we want to achieve: The website will have multiple region and language locations. Some of these will exist as standalone websites and others will exist as folders on the domain. Examples below:
domain.com/en-ca [English Canada]
domain.com/fr-ca [french Canada]
domain.com/es-mx [spanish mexico]
domain.com/pt-br [portugese brazil]
domain.co.in/hi [hindi India] If a user from USA or another restricted location tries to access our site they should not have access but should get a restricted access message.
However we still want google to be able to access, crawl and index our pages. Can i suggest how do we do this without getting done for cloaking etc? Would this approach be ok? (please see below) We continue to work as the present situation is presently doing, showing visitors from the USA a restricted message.
However rather than redirecting these visitors to a restricted location page, we just black out the page and show them a floating message as if it were a model window.
While Googlebot would be allowed to visit and crawl the website. I have also read that it would be good to put paywall schema on each webpage to let Google know that we are not cloaking and its a restricted paid page. All public pages are accessible but only if the visitor is from a location that is not restricted Any feedback and direction that can be given would be greatly appreciated as i am new to this angle of SEO. Sincere thanks,0 -
GoogleBot still crawling HTTP/1.1 years after website moved to HTTP/2
Whole website moved to https://www. HTTP/2 version 3 years ago. When we review log files, it is clear that - for the home page - GoogleBot continues to only access via HTTP/1.1 protocol Robots file is correct (simply allowing all and referring to https://www. sitemap Sitemap is referencing https://www. pages including homepage Hosting provider has confirmed server is correctly configured to support HTTP/2 and provided evidence of accessing via HTTP/2 working 301 redirects set up for non-secure and non-www versions of website all to https://www. version Not using a CDN or proxy GSC reports home page as correctly indexed (with https://www. version canonicalised) but does still have the non-secure version of website as the referring page in the Discovery section. GSC also reports homepage as being crawled every day or so. Totally understand it can take time to update index, but we are at a complete loss to understand why GoogleBot continues to only go through HTTP/1.1 version not 2 Possibly related issue - and of course what is causing concern - is that new pages of site seem to index and perform well in SERP ... except home page. This never makes it to page 1 (other than for brand name) despite rating multiples higher in terms of content, speed etc than other pages which still get indexed in preference to home page. Any thoughts, further tests, ideas, direction or anything will be much appreciated!
Technical SEO | | AKCAC1 -
Dynamic Canonical Tag for Search Results Filtering Page
Hi everyone, I run a website in the travel industry where most users land on a location page (e.g. domain.com/product/location, before performing a search by selecting dates and times. This then takes them to a pre filtered dynamic search results page with options for their selected location on a separate URL (e.g. /book/results). The /book/results page can only be accessed on our website by performing a search, and URL's with search parameters from this page have never been indexed in the past. We work with some large partners who use our booking engine who have recently started linking to these pre filtered search results pages. This is not being done on a large scale and at present we only have a couple of hundred of these search results pages indexed. I could easily add a noindex or self-referencing canonical tag to the /book/results page to remove them, however it’s been suggested that adding a dynamic canonical tag to our pre filtered results pages pointing to the location page (based on the location information in the query string) could be beneficial for the SEO of our location pages. This makes sense as the partner websites that link to our /book/results page are very high authority and any way that this could be passed to our location pages (which are our most important in terms of rankings) sounds good, however I have a couple of concerns. • Is using a dynamic canonical tag in this way considered spammy / manipulative? • Whilst all the content that appears on the pre filtered /book/results page is present on the static location page where the search initiates and which the canonical tag would point to, it is presented differently and there is a lot more content on the static location page that isn’t present on the /book/results page. Is this likely to see the canonical tag being ignored / link equity not being passed as hoped, and are there greater risks to this that I should be worried about? I can’t find many examples of other sites where this has been implemented but the closest would probably be booking.com. https://www.booking.com/searchresults.it.html?label=gen173nr-1FCAEoggI46AdIM1gEaFCIAQGYARS4ARfIAQzYAQHoAQH4AQuIAgGoAgO4ArajrpcGwAIB0gIkYmUxYjNlZWMtYWQzMi00NWJmLTk5NTItNzY1MzljZTVhOTk02AIG4AIB&sid=d4030ebf4f04bb7ddcb2b04d1bade521&dest_id=-2601889&dest_type=city& Canonical points to https://www.booking.com/city/gb/london.it.html In our scenario however there is a greater difference between the content on both pages (and booking.com have a load of search results pages indexed which is not what we’re looking for) Would be great to get any feedback on this before I rule it out. Thanks!
Technical SEO | | GAnalytics1 -
Website can't be crawled
Hi there, One of our website can't be crawled. We did get the error emails from you (Moz) but we can't find the solution. Can you please help me? Thanks, Tamara
Product Support | | Yenlo0 -
Site Crawl Status code 430
Hello, In the site crawl report we have a few pages that are status 430 - but that's not a valid HTTP status code. What does this mean / refer to?
Product Support | | ianatkins
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_errors If I visit the URL from the report I get a 404 response code, is this a bug in the site crawl report? Thanks, Ian.0 -
Crawl error robots.txt
Hello, when trying to access the site crawl to be able to analyze our page, the following error appears: **Moz was unable to crawl your site on Nov 15, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. Can help us? Thanks!
Product Support | | Mandiram0 -
How to block Rogerbot From Crawling UTM URLs
I am trying to block roger from crawling some UTM urls we have created, but having no luck. My robots.txt file looks like: User-agent: rogerbot Disallow: /?utm_source* This does not seem to be working. Any ideas?
Product Support | | Firestarter-SEO0 -
I have removed a subdomain from my main domain. We have stopped the subdomain completely. However the crawl still shows the error for that sub-domain. How to remove the same from crawl reports.
Earlier I had a forum as sub-domain and was mentioned in my main domain. However i have now discontinued the forum and have removed all the links and mention of the forum from my main domain. But the crawler still shows error for the sub-domain. How to make the crawler issues clean or delete the irrelevant crawl issues. I dont have the forum now and no links at the main site, bu still shows crawl errors for the forum which doesnt exist.
Product Support | | potterharry0