Unsolved Why does Moz Crawler start with HTTP//1.1 version??
-
We have run Moz Pro's Site Crawl for example-domain.com
Why has Moz's crawler decided to site http://www.example-domain.com (ie the non-secure version) as zero crawl depth page and report the actually live https://www.example-domain.com (ie secure HTTP2 version) as a crawl depth of 2?
Surely the main live page should be the first crawled and reported as crawl depth = 0?
-
hope you're doing well. I have a question regarding the behavior of the Moz crawler and a concern related to crawling issues on my website.
I've observed that the Moz crawler initiates its sessions with the HTTP/1.1 version. Could you please provide insights into why this is the case? I'm curious about whether this is a standard protocol for the Moz crawler and if there are any implications for website owners.
Additionally, I'm experiencing crawling issues with my website, CFMS BILL STATUS. Despite my efforts to optimize crawlability, I continue to encounter challenges. The website seems to have difficulty being effectively crawled by Moz.
Could you offer guidance or suggestions on how to address these crawling issues and ensure that CFMS BILL STATUS is properly indexed by the Moz crawler?
-
I'll provide a comprehensive response to the prompt, aiming for clarity and incorporating insights from Moz's best practices:
Understanding Crawl Depth and the Issue:
Crawl depth refers to the number of clicks (or links) it takes a crawler to reach a specific page from the starting point (usually the homepage).
In this case, Moz's crawler is reporting a crawl depth of 0 for the non-secure HTTP version of the homepage (http://www.example-domain.com), while the secure HTTPS version (https://www.example-domain.com) has a crawl depth of 2. This discrepancy suggests a potential issue with how the site is configured or how the crawler is interpreting it.
Potential Reasons for the Discrepancy:Redirect Configuration:
If http://www.example-domain.com redirects to https://www.example-domain.com, the crawler might initially treat the non-secure version as the starting point (crawl depth = 0) and the secure version as a secondary page (crawl depth = 2).
Canonical Tags:If the canonical tag on https://www.example-domain.com points to http://www.example-domain.com, Moz might prioritize the non-secure version.
Sitemap and Internal Linking:Ensure your sitemap lists the https version of URLs and that internal links use https URLs consistently.
Crawler Settings:Some tools allow specifying which version (http or https) to prioritize. Check for such settings in Moz Pro.
Historical Data:If the site recently migrated from http to https, historical data might influence crawl behavior.
Resolving the Issue:Review Redirects:
Ensure redirects are set up correctly to prioritize https.
Check Canonical Tags:
Verify that canonical tags point to the https version.
Update Sitemap and Internal Links:
Use https URLs consistently.
Adjust Crawler Settings:
If possible, prioritize https in Moz Pro's settings.
Contact Moz Support:
If the issue persists, seek guidance from Moz support. -
@AKCAC When using Moz Pro's Site Crawl for your website and encountering a situation where the non-secure (http) version of your domain is reported as having a crawl depth of zero, while the secure (https) version shows a greater crawl depth, there are several potential reasons and implications to consider:
-
Redirect Configuration: The most common reason for this is how redirects are set up on your site. If
http://www.example-domain.com
is the primary address that Moz encounters due to your server's configuration, and it redirects tohttps://www.example-domain.com
, Moz might initially treat the non-secure version as the starting point (crawl depth = 0) and the secure version as a secondary page (thus a greater crawl depth). -
Canonical Tags: Check your canonical tags. If the canonical tag on your https pages points to the http version, Moz (and other search engines) might treat the http version as the primary page.
-
Sitemap and Internal Linking: Ensure that your sitemap lists the https version of your URLs and that internal linking on your site uses https URLs. If your internal links or sitemap reference the http version, crawlers may initially prioritize these.
-
Crawler Settings: In some tools, including Moz, you can specify which version of the site (http or https) to prioritize in a crawl. Check if such a setting is influencing the crawl behavior.
-
Historical Data: If your site recently migrated from http to https, and Moz has historical data from previous crawls, it might temporarily reflect the older structure until it fully updates its index with the new configuration.
-
DNS and Server Configuration: Verify your DNS and server settings to ensure that they correctly redirect all http traffic to https and that the https version is set as the primary endpoint.
-
Robots.txt File: Make sure your robots.txt file doesn't unintentionally block or deprioritize https URLs.
Steps to Resolve the Issue:
- Ensure Consistent Redirects: All http URLs should 301 redirect to their https counterparts.
- Update Canonical Tags: Canonical tags on all pages should point to the https versions.
- Verify Sitemap and Internal Links: Both should consistently use and reference https URLs.
- Re-crawl the Site: After making changes, re-run the Moz Site Crawl to
-
-
Moz Crawler, like many web crawlers, typically starts with the HTTP/1.1 version because it is a widely accepted and supported protocol for communication between web clients and servers. HTTP/1.1 is the latest version of the HTTP protocol at the time of Moz Crawler's implementation, offering improvements over its predecessor, HTTP/1.0. It provides features such as persistent connections, chunked transfer encoding, and the ability to pipeline multiple requests, enhancing the efficiency of data transmission. Starting with HTTP/1.1 allows Moz Crawler to leverage these features for more effective and streamlined interactions with web servers, optimizing the crawling process and ultimately enhancing its performance in retrieving information from websites. For More Info Visit Now.
-
The crawl depth reported by tools like Moz Pro is determined by the level of clicks it takes to reach a particular page from the homepage or root domain. It's not solely based on whether the page is HTTP or HTTPS.
In your scenario, if Moz Pro is reporting that the HTTP version (http://www.example-domain.com) has a crawl depth of 0, it means that this page is directly accessible from the root domain. On the other hand, if the HTTPS version (https://www.example-domain.com) is reported as having a crawl depth of 2, it implies that it takes two clicks (or two levels deep) from the homepage to reach this particular HTTPS page.
There could be various reasons for such a situation, such as the site structure, internal linking, or redirects. It's not uncommon for websites to have different versions (HTTP and HTTPS) of their pages, and the crawler may follow links or redirects differently, leading to variations in crawl depth.
To further investigate, you may want to examine your site's internal linking structure, make sure that there are no unexpected redirects or canonicalization issues, and ensure that your preferred version (HTTPS in this case) is correctly configured and prioritized in your website settings and sitemap. Additionally, Moz Pro may provide more detailed insights into the specific reasons for the reported crawl depth if you review the crawl report or log files.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why MOZ just index some of the links?
hello everyone i've been using moz pro for a while and found a lot of backlink oppertunites as checking my competitor's backlink profile.
Link Building | | seogod123234
i'm doing the same way as my competitors but moz does not see and index lots of them, maybe just index 10% of them. though my backlinks are commenly from sites with +80 and +90 DA like Github, Pinterest, Tripadvisor and .... and the strange point is that 10% are almost from EDU sites with high DA. i go to EDU sites and place a comment and in lots of case, MOZ index them in just 2-3 days!! with maybe just 10 links like this, my DA is incresead from 15 to 19 in less than one month! so, how does this "SEO TOOL" work?? is there anyway to force it to crawl a page?0 -
Can't get Google to index our site although all seems very good
Hi there, I am having issues getting our new site, https://vintners.co indexed by Google although it seems all technical and content requirements are well in place for it. In the past, I had way poorer websites running with very bad setups and performance indexed faster. What's concerning me, among others, is that the crawler of Google comes from time to time when looking on Google Search Console but does not seem to make progress or to even follow any link and the evolution does not seem to do what google says in GSC help. For instance, our sitemap.xml was submitted, for a few days, it seemed like it had an impact as many pages were then visible in the coverage report, showing them as "detected but not yet indexed" and now, they disappeared from the coverage report, it's like if it was not detected any more. Anybody has any advice to speed up or accelerate the indexing of a new website like ours? It's been launched since now almost two months and I was expected, at least on some core keywords, to quickly get indexed.
Technical SEO | | rolandvintners1 -
A blank moz bar
Hey MOZ peeps, iv come across something strange. The website http://abundantiaentertainment.com/ displays a "blank" MOZ bar ?! Can anyone help me understand this Thank you in advance
Moz Pro | | Silverwaring0 -
New to Moz and wanted a bit of help with my report
Hi, I have used the MOZ report to analyse one of my friends sites and I wanted to query a few warnings it highlighted and I just wanted people's thought on how important they thought these were: The first is dupliate descriptions/titles. This is mainly down the e-commerce pages. Fist duplicate content:
Moz Pro | | dannylancs
On some pages the description is identical and all that is different is the title and picture, is this an issue? Duplicate pages:
Due to the way the website folder structure/catergories has been created some pages are identical but because the product comes under 2 cetergories there is 2 seperate pages, should we use the canonical on one of the pages? Also regarding the canonical tag, they have put link rel="canonical" on every page and got it to point at itself, so not really being used in the way it is meant to be. Could something like this cause any harm? The final thing is internal linking back to the homepage. If for example the homepage is http://www.test.com, when linking back is it best to put the full URL over "index.html" even though they are the same page? Any help really appreciated Dan0 -
Noindex/nofollow on blog comments; is it good or bad ?
Hi, I changed the design of one my wordpress website at the beginning of the month. I also added a "facebook seo comments" plugin to rewrite facebook comments as normal comments. As most of the website comments are facebook comments, I went from 250 noindex/nofollow comments to 950; URL's are ?replytocom=4822 etc. Moz campaign noticed it and I'm asking myself : is it good to have comments in noindex/nofollow ? Should I do something about this ? Erwan.
Moz Pro | | johnny1220 -
Once I start a campaign, is it possible to edit the domain?
I accidentally submitted a subdomain instead of the root domain. Is it possible to fix this? Or how should this be addressed? Thanks in advance!
Moz Pro | | comerecommended0 -
Is there a way, to display historic data / chart of a domain?
Is there a funktion / tool, that displays domain data like: Domain Authority
Moz Pro | | softclick
Domain MozRank
Domain MozTrust
No. of Links
No. of linking Domains historical?1