Unsolved Crawled 404's
-
Hello, I've receiving these very weird 404's that seem to be crawled by Moz. I contacted rankmath first but they told me to turn to you. So how are the url's crawled? Because these seemed to be 'glued' together. They start with myurl.nl/PAGE/NUMBER/myurl.nl again. These are two existing pages that are now stuck together and are causing 404's. Please help
-
@loes2911 said in Crawled 404's:
Hello, I've receiving these very weird 404's that seem to be crawled by Moz. I contacted rankmath first but they told me to turn to you. So how are the url's crawled? Because these seemed to be 'glued' together. They start with myurl.nl/PAGE/NUMBER/myurl.nl again. These are two existing pages that are now stuck together and are causing 404's. Please help
These “glued” 404 URLs likely result from malformed internal links or redirects. Moz's crawler may be picking up on these if a plugin or incorrect URL structure is causing two URLs to combine into one. Start by inspecting any internal links and redirects on your site, especially near the affected areas, as a misconfiguration could be causing this issue. If you’re using any plugins that dynamically generate URLs, check their settings to ensure they aren’t inadvertently creating these concatenated URLs. This should help prevent future 404s.
-
https://goofyahhpictures.us/goofy-ahh-haircuts/ > This is my website and one of the page showing crawled 404 erors in images, anyone can help?
-
Hi everyone,
I received 243 "404 Not Found" errors on my patent services website. Are these harmful to my site? If so, how can I resolve this issue?
Image URL: 404 error image -
@LE200 I've tried everything except for adjusting the robot.txt. Could you please help me with how to adjust the robot.txt?
-
my site is showing 404 error now tell me what should i do how can i ressolve them here is my website:https://topfollowerpro.com
-
@LE200 Hi, thank you so much for you response! I've tried everything except for updating my robot.txt. How should I adjust it to block those urls?
-
This post is deleted! -
Our web design company, also uses Moz, and also Screaming Frog to find broken links, causing 404 page errors.
-
Are the 404s appearing also in your companies GSC account? have your web designers deleted many pages recently?
-
@loes2911
The 404 errors you're seeing likely result from malformed URLs being crawled by Moz, where two URLs are combined incorrectly. Here’s how to address it:Check Your Links: Review your site's internal links and sitemap for any improper URL structures.
Canonical Tags: Ensure each page has a proper canonical tag to prevent merging.
Redirects: Look into your redirect settings to confirm there are no misconfigurations.
Crawl Settings: Update your robots.txt file to manage how crawlers like Moz access your site.
I was facing the same issue on my site vnapp and solved it by following these steps.
If the issue persists, reach out to Moz for additional support.
-
@loes2911
The 404 errors you're encountering may be due to malformed URLs being crawled by Moz. These URLs seem to merge two existing pages, leading to invalid links.Here are a few quick steps
Check Sitemap and Links: Review your sitemap and internal links for any errors.
Canonical Tags: Ensure canonical URLs are correctly set on each page to prevent confusion.
Redirects: Investigate any misconfigured redirects in your .htaccess or server settings.
Robots.txt: Adjust your robots.txt to block invalid URLs.
If the issue continues, contact Moz for further guidance.For mobile video editing, you can try VN Video Editor here.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Crawler was not able to access the robots.txt
I'm trying to setup a campaign for jessicamoraninteriors.com and I keep getting messages that Moz can't crawl the site because it can't access the robots.txt. Not sure why, other crawlers don't seem to have a problem and I can access the robots.txt file from my browser. For some additional info, it's a SquareSpace site and my DNS is handled through Cloudflare. Here's the contents of my robots.txt file: # Squarespace Robots Txt User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: FacebookBot User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: Applebot-Extended User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps User-agent: * Disallow: /config Disallow: /search Disallow: /account$ Disallow: /account/ Disallow: /commerce/digital-download/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ Disallow:/*?author=* Disallow:/*&author=* Disallow:/*?tag=* Disallow:/*&tag=* Disallow:/*?month=* Disallow:/*&month=* Disallow:/*?view=* Disallow:/*&view=* Disallow:/*?format=json Disallow:/*&format=json Disallow:/*?format=page-context Disallow:/*&format=page-context Disallow:/*?format=main-content Disallow:/*&format=main-content Disallow:/*?format=json-pretty Disallow:/*&format=json-pretty Disallow:/*?format=ical Disallow:/*&format=ical Disallow:/*?reversePaginate=* Disallow:/*&reversePaginate=* Any ideas?
Getting Started | | andrewrench0 -
Unsolved Moz crawler not working
Hi Moz crawler keep failing on my site with the error showing : Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. I'm not sure what am I missing out.. this is my robots.txt.. i don't think Im missing anything else.. https://www.wearefutureheads.com/robots.txt can the support team help ?
Moz Pro | | teikh0 -
Strange - Search Console page indexing "../Detected" as 404
Anyone seen this lately? All of a sudden Google Search Console is insisting in Page indexing that there is a 404 for a page that has never existed on our client's site: https://........com.au/Detected We've noticed this across a number of sites, precisely in this way with a capitalised "/Detected" To me it looks like something spammy is being submitted to the SERPs (somehow) and Google is trying to index that and then getting a 404. Naturally MOZ isn't picking it up, cause the page simply never existed - it's just happening in Search Console 2afc7e35-71e4-4e25-80a3-690bf10776a7.png It comes and it goes in the 404 alerts in Console and is really annoying. I reckon it started happening late 2022.
Reporting & Analytics | | DanielDL0 -
Unsolved Strange "?offset" URL found with content crawl issues
I recently recieved a slew of content crawl issues via Moz for URL's that I have never seen before For example:
Moz Pro | | HannahPalamara
Standard URL: https://skilldirector.com/news,
Newly identified URL: https://skilldirector.com/news?offset=1469542207800&category=Competency+Management). Does anyone know where the URL comes from and how to fix it?0 -
Unsolved Rogerbot blocked by cloudflare and not display full user agent string.
Hi, We're trying to get MOZ to crawl our site, but when we Create Your Campaign we get the error:
Moz Pro | | BB_NPG
Ooops. Our crawlers are unable to access that URL - please check to make sure it is correct. If the issue persists, check out this article for further help. robot.txt is fine and we actually see cloudflare is blocking it with block fight mode. We've added in some rules to allow rogerbot but these seem to be getting ignored. If we use a robot.txt test tool (https://technicalseo.com/tools/robots-txt/) with rogerbot as the user agent this get through fine and we can see our rule has allowed it. When viewing the cloudflare activity log (attached) it seems the Create Your Campaign is trying to crawl the site with the user agent as simply set as rogerbot 1.2 but the robot.txt testing tool uses the full user agent string rogerbot/1.0 (http://moz.com/help/pro/what-is-rogerbot-, rogerbot-crawler+shiny@moz.com) albeit it's version 1.0. So seems as if cloudflare doesn't like the simple user agent. So is it correct the when MOZ is trying to crawl the site it uses the simple string of just rogerbot 1.2 now ? Thanks
Ben Cloudflare activity log, showing differences in user agent strings
2022-07-01_13-05-59.png0 -
Unsolved How do I cancel this crawl?
The latest crawl on my site was the 4th Jan with a current crawl 'in progress'. How do i cancel this crawl and start a new one? I've been getting keyword ranking etc but no new issues are coming through. Screenshot 2022-05-31 083642.jpg
Moz Tools | | ClaireU0 -
Unsolved /%25s
Hi Community, has anyone else had a 404 error reported by Moz, where the end of the domain is /%25s? The error comes from my blog home page https://kaydee.net/blog/ But when I look at the source code, I can't see anything that has a space at the end of the URL. I wonder if it is to do with the WordPress search? Thanks in advance for any insight.
Moz Pro | | kaydeeweb0 -
Unsolved Performance Metrics crawl error
I am getting an error:
Product Support | | bhsiao 0
Crawl Error for mobile & desktop page crawl - The page returned a 4xx; Lighthouse could not analyze this page.
I have Lighthouse whitelisted, is there any other site I need to whitelist? Anything else I need to do in Cloudflare or Datadome to allow this tool to work?1