Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Crawlers crawl weird long urls
- 
					
					
					
					
 I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls. For example (to be clear): there is a page: www.website.com/dogs/dog.html but then it is continuing crawling: 
 www.website.com/dogs/dog.html
 www.website.com/dogs/dogs/dog.html
 www.website.com/dogs/dogs/dogs/dog.html
 www.website.com/dogs/dogs/dogs/dogs/dog.html
 www.website.com/dogs/dogs/dogs/dogs/dogs/dog.htmlwhat can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website 
- 
					
					
					
					
 Answer from Screaming Frog! The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL. 
 It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page. 
- 
					
					
					
					
 Wow, Big mistakes are made one Home maybe because of the .aspx. extension? alle pages have seo-friendly urls Thanks Wesley and Paddy Displays 
- 
					
					
					
					
 I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx. It's the bottom left block which causes this link. This way you will get a big nesting effect. 
- 
					
					
					
					
 OK found one problem on this page http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx you have a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx which i think should be 
- 
					
					
					
					
 ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect. eg http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx is being linked from http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx (as well as others) You just have to follow that trail till you find the source of the problem 
- 
					
					
					
					
 every link, except the hompage itself 
- 
					
					
					
					
 I can't see any source: The pages are like: | URL | www.website.com/page/ | 
 | Status Code | 200 |
 | Status | OK |
 | Type | text/html; charset=utf-8 |
 | Size | 55811 |
 | Title | |
 | Level | 10 |
 | In Links | 9 |
 | Out Links | 38 |
- 
					
					
					
					
 Which URL(s) is/are causing problems? 
- 
					
					
					
					
 please be free to check: http://tinyurl.com/lox7le9 
- 
					
					
					
					
 You don't necessarily have to remove the link. As long as you can verify that it directs to the right page. But curious to see what caused the problem  
- 
					
					
					
					
 I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link. 
- 
					
					
					
					
 That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch! 
- 
					
					
					
					
 Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link. 
 I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.Maybe you can send me the URL and i can take a look at it? 
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		WEbsite cannot be crawled
 I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is? Moz Pro | | threecounties0
- 
		
		
		
		
		
		Difference between urls and referring urls?
 Sorry, nit new to this side of SEO We recently discovered we have over 200 critical crawler issues on our site (mainly 4xx) We exported the CSV and it shows both a URL link and a referring URL. Both lead to a 'page not found' so I have two questions? What is the difference between a URL and a referring URL? What is the best practice/how do we fix this issue? Is it one for our web developer? Appreciate the help. Moz Pro | | ayrutd1
- 
		
		
		
		
		
		What is the best way to treat URLs ending in /?s=
 Hi community, I'm going through the list of crawl errors visible in my MOZ dashboard and there's a few URLs ending in /?s= How should I treat these URLs? Redirects? Thanks for any help Moz Pro | | Easigrass0
- 
		
		
		
		
		
		Why Only Our Homepage Can Be Crawled Showing a Redirect Message as the Meta Title
 Hello Everyone, So recently when we checked our domain using a Moz Crawl Test and Screaming Frog only the homepage comes up and the meta title says “You are being redirected to…”. We have several pages that used to come up and when submitting them to GSC no issues come up. The robots.txt file looks fine as well. We thought this might be ‘server’ related but it’s a little out of our field of expertise so we thought we would find out if anyone has any experience with this (ideas of reasons, how to check…etc.) or any potential suggestions. Any extra insight would be really appreciated. Please let us know if there is anything we could provide further details for that might. Looking forward to hearing from all of you! Thanks in advance. Best, Moz Pro | | Ben-R0
- 
		
		
		
		
		
		404 Crawl Diagnostics with void(0) appended to URL
 Hello I am getting loads of 404 reported in my Crawl report, all appended with void(0) at the end. For example: http://lfs.org.uk/films-and-filmmakers/watch-our-films/1289/void(0) Moz Pro | | moshen
 The site is running on Drupal 7, Has anyone come across this before? Kind Regards Moshe | http://lfs.org.uk/films-and-filmmakers/watch-our-films/1289/void(0) |0
- 
		
		
		
		
		
		Meta Tag Descriptions not being found in Moz Crawls
 Hey guys, I have been managing a few websites and have input them into Moz for crawl reports, etc. For a while I have noticed that we were getting a gratuitous amount of errors when it came to the number of missing meta tags. It was numbering in the 200's. The sites were in place before I got here and a lot of the older posts no one had even attempted to include tags, links of the page or anything. As they are all Wordpress Sites and they all already had the Yoast/Wordpress SEO plug-in installed on them, I decided I would go through each post and media file one at a time and update their meta tags via the plug in. I personally did this so I know that I added and saved each one, however the Moz crawl reports continue to show that we are missing roughly 200 meta tags. I've seen a huge drop off in 404 errors and stuff since I went through and double checked everything on the sites, however the meta tag errors persist. Is this the case that Moz is not recognizing the tags when it crawls because I used the Yoast Plugin? Or would you say that the plugin is the issue and I should find another way to add meta tags to the pages and posts on the site? My main concern is that if Moz is having issues crawling the sites, is Google also seeing the same thing? The URLS include: Moz Pro | | MOZ.info
 sundancevacationsblog.com
 sundancevacationsnews.com
 sundancevacationscharities.com Any help would be appreciated!0
- 
		
		
		
		
		
		Duplicate page titles are the same URL listed twice
 The system says I have two duplicate page titles. The page titles are exactly the same because the two URLs are exactly the same. These same two identical URLs show up in the Duplicate Page Content also - because they are the same. We also have a blog and there are two tag pags showing identical content - I have blocked the blog in robots.txt now, because the blog is only for writers. I suppose I could have just blocked the tags pages. Moz Pro | | loopyal0
- 
		
		
		
		
		
		Use of the tilde in URLs
 I just signed up for SEOMoz and sent my site through the first crawl. I use the tilde in my rewritten URLs. This threw my entire site into the Notice section 301 (permanent redirect) since each page redirects to the exact URL with the ~, not the %7e. I find conflicting information on the web - you can use the tilde in more recent coding guidelines where you couldn't in the old. It would be a huge thing to change every page in my site to use an underscore instead of a tilde int he URL. If Google is like SEOMoz and is 301 redirecting every page on the site, then I'll do it, but is it just an SEOMoz thing? I ran my site through Firebug and and all my pages show the 200 response header, not the 301 redirect. Thanks for any help you can provide. Moz Pro | | fdb0
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				