Crawlers crawl weird long urls
-
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls.
For example (to be clear):
there is a page: www.website.com/dogs/dog.html
but then it is continuing crawling:
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.htmlwhat can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website
-
Answer from Screaming Frog!
The reason the SEO spider is crawling these URLs, is due to incorrect relative linking on the site from the login URL.
It's actually when the spider crawls the login page, http://www.website.com/login?returnurl=%2F which then leads to this URL http://www.website.com/Home/ctl/SendPassword?returnurl=http:/www.website.com/ and then this /home/ sub directory URL http://www.website.com/Home/ctl/page/dogs.aspx which links to http://www.website.com/Home/ctl/page/page/dogs.aspx and so on and so forth. This is the path to the incorrect relative linking (attached for you).To stop this, you can correct the incorrect relative linking, or easier, simply exclude the login page.
-
Wow, Big mistakes are made one Home
maybe because of the .aspx. extension? alle pages have seo-friendly urls
Thanks Wesley and Paddy Displays
-
I see a link to http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/HeutinkICT.aspx from http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx.
It's the bottom left block which causes this link. This way you will get a big nesting effect.
-
OK found one problem
on this page
http://www.odin-groep.nl/Home/ctl/OverOdin/ReindersICT.aspx
you have a link to
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/LesscherIT.aspx
which i think should be
-
ok I did a quick screaming fog and I think I have an idea, you just have to follow the breadcrumbs
You said in you example "In Links 9", you need to find out what those pages are and follow it back to the point of origin As I think its just one bad link that cause this nested link effect.
eg
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/OverOdin/HeutinkICT.aspx
is being linked from
http://www.odin-groep.nl/Home/ctl/OverOdin/OverOdin/OverOdin/StationtoStation.aspx (as well as others)
You just have to follow that trail till you find the source of the problem
-
every link, except the hompage itself
-
I can't see any source:
The pages are like:
| URL | www.website.com/page/ |
| Status Code | 200 |
| Status | OK |
| Type | text/html; charset=utf-8 |
| Size | 55811 |
| Title | |
| Level | 10 |
| In Links | 9 |
| Out Links | 38 | -
Which URL(s) is/are causing problems?
-
please be free to check: http://tinyurl.com/lox7le9
-
You don't necessarily have to remove the link. As long as you can verify that it directs to the right page.
But curious to see what caused the problem
-
I think Screaming Frog will tell you the page it found the weird url, then you can check the source, and find out whats producing that link.
-
That is a good one! It's true that I have the same linking to the page itself. I will remove all that kind of links first and crawl again. I'll keep you in touch!
-
Are you somehow linking to www.website.com/dogs/dog.html from the page itself? There could be something wrong with that link.
I made a small mistake not so long ago with a redirection plugin. I told it to go to domain.com. This plugin was looking at the base + what i told it to. So it went to: domain.com/domain.com. Perhaps you made a similar mistake.Maybe you can send me the URL and i can take a look at it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My moz only one page was crawled
I recently moved my shopping cart from one provider to another and today moz only crawled one page, could this be because maybe google has not indexed it yet or should i be concerned? I pointed the DNS at the new cart monday night if that helps. I would have expected it to be indexed by now
Moz Pro | | SmartVapes0 -
Why is my crawl STILL in progress?
I'm a bit new here, but we've had a few crawls done already. They are always finished by Wednesday night. Our website is not large (by any means), but the crawl still says it's in progress now 3 days later. What's the deal here?!?
Moz Pro | | Kibin0 -
canonical URL tag
Hello, I was checking my ON page SEO, and one of the things i see Number of Canonical tags 2 Remove all but a single canonical URL tag I didn't fully understand, what is canonical URL tag? my website is http://novitasalonandspa.com Thanks for help
Moz Pro | | vlad_mezoz0 -
How do I force a crawl?
In the campaign overview it reads that 0 pages were crawled. Also got an email saying that a comprehensive audit will be done in 7 days. But the 'crawl in progress' wheel disappeared. I think it stopped, and I need to submit that report to substantiate buying the tool! How do I force a crawl?
Moz Pro | | ilhaam0 -
In Site Explorer My Blog.URL.com Shows "No Data Available for this URL"
Why when I use http://www.opensiteexplorer.org and I'm researching our Blog.URL.com's does the tool say "No Data Available for this URL"? Example: http://www.opensiteexplorer.org/links?site=blog.centurypayments.com
Moz Pro | | cfield_splashmedia.com0 -
Crawl Diagnostics Report Lacks Information
When I look at the crawl diagnostics, SEOMoz tells me there are 404 errors. This is understandable, because some pages were removed. What this report doesn't tell me is how those pages were discovered. This is a very important piece of information, because it would tell me there are links pointing to those pages, either internal or external. I believe the internal links have been removed. If the report told me how if found the link, I would be able to take immediate action. Without that information, I have to go so a lot of investigation. And when you have a million pages, that isn't easy. Some possibilities: The crawler remembered the page from the previous crawl. There was a link from an index page - i.e. it is in the database still There was an individual link from another story - so now there are broken links Ditto, but it in on a static index page The link was from an external source - I need to make a redirect Am I missing something, or is this a feature the SEO Moz crawler doesn't have yet? What can I do (other than check all my pages) to discover this?
Moz Pro | | loopyal0 -
SEOMoz Crawl Warnings, do they really hurt rankings?
SEOMoz reports 250 crawl warnings on my site. In most cases its too long title tags, with 4 of them its missing meta description. SEOMoz says it will hurt my rankings? However, I'm sure a recent whiteboard Friday contradicted this. So what is it?
Moz Pro | | sanchez19600 -
Why does my crawl report show just one page result?
I just ran a crawl report on my site: http://dozoco.com The result report shows results for just one page - the home page, but no other pages. The report doesn't indicate any errors or "do not follows" so I'm unclear on the issue, although I suspect user error - mine.
Moz Pro | | b1lyon0