Xenu Alternative for Large Sites
-
We're launching a new site and we're trying to crawl it to check for any problems. It's millions of pages and Xenu seems to start encountering errors as the numbers mount past 500,000. Does anyone know of an alternative, free or paid, that could handle the size better?
-
There's a post in the last 24 hours about Xenu and Screaming Frog, it's worth a read to see if a) screaming frog my work for you and b) if the 64 bit version of Xenu mentioned in the comments can solve any problems.
http://www.seomoz.org/blog/crawler-faceoff-xenu-vs-screaming-frog
-
WEB CEO and seopowersuite try them if it doesnt work: google webmaster tools but you will have to wait until it crawls your site, and not last thing seomoz crawling it has 10 000 limit minimum but it will be enougth to detect major problems
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Bringing a large news site back on line - anything to look out for?
Hi, I'm advising an online news site site that has been completely offline for almost 6 months, and is now looking to start back up again. The site seems to be completely gone from google's cache. This might mean moving to new hosting, but with the same URL. The archive has about 7000 original articles. Most of these are date specific news, although there are some longer investigative pieces that are more timeless. Is there any difference (from an SEO/digital marketing perspective) between putting the whole archive online at once, or gradually republishing the old articles? Is there anything I should be aware of, when restarting a website of this size? Thanks - Chris
Technical SEO | | AISFM0 -
301 Multiple Sites to Main Site
Over the past couple years I had 3 sites that sold basically the same products and content. I later realized this had no value to my customers or Google so I 301 redirected Site 2 and Site 3 to my main site (Site 1). Of course this pushed a lot of page rank over to Site 1 and the site has been ranking great. About a week ago I moved my main site to a new eCommerce platform which required me to 301 redirect all the url's to the new platform url's which I did for all the main site links (Site 1). During this time I decided it was probably better off if I DID NOT 301 redirect all the links from the other 2 sites as well. I just didn't see the need as I figured Google realized at this point those sites were gone and I started fearing Google would get me for Page Rank munipulation for 301 redirecting 2 whole sites to my main site. Now I am getting over 1,000 404 crawl errors in GWT as Google can no longer find the URL's for Site 2 and Site 3. Plus my rankings have dropped substantially over the past week, part of which I know is from switching platforms. Question, did I make a mistake not 301 redirecting the url's from the old sites (Site 2 and Site 3) to my new ecommerce url's at Site 1?
Technical SEO | | SLINC0 -
Site Launching, not SEO Ready
Hi, So, we have a site going up on Monday, that in many ways hasn't been gotten ready for search. The focus has been on functionality and UX rather than search, which is fair enough. As a result, I have a big list of things for the developer to complete after launch (like sorting out duplicate pages and adding titles that aren't "undefined" etc.). So, my question is whether it would be better to noindex the site until all the main things are sorted before essentially presenting search engines with the best version we can, or to have the site be indexed (duplicate pages and all) and sort these issues "live", as it were? Would either method be advisable over the other, or are there any other solutions? I just want to ensure we start ranking as well as possible as quickly as possible and don't know which way to go. Thanks so much!
Technical SEO | | LeahHutcheon0 -
Site architecture & breadcrumbs
Hi A client hasn't structured site architecture in a silo type format so breadcrumbs are not predicating in a topical hierarchy as one would desire (or at least i think one would prefer) For example: say the site is called www.fruit.com and it has a category called 'types of fruit' and then sub/content pages called things like 'apples' and 'pears'. So in terms of architecture that should be: www.fruit.com/types-of-fruit/apples and www.fruit.com/types-of-fruit/pears etc etc The client has kept it all flat so instead architecture is: www.fruit.com/types-of-fruit and www.fruit.com/apples and www.fruit.com/pears As a result breadcrumbs follow suit and hence since also not employing logical predication dont reflect the topical & sub-topical hierarchy I have seen that some seo's at least used to think this was better for seo since kept the page/s nearer the root but surely its better to structure site architecture in a logical topical hierarchy so long as dont go beyond say 3 or 4 directories/forward slashes in the url's? Also is it theoretically possible to keep url structure as is (flat) and just edit/customise the breadcrumbs to reflect a topical hierarchy in a silo structure rather than change the entire site architecture & required 301'ing etc in order to do this (or is that misleading or just not possible?) Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Site is not displaying in Search Engines
My site is www.deoveritas.com it is in magento framework and it has a blog section in wordpress. When I enter Site:www.deoveroitas.com in google it shows all blog links in search result. The homepage and other innerpages are not getting displayed in search results at all. I even tried searching for www.deoveritas.com/about-us and it displays blogs in result. Checked Google webmaster fetch as google and it was index and successful. Can you please help me with this. Is my site de-indexed or banned by Google? the same issue is on Bing and Yahoo search engines too. Please help Thank you.
Technical SEO | | tpt.com0 -
What do the mozzers think about this site of mine?
Hello SEO MOzzers, I am today wanting your feedback on a site that I recently went live with. My Google rankings for the main keywords are doing very well considering the site has been live for 3 weeks now. I of course have a list of items that i'm still working on, completing meta description tags, title tags, adding copy content to category pages, updating h1 tags, working on our backlinking campaign, etc. The site is www.profitness-supplies.com Let me know what you think Mozzers
Technical SEO | | seohive-2227200 -
Site maintenance and crawling
Hey all, Rarely, but sometimes we require to take down our site for server maintenance, upgrades or various other system/network reasons. More often than not these downtimes are avoidable and we can redirect or eliminate the client side downtime. We have a 'down for maintenance - be back soon' page that is client facing. ANd outages are often no more than an hour tops. My question is, if the site is crawled by Bing/Google at the time of site being down, what is the best way of ensuring the indexed links are not refreshed with this maintenance content? (ie: this is what the pages look like now, so this is what the SE will index). I was thinking that add a no crawl to the robots.txt for the period of downtime and remove it once back up, but will this potentially affect results as well?
Technical SEO | | Daylan1 -
Site Relaunch
Hello, I recently launched my new site (Nov. 25, 2011) but still have the old site live because I still need old customer data from the old admin for customer service issues and I cannot delete the old front-end without deleting the old back-end!. I am seeing a lot of referrals coming from the old site IP address with many backlinks to the new site but dont know if this is actually hurting the new site due to duplicate content, ect .. Any input would be greatly aaaaaapreciated 😉 Thanks in advance, Byron-
Technical SEO | | k9byron0