Tool to Generate All the URLs on a Domain
-
Hi all,
I've been using xml-sitemaps.com for a while to generate a list of all the URLs that exist on a domain. However, this tool only works for websites with under 500 URLs on a domain. The paid tool doesn't offer what we are looking for either. I'm hoping someone can help with a recommendation.
We're looking for a tool that can:
- Crawl, and list, all the indexed URLs on a domain, including .pdf and .doc files (ideally in a .xls or .txt file)
- Crawl multiple domains with unlimited URLs (we have 5 websites with 500+ URLs on them)
Seems pretty simple, but we haven't been able to find something that isn't tailored toward management of a single domain or that can crawl a huge volume of content.
-
@PatrickDelehanty The tool mentioned in the statement not only excels in the two areas mentioned earlier but also offers a wide range of additional capabilities. I recommend that you explore it for yourself! Best of luck!
-
@PatrickDelehanty The tool mentioned in the statement not only excels in the two areas ```
mentioned -
It seems to crawl all the wordpress folders and media files.
Is there not a tool that will tell you just your live website URLs, I'm after creating a site map and a mass re-organising content exercise, so want a list in excel of URLs.Any tips welcome
Thanks
Sarah
-
2nd Vote for Screaming Frog. Tried a lot of tools to pull info on all the URL's and this tool is by far the best one for the job.
-
Hi Felicia
Try ScreamingFrog - they crawl the entire site (you can configure how you want it to crawl your site) and have ways of creating a XML Sitemap for you.
The tool goes above and beyond those two areas as well and can do so much. I suggest you check it out! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removed URLs
recently my site has got some problem some of my URLs are repeating in the SERP ! I removed them by search console and also site : but they show up again Does anyone know what is wrong?
Technical SEO | | talaabshode20200 -
Disavow to all domains?
Hi there, I have several versions of my domain setup in Webmaster tools. Should I upload my disavow file against all of these domains? For example.....
Technical SEO | | niallfred
If I find a link pointing to: http://www.mydomain.com from: http://www.somespammysite.com do I need to add a disavow file in Webmaster tools for all my domain versions or only the version the offending links points towards? So... Only
http://www.mydomain.com
Or
http://www.mydomain.com
http://mydomain.com
https://www.mydomain.com
https://mydomain.com0 -
Webmaster tools question
Hi all. I have a question regarding http vs https. I have an https site and was wondering how to tell google in Webmaster tools to combine and use https. I have setup all sites in Webmaster tools. Both www and non www for both http and https. I see where to set up the www vs the non www but don't quite understand how to do the https part. I want all traffic to: https://www-creative -technology-solutions.com Thanks
Technical SEO | | twoacejr0 -
Parked Domains
I have a client who has a somewhat odd situation for their domains. They've been really inconsistent with how they've used them over the years, which makes for a slightly sticky situation. The client has two domains: compname.com and fullcompanyname.com. Right now, their website is just HTML (no CMS) and all of the URLs are relative, so both domains work. Since the new website will be in WordPress, they need to commit to one domain as the primary. Right now, it looks like compname.com is the one they've used the most in ads and such, so I'm going to recommend they go with that. However, the client has also used fullcompanyname.com a lot. They don't want to have to setup individual 301 redirects for everything. I think it's ridiculous, but you can lead a horse to water... Our developer has done some research and he may have found a solution that will satisfy the client. I just want to find out if there are any SEO implications. The possible plan is to us compname.com as the primary domain and to park fullcompanyname.com. That way, if someone visits fullcompanyname.com/products/my-favorite-product, it will still work without having to setup 301 redirects. Since the domain is parked, Google won't recognize it as duplicate content, correct? Just to be clear on the whole situation, I'm insisting that all of the website URLs need 301 redirects, regardless of the domain. The primary concern is with a lot of other stuff on the server that isn't related to the site (email campaign landing pages, image files, assets that are pulled in by the client's software, etc.). The client's concern is about redirecting all that other stuff (and there is a lot of it--thousands of files). The parked domain would seem to fix that, but I want to make sure that the client won't get Google slapped.
Technical SEO | | BopDesign0 -
Migrating domains from a domain that will have new content.
We have a new url. The old url is being taken over by someone else. Is it possible to still have a successful redirect/migration strategy if we are redirect from our old domain, which is now being used by someone else. I see a big mess, but I'm being told we can redirect all the links to our old content (which is now used by someone else) to our new url. Thoughts? craziness? insanity? Or I'm just not getting it:)
Technical SEO | | CC_Dallas0 -
Formatting dynamic urls?
We have a long-time previously well-established website that was hit by panda. On one section of the site, we have dynamic urls that include %20 in them (e.g. North%20America). It's recently come to our attention that google has both a version of the url with a plus sign (+) and the version with the %20 (space) (e.g. North+America). Upon researching this, it seems that a hyphen (-) is preferable to either of the above. We obviously need to remove the %20's from the urls as they can cause issues. So, should we stick with the + sign since it's already indexed and ranking or do a 301 rewrite and change them all to hyphens instead of the plus sign? This is the one section of the site that has maintained rankings through the panda debacle, so we need to take that into consideration as we don’t want to lose the rankings that we have. Along the same lines, we have two other sections of the site that provide search results as well, though these are all formatted to use a plus sign. Is it advisable to do a 301 rewrite to change the plus signs to hyphens on these as well or just leave them alone? This particular section has lost rankings over the last year with panda updates.
Technical SEO | | Odjobob0 -
Domain Forwarding Help
A friend of mine is a domainer and he wants to forward 21 parked niche specific domains to my site for extra type-in traffic. This will turn out to be 30 extra hits a day. Obviously, since these are parked domains, the SEO benefits are none, we just want the traffic. My questions is how to do it. These are his parked domains, and will not be redirected forever, is a 302 redirect the best plan here? He planned on just going into his hosting/domain admin and selecting "forward domain" -- is this ok too? Also, he would prefer to forward these domains to a single domain he owns, and then forward that single domain he owns to my domain. So someone who types in one of these 21 domains will go typindomain.com ---> hisredirectsite.com ---->mysite.com any implications here? What is the best option and how to do it? Thanks
Technical SEO | | terran0 -
URL Structure Question
Hey folks, I have a weird problem and currently no idea how to fix it. We have a lot of pages showing up as duplicates although they are the same page, the only difference is the url structure. They seem to show up like: http://www.example.com/page/ and http://www.example.com/page What would I need to do to force the URLs into one format or the other to avoid having that one page counting as two? The same issue pops up with upper and lower case: http://www.example.com/Page and http://www.example.com/page Is there any solution to this or would I need to forward them with 301s or similar? Thanks, Mike
Technical SEO | | Malarowski0