Tool to Generate All the URLs on a Domain
-
Hi all,
I've been using xml-sitemaps.com for a while to generate a list of all the URLs that exist on a domain. However, this tool only works for websites with under 500 URLs on a domain. The paid tool doesn't offer what we are looking for either. I'm hoping someone can help with a recommendation.
We're looking for a tool that can:
- Crawl, and list, all the indexed URLs on a domain, including .pdf and .doc files (ideally in a .xls or .txt file)
- Crawl multiple domains with unlimited URLs (we have 5 websites with 500+ URLs on them)
Seems pretty simple, but we haven't been able to find something that isn't tailored toward management of a single domain or that can crawl a huge volume of content.
-
@PatrickDelehanty The tool mentioned in the statement not only excels in the two areas mentioned earlier but also offers a wide range of additional capabilities. I recommend that you explore it for yourself! Best of luck!
-
@PatrickDelehanty The tool mentioned in the statement not only excels in the two areas ```
mentioned -
It seems to crawl all the wordpress folders and media files.
Is there not a tool that will tell you just your live website URLs, I'm after creating a site map and a mass re-organising content exercise, so want a list in excel of URLs.Any tips welcome
Thanks
Sarah
-
2nd Vote for Screaming Frog. Tried a lot of tools to pull info on all the URL's and this tool is by far the best one for the job.
-
Hi Felicia
Try ScreamingFrog - they crawl the entire site (you can configure how you want it to crawl your site) and have ways of creating a XML Sitemap for you.
The tool goes above and beyond those two areas as well and can do so much. I suggest you check it out! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Changing Urls
Hi All, I have a question I hope someone can help me with. I ran a scan on a website and it has a stack of urls that are far too long. I am going through and changing the urls to shorter ones. But my question is regarding redirections. Wordpress seems to be automatically redirecting the old urls to the new ones, should i be adding a more solid 301 in as well or is the wordpress redirect enough? I ask as they dont all seem to stay redirecting Thanks in advance for the help
Technical SEO | | DaleZon2 -
URL structure
Hello Guys, Quick Question regarding URL strucutre One of our client is an hotel chain, thye have a group site www.example.com and each property is located in a subfolder: www.example.com/example-boston.html , www.example.com/example-ny.html etc. My quesion is : where is better to place the language extension at a subfolder level?
Technical SEO | | travelclickseo
Should i go for www.example.com/en/example-ny.html or it is preferable to specify the language after the property name www.example.com/example-ny/en/accommodation.html? Thanks and Regards, Alessio0 -
No Keyword in URL
SEOMoz (and other platforms) advise that I need to add my keyword to the page URL, however as far as I'm concerned it has been, so why don't these platforms see it. My home page URL is www.salesandinternetmarketing.com, but apparently I haven't added the keyword internet marketing to the URL, what advice can you give me please? Lindsay
Technical SEO | | lindsayjhopkins1 -
Fix or Block Webmaster Tools URL Errors Not Found Linked from a certain domain?
RE: Webmaster Tool "Not Found" URL Errors are strange links from webstatsdomain.com Should I continue to fix 404 errors for strange links from a website called webstatsdomain.com or is there a way to ask Google Webmaster Tools to ignore them? Most of Webmaster Tools "URL Not Found errors" I find for our website are from this domain. They refer to pages that never existed. For example, one was to www.mydomain.com/virtual. Thanks for your help.
Technical SEO | | zharriet0 -
Domain Aliases
Hi there, I've got two sites mysite.com and mysite.org .org is indexed by google, .com doesnt seem to be. .com is used for some material that is sent out, and accounts for about 20% of incoming visotors. (80% end up on .org) Is there any positive or negative effect from this? Would I benefit from 301'ing the .com to .org?
Technical SEO | | dencreative0 -
How to find original URLS after Hosting Company added canonical URLs, URL rewrites and duplicate content.
We recently changed hosting companies for our ecommerce website. The hosting company added some functionality such that duplicate content and/or mirrored pages appear in the search engines. To fix this problem, the hosting company created both canonical URLs and URL rewrites. Now, we have page A (which is the original page with all the link juice) and page B (which is the new page with no link juice or SEO value). Both pages have the same content, with different URLs. I understand that a canonical URL is the way to tell the search engines which page is the preferred page in cases of duplicate content and mirrored pages. I also understand that canonical URLs tell the search engine that page B is a copy of page A, but page A is the preferred page to index. The problem we now face is that the hosting company made page A a copy of page B, rather than the other way around. But page A is the original page with the seo value and link juice, while page B is the new page with no value. As a result, the search engines are now prioritizing the newly created page over the original one. I believe the solution is to reverse this and make it so that page B (the new page) is a copy of page A (the original page). Now, I would simply need to put the original URL as the canonical URL for the duplicate pages. The problem is, with all the rewrites and changes in functionality, I no longer know which URLs have the backlinks that are creating this SEO value. I figure if I can find the back links to the original page, then I can find out the original web address of the original pages. My question is, how can I search for back links on the web in such a way that I can figure out the URL that all of these back links are pointing to in order to make that URL the canonical URL for all the new, duplicate pages.
Technical SEO | | CABLES0 -
Bing Webmaster tool
Hi Fellas, I wanted to know once you verify the BIng Webmaster tool (via xml file) for a dev site, do you have to do the verification process again for the final site? I thought I needed to do the verification again but once I added the final website (which have almost a similar URL) into the webmaster tool account, it seemed that I didn't have to verify it. I am a bit confused. Thank you for clarifying
Technical SEO | | Ideas-Money-Art0 -
Blog URLs
I read somewhere - pretty sure is was in Art of SEO - that having dates in the blog permalink URLs was a bad idea. e.g. /blog/2011/3/my-blog-post/ However, looking at Wordpress best practice, it's also not a good idea to have a URL without a number - it's more resource hungry if you don't , apparently. e.g. /blog/my-blog-post/ Does anyone have any views on this? Thanks Ben
Technical SEO | | atticus70