Tool to Generate All the URLs on a Domain
-
Hi all,
I've been using xml-sitemaps.com for a while to generate a list of all the URLs that exist on a domain. However, this tool only works for websites with under 500 URLs on a domain. The paid tool doesn't offer what we are looking for either. I'm hoping someone can help with a recommendation.
We're looking for a tool that can:
- Crawl, and list, all the indexed URLs on a domain, including .pdf and .doc files (ideally in a .xls or .txt file)
- Crawl multiple domains with unlimited URLs (we have 5 websites with 500+ URLs on them)
Seems pretty simple, but we haven't been able to find something that isn't tailored toward management of a single domain or that can crawl a huge volume of content.
-
@PatrickDelehanty The tool mentioned in the statement not only excels in the two areas mentioned earlier but also offers a wide range of additional capabilities. I recommend that you explore it for yourself! Best of luck!
-
@PatrickDelehanty The tool mentioned in the statement not only excels in the two areas ```
mentioned -
It seems to crawl all the wordpress folders and media files.
Is there not a tool that will tell you just your live website URLs, I'm after creating a site map and a mass re-organising content exercise, so want a list in excel of URLs.Any tips welcome
Thanks
Sarah
-
2nd Vote for Screaming Frog. Tried a lot of tools to pull info on all the URL's and this tool is by far the best one for the job.
-
Hi Felicia
Try ScreamingFrog - they crawl the entire site (you can configure how you want it to crawl your site) and have ways of creating a XML Sitemap for you.
The tool goes above and beyond those two areas as well and can do so much. I suggest you check it out! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Inspector, Rich Results Tool, GSC unable to detect Logo inside Embedded schema
I work on a news site and we updated our Schema set up last week. Since then, valid Logo items are dropping like flies in Search Console. Both URL inspector & Rich Results test cannot seem to be able to detect Logo on articles. Is this a bug or can Googlebot really not see schema nested within other schema?Previously, we had both Organization and Article schema, separately, on all article pages (with Organization repeated inside publisher attribute). We removed the separate Organization, and now just have Article with Organization inside the publisher attribute. Code is valid in Structured Data testing tool but URL inspection etc. cannot detect it. Example: https://bit.ly/2TY9Bct Here is this page in URL inspector: By comparison, we also have Organization schema (un-nested) on our homepage. Interestingly enough, the tools can detect that no problem. That's leading me to believe that either nested schema is unreadable by Googlebot OR that this is not an accurate representation of Googlebot and it's only unreadable by the testing tools. Here is the homepage in URL inspector: In pseudo-code, our OLD schema looked like this: The NEW schema set up has the same Article schema set up, but the separate script for Organization has been removed. We made the change to embed our schema for a couple reasons: first, because Google's best practices say that if multiple schemas are used, Google will choose the best one so it's better to just have one script; second, Google's codelabs tutorial for schema uses a nested structure to indicate hierarchy of relevancy to the page. My question is, does nesting schemas like this make it impossible for Googlebot to detect a schema type that's 2 or more levels deep? Or is this just a bug with the testing tools?
Technical SEO | | ValnetInc0 -
Is there a tool out there to check any domain that might be pointing to my existing domain?
Is there a tool out there to check any domain that might be pointing to my existing domain?
Technical SEO | | adlev0 -
Old Redirected Domain is replacing my current domain on SERPs
Hello everyone, All of a sudden a 2 year old redirected domain is replacing my current domain for 2 weeks now, my site is apitus.com and my old domain is aptitus.pe (the redirect is still working), however this only happens on my country google results (google.com.pe), if you check my site on google.com, everything looks ok even with a sitelink, which I no longer have on my country search results. Back to the issue, the first thing I thought was go to Search Console and take it out from the index, so I asked for access by uploading a file but since everything on that old site redirects to my current site I can't make such action. While still waiting for such access, is there anything else I could do?. Thanks in advance. PD: I'm adding the images of my SERPs CmzN8kY G3zZwwj
Technical SEO | | JoaoCJ0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
Domain forwarding
Hi Is it ok or bad practice to domain forward shorter more memorable snappier domains used for promoting a website to a longer domain where the website actually lives, such as: Promoting in social media profiles, emails and offline literature a domain with forwarding set up like: www.brand.com To the main website: www.brandincludingprimaryproductrelatedkeyword.com And if ok (not bad practice), since its the forwarded domains that are being promoted they are hence the links most likely to be shared on social media and other websites so will they be treated like 301's and 'link building' for those will pretty much equate to link building for the main domain (or not) ? Many Thanks Dan
Technical SEO | | Dan-Lawrence0 -
Selling Domain,,,,, OOOOPPPPPPsssss!
Hey Everyone, I think I may have screwed up big time. I have an old 4 letter domain that I've been sitting on for many years. I had no idea what it was worth and ended up putting it up on ebay. I have since been told that it's quite valuable. I just tried to end the auction and ebay won't let me end it. My thinking is that someone here may want to take advantage of my mistake. Even better, let me know if there are any tricks to ending the auction. The domain is http://proa.com If for some reason it doesn't sell, where is a good place to sell it at a fair price? Have a good one. David
Technical SEO | | dmac0 -
Shorter URLs
Hi Is there a real value in having the keywords in the URL structure? we could use the URL: Mybrand.com/software/tablets/ipad/supertrader.html Or instead have the CMS create the shorter version mybrand.com/supertrader.html and just optimize this page for the keyword 'supertrader ipad software'
Technical SEO | | FXDD1 -
We have a decent keyword rich URL domain that's not being used - what to do with it?
We're an ecommerce site and we have a second, older domain with a better keyword match URL than our main domain (I know, you may be wondering why we didn't use it, but that's beside the point now). It currently ranks fairly poorly as there's very few links pointing to it. However, the exact match URL means it has some value, if we were to build a few links to it. What would you do with it: 301 product/category pages to current site's equivalent page Link product/category pages to current site's equivalent page Not bother using it at all Something else
Technical SEO | | seanmccauley0