How to find all crawlable links on a particular page?
-
Hi! This might sound like a newbie question, but I'm trying to find all crawlable links (that google bot sees), on a particular page of my website. I'm trying to use screaming frog, but that gives me all the links on that particular page, AND all subsequent pages in the given sub-directory. What I want is ONLY the crawlable links pointing away from a particular page. What is the best way to go about this? Thanks in advance.
-
Thanks for sharing this information Thomas. Appreciate your time and help here. Regards.
-
I understand yes are referred that is a parameter or how far from home here's some information on a tool I'm using right now
http://www.internetmarketingninjas.com/seo-tools/google-sitemap-generator/
here is an HTML file of the results however you can see the how far from home on the left hand side I suggest you run the tool yourself so you can see the full results
Using the IMN Google Site Map Generator
Links are critically important to webpages, not only for connecting to other, related pages to help end users find the information they want, but in optimizing the pages for SEO. The Find Broken Links, Redirects & Google Sitemap Generator Free Tool allows webmasters and search engine optimizers to check the status of both external links and internal links on an entire website. The resulting report generated by the Google sitemap generator tool will give webmasters and SEOs insight to the link structure of a website, and identify link redirects and errors, all of which help in planning a link optimization strategy. We always offer the downloadable results and the sitemap generator free for everyone.
Get started
To start with the free sitemap generator, type (or paste) the full home page URL of the website you want scanned. Select the number of pages you want to scan (up to 500, up to 1,000, or up to 10,000). Note that the job starts immediately and runs in real time. For larger sites containing numerous pages, the process can take up to 30 minutes to crawl and gather data on 1,000 pages (and longer still for very large sites). You can set the Google sitemap generator tool to send you an email once the crawl is completed and the data report is prepared. The online sitemap generator offers several options and also acts as an XML sitemap generator or an HTML sitemap generator.
Note that the results table data of the online sitemap generator is interactive. Most of the data items are linked, either to the URLs referenced or to details about the data. For most cells that contain non-URL data, pause the mouse over the cell to see the full results.
Results Bar
When the tool starts, a results bar appears at the top of the page showing the following information:
- Status of the tool (Crawling or Done)
- Number of Internal URLs crawled
- Number of External links found
- Number of Internal HTTP Redirects found
- Number of External HTTP Redirects found
- Number of Internal HTTP error codes found
- Number of External HTTP error codes found
For those who need sitemaps provided by either an HTML sitemap generator or an XML sitemap generator,
there are corresponding options offered here. Also shown are the following:- Download XML Sitemap button
- Download tool results in Excel format
- Download tool results in HTML format
Lastly, if you love the free sitemap generator tool, you can tell the world by clicking any of the following social media buttons:
- Facebook Like
- Google+
Email notification
Next, you can submit your email address to have a copy of the report emailed to you if you choose not to wait for it to finish crawling. We offer this feature as well as the sitemap generator free to all users.
Tool results data
When results are ready, the HTML sitemap generator will organize the data into six tables:
- Internal links
- External links
- Internal errors (a subset of Internal Links)
- Internal redirects (another subset of Internal Links)
- External errors (a subset of External Links)
- External redirects (another subset of External Links)
The table data is typically linked to either page URLs or to details about the data. Click on column headers to sort the results.
1Internal Links table
The Internal links table created by the XML sitemap generator includes the following data fields:
- URLs crawled on the site
- Link to The On Page Optimization Analysis Free SEO Tool for that URL
- URL’s level from the domain root
- URL’s returned HTTP status code
- Number of internal links the URL has within the site (click to see the list of URLs)
- Link text used for the URL
- Number of internal links on the page (click to see the list of URLs)
- Number of external links on the page (click to see the list of URLs)
- Size of the page on kilobytes (click to see page load speed test results for this URL from Google)
- Link to the Check Image Sizes, Alt Text, Header Checks and More Free SEO Tool for that URL
- The tag text from the URL’s page
- The description tag text from the URL’s page
- The keywords tag text from the URL’s page
- Contents, if used, of the anchor tag’s “rel=” attribute
2External Links table
The External links table includes the following data fields:
- URL’s returned HTTP status code
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- External URL used in the link
- Link text used for the URL
- Internal page URL on which the link was first found
3Internal HTTP code errors table
The Internal errors table gathers all of the pages returning HTTP code errors (4xx and 5xx level codes) in one place to help organize the effort to resolve the problems. It includes the following data fields:
- URL’s returned HTTP status code
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- Internal URL used in the link
- Link text used for the URL
- Internal page URL on which the link was first found
The Internal errors table is a subset of the Internal links table showing just those pages returning HTTP status code errors.
4Internal HTTP redirects table
The Internal redirects table combines all of the pages returning HTTP redirects in one list so you can easily review them. You should not have to rely on redirects internally. Instead, you can fix the source code containing the redirected link. This table contains the following data fields:
- URL’s returned HTTP status code (click it to go to the HTTP Response Code Checker tool)
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- Internal URL used in the link
- Link text used for the URL
- Redirect’s target URL
- Internal page URL on which the link was first found
The Internal redirects table is a subset of the Internal links table showing just those pages returning 301 and 302 HTTP status code redirects.
5External HTTP code errors table
The External errors table gathers all of the pages returning HTTP code errors (4xx and 5xx level codes) in one place to help organize the effort to resolve the problems. It includes the following data fields:
- URL’s returned HTTP status code (click it to go to the HTTP Response Code Checker tool)
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- Internal URL used in the link
- Link text used for the URL
- Redirect’s target URL
- Internal page URL on which the link was first found
The External errors table is a subset of the External links table showing just those pages returning HTTP status code errors.
6External HTTP redirects table
The External redirects table combines all of the pages returning HTTP redirects in one list so you can easily review them. As the redirect to the targeted page does not affect your page, fix these URLs is a lower priority. This table contains the following data fields:
- URL’s returned HTTP status code (click it to go to the HTTP Response Code Checker tool)
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- External URL used in the link
- Link text used for the URL
- Redirect’s target URL
- Internal page URL on which the link was first found
The External redirects table is a subset of the External links table showing just those pages returning 301 and 302 HTTP status code redirects.
-
Hi Thomas! When I say 1 click, I mean all links that can directly be reached from www.wishpicker.com. For example
wishpicker.com/gifts-for can be reached directly from wishpicker.com
wishpicker.com/gifts-for/boyfriend cannot be reached directly from wishpicker.com. I would first need to go to wishpicker.com/gifts-for, and then go to wishpicker.com/gifts-for/boyfriend. So wishpicker.com/gifts-for is 1 click away, and wishpicker.com/gifts-for/boyfriend is 2 clicks away from wishpicker.com.
I am looking to crawl all links that are only 1 click away. Thanks for your help here. Really appreciate it.
-
when you say one click away are you talking about a parameter?
I will run this through screaming frog and a couple other tools and see if I can get your answer.
-
Hi Thomas
Thanks for your response. Here is my website: www.wishpicker.com
What I am looking for is all the links present only 1 click away from the page www.wishpicker.com (both internal and external).
Performing a crawl with screaming frog is giving me all links (1, 2, 3, 4, and more clicks away). Not sure how to limit the crawl to show links that are only 1 click away, and exclude links that are 2 or more clicks away from this page.
Look forward to your response.
Thanks!
-
Hi,
Screaming frog does in fact show you the links that would be considered external links. Here is a great guide.
http://www.seerinteractive.com/blog/screaming-frog-guide
If you look at the external part of Screaming frog you'll find what you're looking for however you may also do this with
using either the campaign tool or the browser plug-in.
I would suggest reading the seer interactive guide and sticking with screaming frog it is an outstanding tool.
Here are some other tools which I hope will help you if that is not the route you wish to go.
If you could post a photograph of what you are looking for or what you mean by it only showing you the internal link count I know what you mean by that I just want to see what screen you're looking on to get the The answer you're looking for.
Here are some more tools that will allow you to scan up to 1000 pages of your website for free and will tell you the information you're looking for.
http://www.internetmarketingninjas.com/tools
if you cannot find what you're looking for in their you might want to try
http://www.quicksprout.com/2013/02/04/how-to-perform-a-seo-audit-free-5000-template-included/
distilled.net/U might be the best way to find out these types of things however it is a complete search engine optimization training course.
Sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I want to find all the keywords that an existing page is currently ranking for...is there a way to do that in MOZ or another tool?
I've seen this done during a software demo and saw it as the only value add for that tool but it's not worth the price of the whole tool for that one feature. The tool I saw showed you all the keywords you currently ranked for (within the top 200 positions), the position you were at, the number of users that term drove to your site and the total search volume for the keyword. SUPER useful info to have.
Technical SEO | | BrianPiper1 -
Why are these internal pages not showing any internal links?
If you look at Author profile pages like this one, http://experts.allbusiness.com/author/denise-oberry (THE top contributor on the site with over 82 posts under her belt), or any Author profile page, they show zero internal links or Page Authority. The same goes for most posts for each author on the site. Author pages should show internal links from every post the author has on the site. And specific posts should also have internal links from categories, etc. Yet they show zero. The only posts that show internal links and PA are ones that were either syndicated to the root domain's homepage, or syndicated to Fox Small Business. ZERO internal links. Does anyone know why this is? The root domain does not act this way with Author pages and posts. And I see nothing blocking links or indexing via the robots.txt file or page level nofollow tags. A real head scratcher for this SEO nerd, that I'm sure someone here will have a really simple answer to.
Technical SEO | | MiguelSalcido0 -
Does this count as a link?
Somebody listed me on their site with this link code A Link Between Worlds Walkthrough It does this weird redirect tracking thing to my site. Would that count as a link back to me?
Technical SEO | | Atomicx0 -
Why is the Page Authority of my product pages so low?
My domain authority is 35 (homepage Page Authority = 45) and my website has been up for years: www.rainchainsdirect.com Most random pages on my site (like this one) have a Page Authority of around 20. However, as a whole, the individual pages of my products rank exceptionally low. Like these: http://www.rainchainsdirect.com/products/copper-channel-link-rain-chain (Page Authority = 1) http://www.rainchainsdirect.com/collections/todays-deals/products/contempo-chain (Page Authority = 1) I was thinking that for whatever reason they have such low authority, that it may explain why these pages rank lower in google for specific searches using my exact product name (in other words, other sites that are piggybacking of my unique products are ranking higher for my product in a specific name search than the original product itself on my site) In any event, I'm trying to get some perspective on why these pages remain with the same non-existent Page Authority. Can anyone help to shed some light on why and what can be done about it? Thanks!
Technical SEO | | csblev0 -
Too many links? Do links to named anchors count (ie page#nameanchor)?
Hi, I have an internal search results page that contains approx 200 links in total. This links to approx 50 pages. Each result listing contains a link to the page in the format /page.html and also has 3 more links (for each listing) to named anchors within the page. eg /page.html#section1, /page.html#section2, /page.html#section3 etc. Should i remove the named anchors to keep my links per page under the Seomoz suggested max of 100? Will it impact crawl-ability or link juice being passed? Thanks in advance for your response.
Technical SEO | | blackrails0 -
Should I delete a page or remove links on a penalized page?
Hello All, If I have a internal page that has low quality links point to it or a penality. Can I just remove the page, and start over versus trying to remove the links? Over time wouldn't this page disapear along with the penalty on that page? Kinda like pruning a tree? Cutting off the junk limbs so other could grow stronger, or to start new fresh ones. Example: www.domain.com Penalized Internal Page: (Say this page is penalized due to keyword stuffing, and has low quality links pointing to it like blog comments, or profiles) www.domain.com/penalized-internal-page.com Would it be effective to just delete this page (www.domain.com/penalized-internal-page.com) and start over with a new page. New Internal Page: www.domain.com/new-internal-page.com I would of course lose any good links point to that page, but it might be easier then trying to remove old back links. Thoughts? Thanks! Pete
Technical SEO | | Juratovic0 -
How To Find and Delete Erroneous Pages From My Wordpress Site
I've downloaded the Seomoz csv file from the crawl data on my site and it found lots of 404 errors, duplicate content, etc. The problem is that when i go to my wp-admin and look for the pages to delete them, I dont see them. Can anyone point me in the right direction? I've checked with HostGator and they say it's a WP problem. I need help locating where they are so i can clean them up or delete them. Thanks Mike
Technical SEO | | mikemunter0 -
Link to overall brand pages
On our website we have two ways to get in a brand environment. We have general brand pages and brand pages divided by category. At this moment the category brand pages get the most SEO value, because we have a link on our homepage to these pages (via the mega dropdown). The problem is that we would like to assign the SEO value to the general brand pages (with all the articles) instead of the category brand pages (with only articles within a category). We prefer to optimize the general brand page without a link to this page on the homepage for now. for example; Those two pages have the most SEO value
Technical SEO | | eCommerceSEO
www.debijenkorf.nl/herenmode/diesel
www.debijenkorf.nl/damesmode/diesel but we would like to assign value to;
www.debijenkorf.nl/diesel Do you have a solution for this problem? Thank you in advance! Kind regards,0