Broken Links from Open Site Explorer
-
I am trying to find broken internal links within my site. I found a page that was non-existent but had a bunch of internal links pointing to that page, so I ran an Open Site Explorer report for that URL, but it's limited to 25 URLs.
Is there a way to get a report of all of my internal pages that link to this invalid URL? I tried using the link: search modifier in Google, but that shows no responses.
-
Whew! Big thread.
Sometimes, when you can't find all the broken links to a page, it's easier simply to 301 redirect the page to a destination of your choice. This helps preserve link equity, even for those broken links you can't find on large sites. (and external links, as well)
Not sure if this would help in your situation, but I hope you're getting things sorted out!
-
Jesse,
That's where I started my search, but GWMT wasn't showing this link. I can only presume that because it isn't coming back a 404 (it is showing that "We're Sorry" message instead) that they're considering that message to be content.
Thanks!
-
Lynn, that was a BIG help. I had been running that report, but was restricted to 25 responses. When I saw your suggestion to filter for only internal links, I was able to see all 127.
Big props. Thanks!
-
One more thing to add - GWMT should report all 404 links and their location/referrer.
-
oops! i did not know this. Thanks Irving.
-
Use the word FREE with an asterisk because sreaming frog is now limiting the free version to 500 pages. Xenu is better, even brokenlinkcheck.com lets you spider 3000 pages.
500 pages makes the tool practically worthless for any site of decent size.
-
Indeed if it is not showing a 404, that makes things a bit difficult!
You could try another way, use OSE!
Use the exact page, filter for only internal links, boom 127 pages that link to it. There might be more, but this should get you going!
-
Jesse:
I appreciate your feedback, but am surprised that the ScreamingFrog report found no 404s. SEOmoz found 15 in Roger's last crawl, but those aren't the ones that I'm currently trying to solve.
The problem page is actually showing up as duplicate content, which is kinda screwy. When visiting the page, our normal 404 error doesn't appear (which our developers are still trying to figure out), but instead, an error message appears:
http://www.gallerydirect.com/about-us/media-birchwood
If this were a normal 404 page, we'd probably be able to find the links faster.
-
I got tired of the confusion and went ahead and proved it. Not sure if this is the site you wanted results for, but I used the site linked in your profile (www.gallerydirect.com)
took me about 90 seconds and I had a full list... no 404s though
anyway here's a screenshot to prove it:
http://gyazo.com/67b5763e30722a334f3970643798ca62.png
so what's the problem? want me to crawl the fbi site next?
-
I understand. Thing is, there is a way and the spider doesn't affect anything. Like I said, I have screaming frog installed on my computer and I could run a report for your website right now and you or your IT department would never know it happened.. I just don't understand the part where the software doesn't work for you but to each their own i suppose.
-
Jesse:
That movie was creepy, but John Goodman was awesome in it.
I started this thread because I was frustrated that OSE restricts my results to 25 links, and I simply wanted to find the rest for that particular URL. I was assuming that there was either:
a. A method for getting the rest of the links that Roger found
b. Another way of pulling these reports from someone who already spiders them (since I can't get any using the link:[URL] in Google and Webmaster Tools isn't showing them).
Thanks to all for your suggestions.
-
run the spider based app from outside their "precious network" then. hell, i could run it right now for you from my computer at work if I wanted. Use your laptop or home computer. It's a simple spider you don't have to be within any network to run it. You could run one for CNN.com if you'd like as well...
-
How else do you expect to trace these broken links without using a "spider?"
Obviously it's the solution. And the programs take up all of like 8 megs... so what's the problem/concern?
I second the screaming frog solution. It will tell you exactly what you need to know and has ZERO risk involved (or whatever it is that's hanging you up). The bazooka comparison is ridiculous, because a bazooka destroys your house. Do you really think a spider crawl will affect your website?
Spiders crawl your site and report findings. This happens often whether you download a simple piece of software or not. What do you think OSE is? Or Google?
I guess what we're saying is if you don't like the answer, then so be it. But that's the answer.
PS - OSE uses a spider to crawl your site...
PPS - Do you suffer from arachnophobia? That movie was friggin awesome now I want to watch old Jeff Daniels films.
PPSS - Do you guys remember John Goodman being in that movie? Wow the late 80s early 90s were really somethin' special.
-
John, I certainly see your point, but our IT guys would not take too kindly to me running a spider-based app from inside their precious network, which is why I was looking for a less intrusive solution.
I'm planning on a campaign to revive "flummoxed" to the everyday lexicon next.
-
Hi Darin,
Both these softwares are made for exactly this kind of job and they are not huge system killing programs or anything. Seriously I use one or both almost every day. I suggest downloading them and seeing how you go, I think you will be happy enough with the results.
-
The way I see it, its much like you missing the last flight home, and you have a choice of getting the bus, that means you might take a little longer, or of course you can wait for the next flight ,which happens to be tomorrow evening, the bus will get you home that night.
I get the bus each and every time, I get home, later than expected I grant you, but I get home a lot quicker than waiting for the plane tomorrow.
Bewildered, I didn't realise it had fallen out of the diction, its a common word (I think) in Ireland, oh and I am still young (ish)
-
John:
Bewildered. There's a good word that I'm happy to see someone is keeping it alive for the younger generations.
I'm not ungrateful for your suggestions, but both involve downloading and installing a spider, which seems like overkill, much like using a bazooka to kill a housefly.
-
I am bewildered by this, I have told you one, Lynn has told you another piece of free software that will do this for you.
Anyway, good luck with however you resolve our issues
-
Lynn, part of the problem is definitely template-based, and one of our developers is working on that fix now. However, I also found a number of non-template created links to this page simply due to UBD error (an old Cobol programming term meaning User Brain Dead).
I need to find all of the non-template based, UBD links that may have been created and fix them.
-
Xenu will also do a similar job and doesn't have a limit which I recall the free version of screaming frog has: http://home.snafu.de/tilman/xenulink.html
If you have loads of links to this missing page it sounds like you maybe have a template problem with the links getting inserted on every or lots of pages. In that case if you find the point in the template you will have fixed them all at once (if indeed it is like this).
-
Darin
Its a stand alone piece of software you run, it crawls your website and finds out broken inbound, outbound or internal links, tells you them ,you go and fix them
Enter your URL, be it a page or directory, run it, it will give you all bad links. And it wont limit you to 25.
You don't need to implement anything ... run the software once, use it, and well bin it afterwards if you wish
But by all means, you can do as you suggest with SE ...
Regards
John
-
John,
While I could look at implementing such a spider to run the check sitewide on a regular basis, I am not looking to go that far at the moment. For right now, I'm only looking for all of the pages on my site that link to a single incorrect URL. I would have to think that there's a solution available for such a limited search.
If I have to, I suppose I can fix the 25 that Open Site Explorer displays, wait a few days for the crawler to run again, then run the report again, fix the next 25, then so on and so on, but that's going to spread the fix out potentially over a number of weeks.
-
Free tool, non SEO Moz related
http://www.screamingfrog.co.uk/seo-spider/ , run that, will find all broken links, where they are coming from etc etc
Hope I aint braking any rules posting it
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are These Links Junk?
I hired an SEO to create incoming links to me website insisting that only white hat techniques be used. The SEO was highly recommended by a family friend. In 3 months about 14 links to my site were obtained. The URLs for the domains where the links originate are below. I paid $8,000 for the services of the SEO provider to create the links over 4 months. When I looked at the links more carefully I noticed that the sites did not seem to have owners. That there was no phone number, physical address and scant information about ownership. I also noticed that most pages had outgoing links of a promotional nature. Also, that content created for me had grammatical and occasional spelling errors. The links did not look bad in terms of MOZ domain authority and MOZ page authority, but when I went subscribed to AHREFS a few days ago and evaluated the links, I noticed that the URL rating (somewhat equivalent to MOZ page authority) was really low. Furthermore, noticed that one of the domains solicits paid links from gambling sites. The SEO who sourced the links on my behalf says he will explain why I "have nothing to worry about". Dividing his monthly fee by the number of links and I paid $571 per link. Is it possible the the below domains could have pages that I would want links from? Would these links be potentially worth more than a few hundred dollars? O are these sites more like a cheap PBN or maybe "the hoth". If the links are in fact good I would be delighted. But if they are of poor quality could I legitimately ask for a refund? Also, are these domains so bad that it is imperative for me to get the links removed? <colgroup><col width="198"></colgroup>
Intermediate & Advanced SEO | | Kingalan1
| https://www.equities.com |
| http://www.realestaterama.com |
| https://moneyinc.com |
| https://homebusinessmag.com |
| http://digitalconnectmag.com |
| https://suburbanfinance.com/ |
| http://www.homebunch.com |
| http://inman.com |
| https://www.propertytalk.com/ |
| http://activerain.com |
| https://www.conservativedailynews.com/ |
| http://moneyforlunch.com/ |
| http://baltimorepostexaminer.com/ |
| https://www.tgdaily.com/ |
| |0 -
Indexed Pages Different when I perform a "site:Google.com" site search - why?
My client has an ecommerce website with approx. 300,000 URLs (a lot of these are parameters blocked by the spiders thru meta robots tag). There are 9,000 "true" URLs being submitted to Google Search Console, Google says they are indexing 8,000 of them. Here's the weird part - When I do a "site:website" function search in Google, it says Google is indexing 2.2 million pages on the URL, but I am unable to view past page 14 of the SERPs. It just stops showing results and I don't even get a "the next results are duplicate results" message." What is happening? Why does Google say they are indexing 2.2 million URLs, but then won't show me more than 140 pages they are indexing? Thank you so much for your help, I tried looking for the answer and I know this is the best place to ask!
Intermediate & Advanced SEO | | accpar0 -
Disavowing Affiliate Links - Domain or Actual Affiliate Link?
Hi everyone, Hope you're all having a great day, I have a question in regards to a site which I am about to disavow. Over the past 2 months a certain page of ours has dropped from the 2nd page, all the way to the 7th. I haven't been able to diagnose why, however, yesterday I discovered that a site has been using an Lafitte link on his sidebar, the link is a do-follow. Webmaster tools indicates that this site has linked to us over 24,000 times. I understand that this link could potentially ruin our rankings - however, in terms of disavowing, what is the best approach here? Do I disavow their domain, or do I disavow the actual affiliate link also? The link is placed within an image, once the image is clicked it redirects you to another link for a second then redirects to our money site. We have got in touch with our affiliate program and they have made the link a no-follow, however, we are pretty certain this site is causing issues for us and we want to go ahead and disavow. Thanks, Brett
Intermediate & Advanced SEO | | Brett-S0 -
Site Migration of 4 sites into 1?
Hi Guys, I have a massive project involving a migration of 4 sites into 1. 4 sites include: **www.MainSite.com ** www.E-commerce.com www.Membership.com www.ResearchStudy.com Goal of this project is to have 1-4 regrouped into Main Site I will be following the best practice from this post https://moz.com/blog/web-site-migration-guide-tips-for-seos which has an awesome checklist. I am actually about to start Phase 3: URL redirect mapping. Because all of these sites have hundreds of duplicates, I figured I should first resolve the Main Site dup issues before creating the URL redirect mapping but what about the other domains (2,3,4) though? Should I first resolve the Dup issues on those ones as well or it is not necessary since they will be pointing into the Main Site new domain? I want to make sure I don't overwork the programming team and myself. Thanks For sharing your expertise and any tips on how should I move forward with this.
Intermediate & Advanced SEO | | Ideas-Money-Art0 -
How to handle broken links to phantom pages appearing in webmaster tools
Hi,Would love to hear different experiences and thoughts on this one. We have a site that is plagued with 404's in the Webmaster Tools. A significant number of them have never existed, for instance affiliates have linked to them with the wrong URL or scraper sites have linked to them with a truncated version of the URL and an ellipsis eg; /my-nonexistent... What's the best way to handle these? If we do nothing and mark as fixed, they reappear in the broken links report. If we 301 redirect and mark as fixed they reappear. We tried 410 (gone forever) and marking as fixed; they re-appeared. We have a lot of legacy broken links and we would really like to clean up our WMT broken link profile - does anyone know of a way we can make these links to non extistent pages disappear once and for all? Many thanks in advance!
Intermediate & Advanced SEO | | dancape0 -
Site not progressing at all....
We relaunched our site almost a year ago after our old site dropped out of ranking due to what we think was overused anchor text.... We transferred over the content to the new site, but started fresh in terms of links etc. And did not redirect the old site. Since the launch we have focused on producing good content and social, but the site has made no progress at all. The only factor I can think off is that one site linked to us from all of their pages, which we asked them to remove which they did over 3 months ago, but still showing in Webmaster tools.... Any help would be appreciated. Thanks
Intermediate & Advanced SEO | | jj34340 -
Why is this site not ranking?
http://www.petstoreunlimited.com They get good grades from the on-page tool. The links are not amazing, but are not super spammy. Yet it ranks for nothing they target Any reason why?
Intermediate & Advanced SEO | | Atomicx0 -
How should I react to my site being "attacked" by bad links?
Hello, We have never bought links or done manipulative linbuilding. Meanwhile, someone has recently (15th of March) pointed at the top 5 websites on my main keyword with lots of bad quality links. So far it has not affected my rankings at all. Actually, I think it will not affect them because I think it was not a massive enough attack. The particular page that has been attacked had about 100 root domains pointing it and now it went up to something like 400. All those were in one day. All of those links use the same anchor text: the keyword we're ranking for. With those extra 300 root domains pointing at us, we went from 600 rootdomain to 900 pointing at our domain as a whole. The page that was targetted by the attack is not the homepage. What I wanted to do was to basically do nothing since I think it won't affect our rankings in any ways but I wanted you guys' opinion. Thanks.
Intermediate & Advanced SEO | | EndeR-0