Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Tools for editorial organizations?
Other than google trends, are there tools a publication could give its writers to help them optimize their articles. Sometimes these articles are very newsy pieces, sometimes they are more evergreen features. In other words, if they're writing a story about the Greek Debt Crisis, I'd love to give them a tool that would suggest the language that is trending most at the moment and then, as the story evolves, identify the language that would be important then. Are there tools to help with this?
Moz Pro | | nymedia0 -
Link from Gizmodo disappeared from Open Site Explorer
Hi, I have been using OSE to check competitor links, DA, PA etc. And recently noticed that an author at Gizmodo was kind enough to link us to a blog post of his. This is great news as Gizmodo has a DA of 94 and a PA of 50 (Which is pretty big compared to our DA of 30 and PA of 42). The link to the post is here: http://gizmodo.com/5956401/everything-you-need-for-the-best-trick+or+treating-house-in-the-neighborhood And the link to our website is: http://www.electromarket.co.uk/lighting-effects/lighting-effects/strobe/ffa0144 It was showing on OSE for the past few days but now it has vanished and it is showing channel5 (TV Channel in the UK) as the highest DA linking to us, which is still pretty good. But I just want to know why the link has stopped displaying on OSE 😞 Any help or insight is appreciated! Tom
Moz Pro | | tomhall900 -
Seomoz research tools question for inbound links
What is the best seomoz tool and or indicator within a tool for checking on a link (directory or article site) to evaluate whether or not you would actaully want a link from them? Any way to see if google penalizes them for there tactics or if that can hurt me by them linking to me or whatever? I am at the beginning of a link building mission, but want to ensure that I do it methodically and correctly as possible based on the combines wisdom of this community. Thanks for your help, Steven
Moz Pro | | sfmatthews0 -
Do we have videos tutorials showing how to use the tools from SEOmoz?
I recently sign up with you guys and watch several videos but can't find tutorials on how to use the incredible tools here... please advice! Many thanks in advance. Bira
Moz Pro | | cssyes0 -
Link Diagnosis and Open Site Explorer
I work at a web design firm that's starting to offer SEO to its clients. They want to keep costs down until the SEO side becomes more established, so I've been using mostly free tools to do the SEO. I've been using a website called LinkDiagnosis.com to check backlinks for clients. It says the data is provided by SEOmoz, so I thought it must be pretty reliable. However, I just signed up for the PRO trial to test out all of SEOmoz's tools (& hopefully convince my company to sign us up), and I find that the link information between the 2 tools can be vastly different! Sometimes the number of unique linking domains is very close, but other times it can be off by hundreds, even thousands, with Open Site Explorer typically providing the higher numbers. Is the Link Diagnosis tool really powered with data from SEOmoz, or does it just receive a portion of the data, since it's a free tool? I'm just trying to figure out what will be the most reliable solution for me to keep using, and the discrepancies between the two tools has caused me to question the reliability of both. Thanks, Hector
Moz Pro | | hmunoz0 -
External Links through Open site explorer
I have just ran an Open site explorer query on our site due to having one of a constant ranking keywords drop, and found that we have 48 new links appear on the report under the anchor text 'home' These are not normal links, and I cannot see where or how these have been connected to our site for example: <colgroup span="1"><col span="1" width="696"></colgroup>
Moz Pro | | hickboy5
| http://regvac.com/TrIsland.swb <colgroup span="1"><col span="1" width="696"></colgroup>
| http://meteonorm.com/fileadmin/tmy3/722066ndry3.rdm | <colgroup span="1"><col span="1" width="696"></colgroup>
| http://www.sexintheshower.net/downloads/SS960-12-pack.psd?ActDo=ShowArt&Information_ID=2396&Parent_ID=&type=&Langue_ID=An&rubID=3068473 | My question is how and why are these appearing on the seomoz open site explorer report? What are they? How can I get them removed, and let google know this was nothing to do with us? Thanks S |0 -
Bing ranking my non-www version of my site.
My site is ranking #1 on Bing for the non-www version of my homepage, but this is not showing in my ranking stats, since the site is setup in SEOMOZ as www version. Do I have to create a whole new campaign for the non-www version or is there a way to pickup these rankings?
Moz Pro | | pdlcom0