Any tools for scraping blogroll URLs from sites?

scottclark

This question is entirely in the whitehat realm...

Let's say you've encountered a great blog - with a strong blogroll of 40 sites.

The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.

Are there any good tools that will

a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)

b) same, but export as OPML so you can subscribe.

Thanks!

Scott

scottclark

Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.

Hallam

I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.

This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do

scottclark

Hi Keri,

That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.

thanks!

scottclark

nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.

Hallam

Hi there,

Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:

Here it is (with some other cool tools) , have fun:

http://citationlabs.com/tools/

KeriMorgret

Hi Scott,

I'm going through older questions. Did you ever find a tool to do what you wanted to do here?

KeriMorgret

One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.

Hallam

Hey Scott,

What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>

Explore more categories

Facebook URLs, Anchor Text

Webmaster Tools shows mystery errors that Moz does not

I need an interlinking report for my site, is there a report in Moz or another application that tell me how all of my pages are linked to other pages on my site?

Is there any report / tool that gives me last cache date for each page on my site ?

What SeoMoz tool am I thinking of?

Best SEO Tool Set ! What Are they ?

How does SEOMoz crawl sites? Does it follow the sitemap?

SEO pro tool ranking section

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Any tools for scraping blogroll URLs from sites?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions