Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I mprove site visibility and keyword ranking for new product site
Hi, Sorry if this is a ridiculous post as I am really new to SEO, but I haven't had this problem with other sites! We had a website www.r-dna.co.uk that was never promoted or used very much as it was early days in the product lifecycle. The product (is called R-DNA or Remote Data Network Analysis) is now live so we re-branded and re-launched the site - it has now been live since the beginning of September but we still only have 0.35% visibility and very little ranking in our keywords. We are also using Google Adwords to try and generate business and have registered with numerous online business directories. I have been blogging to update content, tweeting and updating our facebook page, but we still aren't getting the traffic or visibility increases that we have experienced with our other sites. The MOZ site crawl shows 5 medium priority issues (duplicate title page & missing meta description tag), but no major issues. I know its probably fairly early days for a "new" site, but wondered if anyone could advise if there is anything wrong which would explain our lack of visibility.
Moz Pro | | sharon.bathurst0 -
Compare sites?
I'm frustrated, so want to ask a stupid question....My site.. www.seadwellers.com outranks my biggest competitor in most Moz catagories... www.rainbowreef.us ...EXCEPT Facebook likes...(he has a ton) **And yet, rainbowreef.us outranks me in most keywords on
Moz Pro | | sdwellers
Google?! I know it's not simple...but Can anyone take a quick peek and give me any insight as to why??? ** Example "Dive Key Largo" keyword...he is #1 and I am #5...typical in the most important keywords!0 -
How to create effective Backlinks and promote very small sites?
I want to create effective backlinks and promote websites with having only 8-15 pages with very poor/basic content. Where client is unable to provide content, increase number of pages etc. So kindly suggest..
Moz Pro | | 1akal0 -
Link from Gizmodo disappeared from Open Site Explorer
Hi, I have been using OSE to check competitor links, DA, PA etc. And recently noticed that an author at Gizmodo was kind enough to link us to a blog post of his. This is great news as Gizmodo has a DA of 94 and a PA of 50 (Which is pretty big compared to our DA of 30 and PA of 42). The link to the post is here: http://gizmodo.com/5956401/everything-you-need-for-the-best-trick+or+treating-house-in-the-neighborhood And the link to our website is: http://www.electromarket.co.uk/lighting-effects/lighting-effects/strobe/ffa0144 It was showing on OSE for the past few days but now it has vanished and it is showing channel5 (TV Channel in the UK) as the highest DA linking to us, which is still pretty good. But I just want to know why the link has stopped displaying on OSE 😞 Any help or insight is appreciated! Tom
Moz Pro | | tomhall900 -
Is there any report / tool that gives me last cache date for each page on my site ?
my site has several hundred pages, and it is important for me to know last crawl date of each page as well as number of pages cralwed in a particualr period ( from / to date ). is there any report in seomoz that can help for this ? or any other suggestion ?
Moz Pro | | elegantmicroweb0 -
I know our business listed in Yahoo and medranks.com (for example). But my open site explorer report doesn't show those. however on their sites, I see the listing. Why is this?
I know our business listed in Yahoo and medranks.com (for example). But my open site explorer report doesn't show those links on the inbound report. however on their respective sites, I see the listing when I search for us. And the link does work..... Why is this? Why don't I see it on the open site report?
Moz Pro | | cschwartzel0 -
Garbled URL's in Private Messages.
Every time I try to put a url in a Private message it gets garbled up with extra chars and then won't go to the right place. http://www.facebook.com/pages/Mariah-Carle-Photography Becomes: http://www.facebook.com/pages/Mariah58973jhsdfui-Carle%8594743Photography Ok after that test I deliberately garbled a url and it STILL work in the open forums....
Moz Pro | | Mcarle0 -
Open Site Explorer: Facebook Likes Accuracy
I show in open site explorer that I have 8 Facebook likes for my domain but I actually have 1500+ Facebook likes. Am I missing something?
Moz Pro | | fibers1