Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New pages on my web site
I have created web sites that appear somewhere on Google in hardly any time at all, but I appear to have forgotten something or things are different for pages added recently to an existing website. I have added a page on a particular subject, optimized it using on page grader, so that I get an A, and a check mark for everything except H1 tags and rel=canonical which my web hosting provider does not support. I do have a check mark for accessible to search engines The page has the format http://www.domain.com/specific-keyword It is in the menu, so should have internal links to it, as I understand it. I have created a new site map, and submitted it in webmaster tools. Interestingly it says that of the 96 pages only 76 were indexed is this a clue? and why would they not index a page I have then shared the page on google plus, facebook, tumblr, pinterest and twitter and some others In OSE it comes up as domain authority 28 page authority 1, the social media shares do show up in metrics on the right but no links internal or external are shown, they do on other pages I created in the same way. Is it just a case of waiting or is their something I do to help thank you
Moz Pro | | singingtelegramsuk0 -
Google Webmaster Tools and Open Site explorer's links not matching up
My question is why do my GWT's links not match up to the ones on Open Site Explorer. I watched John Mueller's video, and he said that they had problems with the link counts recently. After checking my links today, I can see that the problem is fixed, but my link count differs from Open Site Explorer. On Web Master Tools I have 307 links, but on Open Site Explore I have 42. Has anyone dealt with this problem? thanks. Peter
Moz Pro | | PeterRota0 -
Keyword rankings tool is not working properly
My website http://www.logobite.com/ is in 29th position for the keyword "logo inspiration" but your keyword rankings tool is not showing up 😞 why?
Moz Pro | | logobite0 -
So many problems with my site.
Hi all.I was shocked when when i run a campaign and the warnings and recommendations about my site are so many.I know nothing about web design and the person who design it is asking me what are these problems and where did i get all these? any solution this are the problems 1.5XX (Server Error)
Moz Pro | | jubba
2.Duplicate Page Content(875)
3.Duplicate Page Title(875)
4.Overly-Dynamic URL(1048)
5.Too Many On-Page Links(60) and this is just a few of the problems.0 -
When does Open Site Explorer automatically follow a redirect?
Does anyone know what determines when Open Site Explorer will ask if you meant to type in the redirected URL or when it will automatically change the URL to the www version. example of a site I get asked the redirect question: http://www.opensiteexplorer.org/comparisons.html?site=checkbook.org Oh Hey! It looks like that URL redirects to www.checkbook.org/. Would you like to see data for <a class="clickable redirects">that URL instead</a>? example of a site that redirects automatically: http://www.opensiteexplorer.org/comparisons.html?site=healthleap.com
Moz Pro | | irvingw1 -
SEOmoz LDA tool experience?
http://www.seomoz.org/labs/lda Does anyone presently use this tool as a regular part of their SEO analysis? Does anyone have any updates regarding this tool? It is still in the lab despite no updates for over 6 months. Is it being retired? Or promoted to production? If you do use the tool, what % of relevance is your goal when optimizing pages?
Moz Pro | | RyanKent0 -
Open Site Explorer Question- Link Value?
One of my backlinks is from a site that has a page authority of 74. However, the domain is a domain I purchased and 302'd to my current main domain. What I'm wondering (without getting into why a 301 is better than a 302) is this: does OSE have any tool that shows if there is actually value in a link? My assumption is that despite this domain having a PA of 74, the 302 is not passing over any value. To be clear, I understand that a 302 doesn't pass over any SEO value, but my question is whether or not OSE shows the value of a link? Thanks!
Moz Pro | | RodrigoStockebrand0 -
Recent backlinks in Open Site Explorer as not showing
I saw the note today that the link index did it's monthly update, yet our site www.oznappies.com still only shows 1 linking root domain and I know there are many more links now. What do I need to do to get open Site Explorer to use the latest data? I enter our site, create the report and only see old information from 6 may 2011's link index. I have the same issue with competive link finder, links I know we have on the sites listed for our competitors are not showing for our site.
Moz Pro | | oznappies0