Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using Keyword Tool Results
I love the keyword tool for giving me direction and helping to prioritize. My question: Once I have the prioritized keywords, do you recommend having a page for each of the top keywords, using the keyword as the page name? Or is it better to sprinkle the keywords into existing pages? Thank you.
Moz Pro | | bhsiao0 -
Page Ranking by URL / Keyword
Needing to know how to find out the page rank of a URL that is NOT within the top 50 or top 100. Need to know that specific page's rank, not what our overall site's ranking for the keyword is. Can't seem to find any tool that goes beyond the top 100. Any ideas?
Moz Pro | | leankit0 -
What is the best tool for checking do follow outbound links?
what is the best software for detecting "do follow" outbound links from my site? thanks all!
Moz Pro | | tm46150 -
Technical Question about tools available in market
Hi, I am looking for a tool most probably web based tool like opensiteexplorer / majestic seo that gives me the list of URL For example, on google we can do site:seomoz.org , and it's saying About 115,000 results. I need to get list of those 115,000 URLS in any file whether it's csv or any other. anybody care to share ?
Moz Pro | | sumairr1230 -
Settings to crawl entire site
Not sure what happened but I started a third campaign yesterday and only 1 pages was crawled, The other two campaigns has 472 and 10K respectively. What is the proper setting to choose in the beginning of campaign setup to have the entire site crawled. Not sure what I did different and I must be reading the instructions incorrectly. Thanks, Don
Moz Pro | | NicheGuy210 -
External Links through Open site explorer
I have just ran an Open site explorer query on our site due to having one of a constant ranking keywords drop, and found that we have 48 new links appear on the report under the anchor text 'home' These are not normal links, and I cannot see where or how these have been connected to our site for example: <colgroup span="1"><col span="1" width="696"></colgroup>
Moz Pro | | hickboy5
| http://regvac.com/TrIsland.swb <colgroup span="1"><col span="1" width="696"></colgroup>
| http://meteonorm.com/fileadmin/tmy3/722066ndry3.rdm | <colgroup span="1"><col span="1" width="696"></colgroup>
| http://www.sexintheshower.net/downloads/SS960-12-pack.psd?ActDo=ShowArt&Information_ID=2396&Parent_ID=&type=&Langue_ID=An&rubID=3068473 | My question is how and why are these appearing on the seomoz open site explorer report? What are they? How can I get them removed, and let google know this was nothing to do with us? Thanks S |0 -
301 redirect in SEOMoz campaigns tool
I did a 301 redirect to another domain and I would like to change the domain name in SeoMoz campaigns tool to continue to track the keywords, is it possible ?
Moz Pro | | mhenriques0