Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why my site not crawl?
hi all me allow rogerbot in robots.txt but rogerbot can't crawl my site my site: toska-co.ir
Moz Pro | | jahanidawodi0 -
Difference between urls and referring urls?
Sorry, nit new to this side of SEO We recently discovered we have over 200 critical crawler issues on our site (mainly 4xx) We exported the CSV and it shows both a URL link and a referring URL. Both lead to a 'page not found' so I have two questions? What is the difference between a URL and a referring URL? What is the best practice/how do we fix this issue? Is it one for our web developer? Appreciate the help.
Moz Pro | | ayrutd1 -
Organic Traffic Drops to Zero After Site Migration
I know this question is terribly general . . . but I don't even know where to start in terms of sharing info, which I would be glad to post on request. My site was recently re-designed and migrated to a new server. I certainly expected SOME drop in traffic as google gets the snew site all figured out, but as you can see this drop has been dramatic. It's a simple wordpress site, I have yoast ptimized on it, I don't have many obvious errors that have not already been fixed that I can see.
Moz Pro | | damon1212
I just . ..what the heck is going on and how can I fix it? Is it normal and takes a month or two to sort out? Help! Screen%20Shot%2012-04-15%20at%2011.32%20AM_zpsczdmzcjt.png0 -
Tool recommendation for Page Depth?
I'd like to crawl our ecommerce site to see how deep (clicks from home page) pages are. I want to verify that every category, sub-category, and product detail page is within three clicks of the home page for googlebot. Suggestions? Thanks!
Moz Pro | | Garmentory0 -
Blog Page URLs Showing Duplicate Content
On the SEOMoz Crawl Diagnostics, we are receiving information that we have duplicate page content for the URL Blog pages. For Example: blog/page/33/ blog/page/34/ blog/page/35/ blog/page/36/ These are older post in our blog. Moz is saying that these are duplicate content. What is the best way to fix the URL structure of the pages?
Moz Pro | | _Thriveworks0 -
Is my Site Spam?
Recently google dropped our site a big time. Can some body tell me if my site is spammy. Our visibility was 67% and one of our top competitor had the visibility of 72%. www.aa-rental.com
Moz Pro | | tanveer10 -
How can a site have a backlink from Barclays website?
Hi, I have entered a competitiors website www.my-wardrobe.com into Open Site to see who they get links from and to my surprise they have a load from Barclays Business Banking. When I visit the page I can not see the links. But if I search the pages source code for my-wardrobe, there I have it, a link to my-wardrobe.com. How have they done this? Surely Barclays haven't sold them it? And more so, why are they receiving link juice when you cant even see the link on the Barclays page in question - http://www.barclays.co.uk/BusinessBanking/P1242557952664 Thanks | |
Moz Pro | | YNWA
| | <a <span="">href</a><a <span="">="</a>http://www.my-wardrobe.com" class="popup" title="Link opens in a new window" rel='' onmousedown="dcsMultiTrack('DCS.dcsuri','BusinessBankingfromBarclays/Footer/wwwmywardrobecom', 'WT.ti', '','WT.dl','1');"> |
| | www.my-wardrobe.com |
| |
|
| | |0 -
Open Site Explorer Backlink Numbers Wrong?
So, I have a new site that we are currently building links for, lots of hard work. Anyway, I was a bit shocked when I viewed my site through Open Site Explorer and found only 4 backlinks!!!!! Alexa.com, Majestic, and Google Webmaster Tools don't show the exact same numbers, but their figures are all MUCH higher, 4 to 566 for example. What gives Site is www.powerequipmentplus.com for those who want to look themselves. Thanks for your time and concern.
Moz Pro | | DRPower0