Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Site Crawler not working but on-demand crawler working
Hi, In Moz pro, when using Site crawler (or recrawl), we are seeing message site is banned. But when using on-demand crawler, it could generate report successfully. I just like to know if in both these cases, it is roberbot that is used! And kindly note, site crawler was perfectly working before. So the required setup is already in place from long time. Site crawler ban issue started appearing from nov/dec 2023. . Could you please us understand how could we possibly make site-crawler work?
Moz Pro | | gilesd
I am happy to provide more details if you need any. Thanks0 -
Having 1 page crawl error on 2 sites
Help! A few weeks back, my dev team did some "changes" (that I don't know anything about), but ever since then, my Moz crawl has only shown one page for either http://betamerica.com or http://fanex.com. Moz service was helpful in talking about a redirect loop that existed, and I asked my team to fix it, which it looks to me like they have. Still, 1 page. I used SEO Book's spider tool and it also only sees 1 page, and sees the sites as http://https://betamerica.com (for example), which is just weird. I don't know enough about HT Access or server stuff to figure out what's going on, so if someone can help me figure that out, I'd appreciate it.
Moz Pro | | BetAmerica0 -
Keyword Difficulty tool
Hello Guys, Please someone explain how the keyword difficulty tool works and decides the difference super competitive v easy to rank keyword. Also the performance of this tool has been really bad since August i have been getting errors i.e. try again in 20 minutes. Thank you in advance for your replies 🙂
Moz Pro | | um090 -
Site Ranking Report
Hi guys, My site ranking report says that I've gone from being 1-20 for a variety of keywords in Google UK to not in the top 50. When I do a search myself I see that my site remains where it previously was (between 1-20). How reliable is the site ranking reporting on a weekly basis? Is it best to look at it monthly?
Moz Pro | | columbus0 -
OPen Site Explorer listing criteria?
Checking link data for a potential client who has given me access to their websmaster tools account, I've noticed that Google list over 150,000 "Links to Your Site" but yet OSE only reports a tiny fraction of this number.
Moz Pro | | G-DC
I have manually verified that there are in fact direct links from the sites that are listed in Webmaster Tools but not in OSE. Does OSE discard some links? Is there a reporting lag? Does anyone know why there would be such a large discrepancy?0 -
Why does Open Site Explorer show less inbound links than yahoo site Explorer?
Hello, We have a question regarding inbound link measurement. We used to measure our inbound links with yahoo site explorer. Now that it's been shut down we use opensiteexplorer.org. However, Open Site Explorer only shows a fraction of inbound links compared to yahoo site explorer. For our website www.theprintspace.co.uk yahoo site explorer measured approx. 14,000 inbound links, whereas open Site Explorer only counts approx. 3,000. This is more than 10,000 links less. For our other website www.theprintspace.de Open Site Explorer also shows 3000 links less than Yahoo. How can this be? Does Open Site Explorer count the links in a different way to Yahoo? Please explain. It would be great if you could help us with this. Thank you!
Moz Pro | | Waplington0 -
Open Site Explorer missing links
Hi, When the update of Open Site Explorer was released I noticed that the new version was missing a huge amount of links that the old version previously found. This still seems to be the case and it's pretty frustrating as we use the tool for our clients. Is this something that everybody is seeing and if so SEOMoz when do you think you'll have a solution? Many thanks
Moz Pro | | JonathanSmith0 -
301 redirect in SEOMoz campaigns tool
I did a 301 redirect to another domain and I would like to change the domain name in SeoMoz campaigns tool to continue to track the keywords, is it possible ?
Moz Pro | | mhenriques0