Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can check my site
can you please tell me what should I do with this site I trying a lot to solve its issues. the link is here https://alazharquranteaching.com/
Moz Pro | | saharali150 -
Duplicate Site Content found in Moz; Have a URL Parameter set in Google Webmaster Tools
Hey, So on our site we have a Buyer's Guide that we made. Essentially it is a pop-up with a series of questions that then recommends a product. The parameter ?openguide=true can be used on any url on our site to pull this buyer's guide up. Somehow the Moz Site Crawl reported each one of our pages as duplicate content as it added this string (?openguide=true) to each page. We already have a URL Parameter set in Google Webmaster Tools as openguide ; however, I am now worried that google might be seeing this duplicate content as well. I have checked all of the pages with duplicate title tags in the Webmaster Tools to see if that could give me an answer as to whether it is detecting duplicate content. I did not find any duplicate title tag pages that were because of the openguide parameter. I am just wondering if anyone knows:
Moz Pro | | MitchellChapman
1. a way to check if google is seeing it as duplicate content
2. make sure that the parameter is set correctly in webmaster tools
3. or a better way to prevent the crawler from thinking this is duplicate content Any help is appreciated! Thanks, Mitchell Chapman
www.kontrolfreek.com0 -
Recovering rankings after a botched url change
Hi there, I have for a long time had a bicycle maintenance website at madegood.org. Over the years the film branch of this business has taken off and moved in a slightly different direction, so I thought in March I decided to move madegood.org to madegoobikes.com, and create a new website for my film business at madegood.com. I thought I did a good job of telling google about my change of domain, but my rankings completely died, so about a month I moved madegoodbikes.com back to madegood.org. So far I haven't seen any sign of a recovery in my rankings, I'm getting almost no visits. I've check all my top pages on OSE and everything seems to be in place. https://moz.com/researchtools/ose/pages?site=http%3A%2F%2Fwww.madegood.org%2F&no_redirects=0&sort=page_authority&filter=all&page=1 Is it normal to wait over a month for my rankings to recover, or is there anything else I should be doing? Any tips/ideas/advice whatsoever will of huge help!
Moz Pro | | madegood0 -
Does SEOmoz have a tool to find mirror sites?
I heard from a company that is trying to get my clients SEO business that they discovered multiple sites mirroring our site's content. Does SEOmoz have a tool to find these websites? Or does Google?
Moz Pro | | thomas.wittine0 -
Site Explorer Results - Linking Domains Tab
I have a competitor who has the following Linking Root Domains numbers on the Linking Domains tab and was wondering if some one could explain what the numbers mean? opera.com - 205,515
Moz Pro | | bridgeway04-34677
dmoz.org - 97,752 prweb.com - 198,092 quantcast.com - 115,748 tinyurl.com - 443,540 jquery.com - 57,886 URL: http://www.opensiteexplorer.org/domains?site=www.verypdf.com%2F Thanks, Brad0 -
Keyword rankings tool is not working properly
My website http://www.logobite.com/ is in 29th position for the keyword "logo inspiration" but your keyword rankings tool is not showing up 😞 why?
Moz Pro | | logobite0 -
Keyword Difficulty tool
Hello Guys, Please someone explain how the keyword difficulty tool works and decides the difference super competitive v easy to rank keyword. Also the performance of this tool has been really bad since August i have been getting errors i.e. try again in 20 minutes. Thank you in advance for your replies 🙂
Moz Pro | | um090 -
Is there a tool that tracks and records your links to your site
What I mean by this is we have a linkbuilder working for us and I'm looking for to record there progress with link building I've seen somthing in Majestic but is there one in SEOMOZ All teh best Steve
Moz Pro | | ibexinternet0