Anyways to pull anchor text?
-
Hi guys,
So basically i have a list of URLs/Domains and there backlinks (example: http://s29.postimg.org/ujxm0c4lj/screenshot_677.jpg) but i'm missing anchor text. Can anyone recommend any tools which can scan a backlink, locate the URL/Domain on the page and then pull the anchor text?
Cheers, Chris
<colgroup><col width="548"><col width="884"></colgroup>
| | | -
Hi Matt!
No i have not yet found a tool which can do this.
The _ScrapeBox Anchor Text plugin _CleverPhD mentioned can only do this for one domain at a time. I need it for multiple domains.
Any other suggestions?
-
Hi Jay! Did you get this worked out?
-
Thanks Jay. If I look on the backlinks side, they all seem to have the same subdomain in some form or another. You would just need to setup the regex in Screaming Frog to look for just that keyword in the subdomain so it should match all the variants of it.
That said, ignore everything I just posted. I was thinking earlier, "Surely there is scraper software out there that does this already." I did not take the time to look. Your mention of Scrapebox reminded me of that.
Scrapebox has a separate addon that does this
http://www.scrapebox.com/anchor-text-checker
The ScrapeBox Anchor Text Checker allows you to enter your domain and then load a list of URL’s that contain your backlink. It will scan all the URL’s containing your link and extract the anchor text used by the websites that link to you.
-
Basically want the anchor text, so I can easily identify the location of the link on the page without needing to view source and search for the URL.
This export is directly from: http://s29.postimg.org/ujxm0c4lj/screenshot_677.jpg
Scrapebox backlink checker which doesn't give you anchor text.
-
Ok. Can you be more specific on what you are trying to accomplish with this data? I think that may help my understanding of what you are trying to do.
-
Thanks CleverPhD, sorry should had mentioned i'm looking to do this for multiple domain names not just one. So the method you describe works great for a single domain.
-
Screaming Frog can do this with custom extraction and list mode. If I am reading your question correctly, you have a list of URLs and what pages on your site that they link to.
You would upload the list of URLs into Screaming Frog so it knows what pages to scan and run it in list mode
http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/#15
You would then use the custom extraction tool to grep for the ahref code that has a link to your domain
http://www.screamingfrog.co.uk/web-scraper/
You would need to plug in a regular expression to look for your domain (or versions of it) and then include the rest of the HTML tag that include the anchor text all the way through the ending .
You should then be able to import that data into a spreadsheet and use text to columns to split the anchor text into it's own column.
It is a little tricky as the regular expression may have to be tweaked depending on how other sites link to your site. Run the Frog on a test group of 10 or so to make sure it works. If you have a bunch of errors, take the error examples and tweak the regular expression based on those.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to pull GMB reviews onto a webite
Hi Guys, I want to get our GMB reviews example:https://cl.ly/8c33973f15eb appearing also on SERP page example: https://cl.ly/8c33973f15eb Is it possible to pull GMB reviews and do this? From my understanding for review rich snippets to appear you actually need the reviews housed on your website and cannot be pulled from third party like GMB. Cheers.
Intermediate & Advanced SEO | | brandonegroup0 -
Optimum Word Count for Home Page Text
We operate a commercial real estate web site (www.nyc-officespace-leader.com) in New York City. Our home page text is about 500 words. Currently the home page text is of a promotional nature and not very engaging. We are attempting to write a check list for companies that are seeking to lease commercial space and make the text very useful, practical and engaging. However we are having difficulty covering all the bases with less than 1,000 words. If the home page text has 1,000-1,300 words is that detrimental from an SEO point of view? On the plus side I would think this would allow us to include several secondary keyword terms and to add plurals and variations of the two or three top phrases. Any thoughts or suggestions? Thanks, Alan Rosinsky
Intermediate & Advanced SEO | | Kingalan10 -
Is there anyway to recover my site's rankings?
My site has been top 3 for 'speed dating' on Google.co.uk since about 2003 and it went to below top 50 for a lot of it's main keywords shortly after 27 Oct 2012. I did a re-submission request and was told there was 'no manual spam action'. My conclusions is I was dropped by Google because of poor quality links I've gained over 10+ years. I have a Domain Authority of 40, a regular blog http://bit.ly/oKyi88, a KLOUT of 42, user reviews and quality content. Since Oct 2012 I've done some technical improvements and managed to get a few questionable links removed. I've continued blogging reguarly and got more active on Twitter. I've seen no improvement and my traffic is 80% down on last year. It would be great to be able to produce content that others want to link to but I've not had much success from that in over 10 years of trying and I've not seen many others in my sector, with small budgets having much success. Is there anything I can do to regain favour with Google?
Intermediate & Advanced SEO | | benners0 -
Is "Car Discount" a problematic anchor text for CarDiscount.com (google penguin)?
I have a couple of partial match domains in the format KEYOWRDdiscount.com and also the website name resembles domain name. "Car Discount" is not my website but just an example to illustrate:
Intermediate & Advanced SEO | | lcourse
Is "Car Discount" a problematic anchor text for CarDiscount.com?
Should I try to modify existing external anchor texts to "CarDiscount" or "CarDiscount.com" instead of "Car Discount" Do you know of any cases where such anchor texts coinciding with partial match domain were likely reason for penguin penalization? Thanks.0 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
Randomly Displayed Text: Hidden text issue?
I want to add some script to my site so that a given page publishes a different paragraph of text every time the page loads. Something like randomly displayed testimonials (but with more text). So, when you look at the page source, you would see all the text (e.g testimonial-1, testimonial-2, etc.), but the user would only see one paragraph randomly. Would this be considered hidden text (one code for search engine, one for use)? Is there a safe number of words you can do this with without setting off red flags? I appreciate the help.
Intermediate & Advanced SEO | | inhouseseo0 -
Duplicate titles but redirecting anyway (without redirects set up!!!)
Google has done a crawl of my site and is flagging up duplicate titles on my wordpress site. This appears to be due to the face that some posts are tagged in more than one category. I have just gone to make sure that each post just has one category and add redirects and I've noticed that all the duplicate title issues google has notified me about appear to redirect anyway. For example: http://www.musicliveuk.com/latest-news/live-music-boosts-australian-economy and http://www.musicliveuk.com/live-music/live-music-boosts-australian-economy have duplicate titles apparantly but the 1st url redirects to the 2nd one. I use the redirection plug in but have no redirection set up for that url so I'm a bit confused. And if they're redirecting anyway then why is google flagging up duplicate titles? Any help would be much appreciated.
Intermediate & Advanced SEO | | SamCUK1 -
Multiple sites linking back with pornographic anchor text
I discovered a while ago that we had quite a number of links pointing back to one of our customer's websites. The anchor text of these links contain porn that is extremely bad. These links are originating from forums that seems to link between themselves and then throw my customers web address in there at the same time. Any thoughts on this? I'm seriously worried that this may negatively affect the site.
Intermediate & Advanced SEO | | GeorgeMaven0