Anyways to pull anchor text?
-
Hi guys,
So basically i have a list of URLs/Domains and there backlinks (example: http://s29.postimg.org/ujxm0c4lj/screenshot_677.jpg) but i'm missing anchor text. Can anyone recommend any tools which can scan a backlink, locate the URL/Domain on the page and then pull the anchor text?
Cheers, Chris
<colgroup><col width="548"><col width="884"></colgroup>
| | | -
Hi Matt!
No i have not yet found a tool which can do this.
The _ScrapeBox Anchor Text plugin _CleverPhD mentioned can only do this for one domain at a time. I need it for multiple domains.
Any other suggestions?
-
Hi Jay! Did you get this worked out?
-
Thanks Jay. If I look on the backlinks side, they all seem to have the same subdomain in some form or another. You would just need to setup the regex in Screaming Frog to look for just that keyword in the subdomain so it should match all the variants of it.
That said, ignore everything I just posted. I was thinking earlier, "Surely there is scraper software out there that does this already." I did not take the time to look. Your mention of Scrapebox reminded me of that.
Scrapebox has a separate addon that does this
http://www.scrapebox.com/anchor-text-checker
The ScrapeBox Anchor Text Checker allows you to enter your domain and then load a list of URL’s that contain your backlink. It will scan all the URL’s containing your link and extract the anchor text used by the websites that link to you.
-
Basically want the anchor text, so I can easily identify the location of the link on the page without needing to view source and search for the URL.
This export is directly from: http://s29.postimg.org/ujxm0c4lj/screenshot_677.jpg
Scrapebox backlink checker which doesn't give you anchor text.
-
Ok. Can you be more specific on what you are trying to accomplish with this data? I think that may help my understanding of what you are trying to do.
-
Thanks CleverPhD, sorry should had mentioned i'm looking to do this for multiple domain names not just one. So the method you describe works great for a single domain.
-
Screaming Frog can do this with custom extraction and list mode. If I am reading your question correctly, you have a list of URLs and what pages on your site that they link to.
You would upload the list of URLs into Screaming Frog so it knows what pages to scan and run it in list mode
http://www.screamingfrog.co.uk/seo-spider/user-guide/configuration/#15
You would then use the custom extraction tool to grep for the ahref code that has a link to your domain
http://www.screamingfrog.co.uk/web-scraper/
You would need to plug in a regular expression to look for your domain (or versions of it) and then include the rest of the HTML tag that include the anchor text all the way through the ending .
You should then be able to import that data into a spreadsheet and use text to columns to split the anchor text into it's own column.
It is a little tricky as the regular expression may have to be tweaked depending on how other sites link to your site. Run the Frog on a test group of 10 or so to make sure it works. If you have a bunch of errors, take the error examples and tweak the regular expression based on those.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Too much duplicate text?
Last December we started losing traffic to our website https://www.spec.lt (This is in Lithuanian).
Intermediate & Advanced SEO | | anonimas
The thing we did was to every single company page we added QR Code. For example: https://www.spec.lt/imone/onninen-uab (at the bottom of this page). We added some text that goes with it. As you can see here http://imgur.com/a/beaYm
The only difference between those texts is the company name. Can this be the reason why google reduced our positions ? (We didn't lose any of traffic in categories/search/articles - only in company pages). A lot of companies that are new or bancrupt have little to no text at all. Except for this text about QR code, like here for example - https://www.spec.lt/imone/mazoji-bendrija-transportas-2017 Can this be the reason? Or any other on page errors that you see.
Thank you0 -
Anchor text optimisation
Hi, I would like to know the best anchor text practices. I think it's same for On-page and Off-page. As per the internet seo buzz, people suggests differently as below. Which actually work out? Exact match with target page title or partial match Will "Read more" and "learn more" are Okay? How much it must be relevant with URL? Thanks, Satish
Intermediate & Advanced SEO | | vtmoz0 -
What is the best text font for health website
What is the good text font for health website, font size, inline spacing, character spacing etc.? Is there any study on it? what font looks to good to eyes? (on what font user stay for long time etc) I personally like apple website text font.
Intermediate & Advanced SEO | | MasonBaker0 -
Directory Listings with anchor text = Should we just Delete them all?
Hi Dr Moz'ers So our ex-contracted SEO "experts" have a enlisted our website to 40 odd directories with the same key word and copy. Is it best to just delete them all to avoid a possible negative SEO/Google outcome? Cheers
Intermediate & Advanced SEO | | supps0 -
Does text, initially hidden within a tabbed structure, carry the same weight in Google?
Hi everyone, my site has suffered from a number of organic drops this year, following a redesign, panda, and penguin. An example of one of my key pages is shown below: http://www.concerthotels.com/venue-hotels/bridgestone-arena-hotels/326895 Earlier this year, I redesigned my site, so that, for example, 4 pages associated with each Bridgestone Arena (a page with nearby hotels, one for user reviews, one for upcoming events, one for general information) were combined into one "Bridgestone Arena Hotels" page. The reason I did this is because I felt that many of the pages were very thin. My new page has tabs for reviews, tickets etc., with the default tab listing nearby hotel information - the primary aim of my website. I'm worried that all the great unique user review information that I'm collecting is not being given the weighting it deserves, because it is content that is not immediately visible when the user lands on the page - only click the Reviews tab makes the content visible. The hidden content is definitely being picked up by Google e.g. searching for a portion of the review content in Google such as "We were here for the Aerosmith concert. The workers were so friendly and helpful - great experience!" serves up the Bridgestone Arena page in the results. But do you think Google still sees the page as being pretty thin in content, because much of the unique content is initially hidden? I am considering introducing a little featured reviews section to the visible content, that just includes a couple of the latest venue reviews, with a link to open the reviews tab. But if I have some review content here, and the same reviews in a hidden section of the same page, is Google likely to treat this as spammy? Thanks for your help and advice, Mike
Intermediate & Advanced SEO | | mjk260 -
Sculpting anchor text percentage through disavow?
Hi there, should less-than-optimal links be preserved, if those links contribute to a more attractive anchor text percentage profile? I'm working on a client who spun a bunch of articles, using keyword word anchor text. No surprise, the strategy worked great up to the penguin update. About 90% of the client's links come from these spun articles. The other 10% of links are naturally occurring, quality links. Furthermore, these quality links are also keyword rich. Now, it occurs to me that if I remove / disavow the links coming from the spun articles, I'm left with the 10% of quality, anchor text rich links. I'm concerned that Google will see this percentage as too high, and lower the rank. Furthermore, I have a vague memory of watching some YouTube video, where an ex-Googler says that your brand name should be about 60% of your anchor text, and everything else lower. Finally, when I examine the anchor text in links coming into the ranking sites, they have 5-15% anchor text density on their keywords. So, I feel a bit of a contradiction: I should clean up all of the crappy links from the spun articles, but then that risks having only the keyword rich anchor text links active? Therefore, I'm considering leaving some of the crappy links active on non-relevant keyword text, such as the good 'ol "click here" link. Also, before answering this, I can already predict some of the answers on philosophical grounds: those crappy links from spun articles are not natural and garbage, so get rid of them. Fair enough, but I'm also interested in an answer on only the dimension of what will produce the highest rank for my client?
Intermediate & Advanced SEO | | ExploreConsulting0 -
Randomly Displayed Text: Hidden text issue?
I want to add some script to my site so that a given page publishes a different paragraph of text every time the page loads. Something like randomly displayed testimonials (but with more text). So, when you look at the page source, you would see all the text (e.g testimonial-1, testimonial-2, etc.), but the user would only see one paragraph randomly. Would this be considered hidden text (one code for search engine, one for use)? Is there a safe number of words you can do this with without setting off red flags? I appreciate the help.
Intermediate & Advanced SEO | | inhouseseo0 -
Internal Anchor Text Penalty Clarification
I believe we may be seeing the initial stages of a penalty for over-using internal anchor text on our ecommerce site. Per Rand and other training, we added related product links and popular category links to our product and category pages. At the time, we did not have an html sitemap in the footer. We're a small to medium sized site with 1,700+ products. We have since added an html sitemap of our categories to our footer. Now we have category links in the sitemap and category pages and product pages with targeted anchor text. I'm beginning to see downward movement on some of those targeted categories. If I have an html sitemap in the footer (category index) should I get rid of the popular category links throughout the rest of the site? Also, with more frequency, I'm seeing a "product index" and "category index" in footers. Is this a best practice? Thanks.
Intermediate & Advanced SEO | | AWCthreads0