TD*IDF analysis Tools
-
Hi guys,
I was wondering if anyone knew of free TD*IDF analysis tools on the market?
I know about onpage.org and Text-tools.net both paid.
I was wondering if anyone knows of other tools?
Cheers,
Chris
-
Hi Chris,
I don't know of any free tools that do this unless you want to write some code yourself. If you go that route we have some open source libraries that you might find useful, especially qdr that implements the TF-IDF scoring and dragnet for parsing/cleaning the HTML. Good luck in your search!
-
Hi Chris,
It's not the TD-IDF solution you're after but may help? SEO Quake (available as a free Chrome plug-in: https://chrome.google.com/webstore/detail/seoquake/akdgnmcogleenhbclghghlkkdndkjdjc) approximates some of this data for you.
It will show the most commonly recurring 1, 2, 3 and 4 word phrases appearing on a web page. It won't compare this to a corpus (e.g. your whole site). It then gives a Density % (broadly, how often this word/phrase appears) and a Prominence % (based around density but also where it appears: title, description, keywords etc.).
Hope that helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to handle broken links to phantom pages appearing in webmaster tools
Hi,Would love to hear different experiences and thoughts on this one. We have a site that is plagued with 404's in the Webmaster Tools. A significant number of them have never existed, for instance affiliates have linked to them with the wrong URL or scraper sites have linked to them with a truncated version of the URL and an ellipsis eg; /my-nonexistent... What's the best way to handle these? If we do nothing and mark as fixed, they reappear in the broken links report. If we 301 redirect and mark as fixed they reappear. We tried 410 (gone forever) and marking as fixed; they re-appeared. We have a lot of legacy broken links and we would really like to clean up our WMT broken link profile - does anyone know of a way we can make these links to non extistent pages disappear once and for all? Many thanks in advance!
Intermediate & Advanced SEO | | dancape0 -
Webmaster tools: which one do you use? Yandex Yay or Nay?
I usually verify websites on Google and Bing Webmaster. How important it is to verify on Yandex Webmaster if Russia is not one of the targeted locations?
Intermediate & Advanced SEO | | selectitaly0 -
Alternative Link Detox tools?
My company is conducting a link detox for a client, and it seems like every tool we utilize is giving us a different answer on how many links we actually have. the numbers range anywhere from 4,000 to 200,000. Does anyone have any suggestions as to what tools will give us an accurate count, and will also email the webmasters on your behalf requesting the links removal? We are trying to have this process be as automated as possible to save time on our end.
Intermediate & Advanced SEO | | lightwurx0 -
Webmaster Tools "Not found" errors after sitemap update
Hello Mozzers - I found a sitemap with loads of URL errors on it (none of the URLs on sitemap actually existed) so I went ahead and updated sitemap - now I'm seeing a spike in "not found" errors in WMT - is this normal / anything to worry about when you significantly change a sitemap. I've never replaced every URL on a sitemap before! L
Intermediate & Advanced SEO | | McTaggart0 -
Is the Tool Forcing Sites to Link Out?
Hi I have a tool that I wish to give to sites, it allows the user to get an accurate idea of their credit score with out giving away any personal data and with out having a credit search done on their file. Due to the way the tool works and to make the implementation on other peoples sites as simple as possible the tool remains hosted by me and a one line piece of Javascript code just needs to be added to the code of the site wishing to use the tool. This code includes a link to my site to call the information from my server to allow the tool to show and work on the other site. My questions are: Could this cause a problem with Google as far as their link quality goes? - Are we forcing people to give us a backlink to use the tool? (in the eyes of Google) or will Google not be able to read the Javascript / will ignore the link for SEO purposes? Should I make the link in the code Nofollow? If I should make the link a Nofollow any tips on how to make the most of the opportunity from a link building or SEO point of view? Thanks for your help
Intermediate & Advanced SEO | | MotoringSEO0 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0 -
Best tools for exploring links?
and not just every single link, but ones you know that Google is actually indexing. I find seomoz to be super easy, but there is no way to distinguish links that are actually counting "juice", or am i missing something. What about majesticseo - any other similar tools you use when trying to find linking sites that pass juice?
Intermediate & Advanced SEO | | imageworks-2612900 -
Tool to calculate the number of pages in Google's index?
When working with a very large site, are there any tools that will help you calculate the number of links in the Google index? I know you can use site:www.domain.com to see all the links indexed for a particular url. But what if you want to see the number of pages indexed for 100 different subdirectories (i.e. www.domain.com/a, www.domain.com/b)? is there a tool to help automate the process of finding the number of pages from each subdirectory in Google's index?
Intermediate & Advanced SEO | | nicole.healthline0