Tons of Crappy links in new OSE (Open Site Explorer)
-
I am starting to miss the old OSE. I've found that for a lot of the pages on our site, the new OSE is showing WAY more links and most of them are garbage nonsense links from China, Russia, and the rest of the internet Wild West.
For instance, in the old OSE, this page used to show 9 linking domains:
http://www.uncommongoods.com/gifts/by-recipient/gifts-for-him
It now shows 454 links. Some of the new links (about 5 of them) are legitimate. The other 400+ are garbage. Some are porn sites, most of them don't even open a web page, they just initiate some shady download. I've seen this for other sites as well (like Urban Outfitters) This is making it much harder for me to do backlink analysis on bc I have no clue how many "Normal" links they have. Is anyone else having this problem ? Any way to filter all this crap out ? See attached screenshot of the list of links I'm getting from OSE.
-
Ok thank you. I will email directly.
-
Hey Zack,
Sorry to hear you're still having problems - we've seen an improvement on most sites at this point. Would you want to send me info on the site you're searching and any filters you are using?
If you don't feel comfortable posting that info on this thread, feel free to email me directly: carin@seomoz.org.
Thanks!
Carin
-
Hey Carin,
I just wanted to follow up on this...I'm still seeing these spammy binary files show up as links. Unfortunately it makes OSE quite useless for me in regards to exploring our own backlinks.
What is the status of this problem? Has there been any headway ? Why does our site have problems but most others don't?
Thanks!
-Zack
-
Hey Zack,
Thanks so much for understanding! We are doing everything we can to get the bug resolved. Binary files are the downloadable files you see as links - .pdf, .exe, .img, etc.
I'm really sorry, but we don't have a URL to the old OSE. I saw Steven's response as a workaround - is that possible or are there too many file types to filter out?
Our crawlers that provide the metrics to OSE are always crawling, but will take about a month for our fix to propagate through to all the pages we crawl. Once we have removed these links from our crawlers, then we'll have to process the metrics. This is why it's looking like late September for the fix to show up.
I really appreciate your patience and understanding, we're doing everything we can to fix it!!
Thanks,
Carin
-
Hey Carin-
Thank you so much for this in-depth response. Glad to hear that you guys are aware of it and trying to sort it out. Very interesting info...I'd never hear of "binary" links before but I hope you guys can figure out how to handle these. Seems like a tough task to tackle, just by looking at my CSV it looks like these come in several different forms and they could be hard to identify..I have a few questions:
1. Is there by chance a URL you could give me that points to the old OSE ?
2. How often does OSE crawl? Is it a constant process or are there scheduled crawls?
Thanks!!
-Zack
-
Hey Zack, I saw the ticket you filed was answered by Aaron, but I just wanted to follow up with you as well. We have made some really exciting changes to the crawler, but, unfortunately, there is a pretty obvious bug as well...
The reason for the “questionable” links coming from the Internet Wild West is due to the crawler reaching much deeper into sites where there are more download (i.e. binary) links. The first issue is the crawler is counting a binary file as a link, but the larger issue, is that the crawler doesn’t really know how to handle these types of files. This bug is causing some links to be improperly associated with certain domains. This is probably what you're seeing with all the crazy links from China and Russia which don't actually link to the site you're researching.
There are two steps to addressing this issue: changing how the crawler sees these file types and then fixing how the crawler handles these file types. We have made improvements to our algorithm so that we will be handle the majority of these files correctly, however, this update will need about a month to propagate. The fix for this issue probably won’t be seen for two more updates, meaning late September. Our improvements should catch most of the issues, but there still could be a few cases we haven't addressed. If this happens, don't hesitate to let us know; we love feedback since it helps us improve and make our index even better!
The next step is to fix how our crawlers handle binary file links and prevent them from being improperly associated with certain domains. We are in the process of working through that issue right now. We’re doing everything we can to resolve this bug as we know it is alarming to see these “questionable” links associated with your sites.I hope this helps and thanks so much for being patient :)Thanks,Carin
-
2 ways:
- Get as CSV and spend the time going through it
- Wait it out
-
OK cool good info, hope they fix it soon!! Any good ideas on how you can filter this crap[ out ?
-
Hello Zack,
That is an issue that they are working on, I know this because I already discussed this with one of their help desk people. Here is the page that describes the changes: http://www.seomoz.org/blog/brand-new-open-site-explorer-is-here
In addition to that, here is some additional information I can share with you:
you may see “questionable” links with weird file extensions. This is due to the crawler reaching much deeper into sites where there are more download links. We are looking into fixing this bug as soon as we can so these won’t be counted as links.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
OSE and Facebook
Hi, I recall being able to use OSE for Facebook. Take https://www.facebook.com/VICE/ which we know as a URL would have many backlinks. It's not registering any. Has this always been the case?
Moz Pro | | wearehappymedia0 -
Working with Open Site Explorer
Hi everyone, I'm new to keyword analysis, and am in the process of consuming a lot of SEOmoz articles and resources on the subject. I wanted to see if I'm correct in my analysis of two compared sites, and hope you can shed some light on the matter. I've been to the Google Keyword Tool and looked for my informational keywords for the project I'm working on, since the user intent is all about information. A not-so-great keyword phrase I've found with 12,100 local monthly searches is: "programa de inglês" (english programme) I'm just using this as a quick example. I have performed a Google search query for the above phrase from google.com.br (Brazil), and I'm comparing the #2 and #4 results from the 1st page of the SERPs which are: #2) www.programa-ingles.net and #4) http://www.baixaki.com.br/categorias/educacao-e-diversao.htm. What's confusing me is that in Open Site Explorer, the #4 result gets a much higher page authority compared to the #2 result, and beats #2 on every category except for internal–external link ratio and all the social categories. Here's an image attached of the comparison. Is it the fact that the external links of #2 account for 100% of the links pointing to it, or that the #2 position beats (rather pitifully) #5 on social sharing, or is it something that I've not stumbled across yet? Thanks in advance for helping out a n00b. 6HU6hBi.png
Moz Pro | | featherseo0 -
How can competition outrank you if your site has better Domain/Page Authority, More links, and More Social sharing?
Say you have a site that has better Domain/page authority, more links, more social media sharing, and a lot more indexed pages (thanks to blogging) than the competition. Of course all of these metrics are based off of data from SEOMoz open site explorer tool which I am not sure if it produces accurate data. 1. Other than exact match domains or the age of a domain what would be other reasons why competition would outrank you? 2. Can anyone suggest other ways to help increase a sites domain/page authority besides creating more indexed pages, link building, etc..?
Moz Pro | | webestate0 -
In Site Explorer My Blog.URL.com Shows "No Data Available for this URL"
Why when I use http://www.opensiteexplorer.org and I'm researching our Blog.URL.com's does the tool say "No Data Available for this URL"? Example: http://www.opensiteexplorer.org/links?site=blog.centurypayments.com
Moz Pro | | cfield_splashmedia.com0 -
Alexa Ranking Sites
I found these two sites giving my competitor link juice: http://www.webnamelist.com/alexa/Alexa_186.html http://www.list-of-domains.org/alexa/Alexa_185.html I have seen these sites before and I just dont get why they are authoritative. The funny thing is I did a search for my competitors link on the page and its not showing up, is this a problem in site explorer? Why is site explorer mentioning these sites as my competitions best links when these links do not exist on their site?
Moz Pro | | SEODinosaur0 -
How to Use Open Site Explorer
I've used Open Site Explorer here at SEOmoz for the first time and I'm confused by the results. I'm wondering how dated the results are? And, what are they based on? For example, I'm certain my facebook shares and like are higher...same with the twitter links. It seems kind of old?! One of my competitors who gets about 2x more traffic than me DOES have great backlinks. I know that. BUT, it's odd that her facebook and twitter results are what they are compared to mine - they're WAY higher in site explorer AND her links seem on par with her facebook page whereas mine don't. Whereas mine seem WAy Way lower than what they are in reality. She barely tweets and facebooks any more. Maybe once per month. She started out gangbusters, but doesn't do it much any more. That's kinda why I'm wondering if it's based on older stuff and not updated often? Anyone know?
Moz Pro | | annasus0 -
Links from External Sites not Showing on OpenSite or Google Webmaster Tools
Hi, I asked this in another post, but didn't get a solid answer. For several websites I manage, I have access to their Google Webmaster Tools. When I put the site through there, or the link:// or SEOMoz's OpenSite link reporter, I get not even a fraction of the links that I know are pointing to the websites in question. Some of these links are from home pages or side columns, and some are from articles keep within a website, like at Inc.com. I wondered if the links in question were not themselves being indexed, but if I do a very specific search, I can pull them up in the search results. Is it just common knowledge that these link tools are not accurate? And to assume that somehow, search engines really do see all of the inbound links, even if our reporting tools don't. Thanks in advance!
Moz Pro | | lilactree0 -
Total number of links in OSE not clear to me.
I have compared 3 sites in open site explorer. The total # of links in the subdomain metrics report for one of the urls is 46,787. Where is that number coming from? There's 125 total external followed links and 132 total external links. I cannot see the number for internal links, but I'm sure it isn't 40000+. So 46,787 is the result of the addition of what? Thanks a lot for your help, and sorry for this newbie question 🙂
Moz Pro | | gerardoH0