Tons of Crappy links in new OSE (Open Site Explorer)
-
I am starting to miss the old OSE. I've found that for a lot of the pages on our site, the new OSE is showing WAY more links and most of them are garbage nonsense links from China, Russia, and the rest of the internet Wild West.
For instance, in the old OSE, this page used to show 9 linking domains:
http://www.uncommongoods.com/gifts/by-recipient/gifts-for-him
It now shows 454 links. Some of the new links (about 5 of them) are legitimate. The other 400+ are garbage. Some are porn sites, most of them don't even open a web page, they just initiate some shady download. I've seen this for other sites as well (like Urban Outfitters) This is making it much harder for me to do backlink analysis on bc I have no clue how many "Normal" links they have. Is anyone else having this problem ? Any way to filter all this crap out ? See attached screenshot of the list of links I'm getting from OSE.
-
Ok thank you. I will email directly.
-
Hey Zack,
Sorry to hear you're still having problems - we've seen an improvement on most sites at this point. Would you want to send me info on the site you're searching and any filters you are using?
If you don't feel comfortable posting that info on this thread, feel free to email me directly: carin@seomoz.org.
Thanks!
Carin
-
Hey Carin,
I just wanted to follow up on this...I'm still seeing these spammy binary files show up as links. Unfortunately it makes OSE quite useless for me in regards to exploring our own backlinks.
What is the status of this problem? Has there been any headway ? Why does our site have problems but most others don't?
Thanks!
-Zack
-
Hey Zack,
Thanks so much for understanding! We are doing everything we can to get the bug resolved. Binary files are the downloadable files you see as links - .pdf, .exe, .img, etc.
I'm really sorry, but we don't have a URL to the old OSE. I saw Steven's response as a workaround - is that possible or are there too many file types to filter out?
Our crawlers that provide the metrics to OSE are always crawling, but will take about a month for our fix to propagate through to all the pages we crawl. Once we have removed these links from our crawlers, then we'll have to process the metrics. This is why it's looking like late September for the fix to show up.
I really appreciate your patience and understanding, we're doing everything we can to fix it!!
Thanks,
Carin
-
Hey Carin-
Thank you so much for this in-depth response. Glad to hear that you guys are aware of it and trying to sort it out. Very interesting info...I'd never hear of "binary" links before but I hope you guys can figure out how to handle these. Seems like a tough task to tackle, just by looking at my CSV it looks like these come in several different forms and they could be hard to identify..I have a few questions:
1. Is there by chance a URL you could give me that points to the old OSE ?
2. How often does OSE crawl? Is it a constant process or are there scheduled crawls?
Thanks!!
-Zack
-
Hey Zack, I saw the ticket you filed was answered by Aaron, but I just wanted to follow up with you as well. We have made some really exciting changes to the crawler, but, unfortunately, there is a pretty obvious bug as well...
The reason for the “questionable” links coming from the Internet Wild West is due to the crawler reaching much deeper into sites where there are more download (i.e. binary) links. The first issue is the crawler is counting a binary file as a link, but the larger issue, is that the crawler doesn’t really know how to handle these types of files. This bug is causing some links to be improperly associated with certain domains. This is probably what you're seeing with all the crazy links from China and Russia which don't actually link to the site you're researching.
There are two steps to addressing this issue: changing how the crawler sees these file types and then fixing how the crawler handles these file types. We have made improvements to our algorithm so that we will be handle the majority of these files correctly, however, this update will need about a month to propagate. The fix for this issue probably won’t be seen for two more updates, meaning late September. Our improvements should catch most of the issues, but there still could be a few cases we haven't addressed. If this happens, don't hesitate to let us know; we love feedback since it helps us improve and make our index even better!
The next step is to fix how our crawlers handle binary file links and prevent them from being improperly associated with certain domains. We are in the process of working through that issue right now. We’re doing everything we can to resolve this bug as we know it is alarming to see these “questionable” links associated with your sites.I hope this helps and thanks so much for being patient :)Thanks,Carin
-
2 ways:
- Get as CSV and spend the time going through it
- Wait it out
-
OK cool good info, hope they fix it soon!! Any good ideas on how you can filter this crap[ out ?
-
Hello Zack,
That is an issue that they are working on, I know this because I already discussed this with one of their help desk people. Here is the page that describes the changes: http://www.seomoz.org/blog/brand-new-open-site-explorer-is-here
In addition to that, here is some additional information I can share with you:
you may see “questionable” links with weird file extensions. This is due to the crawler reaching much deeper into sites where there are more download links. We are looking into fixing this bug as soon as we can so these won’t be counted as links.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I mprove site visibility and keyword ranking for new product site
Hi, Sorry if this is a ridiculous post as I am really new to SEO, but I haven't had this problem with other sites! We had a website www.r-dna.co.uk that was never promoted or used very much as it was early days in the product lifecycle. The product (is called R-DNA or Remote Data Network Analysis) is now live so we re-branded and re-launched the site - it has now been live since the beginning of September but we still only have 0.35% visibility and very little ranking in our keywords. We are also using Google Adwords to try and generate business and have registered with numerous online business directories. I have been blogging to update content, tweeting and updating our facebook page, but we still aren't getting the traffic or visibility increases that we have experienced with our other sites. The MOZ site crawl shows 5 medium priority issues (duplicate title page & missing meta description tag), but no major issues. I know its probably fairly early days for a "new" site, but wondered if anyone could advise if there is anything wrong which would explain our lack of visibility.
Moz Pro | | sharon.bathurst0 -
Specific external links
So...we have an A+ rating with the BBB and have for some time. We have a link from their page. We also use sites like pricegrabber and reseller rankings and have links from those. What I can't figure out is why links from the BBB and Pricegrabber don't even show up in open site explorer but the reseller ratings links are everywhere. The BBB one is easily the oldest and the other two are roughly the same age (i.e. at least a couple years old.)
Moz Pro | | Greatmats1 -
Why are inbound links not showing?
I run the site http://www.eurocheapo.com and am finding that many inbound links are not showing up in OSE and on the toolbar. For example, check out this hotel review: http://www.eurocheapo.com/paris/hotel/hotel-esmeralda.html In OSE it shows only 2 links (from 1 domain), which is crazy. It has dozens of inbound links from many different domains (links:http://www.eurocheapo.com/paris/hotel/hotel-esmeralda.html). I notice this all over my site. Pages that we link between are also showing no internal links -- which is easy to disprove. Was there a problem with this crawl? Or is the problem in our code? Many thanks for your help, Tom
Moz Pro | | TomNYC0 -
Comparing with Open Site Explorer
Hi, I am trying to compare a website that has a url of e.g. https://mysite.com on Open Site Explorer. Any idea how to do this? It will only compare it when I use www and it also doesn't accept https. So I am comparing www.mysite.com which has redirects on it https://mysite.com but I am worried it's not comparing the right stats? If this makes sense and you can help it would be greatly appreciated. Cheers
Moz Pro | | Hughescov0 -
Is anybody else having great difficulty in finding good link opportunities on Open Site Explorer?
I've been using OSE for a while now, and I'm struggling to see any value in it. When I search my competitors link profiles, I find just hundreds and hundreds of crappy reciprocal links, splogs or the like. I have to go through days and days (thousands of links) before I find anything worth using. Is anybody having this sort of problem? Thanks
Moz Pro | | kevinmorley0 -
Problems with OSE downloads
Ordered 5 reports last 24 hours, none received. Anyone else with this problem ? I do expect better from an expensive subscription. C'mon Moz, fix this new OSE report system please.
Moz Pro | | blocker04082 -
Recent backlinks in Open Site Explorer as not showing
I saw the note today that the link index did it's monthly update, yet our site www.oznappies.com still only shows 1 linking root domain and I know there are many more links now. What do I need to do to get open Site Explorer to use the latest data? I enter our site, create the report and only see old information from 6 may 2011's link index. I have the same issue with competive link finder, links I know we have on the sites listed for our competitors are not showing for our site.
Moz Pro | | oznappies0 -
How do I find the most linked to page of a site?
I'm looking at a site for a potential link and am trying to find the most linked to page. The SEOmoz toolbar tells me the root domain (DA) is linked to by 660 root domains but the main URL (PA) is linked to by 38 root domains. I used open site explorer and got the same # of 38 root domains in the result. From the Top Pages tab, I clicked on the 2nd page down and the SEOmoz toolbar gives me 189 root domains linking to that page (PA). Then I ran a Linkscape report to see what that would say and I get 146 linking root domains. 1. Is this 2nd page down on OSE the most linked to page? 2. a. Is something off in these numbers?
Moz Pro | | Motava
b. How come OSE/Linkscape doesn't report the 660 root domains in the DA?0