[Very Urgent] More 100 "/search/adult-site-keywords" Crawl errors under Search Console
-
I just opened my G Search Console and was shocked to see more than 150 Not Found errors under Crawl errors. Mine is a Wordpress site (it's consistently updated too):
Here's how they show up:
Example 1:
- URL: www.example.com/search/adult-site-keyword/page2.html/feed/rss2
- Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword/page2.html
Example 2 (this surprised me the most when I looked at the linked from data):
-
URL: www.example.com/search/adult-site-keyword-2.html/page/3/
-
Linked From:
-
www.example.com/search/adult-site-keyword-2.html/page/2/ (this is showing as if it's from our own site)
-
http://a-spammy-adult-site.com/search/adult-site-keyword-2.html
Example 3:
- URL: www.example.com/search/adult-site-keyword-3.html
- Linked From: http://an-adult-image-hosting.com/search/adult-site-keyword-3.html
How do I address this issue?
-
Here is what I would do
-
Disavow the domain that is linking to you from the adult site(s).
-
The fact that Google search console is showing that you have an internal page linking as well makes me want to know a) have you always owned this domain and maybe someone previously did link internally like this or b) you may have been or are hacked
In the case of b) this can be really tricky. I once had a site that in a crawl it was showing sitewide links to various external sites that we should not be linking to. When I looked at the internal pages via my browser, there was no link as far as I could see even though it showed up on the crawler report.
Here was the trick. The hacker had setup a script to only show the link when a bot was viewing the page. Plus, we were running mirrored servers and they had only hacked one server. So, the links only showed up when you were spidering a specific mirrored instance as a bot.
So thanks to the hacking, not only were we showing bad links to bad sites, we were doing this through cloaking methodology. Two strikes against us. Luckily we picked this up pretty quick and fixed immediately.
Use a spidering program or browser program to show a user agent of Googlebot and go visit your pages that are linking internally. You might be surprised.
Summary
Googlebot has a very long memory. It may be that this was an old issue that was fixed long ago. If that was the case, just show the 404s for the pages that do not exist, and disavow the bad domain and move on. Make sure that you have not been hacked as this would also be why this is showing.
Regardless, the fact that Google did find it at one point, you need to make sure you resolve. Pull all the URLs into a spreadsheet and run Screaming Frog in list mode to check them all to make sure you fix all of it.
-
-
Yep.. Looking if anyone can help with this..
-
Oh yea, I missed that. That's very strange, not sure how to explain that one!
-
Thanks for the response Logan. What you are saying definitely makes sense.. But it makes think why do I see something like Example 2 under Crawl errors. Why Google Search Console shows linked from as 2 URL - one the spammy site's and other is from my own website. How is that even possible?
-
I've seen similar situations, but never in bulk and not with adult sites. Basically what's happening is somehow a domain (or multiple) are linking to your site with inaccurate URLs. When bots crawling those sites find the links pointing to yours, they obviously hit a 404 page which triggers the error in Search Console.
Unfortunately, there's not too much you can do about this, as people (or automated spam programs) can create a link to any site and any time. You could disavow links from those sites, which might help from an SEO perspective, but it won't prevent the errors from showing up in your Crawl Error report.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Title When Your Domain name is the keyword
Hello, I've been struggling with an issue for years. I own UsedCubicles.com. I'm ranked very well, the site generates leads and income on a regular basis. However, I never know what to name the site on the home page. The main keyword I go after is obviously "Used Cubicles". I also have product category named "Used Cubicles". I know google doesn't recognize my Used Cubicles product category as conical and reverts to the home page. I would rather Google use my categories more effectively to rank content or pages. My current site title on the home page is NOT GOOD. Used Cubicles | Usedcubicles.com. Im nervous to change it, however because Im not sure if its helping me rank for the keyword, which I do. For years I was #1 across the country but lately its been dropping. Any advice is appreciated.
Intermediate & Advanced SEO | | Grant06970 -
Scary bug in search console: All our pages reported as being blocked by robots.txt after https migration
We just migrated to https and created 2 days ago a new property in search console for the https domain. Webmaster Tools account for the https domain now shows for every page in our sitemap the warning: "Sitemap contains urls which are blocked by robots.txt."Also in the dashboard of the search console it shows a red triangle with warning that our root domain would be blocked by robots.txt. 1) When I test the URLs in search console robots.txt test tool all looks fine.2) When I fetch as google and render the page it renders and indexes without problem (would not if it was really blocked in robots.txt)3) We temporarily completely emptied the robots.txt, submitted it in search console and uploaded sitemap again and same warnings even though no robots.txt was online4) We run screaming frog crawl on whole website and it indicates that there is no page blocked by robots.txt5) We carefully revised the whole robots.txt and it does not contain any row that blocks relevant content on our site or our root domain. (same robots.txt was online for last decade in http version without problem)6) In big webmaster tools I could upload the sitemap and so far no error reported.7) we resubmitted sitemaps and same issue8) I see our root domain already with https in google SERPThe site is https://www.languagecourse.netSince the site has significant traffic, if google would really interpret for any reason that our site is blocked by robots we will be in serious trouble.
Intermediate & Advanced SEO | | lcourse
This is really scary, so even if it is just a bug in search console and does not affect crawling of the site, it would be great if someone from google could have a look into the reason for this since for a site owner this really can increase cortisol to unhealthy levels.Anybody ever experienced the same problem?Anybody has an idea where we could report/post this issue?0 -
Search engine keyword rank - easiest way to check the keywords that rank across website
What is the best keyword ranking tool you have used? I have used various tools by which I am expected to identify and input the keyword I want to track... However, recently I was introduced to Searchmetrics, which I think automatically pulls in the keywords a website ranks for, without the need for manual input from the SEO (I haven't used this tool yet, so apologies if I am incorrect!). Do any other rank trackers work like this? Thanks, Luke
Intermediate & Advanced SEO | | McTaggart1 -
Something happened within the last 2 weeks on our WordPress-hosted site that created "duplicates" by counting www.company.com/example and company.com/example (without the 'www.') as separate pages. Any idea what could have happened, and how to fix it?
Our website is running through WordPress. We've been running Moz for over a month now. Only recently, within the past 2 weeks, have we been alerted to over 100 duplicate pages. It appears something happened that created a duplicate of every single page on our site; "www.company.com/example" and "company.com/example." Again, according to our MOZ, this is a recent issue. I'm almost certain that prior to a couple of weeks ago, there existed both forms of the URL that directed to the same page without be counting as a duplicate. Thanks for you help!
Intermediate & Advanced SEO | | wzimmer0 -
Search engine blocked by robots-crawl error by moz & GWT
Hello Everyone,. For My Site I am Getting Error Code 605: Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag, Also google Webmaster Also not able to fetch my site, tajsigma.com is my site Any expert Can Help please, Thanx
Intermediate & Advanced SEO | | falguniinnovative0 -
URL Errors for SmartPhone in Google Search Console/Webmaster Tools
Howdy all, In recent weeks I have seen a steady increase in the number of smartphone related url errors on Googles Search Console (formerly webmaster tools). THe crawler appears to be searching for a /m/ or /mobile/ directory within the URLs. Why is it doing this? Any insight would be greatly appreciated. Unfortunately this is for an unresponsive site, would setting the viewport help stop the issue for know until my new responsive site is launched shortly. Cheers fello Mozzers 🙂 Tim NDh1RNs
Intermediate & Advanced SEO | | TimHolmes1 -
Domain Authority... http://www.domain.com/ vs. http://domain.com vs. http://domain.com/
Hey Guys, Looking at Page Authority for my Site and ranking them in Decending Order, I see these 3 http://www.domain.com/ | Authority 62 http://domain.com | Authority 52 http://domain.com/ | Authority 52 Since the first one listed has the highest Authority, should I be using a 301 redirects on the lower ranking variations (which I understand how works) or should I be using rel="canonical" (which I don't really understand how it works) Also, if this is a problem that I should address, should we see a significant boost if fixed? Thanks ahead of time for anyone who can help a lost sailor who doesn't know how to sail and probably shouldn't have left shore in the first place. Cheers ZP!
Intermediate & Advanced SEO | | Mr_Snack0 -
Improving keyword stuffing and site menu
I'm doing some work on http://flyawaysimulation.com/ and would like some tips to avoid keyword stuffing for the term "flight simulator" on this page. Currently, there are about 42 occurrences of the term "flight simulator" on this page - which is above the SEOmoz 15 term recommended limit. However, looking at the page - they have used the term fairly and it doesn't appear that "inappropriate". Could you guys have a look and let me know your thoughts? What can be done? Where can the repetitions be removed from without hurting internal SEO? Also, from an SEO standpoint how could we improve the site menu? For example, if the word "downloads" were removed from the Downloads-->FSX Downloads, FS2004 Downloads, X-Plane Downloads menu (for example just FSX, FS2004, X-Plane) to remove repetition of the phrase - would engines understand that those are "Downloads" because of it being under the "Downloads" drop down list? Your thoughts, suggestions and ideas are greatly appreciated.
Intermediate & Advanced SEO | | Peter2640