Issue with site not being properly found in Google
-
We have a website [domain name removed] that is not being properly found in Google. When we run it through Screaming Frog, it indicates that there is a problem with the robot.txt file.
However, I am unsure exactly what this problem is, and why this site is no longer properly being found.
Any help here on how to resolve this would be appreciated!
-
Note: We've edited and removed select links and images in this thread as requested by the OP for privacy.
-
Hi Thomas,
Thanks for all your help here. You've been fantastic!
We have had an issue generating a sitemap for our website using our usual sitemap creation tools. Could you explain why this is?
-
Moderator's Note: Attached images, along with select links in this thread have been edited and/or removed for privacy at the request of the OP.
--
I noticed your robots.txt is fixed but I would recommend two things to get your site back into the index faster based on the photographs below I am suggesting fetching your site as a Google bot as well as adding your XML site map to Webmaster tools.
Please do not forget to add all four versions of your website to webmaster tools if it has not been added
when I say that I mean add every URL below to Google Webmaster tools with and without www
target the site to the fourth or canonical URL. Choose the one with www.
here is a reference from Google
https://support.google.com/webmasters/answer/34592?hl=en&ref_topic=4564315
I would do two things I would add my site map to my robots.txt file because if you're going to use search tools it's going to help you.
You should set up your robots.txt just like this
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php[Sitemap: https://www.website.com/sitemap_index.xml]
you can reference
https://yoast.com/ultimate-guide-robots-txt/
Allow
directiveWhile not in the original “specification”, there was talk of an
allow
directive very early on. Most search engines seem to understand it, and it allows for simple, and very readable directives like this:Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php
The only other way of achieving the same result without an
allow
directive would have been to specificallydisallow
every single file in thewp-admin
folder.because you don't want your login to be showing up in Google.
after which I would go into Webmaster tools/search console and fetch as a Google bot
Ask Google to re-crawl your URLs
If you’ve recently made changes to a URL on your site, you can update your web page in Google Search with the_Submit to Index_function of the Fetch as Google tool. Thisfunction allows you to ask Google to crawl and index your URL.
See
http://searchengineland.com/how-to-use-fetch-as-googlebot-like-seo-samurai-214292
https://support.google.com/webmasters/answer/6066468?hl=en
Ask Google to crawl and index your URL
- Click Submit to Index, shown next the status of a recent, successful fetch in the Fetches Table.
- Select** Crawl only this URL **to submit one individual URL to the Google for re-crawling. You can submit up to 500 individual URLs in this way within a 30 day period.
- Select** Crawl this URL and its direct links** to submit the URL as well as all the other pages that URL links to for re-crawling. You can submit up to 10 of requests of this kind within a 30 day period.
- Click Submit to let Google know that your request is ready to be processed.
adding your XML site map to Google Webmaster tools
[https://www.website.com/sitemap_index.xml]
will help Google determined that you are back online you should not see any real fallout from this. And submitting a complete XML site map gets a lot of images into Google images.
I hope this helps,
Tom
-
Hi it seems your robots.txt file is blocking Google and all other bots that search the web and obey robots.txt basically the good ones. So if you would like your site to be seen and indexed by Google and other search engines you need to remove the forward slash "/"
Shown here in your robots.txt file
Block all web crawlers from all content
User-agent: * Disallow: /
Go here to see [
https://www.website.com/robots.txt]-
Please read https://moz.com/learn/seo/robotstxt
-
Use to make the file http://tools.seobook.com/robots-txt/generator/
it looks like you're using WordPress so if you're using Apache or Yoast SEO you can go in and set it to use this I added your xml sitemap https://www.brightonpanelworks.com.au/sitemap_index.xml
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php[Sitemap: https://www.website.com/sitemap_index.xml]
You can use tools like this to analyze & fix robots.txt & can allways see it by adding /robots.txt after the .com or tld.
I hope that helps,
Tom ```
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing & Caching Some Other Domain In Place of Mine-Lost Ranking -Sucuri.net Found nothing
Again I am facing same Problem with another wordpress blog. Google has suddenly started to Cache a different domain in place of mine & caching my domain in place of that domain. Here is an example page of my site which is wrongly cached on google, same thing happening with many other pages as well - http://goo.gl/57uluq That duplicate site ( protestage.xyz) is showing fully copied from my client's site but showing all pages as 404 now but on google cache its showing my sites. site:protestage.xyz showing all pages of my site only but when we try to open any page its showing 404 error My site has been scanned by sucuri.net Senior Support for any malware & there is none, they scanned all files, database etc & there is no malware found on my site. As per Sucuri.net Senior Support It's a known Google bug. Sometimes they incorrectly identify the original and the duplicate URLs, which results in messed ranking and query results. As you can see, the "protestage.xyz" site was hacked, not yours. And the hackers created "copies" of your pages on that hacked site. And this is why they do it - the "copy" (doorway) redirects websearchers to a third-party site [http://www.unmaskparasites.com/security-report/?page=protestage.xyz](http://www.unmaskparasites.com/security-report/?page=protestage.xyz) It was not the only site they hacked, so they placed many links to that "copy" from other sites. As a result Google desided that that copy might actually be the original, not the duplicate. So they basically hijacked some of your pages in search results for some queries that don't include your site domain. Nonetheless your site still does quite well and outperform the spammers. For example in this query: [https://www.google.com/search?q=](https://www.google.com/search?q=)%22We+offer+personalized+sweatshirts%22%2C+every+bride#q=%22GenF20+Plus+Review+Worth+Reading+If+You+are+Planning+to+Buy+It%22 But overall, I think both the Google bug and the spammy duplicates have the negative effect on your site. We see such hacks every now and then (both sides: the hacked sites and the copied sites) and here's what you can do in this situation: It's not a hack of your site, so you should focus on preventing copying the pages: 1\. Contact the protestage.xyz site and tell them that their site is hacked and that and show the hacked pages. [https://www.google.com/search?q=](https://www.google.com/search?q=)%22We+offer+personalized+sweatshirts%22%2C+every+bride#q=%22GenF20+Plus+Review+Worth+Reading+If+You+are+Planning+to+Buy+It%22 Hopefully they clean their site up and your site will have the unique content again. Here's their email flang.juliette@yandex.com 2\. You might want to send one more complain to their hosting provider (OVH.NET) abuse@ovh.net, and explain that the site they host stole content from your site (show the evidence) and that you suspect the the site is hacked. 3\. Try blocking IPs of the Aruba hosting (real visitors don't use server IPs) on your site. This well prevent that site from copying your site content (if they do it via a script on the same server). I currently see that sites using these two IP address: 149.202.120.102\. I think it would be safe to block anything that begins with 149.202 This .htaccess snippet should help (you might want to test it) #-------------- Order Deny,Allow Deny from 149.202.120.102 #-------------- 4\. Use rel=canonical to tell Google that your pages are the original ones. [https://support.google.com/webmasters/answer/139066?hl=en](https://support.google.com/webmasters/answer/139066?hl=en) It won't help much if the hackers still copy your pages because they usually replace your rel=canonical with their, so Google can' decide which one is real. But without the rel=canonical, hackers have more chances to hijack your search results especially if they use rel=canonical and you don't. I should admit that this process may be quite long. Google will not return your previous ranking overnight even if you manage to shut down the malicious copies of your pages on the hacked site. Their indexes would still have some mixed signals (side effects of the black hat SEO campaign) and it may take weeks before things normalize. The same thing is correct for the opposite situation. The traffic wasn't lost right after hackers created the duplicates on other sites. The effect build up with time as Google collects more and more signals. Plus sometimes they run scheduled spam/duplicate cleanups of their index. It's really hard to tell what was the last drop since we don't have access to Google internals. However, in practice, if you see some significant changes in Google search results, it's not because of something you just did. In most cases, it's because of something that Google observed for some period of time. Kindly help me if we can actually do anything to get the site indexed properly again, PS it happened with this site earlier as well & that time I had to change Domain to get rid of this problem after I could not find any solution after months & now it happened again. Looking forward for possible solution Ankit
Intermediate & Advanced SEO | | killthebillion0 -
Wrong country sites being shown in google
Hi, I am having some issues with country targeting of our sites. Just to give a brief background of our setup and web domains We use magento and have 7 connected ecommerce sites on that magento installation 1.www.tidy-books.co.uk (UK) - main site 2. www.tidy-books.com (US) - variations in copy but basically a duplicate of UK 3.www.tidy-books.it (Italy) - fully translated by a native speaker - its' own country based social medias and content regularly updated/created 4.www.tidy-books.fr (France) - fully translated by a native speaker - its' own country based social medias and content regularly updated/created 5.www.tidy-books.de (Germany) - fully translated by a native speaker - uits' own country based social medias and content regularly updated/created 6.www.tidy-books.com.au (Australia) - duplicate of UK 7.www.tidy-books.eu (rest of Europe) - duplicate of UK I’ve added the country and language href tags to all sites. We use cross domain canonical URLS I’ve targeted in the international targeting in Google webmaster the correct country where appropriate So we are getting number issues which are driving me crazy trying to work out why The major one is for example If you search with an Italian IP in google.it for our brand name Tidy Books the .com site is shown first then .co.uk and then all other sites followed on page 3 the correct site www.tidy-books.it The Italian site is most extreme example but the French and German site still appear below the .com site. This surely shouldn’t be the case? Again this problem happens with the co.uk and .com sites with when searching google.co.uk for our keywords the .com often comes up before the .co.uk so it seems we have are sites competing against each other which again can’t be right or good. The next problem lies in the errors we are getting on google webmaster on all sites is having no return tags in the international targeting section. Any advice or help would be very much appreciated. I’ve added some screen shots to help illustrate and happy to provide extra details. Thanks UK%20hreflang%20errors.png de%20search.png fr%20search.png it%20search.png
Intermediate & Advanced SEO | | tidybooks1 -
Google de-indexed a page on my site
I have a site which is around 9 months old. For most search terms we rank fine (including top 3 rankings for competitive terms). Recently one of our pages has been fluctuating wildly in the rankings and has now disappeared altogether from the rankings for over 1 week. As a test I added a similar page to one of my other sites and it ranks fine. I've checked webmaster tools and there is nothing of note there. I'm not really sure what to do at this stage. Any advice would me much appreciated!
Intermediate & Advanced SEO | | deelo5550 -
After Receiving a "Googlebot can't access your site" would this stop your site from being crawled?
Hi Everyone,
Intermediate & Advanced SEO | | AMA-DataSet
A few weeks ago now I received a "Googlebot can't access your site..... connection failure rate is 7.8%" message from the webmaster tools, I have since fixed the majority of these issues but iv noticed that all page except the main home page now have a page rank of N/A while the home page has a page rank of 5 still. Has this connectivity issues reduced the page ranks to N/A? or is it something else I'm missing? Thanks in advance.0 -
Our Site's Content on a Third Party Site--Best Practices?
One of our clients wants to use about 200 of our articles on their site, and they're hoping to get some SEO benefit from using this content. I know standard best practices is to canonicalize their pages to our pages, but then they wouldn't get any benefit--since a canonical tag will effectively de-index the content from their site. Our thoughts so far: add a paragraph of original content to our content link to our site as the original source (to help mitigate the risk of our site getting hit by any penalties) What are your thoughts on this? Do you think adding a paragraph of original content will matter much? Do you think our site will be free of penalty since we were the first place to publish the content and there will be a link back to our site? They are really pushing for not using a canonical--so this isn't an option. What would you do?
Intermediate & Advanced SEO | | nicole.healthline1 -
Why google does not show my site on my branding keyword?
Hi, I am the site owner of http://www.lankahq.net. It is a youtube video hosted website. 95% of the video contents are daily telecasting TV shows and categorized to make easy to find the specific video of the day of preferred program of the viewer. About 1 year ago suddenly disappeared my site from the google. Before happened that it was performing very well in search. Google showed my site contents within few minutes after I update in their search results when someone search for that. At the moment it does not show even some one search for "lankahq". It showing only if search for "lankahq.net", "lankahq.com" or "www.lankahq.net" something like a search keyword related to my domain name. Some other websites have added my branding keyword LankaHQ as a user of their site and they showing on top of Google. But not mine. It is much appreciated if someone can have a look on this matter. I can not find where is the problem.
Intermediate & Advanced SEO | | cprasad0 -
Retailers Issue
Hi there, We have 20 retailers who are about to launch websites and are going to be selling our products on their websites, however with they have no content for these products they are wanting to take our content we have for our product pages on place the content on their websites, is this going to cause an issue for me? We are ranking well for competitive keywords in this niche and do not want to do anything to harm it. What I would say is the retailers in question of no intention short term anyway of doing anything with SEO. Thanks for any help
Intermediate & Advanced SEO | | Paul780