Disallow: /search/ in robots but soft 404s are still showing in GWT and Google search?
-
Hi guys, I've already added the following syntax in robots.txt to prevent search engines in crawling dynamic pages produce by my website's search feature: Disallow: /search/. But soft 404s are still showing in Google Webmaster Tools. Do I need to wait(it's been almost a week since I've added the following syntax in my robots.txt)? Thanks, JC
-
You could also look at using the meta robots = noindex tag on /search/ pages, rather than just blocking it in robots.txt, as this will remove existing URLs from the index.
-
Glad to help
-
Thanks a lot Dan!
-
That is a good recommendation but ultimately search engines will make a final decision on crawl frequency. Take a look at your 'Crawl Stats' on GWTs and this will give you an idea of how often your site is crawled.
-
Is the time issue related in crawl frequency of the URLs in my sitemap?
Thanks Dan, appreciate it.
-
You will probably need to wait a little longer - it depends how often your site usually gets crawled and indexed.
However, robots.txt does not always stop search engines from indexing your pages. It will stop them crawling a page on your site but it tells them that they can still index that page. If they find links from external sites then the URL may still appear in the SERP.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website URL, Robots.txt and Google Search Console (www. vs non www.)
Hi MOZ Community,
Technical SEO | | Badiuzz
I would like to request your kind assistance on domain URLs - www. VS non www. Recently, my team have moved to a new website where a 301 Redirection has been done. Original URL : https://www.example.com.my/ (with www.) New URL : https://example.com.my/ (without www.) Our current robots.txt sitemap : https://www.example.com.my/sitemap.xml (with www.)
Our Google Search Console property : https://www.example.com.my/ (with www.) Question:
1. How/Should I standardize these so that Google crawler can effectively crawl my website?
2. Do I have to change back my website URLs to (with www.) or I just need to update my robots.txt?
3. How can I update my Google Search Console property to reflect accordingly (without www.), because I cannot see the options in the dashboard.
4. Is there any to dos such as Canonicalization needed, or should I wait for Google to automatically detect and change it, especially in GSC property? Really appreciate your kind assistance. Thank you,
Badiuzz0 -
Google Search Console and User-declared canonical is actually Hreflang tag
Hey, We recently launched a US version of UK based ecommerce website on the us.example.com subdomain. Both websites are on Shopify so canonical tags are handled automatically and we have implemented Hreflang tags across both websites. Suddenly our rankings in the UK have dropped and after looking in search console for the UK site ive found that a lot of pages are now no longer indexed in Google because the User-declared canonical is the Hreflang tag for the US URL. Below is an example https://www.example.com/products/pac-man-arcade-cabinet - is the product page is the canonical tag rel="alternate" href="https://www.example.com/products/pac-man-arcade-cabinet" hreflang="en-gb" /> - UK hreflang tag rel="alternate" href="https://us.example.com/products/pac-man-arcade-cabinet" hreflang="en-us" /> - US Hreflang tag then in Google search console the user-defined canonical is https://us.example.com/products/pac-man-arcade-cabinet but it should be https://www.example.com/products/pac-man-arcade-cabinet The UK website has been assigned to target the United Kingdom in Search Console and the US website has been assigned to target the United States. We also do not have access to robots.txt file unfortunately. Any help or insight would be greatly appreciated.
Technical SEO | | PeterRubber0 -
Should a login page for a payroll / timekeeping comp[any be no follow for robots.txt?
I am managing a Timekeeping/Payroll company. My question is about the customer login page. Would this typically be nofollow for robots?
Technical SEO | | donsilvernail0 -
Meta description not showing as per view source on Google results
On our website recyclingbins.co.uk the meta decsription of the homepage under view source is - Recycling bins offers the largest range of recycling bins for schools, homes, offices and other venues. With free delivery on everything and lowest prices guaranteed.
Technical SEO | | imrubbish
But if you searched for our website in Google the meta description it shows is: Offers recycling binsfor offices, schools and the home. Someone has already suggested it must be cached. I do not think this could be possible as we are fairly regularly crawled and it has been like this for weeks and weeks. No one seems to have much idea. could you possibly share any light? I am not concerned from an SEO perspective, but more from a click through perspective.Thank youJon0 -
Google ignores Meta name="Robots"
Ciao from 24 degrees C wetherby UK, On this page http://www.perspex.co.uk/products/palopaque-cladding/ this line was added to block indexing: But it has not worked, when you google "Palopaque PVC Wall Cladding" the page appears in the SERPS. I'm going to upload a robots txt file in a second attempt to block indexing but my question is please:
Technical SEO | | Nightwing
Why is it being indexed? Grazie,
David0 -
How to remove my cdn sub domins on Google search result?
A few months ago I moved all my Wordpress images into a sub domain. After I purchased CDN service, I again moved that images to my root domain. I added User-agent: * Disallow: / to my CDN domain. But now, when I perform site search on the Google, I found that my CDN sub domains are indexed by the Google. I think this will make duplicate content issue. I already hit by the Panguin. How do I remove these search results on Google? Should I add my cdn domain to webmaster tools to request URL removal request? Problem is, If I use cdn.mydomain.com it shows my www.mydomain.com. My blog:- http://goo.gl/58Utt site search result:- http://goo.gl/ElNwc
Technical SEO | | Godad1 -
Will same language different region (US/UK) geotargeting via subdirectory (& GWT) cause dupe content or other issues ?
If a UK hosted site on a .com, needs to target US now too but for keywords that are spelt differently in US is creating duplicate version of uk hosted .com site and putting it on a subdirectory .com/us/ and geotargeting via webmaster tools (to usa) ok ? I take it in this scenario no dupe content issues (or other issues) so long as is geotargeted via GWT ? Or are there ? Comments from anyone with experience doing similar (same language, different region geo-targeting dupe content with kw spelling being only difference, via a subdirectory or other route) much appreciated ? Many Thanks 🙂
Technical SEO | | Dan-Lawrence0 -
SERP Meta Dependant Upon Search Query (strange Google bug?)
Hi, I have on-page optimised a client's website Now take a look at the Title Tag & Meta description of the front page. This is the correct updates I have made - Title: Practice Management and Financial Consultants to the Health Industry
Technical SEO | | LukeyJamo
Description: Award winning Health and Life have been providing accounting, tax and practice management services for Medical, Dental, Allied Health businesses. Now, take a look when the business name is Googled. Notice how the Title Tag switches back to the original, yet the Description Tag is Correct. Now, take a look when the owner's name is Googled. The Title Tag is now correct, but the description is incorrect. Ive set the preferred URL to be the www version Ive spent ages in the custom CMS trying to find what could be causing this The developer says it's a "Google Thing" Anyone have any ideas?0