Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt file in Shopify - Collection and Product Page Crawling Issue
-
Hi, I am working on one big eCommerce store which have more then 1000 Product. we just moved platform WP to Shopify getting noindex issue. when i check robots.txt i found below code which is very confusing for me. **I am not getting meaning of below tags.**
- Disallow: /collections/+
- Disallow: /collections/%2B
- Disallow: /collections/%2b
- Disallow: /blogs/+
- Disallow: /blogs/%2B
- Disallow: /blogs/%2b
I can understand that my robots.txt disallows SEs to crawling and indexing my all product pages. ( collection/*+* ) Is this the query which is affecting the indexing product pages?
Please explain me how this robots.txt work in shopify and once my page crawl and index by google.com then what is use of Disallow:
Thanks.
-
Make sure products are in your sitemap and it has been re-submitted. You can also submit your products to request indexing for them in Google Search Console.
-
Thank you for replying,
But, our main issue is that we have already crawled all collection pages but the product pages haven't crawled yet. Now we don't figure out that whether it's robots.txt issue or other crawling issue?
For example: "www.abc.com/collection/" page is crawled but "www.abc.com/collection/product1/" page hasn't crawled.
Please reply me some tips here.
-
While you may not want context indexed, it's still valuable to be crawled and access your most important content like products.
If you are blocking your /collections pages, Google will not be able to see that page's meta robots set to noindex, causing an issue for you. You may consider allowing robots to crawl your /collections pages but noindex them if they are low value or duplicative.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the best strategy to SEO Discontinued Products on Ecommerce Sites?
RebelsMarket.com is a marketplace for alternative fashion. We have hundreds of sellers who have listed thousands of products. Over 90% of the items do not generate any sales; and about 40% of the products have been on the website for over 3+ years. We want to cleanup the catalog and remove all the old listings that older than 2years that do not generate any sales. What is the best practice for removing thousands of listings an Ecommerce site? do we 404 these products and show similar items? Your help and thoughts is much appreciated.
White Hat / Black Hat SEO | | JimJ3 -
Unlisted (hidden) pages
I just had a client say they were advised by a friend to use 'a bunch of unlisted (hidden) pages'. Isn't this seriously black hat?
White Hat / Black Hat SEO | | muzzmoz0 -
Moz was unable to crawl your site? Redirect Loop issue
Moz was unable to crawl your site on Jul 25, 2017. I am getting this message for my site: It says "unable to access your homepage due to a redirect loop. https://kuzyklaw.com/ Site is working fine and last crawled on 22nd July. I am not sure why this issue is coming. When I checked the website in Chrome extension it saysThe server has previously indicated this domain should always be accessed via HTTPS (HSTS Protocol). Chrome has cached this internally, and did not connect to any server for this redirect. Chrome reports this redirect as a "307 Internal Redirect" however this probably would have been a "301 Permanent redirect" originally. You can verify this by clearing your browser cache and visiting the original URL again. Not sure if this is actual issue, This is migrated on Https just 5 days ago so may be it will resolved automatically. Not sure, can anybody from Moz team help me with this?
White Hat / Black Hat SEO | | CustomCreatives0 -
Page title optimisation - Does suffix keywords matters?
Hi Moz community, We can see in many of the page titles; "brand & keyword" go after every topic like..... "best tiles for kitchen | vertigo tiles". Do Google count this suffix as any other word in page title or give low preference just because it has been repeated across every single page? What if the "keyword" is repeated with topic and brand name as well. I mean which one of the below 2 page titles gonna workout better in correlation with keyword and website authority ? best tiles for kitchen | vertigo tiles best tiles for kitchen | vertigo Thanks
White Hat / Black Hat SEO | | vtmoz0 -
Forcing Google to Crawl a Backlink URL
I was surprised that I couldn't find much info on this topic, considering that Googlebot must crawl a backlink url in order to process a disavow request (ie Penguin recovery and reconsideration requests). My trouble is that we recently received a great backlink from a buried page on a .gov domain and the page has yet to be crawled after 4 months. What is the best way to nudge Googlebot into crawling the url and discovering our link?
White Hat / Black Hat SEO | | Choice0 -
Is it wrong to have the same page represented twice in the Nav?
Hi Mozzers, I have a client that have 3 pages represented twice in the Nav. There are not duplicates since they land with the same URL. It seems odd to have this situation but I guess it make sense for my client to have those represented twice since these pages could fall into multiple categories? Is it a bad practice for SEO or is it a waste to have those in the NAV? Should I require to eliminate the extras? Thanks!
White Hat / Black Hat SEO | | Ideas-Money-Art0 -
Noindexing Thin Content Pages: Good or Bad?
If you have massive pages with super thin content (such as pagination pages) and you noindex them, once they are removed from googles index (and if these pages aren't viewable to the user and/or don't get any traffic) is it smart to completely remove them (404?) or is there any valid reason that they should be kept? If you noindex them, should you keep all URLs in the sitemap so that google will recrawl and notice the noindex tag? If you noindex them, and then remove the sitemap, can Google still recrawl and recognize the noindex tag on their own?
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
Pages linked with Spam been 301 redirected to 404\. Is it ok
Pl suggest, some pages having some spam links pointed to those pages are been redirected to 404 error page (through 301 redirect) - as removing them manually was not possible due to part of core component of cms and many other coding issue, the only way as advised by developer was making 301 redirect to 404 page. Does by redirecting these pages to 404 page using 301 redirect, will nullify all negative or spam links pointing to them and eventually will remove the resulting spam impact on the site too. Many Thanks
White Hat / Black Hat SEO | | Modi0