Robots.txt file in Shopify - Collection and Product Page Crawling Issue
-
Hi, I am working on one big eCommerce store which have more then 1000 Product. we just moved platform WP to Shopify getting noindex issue. when i check robots.txt i found below code which is very confusing for me. **I am not getting meaning of below tags.**
- Disallow: /collections/+
- Disallow: /collections/%2B
- Disallow: /collections/%2b
- Disallow: /blogs/+
- Disallow: /blogs/%2B
- Disallow: /blogs/%2b
I can understand that my robots.txt disallows SEs to crawling and indexing my all product pages. ( collection/*+* ) Is this the query which is affecting the indexing product pages?
Please explain me how this robots.txt work in shopify and once my page crawl and index by google.com then what is use of Disallow:
Thanks.
-
Make sure products are in your sitemap and it has been re-submitted. You can also submit your products to request indexing for them in Google Search Console.
-
Thank you for replying,
But, our main issue is that we have already crawled all collection pages but the product pages haven't crawled yet. Now we don't figure out that whether it's robots.txt issue or other crawling issue?
For example: "www.abc.com/collection/" page is crawled but "www.abc.com/collection/product1/" page hasn't crawled.
Please reply me some tips here.
-
While you may not want context indexed, it's still valuable to be crawled and access your most important content like products.
If you are blocking your /collections pages, Google will not be able to see that page's meta robots set to noindex, causing an issue for you. You may consider allowing robots to crawl your /collections pages but noindex them if they are low value or duplicative.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unlisted (hidden) pages
I just had a client say they were advised by a friend to use 'a bunch of unlisted (hidden) pages'. Isn't this seriously black hat?
White Hat / Black Hat SEO | | muzzmoz0 -
Diminishing Returns for Links to an Unrelated Page
Suppose I have a new website about cars and I had created a page about something completely not-related - like cupcakes. However, I found that it was very easy to get high quality sites to link to the cupcakes page where as it was very difficult to get people to link to the homepage about cars. If my goal is to increase the SEO for the homepage (which again is related to cars), is there a point where additional high quality links to my cupcakes page is not useful for it anymore? What if I created another page - about frosted cupcakes - which was also easy to get high quality links to?
White Hat / Black Hat SEO | | wlingke10 -
Site redesign what to consider to avoid any issues
Hi GUYS I want to avoid getting myself into a bad situation with google, so I'm just wanting to know if there are any steps I would need to take whilst I'm redesigning and developing my site as I'm currently deploying our new designs. One thing I noticed, i have my new designs and content on our development server to run through any checks before deploying it to the live environment, however while our live site is up, I have duplicate content on the live site that exactly matches the dev site for obvious reasons but do I need to tell google that the dev site is for development purposes only so google knows I'm not duplicating content? I have searched around to find some more info about this, if anyone has some insight i would be glad to know your thoughts. Thank you in advance
White Hat / Black Hat SEO | | edward-may0 -
Duplicate content for product pages
Say you have two separate pages, each featuring a different product. They have so many common features, that their content is virtually duplicated when you get to the bullets to break it all down. To avoid a penalty, is it advised to paraphrase? It seems to me it would benefit the user to see it all laid out the same, apples to apples. Thanks. I've considered combining the products on one page, but will be examining the data to see if there's a lost benefit to not having separate pages. Ditto for just not indexing the one that I suspect may not have much traction (requesting data to see).
White Hat / Black Hat SEO | | SSFCU0 -
Massive site-wide internal footer links to doorway pages: how bad is this?
My company has stuffed several hundred links into the footer of every page. Well, technically not the footer, as they're right at the end of the body tag, but basically the same thing. They are formatted as follows: [" href="http://example.com/springfield_oh_real_estate.htm">" target="_blank">http://example.com/springfield_pa_real_estate.htm">](</span><a class= "http://example.com/springfield_oh_real_estate.htm")springfield, pa real estate These direct to individual pages that contain the same few images and variations the following text that just replace the town and state: _Springfield, PA Real Estate - Springfield County [images] This page features links to help you Find Listings and Homes for sale in the Springfield area MLS, Springfield Real Estate Agents, and Springfield home values. Our free real estate services feature all Springfield and Springfield suburban areas. We also have information on Springfield home selling, Springfield home buying, financing and mortgages, insurance and other realty services for anyone looking to sell a home or buy a home in Springfield. And if you are relocating to Springfield or want Springfield relocation information we can help with our Relocation Network._ The bolded text links to our internal site pages for buying, selling, relocation, etc. Like I said, this is repeated several hundred times, on every single page on our site. In our XML sitemap file, there are links to: http://www.example.com/Real_Estate/City/Springfield/
White Hat / Black Hat SEO | | BD69
http://www.example.com/Real_Estate/City/Springfield/Homes/
http://www.example.com/Real_Estate/City/Springfield/Townhomes/ That direct to separate pages with a Google map result for properties for sale in Springfield. It's accompanied by the a boilerplate version of this: _Find Springfield Pennsylvania Real Estate for sale on www.example.com - your complete source for all Springfield Pennsylvania real estate. Using www.example.com, you can search the entire local Multiple Listing Service (MLS) for up to date Springfield Pennsylvania real estate for sale that may not be available elsewhere. This includes every Springfield Pennsylvania property that's currently for sale and listed on our local MLS. Example Company is a fully licensed Springfield Pennsylvania real estate provider._ Google Webmaster Tools is reporting that some of these pages have over 30,000 internal links on our site. However, GWT isn't reporting any manual actions that need to be addressed. How blatantly abusive and spammy is this? At best, Google doesn't care a spit about it , but worst case is this is actively harming our SERP rankings. What's the best way to go about dealing with this? The site did have Analytics running, but the company lost the account information years ago, otherwise I'd check the numbers to see if we were ever hit by Panda/Penguin. I just got a new Analytics account implemented 2 weeks ago. Of course it's still using deprecated object values so I don't even know how accurate it is. Thanks everyone! qrPftlf.png0 -
Passing page rank with frames - Is this within Google Guidelines?
It appears this site is gaming Google for better rankings. I haven't seen a site do it this before way before. Can you tell me what enables this to get such good rankings, and whether what they are doing is legitimate? The site is http://gorillamikes.com/ Earlier this year this site didn't show up in the rankings for terms like "Cincinnati tree removal" and"tree trimming Cincinnati" etc. The last few months they have been ranking #1 or #2 for these terms. The site has a huge disparity in MozRank (8, very low) vs. Page Rank (6, high). The only links to this page come from the BBB. However, when you look at the source code you find 100% of what is displayed on the site comes from a page on another site via a frame. The content is here: http://s87121255.onlinehome.us/hosting/gorillamikes/ When I go to onlinehome.us I'm redirected to http://www.1and1.com/. I'm only speculating, but my guess is onlinehome.us has a high page rank that it is passing to http://gorillamikes.com/, enabling Gorilla Mikes to achieve PR of 6. Does this make sense? In addition, the content is over optimized for the above terms (they use "Cincinnati (Cincinnat, OH)" in the first three H2 tags on the page. And all of the top menu links result in 404 errors. Are the tactics this site is using legitimate? It appears that everything they're doing is designed to improve search results, and not in ways that are helpful to users. What do you think?
White Hat / Black Hat SEO | | valkyrk0 -
Can a Page Title be all UPPER CASE?
My clients wants to use UPPER CASE for all his page titles. Is this okay? Does Google react badly to this?
White Hat / Black Hat SEO | | petewinter0 -
Need clarification on what is a landing page vs. doorway page
Hello everyone - I just became a PRO member today and wanted to say hello and ask this question... I am launching a new product, but 6 months before I created 4 different domains with landing pages to "prime" my SEO for the keywords I am trying to pursue. Now that I have launched my new product, it resides on the main domain name (let's call it "MainDomain.com"). Here's my dilemma... I want to create landing pages on each of the different domains for my PPC and optimized organic search traffic. For example, on one of the other domains (let's call it "LandingDomain1.com"), I have created a page to optimize for the keyword "event planning software" and sending my PPC traffic for "event planning software" there as well as my email campaigns. This page has original content that I have written for it (it's not duplicate content used elsewhere), but it also has navigation and links pointing to MainDomain.com, which is where we convert and collect registrations. My question is, will this activity be considered a doorway page even though I'm using it for a landing page for a particular audience? And, if it could be considered a doorway page, would I be better off moving all these optimized landing pages to my MainDomain.com and then doing a 301 redirect from those other domains to the MainDomain.com. Your input is much appreciated ... thanks.
White Hat / Black Hat SEO | | DenverDude1