Any idea why pages are not being indexed?
-
Hi Everyone,
One section on our website is not being indexed. The product pages are, but not some of the subcategories. These are very old pages, so thought it was strange. Here is an example one one:
https://www.moregems.com/loose-cut-gemstones/prasiolite-loose-gemstones.html
If you take a chunk of text, it is not found in Google. No issues in Bing/Yahoo, only Google. You think it takes a submission to Search Console?
Jeff
-
So I am testing removing some of the restrictions in the robots.txt file and see if that helps as I still can't get it to be indexed.
-
Yeah...it's very close to what I have. I also checked other websites I own with the same category structure and robots.txt file...no issues.
I even checked other subcats on www.moregems.com, and no issues. It seems to be all the pages under "GEMSTONES" that are not being indexed. Any thoughts there?
-
I usually do robots.txt for Magento sites custom. But I did find a good example to use. Check out this site:https://www.magikcommerce.com/blog/set-up-robots-txt-in-magento/
I would edit anything that doesn't fit your site.
Hope this helps!
-
So I submitted https://www.moregems.com/loose-cut-gemstones/prasiolite-loose-gemstones.html and fetched it in Google Search Console a few hours ago. Still not being indexed. I don't see any issues in the robots.txt file. Any thoughts?
-
Hi Nicholas,
I asked him this as well, but do you have any resources for a "good" Magento specific robots.txt file? I want to try updating it, as it has been the same for about 7 years.
The strange thing is the deeper product pages are indexed, but not the subcats.
-
Hi Christian,
Do you have any resources for a recommended Magento robots.txt file? I added this probably 6-7 years ago, and have not updated it since. I can definitely try that.
Jeff
-
Hi Jeff,
In addition to Christian's recommendation (which I would do first), use Google Search Console's Fetch & Render Tool to request your non-indexed pages to Google's index. Sometimes this tool in GSC will have them indexed immediately.
It is not uncommon for deep links or internal pages of internal pages to not be immediately indexed. It is definitely important to use new pages to link our to other pages on your website, and if possible go in and link to your new pages from older (already indexed) pages on your website
-
Hey Jeff,
I just ran a quick scan of the site, it looks like you have a lot of links, pages, and directories being blocked by your robots.txt file: https://www.moregems.com/robots.txt
I would make sure the pages you want to be indexed by search engines are not being blocked in your robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does a no-indexed parent page impact its child pages?
If I have a page* in WordPress that is set as private and is no-indexed with Yoast, will that negatively affect the visibility of other pages that are set as children of that first page? *The context is that I want to organize some of the pages on a business's WordPress site into silos/directories. For example, if the business was a home remodeling company, it'd be convenient to keep all the pages about bathrooms, kitchens, additions, basements, etc. bundled together under a "services" parent page (/services/kitchens/, /services/bathrooms/, etc.). The thing is that the child pages will all be directly accessible from the menus, so there doesn't need to be anything on the parent /services/ page itself. Another such parent page/directory/category might be used to keep different photo gallery pages together (/galleries/kitchen-photos/, /galleries/bathroom-photos/, etc.). So again, would it be safe for pages like /services/kitchens/ and /galleries/addition-photos/ if the /services/ and /galleries/ pages (but not /galleries/* or anything like that) are no-indexed? Thanks!
Technical SEO | | BrianAlpert781 -
What is the best practice to re-index the de-indexed pages due to a bad migration
Dear Mozers, We have a Drupal site with more than 200K indexed URLs. Before 6 months a bad website migration happened without proper SEO guidelines. All the high authority URLs got rewritten by the client. Most of them are kept 404 and 302, for last 6 months. Due to this site traffic dropped more than 80%. I found today that around 40K old URLs with good PR and authority are de-indexed from Google (Most of them are 404 and 302). I need to pass all the value from old URLs to new URLs. Example URL Structure
Technical SEO | | riyas_
Before Migration (Old)
http://www.domain.com/2536987
(Page Authority: 65, HTTP Status:404, De-indexed from Google) After Migration (Current)
http://www.domain.com/new-indexed-and-live-url-version Does creating mass 301 redirects helps here without re-indexing the old URLS? Please share your thoughts. Riyas0 -
Page for Link Building
Hello guys, My question is about link building and reciprocal links. Since many directories request a reciprocal link, makes me wonder if is not better to create a unique page in the website only for this kind of links. What do you guys recommend? Thanks in advance, PP
Technical SEO | | PedroM0 -
Have a client that migrated their site; went live with noindex/nofollow and for last two SEOMoz crawls only getting one page crawled. In contrast, G.A. is crawling all pages. Just wait?
Client site is 15 + pages. New site had noindex/nofollow removed prior to last two crawls.
Technical SEO | | alankoen1230 -
Why is my office page not being indexed?
Good Morning from 24 degrees C partly cloudy wetherby UK 🙂 This page is not being indexed by Google:
Technical SEO | | Nightwing
http://www.sandersonweatherall.co.uk/office-to-let-leeds/ 1st Question Ive checked robots txt file no problems, i'm in the midst of updating the xml sitemap (it had the old one in place). It only has one link from this page http://www.sandersonweatherall.co.uk/Site-Map/ So is the reason oits not being indexed just a simple case of lack if SEO juice from inbound links so the remedy lies in routing more inbound links to the offending page? 2nd question Is the quickest way to diagnose if a web address is not being indexed to cut and paste the url in the Google search box and if it doesnt return the page theres a problem? Thanks in advance, David0 -
Rel=canonical + no index
We have been doing an a/b test of our hp and although we placed a rel=canonical tag on the testing page it is still being indexed. In fact at one point google even had it showing as a sitelink . We have this problem through out our website. My question is: What is the best practice for duplicate pages? 1. put only a rel= canonical pointing to the "wanted original page" 2. put a rel= canonical (pointing to the wanted original page) and a no index on the duplicate version Has anyone seen any detrimental effect doing # 2? Thanks
Technical SEO | | Morris770 -
Https indexed - though a no index no follow tag has been added
Hi, The https-pages of our booking section are being indexed by Google. We added But the pages are still being indexed. What can I do to exclude these URL's from the Google index? Thank you very much in advance! Kind regards, Dennis Overbeek ACSI Publishing | dennis@acsi.eu
Technical SEO | | SEO_ACSI0 -
Google News not indexing .index.html pages
Hi all, we've been asked by a blog to help them better indexing and ranking on Google News (with the site being already included in Google News with poor results) The blog had a chronicle URL duplication problem with each post existing with 3 different URLs: #1) www.domain.com/post.html (currently in noindex for editorial choices as showing all the comments) #2) www.domain.com/post/index.html (currently indexed showing only top comments) #3) www.domain.com/post/ (very same as #2) We've chosen URL #2 (/index.html) as canonical URL, and included a rel=canonical tag on URL #3 (/) linking to URL #2.
Technical SEO | | H-FARM
Also we've submitted yesterday a Google News sitemap including consistently the list of URLs #2 from the last 48h . The sitemap has been properly "digested" by Google and shows that all URLs have been sent and indexed. However if we use the site:domain.com command on Google News we see something completely different: Google News has indexed actually only some news and more specifically only the URLs #3 type (ending with the trailing slash instead of /index.html). Why ? What's wrong ? a) Does Google News bot have problems indexing URLs ending with .index.html ? While figuring out what's wrong we've found out that http://news.google.it/news/search?aq=f&pz=1&cf=all&ned=us&hl=en&q=inurl%3Aindex.html gives no results...it seems that Google News index overall does not include any URLs ending with /index.html b) Does Google News bot recognise rel=canonical tag ? c) Is it just a matter of time and then Google News will pick up the right URLs (/index.html) and/or shall we communicate Google News team any changes ? d) Any suggestions ? OR Shall we do the other way around. meaning make URL #3 the canonical one ? While Google News is showing these problems, Google Web search has actually well received the changes, so we don't know what to do. Thanks for your help, Matteo0