Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Problems in indexing a website built with Magento
- 
					
					
					
					
 Hi all My name is Riccardo and i work for a web marketing agency. Recently we're having some problem in indexing this website www.farmaermann.it which is based on Magento. In particular considering google web master tools the website sitemap is ok (without any error) and correctly uploaded. However only 72 of 1.772 URL have been indexed; we sent the sitemap on google webmaster tools 8 days ago. We checked the structure of the robots.txt consulting several Magento guides and it looks well structured also. 
 In addition to this we noticed that some pages in google researches have different titles and they do not match the page title defined in Magento backend.To conclude we can not understand if this indexing problems are related to the website sitemap, robots.txt or something else. 
 Has anybody had the same kind of problems?Thank you all for your time and consideration Riccardo 
- 
					
					
					
					
 Hi Dan! Thank you very much for your help and suggestions. I will try to follow your guidelines also. Riccardo 
- 
					
					
					
					
 Thank you Linda! We will try and we will see what happens. Riccardo 
- 
					
					
					
					
 However, you should allow Google to crawl your JavaScript and CSS (which is now blocked). Here's some background info on that: 
- 
					
					
					
					
 Hi Riccardo Yes to confirm the site is indexed and crawlable. Checking the number of URLs from a sitemap that are indexed isn't the most reliable way to see if you content is indexed. You can do a site: search on your domain in Google like this as probably one of the most reliable ways. Also, you can try jus crawling the site with a tool like Screaming Frog SEO Spider - and if the tool can crawl everything, there may be just a delay on Google's end. But in your case now, all looks good! -Dan 
- 
					
					
					
					
 Hi Riccardo, Since I do not know which pages exist on your site, I cannot be a 100% sure. You can remove this though from your robots.txt and see what happens (in Google Search Console & Bing Webmaster Tools). Allow: /*?p= 
 Allow: /catalog/seo_sitemap/category/
 Allow: /catalogsearch/result/Good luck! 
- 
					
					
					
					
 Hi Linda! Unfortunately we didn't develop the website but we have to work on its optimization. Probably you have right about the robots.txt because the sitemaps looks ok. I will try to remove the crawl delay. On the other hand which disallow rules should i remove or which modifies should i do in particular? Thank you very much for your help! Riccardo 
- 
					
					
					
					
 Hi Josh! Thank you very much for your help! 
 So probably there is a delay in webmaster tools data. Unfortunately we didn't develop the site but we only work on its optimization so we are a little bit confused with these data.
- 
					
					
					
					
 Hi Ricardo, Your home page is indexed. It is most likely your problems are because of the robots.txt. -> http://www.farmaermann.it/robots.txt 1. You set a crawl delay of 10 seconds for all bots, which is quite long. 
 User-agent: *
 Crawl-delay: 102. Some of your pages are not allowed to be crawled, like this one in your menu: http://www.farmaermann.it/integratori.html and http://www.farmaermann.it/contraccettivi-e-gravidanza.html 
 Allow: /*?p=
 Allow: /catalog/seo_sitemap/category/
 Allow: /catalogsearch/result/My advice is to modify your robots.txt: remove the crawl delay (and check whether your server can handle that) and make sure the pages in your menu can be crawled. 
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		If my website uses CDN does thousands of 301 redirect can harm the website performance?
 Hi, If my website uses CDN does thousands of 301 redirect can harm the website performance? Thanks Roy Intermediate & Advanced SEO | | kadut1
- 
		
		
		
		
		
		If my website do not have a robot.txt file, does it hurt my website ranking?
 After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help! Intermediate & Advanced SEO | | binhlai0
- 
		
		
		
		
		
		I'm a newb, built a website with Wix want to redirect it to a domain I own, but am reading that Wix is bad for this
 Hi, I am building this site for my boss http://charlesfridmanpr.wix.com/real-estate and am still working on it. I'm getting close to the stage where I want to redirect it to the URL we want to use, but in reading these forums, it says that because all of subpages (?) have a # in them, they will not be read or indexed by google. I am very new to this, and while it may not look like it, the website has taken me quite a while to design. Is there a way to fix this? We want to appear high up for a non competitive keyword. Thanks Intermediate & Advanced SEO | | Charlesfridmanpr0
- 
		
		
		
		
		
		Website completely delisted - reasons?
 Hi, I got a request from a potential client as he do not understand why his website cannot be found on Google. I've checked that and found out that the complete website is not listed (complete delist) at all - expect just one pdf file. Intermediate & Advanced SEO | | TheHecksler
 I've checked his robots.txt - but this is ok. I've checked the META Robots - but they are on index,follow ... ok so far. I've checked his backlinks but could not found any massive linking from bad pages - just 6 backlinks and only four of them from designdomains.com which looks like a linklist or so. I've requested access to their GWT account if available in hope to find more infos, but does anyone of you may have a quick idea what els it could be? What could be the issue? I think that they got delisted due to any bad reason ... Let me know your Ideas 🙂 THANX 🙂 Sebi0
- 
		
		
		
		
		
		Removing index.php
 I have question for the community and whether or not this is a good or bad idea. I currently have a Joomla site that displays www.domain.com/index.php in all the URLs with the exception of the home page. I have read that it's better to not have index.php showing in the URL at all. Does it really matter if I have index.php in my URL? I've read that it is a bad practice. I am thinking about installing the sh404SEF component on my site and removing the index.php. However, I rank pretty high for the keywords I want in Google, Bing and Yahoo. All of the URLs that show up in the searches have index.php as part of the URL. Has anyone ever used sh404SEF to remove the index.php and how did you overcome not loosing your search engine links? I don't want an existing search showing www.domain.com/index.php/sales and it not linking to the correct page which would now be www.domain.com/sales. I guess I could insert the proper redirects in the htaccess file. But I was hoping to avoid having every page of my site in the htaccess file for redirecting. Any help or advice appreciated. Intermediate & Advanced SEO | | MedGroupMedia0
- 
		
		
		
		
		
		Can too many "noindex" pages compared to "index" pages be a problem?
 Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio Intermediate & Advanced SEO | | fablau0
- 
		
		
		
		
		
		How do you de-index and prevent indexation of a whole domain?
 I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks Intermediate & Advanced SEO | | Martin_S0
- 
		
		
		
		
		
		Redirecting Canonical 301s and Magento Website
 I have an issue with a client's website where it has 3700+ pages, but roughly half of them are duplicates. Thankfully, the only difference between the original and the duplictes is the "?print" at the end of each URL (I suppose this is Magento's way of making a printable page version of the same page. I don't know, I didn't build it.) My questions is, how can I get all the pages like this http://www.mycompany.com/blah.html?print to redirect to pages like this... http://www.mycompany.com/blah.html Also, do they NEED to be Canonical, or will a 301 redirect be sufficient. Also, after having done this, if anybody knows, is there a way I can turn that feature off in Magento, because we're expanding our product line, and I don't want to have to keep chasing after these "?print" pages after the fact. Intermediate & Advanced SEO | | ClifThompson0
 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				