Problems in indexing a website built with Magento
-
Hi all
My name is Riccardo and i work for a web marketing agency. Recently we're having some problem in indexing this website www.farmaermann.it which is based on Magento.
In particular considering google web master tools the website sitemap is ok (without any error) and correctly uploaded. However only 72 of 1.772 URL have been indexed; we sent the sitemap on google webmaster tools 8 days ago. We checked the structure of the robots.txt consulting several Magento guides and it looks well structured also.
In addition to this we noticed that some pages in google researches have different titles and they do not match the page title defined in Magento backend.To conclude we can not understand if this indexing problems are related to the website sitemap, robots.txt or something else.
Has anybody had the same kind of problems?Thank you all for your time and consideration
Riccardo
-
Hi Dan!
Thank you very much for your help and suggestions. I will try to follow your guidelines also.
Riccardo
-
Thank you Linda!
We will try and we will see what happens.
Riccardo
-
However, you should allow Google to crawl your JavaScript and CSS (which is now blocked). Here's some background info on that:
-
Hi Riccardo
Yes to confirm the site is indexed and crawlable. Checking the number of URLs from a sitemap that are indexed isn't the most reliable way to see if you content is indexed. You can do a site: search on your domain in Google like this as probably one of the most reliable ways. Also, you can try jus crawling the site with a tool like Screaming Frog SEO Spider - and if the tool can crawl everything, there may be just a delay on Google's end. But in your case now, all looks good!
-Dan
-
Hi Riccardo,
Since I do not know which pages exist on your site, I cannot be a 100% sure. You can remove this though from your robots.txt and see what happens (in Google Search Console & Bing Webmaster Tools).
Allow: /*?p=
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/Good luck!
-
Hi Linda!
Unfortunately we didn't develop the website but we have to work on its optimization. Probably you have right about the robots.txt because the sitemaps looks ok. I will try to remove the crawl delay. On the other hand which disallow rules should i remove or which modifies should i do in particular?
Thank you very much for your help!
Riccardo
-
Hi Josh!
Thank you very much for your help!
So probably there is a delay in webmaster tools data. Unfortunately we didn't develop the site but we only work on its optimization so we are a little bit confused with these data. -
Hi Ricardo,
Your home page is indexed.
It is most likely your problems are because of the robots.txt. -> http://www.farmaermann.it/robots.txt
1. You set a crawl delay of 10 seconds for all bots, which is quite long.
User-agent: *
Crawl-delay: 102. Some of your pages are not allowed to be crawled, like this one in your menu: http://www.farmaermann.it/integratori.html and http://www.farmaermann.it/contraccettivi-e-gravidanza.html
Allow: /*?p=
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/My advice is to modify your robots.txt: remove the crawl delay (and check whether your server can handle that) and make sure the pages in your menu can be crawled.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sale Pages On An eCommerce Website
I have a client who sells 50 brands of shoes. At the moment the developer has a noindex/nofollow tag on all sale pages which is wrong as around 10% of site activity revolves around those pages. The structure looks like this: 1. For Cats/Sub Cats site/sale
Intermediate & Advanced SEO | | Nigel_Carr
site/womens/sale
site/womens/shoe/sale
site/womens/shoes/ballerinas/sale For every cat/subcat - there are 10 cats and average 5 subcats per cat so 50 pages of sale. 2. For Brands site/brand
site/brand/womens
site/sale/brand
site/sale/womens/brand
site/sale/womens/cat/brand
site/sale/womens/cat/subcat/brand So each brand can have four sale pages on top of its own brand page. 50 brands x 54 = around 2700. Now no one is going to start writing 2700 pieces of additional on page content (although Meta is OK! ) and we risk further diluting the brand pages we need to show highly for, so we need to do something. Should we Category Pages: 1. Allow all sale cat and subcat pages to proliferate through Google? or
2. Canonicalise all sale sub category pages back to category
3. Caonicalise all category and Subcategory pages back to sale/womens Brand Pages: 1. Allow all sale brand pages to proliferate through Google ?
2. Canonicalise Sub Cat brand pages back to sale/category/brand
3. Canonicalise Sub Cat and Cat back to sale/brand Note the lower pages never do well in search. If you search a brand + Sale in Google it is always the site/brand page that comes up, never the sale version (This is from research on other similar sites and my own analysis) Same with Sub Cats - eg, Brand + Subcat - it's always site/brand that comes up first wand has the highest PA. Also we can't analyse any of these sale pages in MOZ or anywhere else as they are not in search at all having been no indexed. That's my conundrum for today, Any thoughts would be appreciated!0 -
International website. Di I need a new website
i am looking to expand from the UK and open a location in the US. i curretly have a .co.uk domain. what would you recommend I do with th website, create a new one wth a .com domain?
Intermediate & Advanced SEO | | Caffeine_Marketing0 -
Website Indexing Issues - Search Bots will only crawl Homepage of Website, Help!
Hello Moz World, I am stuck on a problem, and wanted to get some insight. When I attempt to use Screaming Spider or SEO Powersuite, the software is only crawling the homepage of my website. I have 17 pages associated with the main domain i.e. example.com/home, example.com/sevices, etc. I've done a bit of investigating, and I have found that my client's website does not have Robot.txt file or a site map. However, under Google Search Console, all of my client's website pages have been indexed. My questions, Why is my software not crawling all of the pages associated with the website? If I integrate a Robot.txt file & sitemap will that resolve the issue? Thanks ahead of time for all of the great responses. B/R Will H.
Intermediate & Advanced SEO | | MarketingChimp100 -
Research on building links to a website
Hi building a brand new site with no domain authority. I have created all the content and now want to start building links to the website. Mostly through guest posting, niche directories, broken link building and other whitehat methods. Anyway i was wondering if anyone has seen any good research on the way you should link to a brand new website or any site for that matter. Like in terms of % you should focus at the homepage, inner pages, anchor distribution, internal link structure, etc. A good start would be looking at successful competitors, but i wanted to see if anyone knows any studies on this. My goal is to build a link profile which meets the standards of Google and that lasts! Thanks, Mark
Intermediate & Advanced SEO | | Mikey0080 -
Google and PDF indexing
It was recently brought to my attention that one of the PDFs on our site wasn't showing up when looking for a particular phrase within the document. The user was trying to search only within our site. Once I removed the site restriction - I noticed that there was another site using the exact same PDF. It appears Google is indexing that PDF but not ours. The name, title, and content are the same. Is there any way to get around this? I find it interesting as we use GSA and within GSA it shows up for the phrase. I have to imagine Google is saying that it already has the PDF and therefore is ignoring our PDF. Any tricks to get around this? BTW - both sites rightfully should have the PDF. One is a client site and they are allowed to host the PDFs created for them. However, I'd like Mathematica to also be listed. Query: no site restriction (notice: Teach for america comes up #1 and Mathematica is not listed). https://www.google.com/search?as_q=&as_epq=HSAC_final_rpt_9_2013.pdf&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=&gws_rd=ssl#q=HSAC_final_rpt_9_2013.pdf+"Teach+charlotte"+filetype:pdf&as_qdr=all&filter=0 Query: site restriction (notice that it doesn't find the phrase and redirects to any of the words) https://www.google.com/search?as_q=&as_epq=HSAC_final_rpt_9_2013.pdf&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=&gws_rd=ssl#as_qdr=all&q="Teach+charlotte"+site:www.mathematica-mpr.com+filetype:pdf
Intermediate & Advanced SEO | | jpfleiderer0 -
My websites position has dropped, any ideas why?
Hi, First off im new here, so hello to everyone. Now to the reason why I have joined. I am currently trying to rank for 2 terms: **UK Bank Holidays 2013 (Term 1) **and Bank Holidays 2013 (Term 2) The page which im trying to rank these terms on is: http://www.followuk.co.uk/bank-holidays Now some background history: On the 29th Dec 2013, term 1 was 5th and term 2 was 7th - rankings achieved through guest blogging. Last night I changed the h1 tag from 'Bank Holidays 2013' to 'UK Bank Holidays 2013'. Re-worded the meta description to try and increase the CTR. And removed the term 'Bank Holiday' from the end of each sub-heading - Ex: 'New Year's Day Bank Holiday' to 'New Year's Day' - I did this because I felt it was to much so in total 'Bank Holiday' term had been removed from 5 sub-headings. Ok, so I went into WMT and resubmitted for indexing, over night the page got reindexed - the term 'UK Bank Holidays 2013' stayed at the same position (5) BUT the 'Bank Holidays 2013' term dropped into hell at roughly position 250. I'm thinking of changing everything back and crossing my fingers that term which dropped comes back BUT maybe im being to rash and it might jump back as the page stands. I did a grade test using SEOMOZ and both terms generate a grade of 'A'. Has anyone got any ideas? Sorry if the thread is a bit messy im currently crying all over the keyboard as im typing. Thanks
Intermediate & Advanced SEO | | followuk0 -
Which index page should I canonical to?
Hello! I'm doing a routine clean up of my code and had a question about the canonical tag. On the index page, I have the following: I have never put any thought into which index path is the best to use. http://www.example.com http://www.example.com/ http://www.example.com/index.php Could someone shed some light on this for me? Does it make a difference? Thanks! Ryan
Intermediate & Advanced SEO | | Ryan_Phillips1 -
Is Affiliate masking a problem for Google?
Does Google consider affiliate masking as unethical? I have a offer website that has 1000's of affiliate links that are masked. I feel google does not like masking URL's which is why my ranking started dropping. Can any one explain the context of affiliate masking please?
Intermediate & Advanced SEO | | SEOMad0