Problem of indexing
-
Hello, sorry, I'm French and my English is not necessarily correct.
I have a problem indexing in Google.
Only the home page is referenced: http://bit.ly/yKP4nD.
I am looking for several days but I do not understand why.
I looked at:
-
The robots.txt file is ok
-
The sitemap, although it is in ASP, is valid with Google
-
No spam, no hidden text
-
I made a request for reconsideration via Google Webmaster Tools and it has no penalties
-
We do not have noindex
So I'm stuck and I'd like your opinion.
thank you very much
A.
-
-
Hello Rasmus,
i think it's ok now.
Indexing is better http://bit.ly/yKP4nD
Thank you so much.
Take care
A.
-
Hi,
very interesting, good idea !!!
I think you're right.
I will tell you
Best regards
A.
-
Ah!
I've found it!
You have a canonical link on each page?
| rel="canonical" href="http://www.syrahetcompagnie.com/Default.asp" /> |
This is not so good, as it is on http://www.syrahetcompagnie.com/vins-vallee-du-rhone-nord.htm AND http://www.syrahetcompagnie.com/PBHotNews.asp?PBMInit=1
If you remove that (and keep it on the start page) you should experience a whole lot of indexing in the following days
Best regards
Rasmus
-
You are correct. I've just found this page:
http://www.robotstxt.org/robotstxt.html
It says:
User-agent: *
Disallow:
Allows all robots to all pages.So that was my mistake. I am truly sorry for the confusion.
I will have a look at it later to see if I can find a good explanation...
-
Hi Rasmus,
User-agent: *
Disallow:means that all robots can enter the site
User-agent: *
Disallow: /block all robots to enter.
User-agent: WebCrawler
Disallow:block WebCrawler robot, but other can enter
Always first line of robots.txt tells what robots can crawl a site and * means all. Second and next lines are pointing specific catalogues on a server e.g. Disallow: /admin/
So I think that is not a robots.txt issue - please ensure me
-
Hi again,
Do you use Google Webmaster tools?
In Webmaster tools you can see how many URLs on your site that has been restricted due to robots.txt file. Perhaps that could give you a clue.
I would recommend that you take a look at webmaster tools. All in all there are a lot of good information in there for optimizing your site.
Best regards
Rasmus
-
Thanks for your answer.
OK I will edit the file but I am not convinced that this is causing my problem because it was written that way.
Take care
-
Actually your robots.txt is NOT ok. It says:
Sitemap: http://www.syrahetcompagnie.com/Sitemap.asp?AccID=27018&LangID=0 User-agent: * Disallow: Which means that all pages are to be disallowed. You should have: User-agent: * Allow: /
If you change that, it should fix it!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do internal search results get indexed by Google?
Hi all, Most of the URLs that are created by using the internal search function of a website/web shop shouldn't be indexed since they create duplicate content or waste crawl budget. The standard way to go is to 'noindex, follow' these pages or sometimes to use robots.txt to disallow crawling of these pages. The first question I have is how these pages actually would get indexed in the first place if you wouldn't use one of the options above. Crawlers follow links to index a website's pages. If a random visitor comes to your site and uses the search function, this creates a URL. There are no links leading to this URL, it is not in a sitemap, it can't be found through navigating on the website,... so how can search engines index these URLs that were generated by using an internal search function? Second question: let's say somebody embeds a link on his website pointing to a URL from your website that was created by an internal search. Now let's assume you used robots.txt to make sure these URLs weren't indexed. This means Google won't even crawl those pages. Is it possible then that the link that was used on another website will show an empty page after a while, since Google doesn't even crawl this page? Thanks for your thoughts guys.
Intermediate & Advanced SEO | | Mat_C0 -
Why Google isn't indexing my images?
Hello, on my fairly new website Worthminer.com I am noticing that Google is not indexing images from my sitemap. Already 560 images submitted and Google indexed only 3 of them. Altough there is more images indexed they are not indexing any new images, and I have no idea why. Posts, categories and other urls are indexing just fine, but images not. I am using Wordpress and for sitemaps Wordpress SEO by yoast. Am I missing something here? Why Google won't index my images? Thanks, I appreciate any help, David xv1GtwK.jpg
Intermediate & Advanced SEO | | Worthminer1 -
Is it a problem that Google's index shows paginated page urls, even with canonical tags in place?
Since Google shows more pages indexed than makes sense, I used Google's API and some other means to get everything Google has in its index for a site I'm working on. The results bring up a couple of oddities. It shows a lot of urls to the same page, but with different tracking code.The url with tracking code always follows a question mark and could look like: http://www.MozExampleURL.com?tracking-example http://www.MozExampleURL.com?another-tracking-examle http://www.MozExampleURL.com?tracking-example-3 etc So, the only thing that distinguishes one url from the next is a tracking url. On these pages, canonical tags are in place as: <link rel="canonical<a class="attribute-value">l</a>" href="http://www.MozExampleURL.com" /> So, why does the index have urls that are only different in terms of tracking urls? I would think it would ignore everything, starting with the question mark. The index also shows paginated pages. I would think it should show the one canonical url and leave it at that. Is this a problem about which something should be done? Best... Darcy
Intermediate & Advanced SEO | | 945010 -
"No Index" Extensions
Hi there, We run an e-commerce website and we are aware of our duplicate page content/title problems. We know about the "rel canonical" tag and the "no index" tag but I am more interested in the latter. We use a CMS called Magento. Now, Magento has an extension that allows you to use the "no follow" and "no index" tag on products. Google has indexed many of our pages and I wanted to know if applying the "no index" tag on duplicate pages will instruct Google to remove the duplicate url's it has already indexed. I know the tag will tell Google not to index a page but what if I apply it to a product already indexed?
Intermediate & Advanced SEO | | iBags0 -
Index, Nofollow Issue
We are having on our site a couple of pages that we want the page to be indexed, however, we don't want the links on the page to be followed. For example url: http://www.printez.com/animal-personal-checks.html. We have added in our code: . Bing Webmaster Tools, is telling us the following: The pages uses a meta robots tag. Review the value of the tag to see if you are not unintentionally blocking the page from being indexed (NOINDEX). Question is, is the page using the right code as of now or do we need to do any changes in the code, if so, what should we use for them to index the page, but not to follow the links on the page? Please advise, Morris
Intermediate & Advanced SEO | | PrintEZ0 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
How accurate are the index figures in GWT?
I've been looking at a site in GWT and the number of indexed urls is very low when compared with the number or submitted urls on the xml sitemaps. The site has several stores which are all submitted using different sitemaps. When you perform a search in Google, eg site:domain.com/store1 site:domain.com/store2 site:domain.com/store3 The results are similar to the webmaster urls. However, looking in the analytics for landing pages used for organic traffic from Google shows a much higher number of pages. If these pages aren't indexed as reported in GMT, how could they be found in the results and be recorded as landing pages?
Intermediate & Advanced SEO | | edwardlewis0 -
Software to monitor indexed pages
Dear SEO moz, As a SEO marketer on a pretty big website I noticed a HUGE amount of dropping pages indexed by google. We did not do anything to block googleblot in the past 6 months, but since November the number of indexed pages decreased from 3.4 milion (3,400.000) to 7 hundred thousand (700,000). Obviously I want to know which pages are de-indexed. Does anyone you know a tool which can do this?
Intermediate & Advanced SEO | | JorisHas1