Problem of indexing
-
Hello, sorry, I'm French and my English is not necessarily correct.
I have a problem indexing in Google.
Only the home page is referenced: http://bit.ly/yKP4nD.
I am looking for several days but I do not understand why.
I looked at:
-
The robots.txt file is ok
-
The sitemap, although it is in ASP, is valid with Google
-
No spam, no hidden text
-
I made a request for reconsideration via Google Webmaster Tools and it has no penalties
-
We do not have noindex
So I'm stuck and I'd like your opinion.
thank you very much
A.
-
-
Hello Rasmus,
i think it's ok now.
Indexing is better http://bit.ly/yKP4nD
Thank you so much.
Take care
A.
-
Hi,
very interesting, good idea !!!
I think you're right.
I will tell you
Best regards
A.
-
Ah!
I've found it!
You have a canonical link on each page?
| rel="canonical" href="http://www.syrahetcompagnie.com/Default.asp" /> |
This is not so good, as it is on http://www.syrahetcompagnie.com/vins-vallee-du-rhone-nord.htm AND http://www.syrahetcompagnie.com/PBHotNews.asp?PBMInit=1
If you remove that (and keep it on the start page) you should experience a whole lot of indexing in the following days
Best regards
Rasmus
-
You are correct. I've just found this page:
http://www.robotstxt.org/robotstxt.html
It says:
User-agent: *
Disallow:
Allows all robots to all pages.So that was my mistake. I am truly sorry for the confusion.
I will have a look at it later to see if I can find a good explanation...
-
Hi Rasmus,
User-agent: *
Disallow:means that all robots can enter the site
User-agent: *
Disallow: /block all robots to enter.
User-agent: WebCrawler
Disallow:block WebCrawler robot, but other can enter
Always first line of robots.txt tells what robots can crawl a site and * means all. Second and next lines are pointing specific catalogues on a server e.g. Disallow: /admin/
So I think that is not a robots.txt issue - please ensure me
-
Hi again,
Do you use Google Webmaster tools?
In Webmaster tools you can see how many URLs on your site that has been restricted due to robots.txt file. Perhaps that could give you a clue.
I would recommend that you take a look at webmaster tools. All in all there are a lot of good information in there for optimizing your site.
Best regards
Rasmus
-
Thanks for your answer.
OK I will edit the file but I am not convinced that this is causing my problem because it was written that way.
Take care
-
Actually your robots.txt is NOT ok. It says:
Sitemap: http://www.syrahetcompagnie.com/Sitemap.asp?AccID=27018&LangID=0 User-agent: * Disallow: Which means that all pages are to be disallowed. You should have: User-agent: * Allow: /
If you change that, it should fix it!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing Parameterized URLs from Google Index
We have duplicate eCommerce websites, and we are in the process of implementing cross-domain canonicals. (We can't 301 - both sites are major brands). So far, this is working well - rankings are improving dramatically in most cases. However, what we are seeing in some cases is that Google has indexed a parameterized page for the site being canonicaled (this is the site that is getting the canonical tag - the "from" page). When this happens, both sites are being ranked, and the parameterized page appears to be blocking the canonical. The question is, how do I remove canonicaled pages from Google's index? If Google doesn't crawl the page in question, it never sees the canonical tag, and we still have duplicate content. Example: A. www.domain2.com/productname.cfm%3FclickSource%3DXSELL_PR is ranked at #35, and B. www.domain1.com/productname.cfm is ranked at #12. (yes, I know that upper case is bad. We fixed that too.) Page A has the canonical tag, but page B's rank didn't improve. I know that there are no guarantees that it will improve, but I am seeing a pattern. Page A appears to be preventing Google from passing link juice via canonical. If Google doesn't crawl Page A, it can't see the rel=canonical tag. We likely have thousands of pages like this. Any ideas? Does it make sense to block the "clicksource" parameter in GWT? That kind of scares me.
Intermediate & Advanced SEO | | AMHC0 -
How do I handle this 301/indexing mess?
I'm working on a client's site and noticed a brisk drop in rankings. In doing some digging I found that the homepage (domain.com) is 301'd to domain.com/home.html. Here's my problem/questions: 1. domain.com is indexed by Google 2. domain.com/home.html is not indexed by Google 3. both domains have some healthy linking 4. Is the fact that domain.com/home.html impacting rankings? 5. How do carefully handle this situation (ex. redirect domain.com/home.html back to domain.com?) 6. See the attached jpeg for a visual representation of my debacle. hcIiPAs
Intermediate & Advanced SEO | | rhoadesjohn0 -
Site Search Results in Index -- Help
Hi, I made a mistake on my site, long story short, I have a bunch of search results page in the Google index. (I made a navigation page full of common search terms, and made internal links to a respective search results page for each common search term.) Google crawled the site, saw the links and now those search results pages are indexed. I made versions of the indexed search results pages into proper category pages with good URLs and am ready to go live/ replace the pages and links. But, I am a little unsure how to do it /what the effects can be: Will there be duplicate content issues if I just replace the bad, search results links/URLs with the good, category page links/URLs on the navi. page? (is a short term risk worth it?) Should I get the search results pages de-indexed first and then relaunch the navi. page with the correct category URLs? Should I do a robots.txt disallow directive for search results? Should I use Google's URL removal tool to remove those indexed search results pages for a quick fix, or will this cause more harm than good? Time is not the biggest issue, I want to do it right, because those indexed search results pages do attract traffic and the navi. page has been great for usability. Any suggestions would be great. I have been reading a ton on this topic, but maybe someone can give me more specific advice. Thanks in advance, hopefully this all makes sense.
Intermediate & Advanced SEO | | IOSC1 -
Is Affiliate masking a problem for Google?
Does Google consider affiliate masking as unethical? I have a offer website that has 1000's of affiliate links that are masked. I feel google does not like masking URL's which is why my ranking started dropping. Can any one explain the context of affiliate masking please?
Intermediate & Advanced SEO | | SEOMad0 -
How can I block unwanted urls being indexed on google?
Hi, I have to block unwanted urls (not that page) from being indexed on google. I have to block urls like example.com/entertainment not the exact page example.com/entertainment.aspx . Is there any other ways other than robot.txt? If i add this to robot.txt will that block my other url too? Or should I make a 301 redirection from example.com/entertainment to example.com/entertainment.aspx. Because some of the unwanted urls are linked from other sites. thanks in advance.
Intermediate & Advanced SEO | | VipinLouka780 -
No index, follow vs. canonical url
We have a site that consists almost entirely as a directory of videos. Example here: http://realtree.tv/channels/realtreeoutdoorsclassics We're trying to figure out the best way to handle pagination and utility features such as sort for most recent, most viewed, etc. We've been reading countless articles on this topic, but so far have been unable to determine what might be considered the industry standard. Two solutions seem to stand out... Using the canonical url on all the sorted and paginated pages. However, after reading many blog posts, it seems that you should NEVER use the canonical url to solve the issue of paginated, and thus duplicated content because the search bots will never crawl past the first page leaving many results not in the index. (We are considering ruling this method out.) Another solution seems to be using the meta tag for noindex, follow so that a search engine like Google will crawl your directory pages but not add them to the index themselves. All links are followed so content is crawled and any passing link juice remains unchanged. However, I did see a few articles skeptical of this solution as well saying that there are always better alternatives, or that there is no verification that search engines obey this meta tag. This has placed some doubt in our minds. I was hoping to get some expert advice on these methods as it would pertain to our site. Thank you.
Intermediate & Advanced SEO | | grayloon0 -
I have a duplicate content problem
The website guy that made the website for my business Premier Martial Arts Austin disappeared and didn't set up that www. was to begin each URL, so I now have a duplicate content problem and don't want to be penalized for it. I tried to show in Webmaster tools the preferred setup but can't get it to OK that I'm the website owner. Any idea as what to do?
Intermediate & Advanced SEO | | OhYeahSteve0 -
Can a XML sitemap index point to other sitemaps indexes?
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking. Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps? Has anyone done this successfully? Based upon the description here: http://sitemaps.org/protocol.php#index it seems like it should be possible. Thanks in advance for your help!
Intermediate & Advanced SEO | | CareerBliss0