I haven't contacted the forum yet but that's my next step.
Pages indexed: 91k
Blocked by robots.txt: 8.4million
I don't even know how you could create 8.4 million indexable pages from our content.
Welcome to the Q&A Forum
Browse the forum for helpful insights and fresh discussions about all things SEO.
I haven't contacted the forum yet but that's my next step.
Pages indexed: 91k
Blocked by robots.txt: 8.4million
I don't even know how you could create 8.4 million indexable pages from our content.
If the text is shown via an iframe, it won't count as any kind of beneficial content. If it's actually scraped and rendered on your site, it will likely be classified as duplicate content.
The best organizations recognize that creating original content is important and they put the grueling hours in to do it. It's hard but it's worth it.
Any help out there? Since the original question was posted, I've seen some improvement but even with aggressive canonicalization and noindexing, I'm still seeing a boatload of indexed pages. I am still seeing pages indexed that I've asked explicitly to be omitted by robots.txt (/search.aspx and */filter). I'm guessing it's just going to take a while to deindex what's there. Still, 91k pages indexed is quite a lot when you consider we only have about 3-4k pages and some articles.
Is anyone aware of any significant releases by Google?
Subdirectories can work very well. Those subdirectories can benefit from link juice above them. I would also consider different sitemaps for the different languages. Again, I like your third option from a user experience standpoint. Given the search engine's emphasis on that coupled with improved click through rates, you should do well as long as the content end of things is good.
Quite recent. We were actually seeing a nice downward trend in the huge number of pages indexed and then the number tripled. Crazy is an understatement. I would have thought the number of pages would fall given the number of pages that now use canonicals.
There's a lot of good info out there on this subject. The ideal for ranking in a specific country would be to have domain.es with a hosting provider in Spain that has a local Spain billing address. Aside from that, I personally like option 3 in your list but provide links to other languages in case the user is on a VPN that confounds geolocation.
If you're using WordPress, make sure you don't have the site blocked in "General". If not, check your robots.txt file. Does the site come up when you hit the URL? If so, make sure you Google ".htaccess force www" and put a rule in your .htaccess file that 301 redirects non-www pages to their www brethren. All those links were to your non-www URL so depending on your system, the non-www may be throwing a 404 which would certainly get you pushed down the SERPs.
It sounds like blank white pages may mean your system has a misconfiguration that's causing these pages to display blanks. It could be something as simple as an illegal character you put in a page title or a more general config problem you have on your category pages. What ecom system are you using if you don't mind me asking?
I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example:
Implemented a strong canonicalization effort
NOINDEX'd content we know to be duplicate programatically
Are currently fixing true duplicate content issues through rewriting titles, desc etc.
So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages?