How do we ensure our new dynamic site gets indexed?
-
Just wondering if you can point me in the right direction. We're building a 'dynamically generated' website, so basically, pages don’t technically exist until the visitor types in the URL (or clicks an on page link), the pages are then created on the fly for the visitor.
The major concern I’ve got is that Google won’t be able to index the site, as the pages don't exist until they're 'visited', and to top it off, they're rendered in JSPX, which makes things tricky to ensure the bots can view the content
We’re going to build/submit a sitemap.xml to signpost the site for Googlebot but are there any other options/resources/best practices Mozzers could recommend for ensuring our new dynamic website gets indexed?
-
Hi Ryan,
Mirroring what Alan said, if the links are html text links - and they should be - then you will reduce your crawling problem with Google.
If you must use javascript links, make sure to duplicate them using
<noscript>tags so that Google will follow them.</p> <p><a href="http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355">http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355</a></p> <p>But be careful, Google doesn't treat <noscript> links like regular html links. At best, it's a poor alternative.</p> <p>Google derives so many signals from HTML links (anchor text, page rank, context, etc) that it's almost essential for a search engine friendly site to include them.</p> <p>The Beginners Guide to SEO has a relevant chapter on the basics of Search Engine Friendly Design and Development:</p> <p><a href="http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development">http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development</a></p> <p>Best of luck!</p></noscript>
-
Definitely want to get it right before launch. It's not going anywhere until it is absolutely ready!
-
The project this reminds me of took six months to complete and the 301's alone were a full time job.
Get it right the first time... you do not want to restructure like this on a large dynamic site.
I must say the project worked out but I got all my grey hair the day we threw the switch...
-
When I say its costly to rewrite 200,000+ URLS I mean it. Correcting mistakes here can cost big dollars.
In this case it wascostly to the tune of $60,000+ in costs and loss, however the bottle of bubbly at the end of the six month project was tasty.
Point being is to do it right the first time.
As I said before your best bet is documentation. Large dynamic sites generate large dynamic problems very quickly if not watched closely.
-
Thank you Khem, very helpful replies.
-
One more thing, I missed. Internal linking, make sure each of the page is linked with some text link. But avoid over linking. don't try to link all the pages from home page. Generally we links all the categories, pages from footer or site-wide links
-
Okay, lets do it step by step.
First, if it's a product website, create a separate feed for products and submit the sitemap with Google.
if not, that may you would have separate news/articles/videos sections, create separate xml sitemap for each section and submit with Google
If not, make sure to have only search engine friendly URLs, who says rewriting 200,000+ pages is costly, compare this cost with the business you'll loose when all your products would be listed in Google. So, make sure to rewrite all the dynamic URLs, if you feel that Google might face problem in crawling your website's URLs
Second, study webmaster tool's data very carefully for warnings, errors, so that you can figure out the issues which Google might have been facing while visits your websites.
Avoid duplicate entries of products, generally we don't pay attention to these things, and show same products on different pages in different categories. Google will filter all those duplicate pages, and can even penalize your website because of the duplicate content issue.
Third, keep promoting, but avoid grey/black hat techniques, there is no shortcut to the success. you'll have to spend time and money.
-
It's definitely something we're taking a very close look at. Another thing not mentioned is the use of canonical tags to head off duplicate content issues, which I'll be ensuring is implemented.
My next mugshot might have significantly grayer hair after this is all done...
-
Thanks very much for the replies.
I'll ensure proper cross linking from navigation, on pages themselves and submit a full XML sitemap, along with the social media options suggested. My other concern is that the content itself won't be visible to Googlebot due to the site being largely javascript driven, but that's something I'm working with the developers to resolve.
-
As you can tell from the response above indexation is not what you should be worried about.
Dynamic content is not fool proof. The mistakes are costly and you never want to be involved rewriting 200,000+ pages of dynamic rats nest.
Sorting abilities can cause dynamic urls and duplicate content.
Structure changes or practice changes can cause crawl errors. I looked at a report for a client early today that had 3000+ errors today compared to 20 last week. This was all due to a request made by the owner to the developer.
When enough attention is not paid to this stuff it causes real issues.
The best advice I can offer is to make sure you have a best practices document that must be followed by all developers.
-
Make sure every page you would like to be crawled is linked to in any matter. You can create natural links to them, e.g. from your navigation or in text links, or you can put them in a sitemap.
You can also link to these pages from websites like facebook, twitter to have fast crawling.
Tell Google in your robots.txt that it can access your website and make sure non of the pages you would like to be indexed carry the noindex-value in the robots meta-tag.
Good luck!
-
any link, but i should correct what i said, they will be crawled, not necessary indexwed
-
Thanks for the reply Alan, do you mean links from the sitemap?
-
If you have links to the pages they will be indexed, dynamic of static it does not matter
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site not getting indexed by googlebot.
The following question is in regards to http://footeschool.org/. This site is not getting indexed with google(googlebot) This only happens when the user agent is set googlebot. This is a recent issue. We are using DNN as CMS. Are there any suggestion to help resolve this issue?
Technical SEO | | bcmull0 -
Why are these blackhat sites so successful?
Here's an interesting conundrum. Here are three sites with their respective ranking for "dental implants [city]:" http://dentalimplantsvaughan.ca - 9 (on google.ca) http://dentalimplantsinhonoluluhi.com - 2 (on google.com) http://dentalimplantssurreybc.ca - 7 (on google.ca) These markets are not particularly competitive, however, all of these sites suffer from: Duplicate content, both internally and across sites (all of this company's implant sites have the same exact content, minus the bio pages and the local modifier). Average speed score. No structured data No links And these sites are ranking relatively quickly. The Vaughan site went live 3 months ago. But, what's boggling my mind is that they rank on the first page at all. It seems they're doing the exact opposite of what you're supposed to do, yet they rank relatively well.
Technical SEO | | nowmedia10 -
Site indexed by Google, but (almost) never gets impressions
Hi there, I have a question that I wasn't able to give it a reasonable answer yet, so I'm going to trust on all of you. Basically a site has all its pages indexed by Google (I verified with site:sitename.com) and it also has great and unique content. All on-page grades are A with absolutely no negative factors at all. However its pages do not get impressions almost at all. Of course I didn't expect it to be on page 1 since it has been launched on Dec, 1st, but it looks like Google is ignoring (or giving it bad scores) for some reason. Only things that can contribute to that could be: domain privacy on the domain, redirect from the www to the subdomain we use (we did this because it will be a multi-language site, so we'll assign to each country a subdomain), recency (it has been put online on Dec 1st and the domain is just a couple of months old). Or maybe because we blocked crawlers for a few days before the launch? Exactly a few days before Dec 1st. What do you think? What could be the reason for that? Thanks guys!
Technical SEO | | ruggero0 -
Then why my site is not ranking
My website's DA and PAs are good compare with my competitors. Then why my site is not ranking.
Technical SEO | | Somanathan0 -
Site Navigation
Hello, I have some questions about best practices with site navigation & internal linking. I'm currently assisting aplossoftware.com with its navigation. The site has about 200 pages total. They currently have a very sparse header with a lot of links in the footer. The three most important keywords they want to rank for are nonprofit accounting software, church accounting software and file 990 online. 1. What are your thoughts about including a drop down menu in the header for the different products? (they have 3 main products). This would allow us to include a few more links in the header and give more real estate to include full keywords in anchor text. 2. They have a good blog with content that gets regularly updated. Currently it's linked in the footer and gets a tiny amount of visits. What are your thoughts about including it as a link in the header instead? 3. What are best practices with using (or not using) no follow with site navigation and footer links? How about with links to social media pages like Facebook/Twitter? Any other thoughts/ideas about the site navigation for this site (www.aplossoftware.com) would be much appreciated. Thanks!
Technical SEO | | stageagent0 -
Why are URLs like www.site.com/#something being indexed?
So, everything after a hash (#) is not supposed to be crawled and indexed. Has that changed? I see a clients site with all sorts of URLs indexed like ... http://www.website.com/#!category/c11f For the above URL, I thought it was the same as simply http://www.website.com/. But they aren't, they're getting indexed and all the content on the pages with these hash tags are getting crawled as well. Thanks!
Technical SEO | | wiredseo0 -
Merging sites, ensuring traffic doesn't die
Wondering if I could get a second opinion on this, please. I have just taken on a new client, they own about 6 different niched car experience websites (hire an Aston Martin for the day, type thing). All the six sites they have seem to perform reasonably well for the brand of car they deal with, the average DA of the sites is about 24. The client wishes to move all of these different manufacturers into one site and have sections of the site, they can then also target more generic experience day type keywords. The obvious way of dealing with this move would be to 301 the old sites to the relevant places on the new site and wait for that to rank. However, looking at the backlinks profile of the niched sites, they seem to have very few backlinks and i feel the reason they are ranking so well for all the individual manufacturers is because they all feature the name in the domain. Not exact match, but the name is there. If I am thinking right, with the 301 we want to tell Google page x is now page y, index this one instead. Because the new site has a more generic name I don't think it will enjoy any of the domain keyword benefits which are helping the sub sites, and as a result I expect the rankings and traffic to drop (at least in the short term). Am I reading this correct. Would people use a 301 in this case? The easiest thing to do would be to leave the 6 sub sites up and running on their own domain and launch the new site to run alongside them, however the client doesn't want this. Thanks, Carl
Technical SEO | | GrumpyCarl0 -
Will SEO Moz index our keywords if the site is ALL https?
We have a site coming into beta next week. Playing around with SEO Moz, I had trouble getting the keywords to rank at all. Was this because the site is entirely https? If yes, what else can SEO Moz NOT do if the site is all https? Thanks!
Technical SEO | | OTSEO0