How do we ensure our new dynamic site gets indexed?
-
Just wondering if you can point me in the right direction. We're building a 'dynamically generated' website, so basically, pages don’t technically exist until the visitor types in the URL (or clicks an on page link), the pages are then created on the fly for the visitor.
The major concern I’ve got is that Google won’t be able to index the site, as the pages don't exist until they're 'visited', and to top it off, they're rendered in JSPX, which makes things tricky to ensure the bots can view the content
We’re going to build/submit a sitemap.xml to signpost the site for Googlebot but are there any other options/resources/best practices Mozzers could recommend for ensuring our new dynamic website gets indexed?
-
Hi Ryan,
Mirroring what Alan said, if the links are html text links - and they should be - then you will reduce your crawling problem with Google.
If you must use javascript links, make sure to duplicate them using
<noscript>tags so that Google will follow them.</p> <p><a href="http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355">http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355</a></p> <p>But be careful, Google doesn't treat <noscript> links like regular html links. At best, it's a poor alternative.</p> <p>Google derives so many signals from HTML links (anchor text, page rank, context, etc) that it's almost essential for a search engine friendly site to include them.</p> <p>The Beginners Guide to SEO has a relevant chapter on the basics of Search Engine Friendly Design and Development:</p> <p><a href="http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development">http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development</a></p> <p>Best of luck!</p></noscript>
-
Definitely want to get it right before launch. It's not going anywhere until it is absolutely ready!
-
The project this reminds me of took six months to complete and the 301's alone were a full time job.
Get it right the first time... you do not want to restructure like this on a large dynamic site.
I must say the project worked out but I got all my grey hair the day we threw the switch...
-
When I say its costly to rewrite 200,000+ URLS I mean it. Correcting mistakes here can cost big dollars.
In this case it wascostly to the tune of $60,000+ in costs and loss, however the bottle of bubbly at the end of the six month project was tasty.
Point being is to do it right the first time.
As I said before your best bet is documentation. Large dynamic sites generate large dynamic problems very quickly if not watched closely.
-
Thank you Khem, very helpful replies.
-
One more thing, I missed. Internal linking, make sure each of the page is linked with some text link. But avoid over linking. don't try to link all the pages from home page. Generally we links all the categories, pages from footer or site-wide links
-
Okay, lets do it step by step.
First, if it's a product website, create a separate feed for products and submit the sitemap with Google.
if not, that may you would have separate news/articles/videos sections, create separate xml sitemap for each section and submit with Google
If not, make sure to have only search engine friendly URLs, who says rewriting 200,000+ pages is costly, compare this cost with the business you'll loose when all your products would be listed in Google. So, make sure to rewrite all the dynamic URLs, if you feel that Google might face problem in crawling your website's URLs
Second, study webmaster tool's data very carefully for warnings, errors, so that you can figure out the issues which Google might have been facing while visits your websites.
Avoid duplicate entries of products, generally we don't pay attention to these things, and show same products on different pages in different categories. Google will filter all those duplicate pages, and can even penalize your website because of the duplicate content issue.
Third, keep promoting, but avoid grey/black hat techniques, there is no shortcut to the success. you'll have to spend time and money.
-
It's definitely something we're taking a very close look at. Another thing not mentioned is the use of canonical tags to head off duplicate content issues, which I'll be ensuring is implemented.
My next mugshot might have significantly grayer hair after this is all done...
-
Thanks very much for the replies.
I'll ensure proper cross linking from navigation, on pages themselves and submit a full XML sitemap, along with the social media options suggested. My other concern is that the content itself won't be visible to Googlebot due to the site being largely javascript driven, but that's something I'm working with the developers to resolve.
-
As you can tell from the response above indexation is not what you should be worried about.
Dynamic content is not fool proof. The mistakes are costly and you never want to be involved rewriting 200,000+ pages of dynamic rats nest.
Sorting abilities can cause dynamic urls and duplicate content.
Structure changes or practice changes can cause crawl errors. I looked at a report for a client early today that had 3000+ errors today compared to 20 last week. This was all due to a request made by the owner to the developer.
When enough attention is not paid to this stuff it causes real issues.
The best advice I can offer is to make sure you have a best practices document that must be followed by all developers.
-
Make sure every page you would like to be crawled is linked to in any matter. You can create natural links to them, e.g. from your navigation or in text links, or you can put them in a sitemap.
You can also link to these pages from websites like facebook, twitter to have fast crawling.
Tell Google in your robots.txt that it can access your website and make sure non of the pages you would like to be indexed carry the noindex-value in the robots meta-tag.
Good luck!
-
any link, but i should correct what i said, they will be crawled, not necessary indexwed
-
Thanks for the reply Alan, do you mean links from the sitemap?
-
If you have links to the pages they will be indexed, dynamic of static it does not matter
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How preproduction website is getting indexed in Google.
Hi team, Can anybody please help me to find how my preproduction website and urls are getting indexed in Google.
Technical SEO | | nlogix0 -
Representing categories on my site
My site serves a consumer-focused industry that has about 15-20 well recognized categories, which act as a pretty obvious way to segment our content. Each category supports it's own page (with some useful content) and a series of articles relevant to that category. In short, the categories are pretty focal to what we do. I am moving from DNN to WordPress as my CMS/blog. I am taking the opportunity to review and fix SEO-related issues as I migrate. One such area is my URL structure. On my existing site (on DNN), I have the following types of pages for each topic: / <topic>- this is essentially the landing page for the topic and links to articles</topic> /<topic>/articles/ <article-name>- topics have 3-15 articles with this URL structure</article-name></topic> With WordPress, I am considering moving to articles being under the root. So, an article on (making this up) how to make a widget would be under /how-to-make-a-widget, instead of /<widgets>/article/how-to-make-a-widget I will be using WordPress categories to reflect the topics taxonomy, so I can flag my articles using standard WordPress concepts.</widgets> Anyway, I'm trying to get my head around whether it makes sense to "flatten" my URL structure such that the URLs for each article no longer include the topic (the article page will link to the topic page though). Thoughts?
Technical SEO | | MarkWill1 -
Micro-site homepage not being indexed
http://www.reebok.com/en-US/reebokonehome/ This is a homepage for an instructor network micro-site on Reebok.com The robots.txt file was excluding the /en-US/ directory, we've since removed that exclusion, and resubmitted this URL for indexing via Google Webmaster but we are still not seeing it in the index. Any advice would be very helpful, we may be missing some blocking issue or perhaps we just need to wait longer?
Technical SEO | | PatrickDugan0 -
No index on subdomains
Hi, We have a subdomain that is appearing in the search results - I want to hide this as it looks really bad. If I were to add the no index tag to the sub domain would URL would this affect the whole domain or just that sub domain? The main domain is vitally important - it is just that sub domain I need to hide. Many thanks
Technical SEO | | Creditsafe0 -
Duplicate Content Issues - Should I build a new site?
I'm currently working on a site which is built using Zen Cart. The client also has another version which has the same products on it. The product descriptions and the vast majority of the text has been re-written. I've used the duplicate content tool and these are the results: HTML fingerprint: 0000a7ee1f07a131 0000a7ec1f07a931 92.31% Total HTML similarity: 76.33% Standard text similarity: 66.72% Smart text similarity: 45.81% Total text similarity 56.27% I considered using a different eCommerce system like Magento or Volusion. So I had a look at a few templates, chose one and then used the tool again and got the following: HTML fingerprint: 0000a7e41b012111 0000a7ec1f07a931 72.00% Total HTML similarity: 64.65% Standard text similarity: 11.69% Smart text similarity: 17.90% Total text similarity 14.80% Do you think its worth doing this? thanks Dan
Technical SEO | | TheYeti0 -
Time on site
From what I understand, if you search for a keyword say "blue widgets" and you click on a result, and then spend 10 seconds there, and go back to google and click on a different result google will track that first result as being not very relevant. What I don't understand is what happens when (and this happens all the time, i did it today) you click on a result go to that page, find it (not?) relevant and then get distracted, phone call, or someone calls you into another room in the office. You end up accidentally leaving the tab open all day long, and never go back to the google search. So your time on site to google is what? infinity? there must be an upper cap here? at some point they must say, ok, the user is gone, time on site = our maximum = 5 minutes?!? Get me? any insight?
Technical SEO | | adriandg0 -
Getting a Video Sitemap Indexed
Hi, A client of mine completed a video sitemap to Google Webmaster Tools a couple of months ago. As of yet the videos are still not indexing in Google. All of the videos sit on the one page but have unique URLs in the sitemap. Does anybody know a reason why they are not being indexed? Thanks David
Technical SEO | | RadicalMedia0 -
How do I set up a site review for a password protected site?
We need to conduct a SEO analysis for a website that is on a private, password protected development site -- is there anyway for SEOMoz tools to access and analyze a PW protected site? Thank you, Sara Merten
Technical SEO | | kev110