Hi and thank you for your help. The site in question is the following: http://www.universalpr.com/
In this platform the URL holds session information such as the language, navigational state, among other things. Now this platform has a normalization process that detects the user-agent and looks for bots such as the Googlebot. Once portal detects a crawler bot it strips out most of the session information out of the URL. The end result is still not pretty, but it aims for consistency otherwise you could get a huge number of URLs that reference the same content.
When using WebSphere Portal with Lotus WCM (the CMS) content is displayed as modules or sections of the page that are called portlets; such portlets don't have direct access to the HEAD. This makes using canonical URLs is a bit challenge. However we're working on finding a way to write to the HEAD for updating the TITLE and adding canonical elements to the pages.
The following are two other examples that use this platform:
- http://www.ncaa.org/ - This site uses WCM stand alone and it redirects to WebSphere Portal only whenever necessary (i.e. the login). The benefits of this approach is that they can use canonical elements and that the can update their titles depending on which content is being displayed.
- http://miamidade.gov/ - Miami Dade is a lot similar to my example; which is www.universalpr.com. In Miami Dade's case the use the Lotus WCM portlet. If you click around this site you'll see that their Title is always the same and that they don't use canonical elements in their page either.
It goes without saying that I could benefit from using the NCAA's approach, however this would require quite a bit of re-work.
These are some of the shortcomings that I can identify with my limited experience in SEO. If you need any more details at all please let me know.
Thank you,