Does Googlebot Read Session IDs?
-
I did a raw export from AHREFs yesterday and one of our sites has 18,000 backlinks coming from the same site. But they're all the same link, just with a different session ID. The structure of the URL is:
[website].com/resources.php?UserID=10031529
And we have 18,000 of these with a different ID.
Does Google read each of these as a unique backlink or does it realize there's just one link and the session ID is throwing it off? I read different opinions when researching this so I'm hoping the Moz community can give some concrete answers.
-
Safest bet, set up canonicals that point to the page minus the parameter so even if Google does read the session IDs it will understand that they relate to the canon link. Honestly, I'm not 100% sure if Google reads those sessions IDs or not either and have seen conflicting information. I know they read other parameters as separate URLs... I had a few issues with the way one of our sites handled products (sometimes it was ?model= and sometimes it was ?prod_id= and some old products also had ?sku=). But adding the canonicals will solve this problem if it exists and if the problem doesn't exist it won't hurt having a self-referential canonical sitting in the code in case someone scrapes your site.
-
You have to inform yourself and really watch out for this kind of stuff and SE bots.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl and Indexation Error - Googlebot can't/doesn't access specific folders on microsites
Hi, My first time posting here, I am just looking for some feedback on a indexation issue we have with a client and any feedback on possible next steps or items I may have overlooked. To give some background, our client operates a website for the core band and a also a number of microsites based on specific business units, so you have corewebsite.com along with bu1.corewebsite.com, bu2.corewebsite.com. The content structure isn't ideal, as each microsite follows a structure of bu1.corewebsite.com/bu1/home.aspx, bu2.corewebsite.com/bu2/home.aspx and so on. In addition to this each microsite has duplicate folders from the other microsites so bu1.corewebsite.com has indexable folders bu1.corewebsite.com/bu1/home.aspx but also bu1.corewebsite.com/bu2/home.aspx the same with bu2.corewebsite.com has bu2.corewebsite.com/bu2/home.aspx but also bu2.corewebsite.com/bu1/home.aspx. Therre are 5 different business units so you have this duplicate content scenario for all microsites. This situation is being addressed in the medium term development roadmap and will be rectified in the next iteration of the site but that is still a ways out. The issue
Intermediate & Advanced SEO | | ImpericMedia
About 6 weeks ago we noticed a drop off in search rankings for two of our microsites (bu1.corewebsite.com and bu2.corewebsite.com) over a period of 2-3 weeks pretty much all our terms dropped out of the rankings and search visibility dropped to essentially 0. I can see that pages from the websites are still indexed but oddly it is the duplicate content pages so (bu1.corewebsite.com/bu3/home.aspx or (bu1.corewebsite.com/bu4/home.aspx is still indexed, similiarly on the bu2.corewebsite microsite bu2.corewebsite.com/bu3/home.aspx and bu4.corewebsite.com/bu3/home.aspx are indexed but no pages from the BU1 or BU2 content directories seem to be indexed under their own microsites. Logging into webmaster tools I can see there is a "Google couldn't crawl your site because we were unable to access your site's robots.txt file." This was a bit odd as there was no robots.txt in the root directory but I got some weird results when I checked the BU1/BU2 microsites in technicalseo.com robots text tool. Also due to the fact that there is a redirect from bu1.corewebsite.com/ to bu1.corewebsite.com/bu4.aspx I thought maybe there could be something there so consequently we removed the redirect and added a basic robots to the root directory for both microsites. After this we saw a small pickup in site visibility, a few terms pop into our Moz campaign rankings but drop out again pretty quickly. Also the error message in GSC persisted. Steps taken so far after that In Google Search Console, I confirmed there are no manual actions against the microsites. Confirmed there is no instances of noindex on any of the pages for BU1/BU2 A number of the main links from the root domain to microsite BU1/BU2 have a rel="noopener noreferrer" attribute but we looked into this and found it has no impact on indexation Looking into this issue we saw some people had similar issues when using Cloudflare but our client doesn't use this service Using a response redirect header tool checker, we noticed a timeout when trying to mimic googlebot accessing the site Following on from point 5 we got a hold of a week of server logs from the client and I can see Googlebot successfully pinging the site and not getting 500 response codes from the server...but couldn't see any instance of it trying to index microsite BU1/BU2 content So it seems to me that the issue could be something server side but I'm at a bit of a loss of next steps to take. Any advice at all is much appreciated!0 -
Uppercase/Lowercase Reading As Duplicate Permalinks
I cannot figure out if this is an actual SEO issue or just a crawl reader error. I use Screaming Frog to crawl my site and use their SEO features. When I look at page titles and duplicates it shows all our pages twice... some with 1 letter capitalized and the other not. I don't REALLY have duplicate permalinks do I? I also noticed when I use some open site explorers and paste in both permalinks the specs will show for the permalink that's all lowercase but it won't find anything for the "duplicate" permalink that is capitalized. Below I included a few screenshots. Thank you Moz Fam! Q6866xZNUfpF cxXacVajCBGb
Intermediate & Advanced SEO | | LindsayE0 -
How to solve our duplicate content issue? (Possible Session ID problem)
Hi there, We've recently took on a new developer who has no experience in any technical SEO and we're currently redesigning our site www.mrnutcase.com. Our old developer was up to speed on his SEO and any technical issues we never really had to worry about. I'm using Moz as a tool to go through crawl errors on an ad-hoc basis. I've noticed just now that we're recording a huge amount of duplicate content errors ever since the redesign commenced (amongst other errors)! For example, the following page is duplicated 100s of times: https://www.mrnutcase.com/en-US/designer/?CaseID=1128599&CollageID=21&ProductValue=2293 https://www.mrnutcase.com/en-US/designer/?CaseID=1128735&CollageID=21&ProductValue=3387 https://www.mrnutcase.com/en-GB/designer/?CaseID=1128510&CollageID=21&ProductValue=3364 https://www.mrnutcase.com/en-GB/designer/?CaseID=1128511&CollageID=21&ProductValue=3363 etc etc. Does anyone know how I should be dealing with this problem? And is this something that needs to be fixed urgently? This problem has never happened before so i'm hoping it's an easy enough fix. Look forward to your responses and greatly appreciate the help. Many thanks, Danny
Intermediate & Advanced SEO | | DannyNutcase0 -
Robots.txt - Googlebot - Allow... what's it for?
Hello - I just came across this in robots.txt for the first time, and was wondering why it is used? Why would you have to proactively tell Googlebot to crawl JS/CSS and why would you want it to? Any help would be much appreciated - thanks, Luke User-Agent: Googlebot Allow: /.js Allow: /.css
Intermediate & Advanced SEO | | McTaggart0 -
I'm a newb, built a website with Wix want to redirect it to a domain I own, but am reading that Wix is bad for this
Hi, I am building this site for my boss http://charlesfridmanpr.wix.com/real-estate and am still working on it. I'm getting close to the stage where I want to redirect it to the URL we want to use, but in reading these forums, it says that because all of subpages (?) have a # in them, they will not be read or indexed by google. I am very new to this, and while it may not look like it, the website has taken me quite a while to design. Is there a way to fix this? We want to appear high up for a non competitive keyword. Thanks
Intermediate & Advanced SEO | | Charlesfridmanpr0 -
Does Google bot read embedded content?
Is embedded content "really" on my page? There are many addons nowadays that are used by embedded code and they bring the texts after the page is loaded. For example - embedded surveys. Are these read by the Google bot or do they in fact act like iframes and are not physically on my page? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
Best way to view Global Navigation bar from GoogleBot's perspective
Hi, Links in the global navigation bar of our website do not show up when we look at Google cache --> text only version of the page. These links use "style="<a class="attribute-value">display:none;</a>" when we looked at HTML source. But if I use "user agent switcher" add-on in Firefox and set it to Googlebot, the links in global nav are displayed. I am wondering what is the best way to find out if Google can/can not see the links. Thanks for the help! Supriya.
Intermediate & Advanced SEO | | SShiyekar0 -
Google Webmasters- Message ID CDO8151
Sorry if this question is asked before, but I dont know how to deal with latest issue in one of my website. I got this warning message today in GWT: The URL (URL) is no longer appearing in Google search results because our algorithms have selected (URL) instead. Please see this Help Center article for more information about this message. Message ID: CDO81511. Please include this ID in any messages you post in our Help forum. Sincerely, Google Search Quality Team
Intermediate & Advanced SEO | | eddieodriscoll0