Do you want the .sg site to only rank regionally in Singapore? You could use rel=alternate hreflang to designate the language/region for the two sites, and help Google more accurately know when to display which sites. This also acts as a soft canonicalization signal and tells Google that the pages are known duplicates:
Posts made by Dr-Pete
-
RE: Using canonical for duplicate contents outside of my domain
-
RE: Can someone help with Canonical?
I'm not sure what's tripping our warning (although that can occasionally be hyperactive), but I do see one problem. You're using the non-www version (which is fine, in theory), but your "Home" link goes to the "www..." version. You're sending Google a mixed signal with this. It doesn't seem to be causing major problems (no "www..." pages are in Google's index for your site), but I'd change the internal link to conform to the non-www default.
The basic problem is that Google and other crawlers (including ours) will keep visiting that "www..." and then get redirects and/or hit the canonical. It creates an odd loop that could waste crawler bandwidth and sends a mixed signal about your intent.
-
RE: Sitemap created on client's Joomla site but it is not showing up on site reports as existing? (Thumbs Up To Answers)
Sorry, when you say "run site reports", are you referring to Google Webmaster Tools? You'll have to either submit that dynamic URL to GWT or use a work-around, as Thomas mentioned, to map it to something a but more standard. For GWT, they don't care what the filename is (I've submitted dynamic, PHP-based sitemaps). For other tools, it's possible the tool is looking for a specific filename.
-
RE: Google contradictory communications about manual action being applied
Sorry, I'm completely confused - you said that there's no notice of a manual action in Google Webmaster Tools, but the manual action page isn't empty and Google is giving you examples of URLs? If the page isn't empty, then you have a notice of manual action. I think I'm missing something. Did the message come and go in GWT?
-
RE: Do we need to remove Google Authorship from the blog?
Generally agree with everyone (well, everyone so far ) - there's no need to remove authorship in general. Google isn't penalizing sites with authorship - they're just showing authorship thumbnails less often, as they found they were overdoing it (at least in the eyes of users).
The only exception I'd give is to be careful about applying authorship tags to every page on a site - home page, search results, product pages, etc. Some CMSs go crazy, unfortunately. This is technically against Google authorship guidelines. Again, most likely, they'll just ignore your authorship mark-up (it's not a penalty situation), but if you want that mark-up to appear in SERPs, then use it appropriately.
-
RE: Disavow
I think actual removal of the worst offenders is a better step all-around. Leave Google out of it (at least in a direct sense).
I think there is such a thing as "natural" sitewide links. For example, I once made a comment on a pretty well-trafficked blog, and that got promoted to a "top comment" which went into a sitewide sidebar. I got a lot of links from that site for a while, and it was fine. Google can see that it's just one site.
It's when you're obviously using this tactic repeatedly that the trouble starts. In most cases, I think it's just devaluation. There's not, in theory, anything wrong with a design company putting "Designed by Company X" in a footer to drive some business. Google doesn't value those links, because they're too easy to get in some industries. On the other hand, you aren't necessarily trying to cheat the system. In general, I'd be cautious these days, but sitewide links aren't an instant penalty.
-
RE: Disavow
That's a really tough call. Truthfully, even links that could potentially harm you down the road might actually be helping you short-term. So, you could proactively disavow the sitewide footer links and lose ground, even if it's a smart long-term move.
In most cases, these links are devalued, so odds are that they aren't helping you much and the risk is low. I'd just urge caution. You may want to ease into it with links from a few sites that you know for a fact are low quality.
-
RE: Dealing with Redirects and iFrames - getting "product login" pages to rank
Could you explain the business model a bit? If people can land directly on the product page, why do you want to take them to the log-in page? What are they logging into? Is the product an actual, physical product, or a service, SaaS, etc.? I realize you many not be able to provide all details, but if I can understand the business logic a bit, that would help. This is as much a business/conversion issue as a technical SEO issue.
-
RE: Disavow
To answer the technical aspect of the question, Google does not seem to store previous files or maintain a historical record of disavowed links. You should consider the current submission to be the only submission, and keep it as complete as possible.
I'm not overly concerned about disavow coming back to bite people (I haven't seen any reports of that floating around the industry), but down the road, I think Robert's concerns are valid. If you're constantly adding to new links to disavow, Google may start to wonder if there's anything fishy going on.
Maybe the bigger question is - why are people consistently building these links? Is this a negative SEO attack or is something about your business model, content, structure, etc. creating quality problems with your link profile. If there are deeper issues, better to address them than put a bandage over them.
-
RE: Comparing Domain Authority Scores
Talked to Dr. Matt, and he said that, if you just want an estimate, take the log (base 10). You'll get a value from zero to two that will be roughly linear, and then you can scale it up to whatever range you need.
-
RE: Rel canonical between mirrored domains
Wow - that's a huge impact. It's hard for me to believe this one change would have such an impact, but hopefully these new numbers stick.
-
RE: Disavow Links Notification
I'm not 100% sure, but their messaging on this has been pretty inconsistent, and it's clear they still treat the disavow tool as a beta. If you didn't get any errors and the file seems to be in place, you're probably ok, and it's just going to take some time. I wouldn't take lack of confirmation as necessarily a bad sign, as long as you know the file format was ok. At best, they acknowledge receipt - the timeline of when they actually disavow and when that could impact a Penguin problem could be weeks or months, unfortunately.
-
RE: Rel canonical between mirrored domains
Happy to help - hope it does the trick.
-
RE: Rel canonical between mirrored domains
I'm not sure there's a one-sized-fits-all answer. If the .com is more geared to an audience outside of Singapore, and the .edu.sg site is more geared to the local audience you could set a region and/or language with rel="alternate" hreflang:
https://support.google.com/webmasters/answer/189077?hl=en
This is a bit more subtle canonicalization signal that Google can use to sort out sites with language and/or regional copies (the regional aspect may be relevant even if both sites are in english).
The next question would be: where does your traffic come from? If you want to consolidate, but most of your traffic is coming from outside of Singapore, then I'd probably stick with the .com - it still has a "generic" status. The .edu.sg may rank more strongly in Singapore but fall off everywhere else.
I wouldn't worry much about the DMOZ link, especially if you have a solid link profile. DMOZ links have gotten buried over the years and typically don't carry nearly the value most people think. At some point, they could be a solid boost to a new site with a small link profile, but I think even those days are well behind us.
-
RE: Can't get auto-generated content de-indexed
I don't think there's any harm in submitting a new/full list, even if it duplicates past lists. The URLs haven't been removed, and you did fix the tags. This isn't like disavowing links - it's more of a technical issue. Worst case, it doesn't work, from what I've seen.
-
RE: Can't get auto-generated content de-indexed
It honestly sounds like you're on the right track - you do need to explicitly mark those (and META NOINDEX should be fine). Could you just request removal for all private pages? Worst case, Google removes some that aren't in the index, or attempts to. Since the public/private setting can be changed, you can't really put the private pages all in one folder (real or virtual) - that would make life easier, long-term, but probably isn't useful/appropriate for your case.
I'd also recommend having a clean XML sitemap with just the public entries (updated dynamically). That won't deindex the other pages, but it's one more cue Google can use. You want all of the signals you're sending to be consistent.
I agree with Doug, though - this is really tricky, because ideally you would want people to share these pages, and if you NOINDEX then you're losing out on that. My gut feeling is that, until your site is stronger, you probably can't support 3K near duplicates (and counting). If you want to get sophisticated, though, you could dynamically NOINDEX and only noindex posts that have very little content or our obvious dupes. As people fill out or share a product, you could remove the NOINDEX.
-
RE: Authorship's Back. Could a custom URL be why?
Interesting - thanks. It's a bit hard to pin down, because Google has been changing the "volume" on authorship mark-up a lot these past two months, and that means both up and down on any given day. Authorship also seems to be page-based and probably query-based, which means any given site could have and not have the mark-up depending on the page and/or query in play. It's a real-time evaluation on Google's part.
-
RE: Do you choose PA/DA over PR when purchasing expiring domains?
Obviously, we at Moz have some bias toward PA/DA, but I think there's a lot of danger in focusing on any one metric when buying a domain. For example, PA/DA look more at the strength of your link profile than certain spam factors, so we may miss if a domain has been penalized or devalued (something we're actively working on). The raw strength of a link profile, even without mitigating circumstance can be misleading, too - for example, if the domain isn't relevant to the new use then buying an repurposing it could result in a site with substantially lower value than the original.
The biggest issue with (toolbar) PR is that it's often out of date. It can also be gamed. Shady domain sellers can do things like mass 301-redirects that make a domain look stronger for a while (until the toolbar PR gets updated) but disappear as soon as they pull the plug - usually after you've written the check.
Long story short, it depends a lot on your use case, and I'd strongly recommend against relying on just one metric.
-
RE: Hi guys What the best way to adress duplicate content on photo gallery?
When you say that "each one has it own canonical tag to its own individual page", do you mean that "./name-of-the-page-categoryone" has a canonical pointing back to "/name-of-the-page"? Are there other variations (sorts, filters, etc)? Typically, it's best not to have different categories land you on custom URLs with the same content. If, for example, a product appears in multiple categories, you should still land on one unified URL when you actual click to the product, regardless of the path/category used to get you there.
That said, rel=canonical should work, and our tools should generally honor it. I have a feeling that there's something more complicated going on here.
-
RE: Should I Disavow Links if there is No Manual Action
There are many link-based algorithmic actions that can hit a site, including Penguin, so just because you don't have a warning of a manual action doesn't mean that you're not in trouble. I can't see the data, but this doesn't seem to be speculative - you're basically saying that certain pages and keywords have clearly been devalued.
If you've ruled out technical issues with the pages (including duplicate/thin content issues), and you know the spammy links are targeting these pages, then I think disavowing is probably a good way to go. Ideally, try to have some of the links removed first, as that will make Google take the request more seriously (and disavow is basically a request, although it's semi-automated).
It's entirely possible, too, that you're already headed for a manual action, so even if you do nothing, the situation could get worse. If you were unaffected, I might suggest pruning some of the bad links and focusing your future link-building on better tactics, but you're already taking damage.
-
RE: Dropped ranking - new domain same IP????
I think many of the commenters have good points here. It's really tough to make a call like this without a deep knowledge of the situation, and for any of us to tell you what to do would probably be irresponsible. Generally, I don't think Google transfers penalties across IPs like they used to occasionally do. With IPv4 space running out, sharing IPs is just a lot more common than it used to be. Google also has other cues, like domain ownership (they're technically a registrar, so they have access to a lot of data) to go by.
To be safe, you could isolate it and get a new IP. I'm not sure it's necessary, but if you're going to go so far as to start over, you might as well do it as cleanly as possible.
The question is whether starting over with a new domain will solve the problem. If you want to avoid the penalty transferring, you can't 301-redirect the old site, which means that you'll lose all of your link equity and leave past visitors stranded. That's a huge loss to take, and it's going to take time to rebuild (as Michael B. said). Will the content be the same? There may be other aspects of the site that caused you problems, and if they're related to the content or site structure, they could just come back.
-
RE: Should I deindex my pages?
Glad to hear it! Yeah, patience isn't easy, that's for sure.
-
RE: Do you think this site has been hit by penguin?
First, I have to second Marie - the easiest way right now to detect a Penguin drop is to look at the publicly released update dates and compare them to your search traffic. Penguin often hits hard and fast - it's not usually subtle.
One thing I think you have to be careful about is your definition of "quality" vs. Google's. For example, diversity matters to Google. If you guest blog on generally decent sites (non-spammy, for lack of a better word), but that's your only link-building tactic, you could still be in trouble. It's just not natural, in Google's eyes, and they're going to think you're only guest blogging to get links (which may be the truth). It's not that your posts are bad, but you're relying too much on one tactic to the point that it could look manipulative. I'm not saying this is what you're doing - just making a general observation (and one that affects a lot of people right now, IMO).
At first glance (and, please note, link profile analysis can be tricky), I'm seeing a lot of keyword-loaded anchor text. It's a bit tricky, since you have an EMD (so your "brand" is keyword-based), but you may be pushing the variants of "banner(s)" too hard. Again, that can be a sign of artificial link building patterns.
Frankly, you've also got some links that look outright spammy. Take this one, for example:
http://www.constructionindustryscheme.org/
Your link is at the bottom (not even really in a footer) with no context or relevance to the page. This looks incredibly artificial.
-
RE: Help with redirects
I'll put the content/user considerations aside for a moment, since I can't comment on content I can't see, and focus on the technical SEO aspect. You could reverse the 301-redirect, and send it back to the old page, but I'll be honest - it's likely to take a while for Google to process it. They're likely to be confused by the reversal.
It's a tough call, but what I think I'd do in this case is the following. Let's say the original URL is (A), the "new" URL is (B). Currently, you're 301-redirecting (A) --> (B), and now you're going to put the content back on (A)...
STEP 1
Remove the 301-redirects
Rel-canonical (B) --> (A)
Rel-canonical (A) --> (A)Change the signals from a 301-redirect to rel-canonical may help nudge Google. It will also allow you to place a self-referencing canonical on (A), which could help offset the old 301-redirect.
STEP 2
Once the URLs are cleared, put a 301-redirect back in place from (B) --> (A)You could also just leave the canonicals, if they seem to be working. These situations often require some monitoring, to make sure Google is processing the new directives correctly. Give it time, though - don't panic and change things every week, or you could make the situation worse.
-
RE: Should I deindex my pages?
Just checking in - has the situation improved in the month since you posted the question? I tend to agree with Thomas that it's usually just a waiting game, assuming the 301-redirects are working properly. It never hurts to use a header checker, just in case (it's amazing how often redirects get implemented poorly).
You could re-create the old sitemap if the transition is stalled. I'd avoid actively removing the old URLs, as that could interfere with making sure the link equity from the old URLs passes to the new ones. The only issue would be if you suspect duplicate content problems are harming you.
The devil is in the details on these situations, and it depends a lot on the size of the site, etc.
-
RE: Canonicals for product pages - confused, anyone help?
I agree with Lynn, but I'm a little confused about the intent. If you create the new URLs with product categories in them, you'll need to move the old URLs somehow, such as with 301-redirects. The new canonical tags won't help those old URLs, so you're potentially creating even more duplicate content by creating a new canonical version.
Generally, I don't think adding categories to the URLs is a great idea. You can squeeze in a few more keywords, but the impact of that in 2013 is very small. As you said, you're also making the URLs longer and you're pushing back the unique keywords. So, Google is going to see more repetition toward the beginning of the URL and less unique information (as are users, although most people don't read URLs closely, IMO).
-
RE: Best way to implement canonical tags on an ecommerce site with many filter options?
This is generally an exception Google supports - for example, they say that you can use rel=prev/next and rel=canonical in conjunction, where one handles pagination and the other handles sorts/filters. In the case of a sort (like ascending/descending) the actual results could be very different, but the intent is still legitimate. They generally understand you're trying to clean up these pages.
In a perfect world, these filters wouldn't create unique URLs, honestly, but now that they already exist, you have to manage them. The other option would just be to META NOINDEX those filter URLs or set them up in parameter handling in Google Webmaster Tools. I tend to prefer the canonical here, personally.
-
RE: Disavow wn.com?
Good point by Kevin, too, that it does depend on the rest of your link profile and how solid it is. If you have thousands of linking root domains, just one domain isn't going to make or break you. Your overall profile is the key.
-
RE: Disavow wn.com?
First, to Kevin's question, a high DA doesn't mean a site isn't spammy. It means the site has a lot of seemingly high-authority links (or just a large link profile from generally large sites, or a healthy mix). Some of the modelling controls for quality, but not necessarily spam factor - which is something we're actively working on.
I suspect the "articles." sub-domain carries less authority than the overall root domain, but it's tough to say. With so many links, you're probably getting some credit from the root domain.
Unfortunately, the weight of any one link or even 2,000 links from one domain is almost impossible to measure. So, it comes down to a risk/reward scenario. Are you just proactively cleaning things up, or are you fighting a serious fire, like an outright penalty that's killing traffic? If you're being proactive, I'd probably leave this alone, especially if you have solicited these links, paid for them, etc. If you're fighting a serious penalty, then you need to risk cutting deep, especially if you're doing a Penguin recovery.
-
RE: Matt Cutts or No Action Is Required
Google tends to downplay the existence of negative SEO, because: (1) frankly, it makes them look bad, and (2) people tend to over-estimate how common it is (we all like to blame the competition). In most cases, if Google thinks a link is malicious and being created by a third-party to harm you, they'll just devalue it.
The problem is that Google is far from perfect at detecting who created a link. If they had that down, they'd be a lot better at dealing with spam. So, I wouldn't trust them to simply figure it out. If a link or linking domain is clearly bad, and especially if you've suffered a penalty, be proactive.
The very fact that Google is warning you in Webmaster Tools suggests that they haven't simply devalued these links. They think that the links are suspect.
-
RE: Can I dissavow links on a 301'd website?
I tend to agree with Federico's concerns. If the 301 transfers a penalty, the impact could be long-term, and it could be harder to rescue site B. The short-term ranking gains may not be worth it.
Google hasn't been clear on how this operates with 301 redirects. John's suggestion to disavow on both sites seems safe. Worst case, it's wasted effort, but it's not much effort (once you've built one file, building two is easy). Still, you've got to wait for that to process, and if the algorithmic penalty is something like Penguin, then you'd have to wait for a data refresh. This could take months, so I'd be really hesitant to risk site B until you've cleaned up the mess.
Once you disavow to site A, the 301-redirect should be fairly safe, but it does depend on the extent of the penalty. The risk/reward trade-off is definitely a "devil is in the details" sort of situation.
-
RE: Algorithm Penalty?
I'm getting a 403 error for the entire site at the moment (?) Could you double-check the URL?
Interesting that you mentioned geo-targeted pages and that this happened around when Panda hit. Do you have a ton of pages of the form "[brand] [city]" (dozens or hundreds, maybe)? If those pages are very similar - with only a few details switched out, this is definitely the kind of thin content that Panda hit pretty hard. Google views those pages as low value, and can devalue the entire domain based on that perception. This wouldn't appear as an error in GWT, since it's an algorithmic action.
Unfortunately, I can only speculate, without taking a peak at what Google has indexed.
-
RE: Should I change the URL now?
Yeah, I'm with Federico - 100 doesn't seem like a ton, unless your total link profile is very small. I'm not clear on what you're trying to accomplish with changing the category in WordPress - is it the category page that's a problem? If the links are all to one page and you can live without that page, you could let it 404. If it's a category page, then yes, I guess you could change the URL. Just don't 301-redirect the old URL to the new one, in this case, because you'd carry the links and any penalty.
If this was something like Penguin, you'd still have to wait for a data refresh. If it's a manual penalty, you'd need reconsideration. So, even total removal may not instantly fix the situation.
-
RE: Duplicate content on partner site
Cross-domain canonical is the most viable option here. As Mike and Chris said, it is possible for Google to ignore the tag in some cases, but it's a fairly strong suggestion. There are two main reasons I'd recommend it:
(1) Syndicated content is the entire reason Google allowed the use of rel=canonical across domains. SEOs I know at large publishers have used it very effectively. While your situation may not be entirely the same, it sounds similar to a syndicated content scenario.
(2) It's really your only viable option. While a 301-redirect is almost always honored by Google, as Chris suggested, it's also very different. A 301 will take the visitors on the partner site page directly to your page, and that's not your intent. Rel=canonical will leave visitors on the partner page, but tell search engines to credit that page to the source. Google experimented with a content syndication tag, but that tag's been deprecated, so in most cases rel=canonical is the best choice we have left.
-
RE: Getting links from spammy websites on the same IP
Shared IPs are a bit tough. It used to be the case that, on rare occasions, Google might accidentally penalize a domain if another domain on that IP had a spammy history, carried a penalty, etc. These days, that seems to be very unlikely, probably because shared IPs are a lot more common. Once IPv4 started running out, many legitimate sites were sharing.
On the other hand, it is still more likely for links to be devalued or look suspicious if they come from the same IP, because that's often how low-quality link networks are built. In most cases, I suspect Google would simply devalue those links and not outright penalize the domains, but I'm not certain either way.
The tough part is that it's hard to separate whether you're being hit for the links, for them coming from a share IP, or for something entirely different. In a perfect world, I'm a fan of having a unique IP, but my gut says that that's probably not going to magically fix things here. Since they're all coming from one site, and if they're definitely spammy, I'd go the disavow route first (especially since you can disavow the entire domain). Give it a couple of weeks, and then you can try to submit for reconsideration. If/when you do, I'd specifically explain why you have a shared IP and that your site has no affiliation with this other site. As other said, reconsideration is generally for manual penalties, but it's unclear from your statements whether this penalty is manual or algorithmic. Disavow seems to be applicable to some cases of either situation, although Google has been a bit inconsistent on this topic.
At the same time, make sure nothing else is going on with your link profile. For links from just one domain to cause a serious penalty is very rare.
-
RE: Using unique content from "rel=canonical"ized page
Yeah, I tend to agree with Maximilian and Mike - I'm not clear on the use-case scenario here and, technically, pages 1 and 2 aren't duplicated. Rel=canonical probably will still work, in most cases, and will keep page 2 from looking like a duplicate (and from ranking), but I'd like to understand the situation better.
If Google did honor the canonical tag on page 2, then the duplication between pages 2 and 3 shouldn't be a problem. I'm just thinking there may be a better way.
-
RE: How to bulk check the ranking for more than 800 domains?
Sorry, I don't think we're clear (judging by other responses) on what you're trying to check. Are you looking for some core stats on those 800 domains, or do you have a list of keywords for each domain that you want to check rankings for? If it's a list of keywords, how many total (ballpark) keywords are you talking about?
-
RE: Is this a Penguin hit; how to recover?
There was a big local shake-up around 10/25 - were you potentially ranking in local markets for some of these phrases (they don't seem like local queries, but just thought I'd ask)? I'd have to agree with Marie that the timing suggest this wasn't Penguin, but Google hasn't been very forthcoming on Penguin data refreshes.
Your link profile looks pretty clean to me, and your site isn't really large enough to have large-scale content issues. Those services pages look a little keyword-targeted and border on thin, but if you're talking about a handful of them, I doubt it's enough to cause you serious problems. If you rolled out hundreds of them that were all just variations on the same core keyword phrases, that would be different.
On the other hand, if those pages specifically target the terms that dropped, it's hard to ignore that fact. Did you do any targeted link-building to those pages. I'm seeing one weird thing - OSE is showing a ton of recent links from Scoop.it - across a large number of pages - but when I check the source site, I'm not seeing any links. It's possible you got a temporary boost from some links that got removed, but that's really hard to track.
-
RE: How to associate content on one page to another page
Yeah, I'm afraid Chris is right. There's really no way to tell Google to index both pages but then not give them control over which one ranks. Google is naturally going to prefer the full content page, because they want to get people to the best "answer", in a sense.
Truthfully, I think it's a better search user experience, in most cases. Internal search users can travel from the snippet to the post, but search users may get frustrated at going from Google's results to your results, and not straight to the resource. If you force the first step on search users, you may actually increase your bounce rate and harm your overall performance.
-
RE: I sent a reconsideration request after submitting a disavow list. Response to the reconsideration request had examples of the very same domains listed in the submitted disavow list. Was my disavow list not taken into consideration? What next?
Wish I could add some clarity, but I'm seeing what everyone else is seeing - the process is opaque, inconsistent, and takes time. If you've made an effort to remove links, disavowed properly, and outlined the removals in the request, then I'd give it time. What was the time lapse between the disavow and the reconsideration request?
-
RE: Penguin 2.0 Recovery - Penguin Update Rerun yet or not
Yeah, Penguin 2.1 was confirmed. We're not clear if there are data updates outside of the official updates or not - Google hasn't been very forthcoming on that. I have heard of recoveries since 2.0, though.
In general, recovery stories are very limited. Penguin is brutal, and the people I know who have been successful have had to make deep cuts. Also, keep in mind that, if you cut deep but it's not the right cuts, that may not work either. It's a difficult road, and it depends a lot on the depth of the problem and whether the site has enough good links and a solid enough base for Google to take the link removals seriously.
-
RE: Is there an efficient way to use Open Site Explorer to find unnatural or harmful links
I'm honestly not sure if there's a great way to use that process to drill back into the links to see which ones are low quality - it was intended more as a big picture look at your entire profile. We're actively working on more spam detection metrics, but it's a tricky business and we're really concerned about potentially giving people false alarms, especially with so many people hacking at their links.
I know some folks use OSE to export links and pull out specific bits of data, like very low PA pages, pages with exact-match anchor text on keyword-heavy phrases (non-brand), etc., but even then you've got to be a bit careful. A low PA page isn't necessarily "bad", just relatively weak, from a link perspective.
One spam marker we've found pretty relevant is sites with a disproportionately low MozTrust to their MozRank (or to their DA). Sites with high authority but low trust may have spammy link profiles in general. These are the kinds of factors that 3rd-party link evaluation tools look at. Some of them are a good starting point, but you still need to fact-check.
I'd definitely start with some kind of domain-based analysis. Your 40K links might be coming from 4K domains, or even 400. Getting a broad sense of linking-domain quality will be an easier starting point.
-
RE: Canonical nightmare! Help!
Yeah, a status code of 200 is generally a good thing. Could you direct message me through the site and tell me which campaign you're seeing the errors on. I can log in and try to take a deeper look.
-
RE: Tweet favourites or re-tweets, how does it affect ?
I think it's important to note that we currently have no strong evidence that tweets (favorites or RTs) directly impact rankings. Google cut off the Twitter "firehose" data and claims they don't factor in social as a direct ranking factor (even Google+). I think "direct" is an important word there, and that makes sense - social is relatively easy to manipulate, at least in terms of raw signals. They're still trying to figure out the right mix.
That caveat aside:
(1) I agree with Ratan that RT's are generally more advantageous indirectly. They expose more people to your tweet, and those people will click through, drive up engagement, and potentially link to you. Eventually, this can have an indirect but very real impact on SEO. It's unlikely that favoriting has much impact even indirectly, IMO.
(2) Social signals, like RTs, can definitely be used by Google for indexing new content. Content posted on G+, for example, is indexed incredibly fast from decent accounts (not just big names, but any account that's clearly real). This isn't "ranking" per se, but you can't win if you don't play, so it matters.
(3) I strongly suspect social will be a corroborating layer, if it isn't already. In other words, if you have a piece of content with 500 +1s but no tweets, not Likes, and no links, that's going to look like spam to Google. If that same piece shows signals across very dimensions, then those 500 +1s may have an impact. RTs may eventually be part of that equation (personally, I don't think they are right now).
-
RE: Google Webmaster tools: Sitemap.xml not processed everyday
My initial reaction was that this is more likely technical than something Google is doing - checking the load-time is a good idea. Make sure the sitemap validates and there's nothing odd about it. If you manually re-submit it, does it seem to take?
-
RE: Appropriate use of rel canonical
Yeah, I'd really tackle this error first, as the other one could be a false alarm. It sounds like you've got multiple canonical tags on a single page, which Google can't interpret very well (and that might just ignore it or use the wrong one). This often indicates that your CMS is double-placing tags and could signal broader problems.
-
RE: Is Inter-linking websites together good or bad for SEO?
Google's advice on this is a bit vague, and the practical consequences can vary a lot. Linking together a couple of sites is usually fine - linking together dozens or hundreds could get you marked as a link network and get all of your sites penalized.
Usually, as Richard and James said, it's more that Google will simply devalue the links, especially if those sites share ownership/hosting/etc. It's just too easy to cross-link your own properties. I don't think getting too fancy with hosting, C-blocks, etc. is the answer. That's a lot of work, and Google can still connect you on ownership and other cues. To erase all of those cues is a lot more time, effort, and money than links from a couple of sites are really worth.
The best advice I can give is that, if you cross-link, do it in a way that's clearly of value to users. In other words, just linking these sites to each other in the footer is almost going to guarantee that Google ignores those links. If, however, you can link specific content to directly relevant content on another site, they're much more likely to let those links carry equity, and that's going to be valuable for your visitors and let them usefully traverse your sites. So, think of it more as a CRO task - how can you get visitors from one site to meaningfully engage in and convey on your other sites? If you can do that, and if you're only talking a handful of sites, you have some chance at making those links carry value.