Google Indexed a version of my site w/ MX record subdomain
-
We're doing a site audit and found "internal" links to a page in search console that appear to be from a subdomain of our site based on our MX record. We use Google Mail internally. The links ultimately redirect to our correct preferred subdomain "www", but I am concerned as to why this is happening and if it can have any negative SEO implications.
Example of one of the links:
Links aspmx3.googlemail.com.sullivansolarpower.com/about/solar-power-blog/daniel-sullivan/renewable-energy-and-electric-cars-are-not-political-footballs I did a site operator search, site:aspmx3.googlemail.com.sullivansolarpower.com on google and it returns several results.
-
You appear to have the MX sub-domain also set up as an A record.
If you have a mac / linux you can run the command: host aspmx3.googlemail.com.sullivansolarpower.com
You get the result aspmx3.googlemail.com.sullivansolarpower.com has address 72.10.48.198
Where you should get the result "not found".
I think you want to delete the A record (though check the documentation of your email provider first). You should only need them set up as MX records and shouldn't need the A record.
You've done the right thing by setting up the redirect - which should mean that the pages drop out of the index and those links disappear. (Note that there is also an https error on the aspmx3 sub-domain - but given that you don't actually want it, I don't suppose that matters that much).
Hope that helps.
-
I did not explain the problem thoroughly. The problem is, the link does not actually exist anywhere. To make a very long story short. There was an issue with server configuration for a period of a couple months. During that time, an unknown number of non-existent subdomains got indexed. Basically, if anyone had a typo in the subdomain when accessing our site, it would get cached and if Google crawled our site before we cleared the cache, the typo subdomain would get indexed. Over a period of a couple months, many bad subdomains were accidentally created and indexed by Google. We do not have any way of finding a comprehensive list of all of them. This problem has been resolved so we are not getting new bad subdomains created and indexed, but the damage has been done.
The way our site is setup currently, any attempt to reach our site with any subdomain other than "www" gets redirected to "www.sullivan..." Also, any nonsecure protocol gets resolved to https://
The actual problem, simply put is this: Google has an index which includes some number of unknown, non existent subdomains. We need to get rid of them and cannot figure out how.
Example: Copy and paste the following into google and search it:
site:aspmx3.googlemail.com.sullivansolarpower.com
Google will return two results. If you click on either, it resolves to the "https://www. version of the page.
I know it is confusing, but does that make sense? I have searched everywhere, but the reason this happened was because of a perfect storm of server configuration issues and I cannot find anyone else who has had the same problem.
If it were one or two bad subdomains, we would just put them into search console and then get "remove URL" for the entire subdomain. But it is not 1 or 2. It is at least 10 that I know of and could be hundreds for all I know.
Does anyone have any ideas? Any and all would be welcome.
Thank you.
-
You should find the locations of those links and correct them to point to the proper URL. I find that Screaming Frog's crawl is the easiest for this, you can find every link and see where they are located.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site address change: new site isn't showing up in Google, old site is gone.
We just transitioned mccacompanies.com to confluentstrategies.com. The problem is that when I search for the old name, the old website doesn't come up anymore to redirect people to the new site. On the local card, Google has even taken off the website altogether. (I'm currently still trying to gain access to manage the business listing) When I search for confluent strategies, the website doesn't come up at all. But if I use the site: operator, it is in the index. Basically, my client has effectively disappeared off the face of the Google. (In doing other name changes, this has never happened to me before) What can I do?
Technical SEO | | MichaelGregory0 -
Blog on subdomain of e-commerce site
Hi guys. I've got an e-commerce site which we have very little control over. As such, we've created a subdomain and are hosting a WordPress install there, instead. This means that all the great content we're putting out (via bespoke pages on the subdomain) are less effective than if they were on the main domain. I've looked at proxy forwarding, but unfortunately it isn't possible through our servers, leaving the only option I can see being permenant redirects... What would be the best solution given the limitations of the root site? I'm thinking of wildcard rewrite rules (eg. link site.com/blog/articleTitle to blog.site.com/articleTitle) but I'm wondering if there's much of an SEO benefit in doing this? Thanks in advance for everyone's help 🙂
Technical SEO | | JAR8970 -
WMT "Index Status" vs Google search site:mydomain.com
Hi - I'm working for a client with a manual penalty. In their WMT account they have 2 pages indexed.If I search for "site:myclientsdomain.com" I get 175 results which is about right. I'm not sure what to make of the 2 indexed pages - any thoughts would be very appreciated. google-1.png google-2.png
Technical SEO | | JohnBolyard0 -
Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO? Anyone know? Thanks! ~Brett
Technical SEO | | BBuck0 -
301 Redirect / cross-domain canonical to a URL w/ Ampersand
I have a question regarding ampersands, we are needing to redirect to a URL w/ an ampersand in the URL: http://local.sfgate.com/b18915250/Sam-&-Associates-Insurance-Agency Will Google pass page authority/juice despite the fact that there is an ampersand in the URL, if we were to 301 redirect or cross-domain canonical to the url? Should we 301 redirect to http://local.sfgate.com/b18915250/Sam-%26-Associates-Insurance-Agency instead of http://local.sfgate.com/b18915250/Sam-&-Associates-Insurance-Agency? I don't have the option of removing the ampersand Thank you for your time!
Technical SEO | | Gatelist0 -
Fixing a wordpress 404 error for /feed and /comments/feed?
I have a wordpress site that does not have a blog currently. I'm getting a 404 for /feed and /comments/feed. Anyone know how I can fix?
Technical SEO | | DM50 -
Google seems to be penalizing my site for some reason
I recently took control of a website which did have some pretty big SEO problems, duplicate content being one of the main ones!! Looking back at ranking data the website ranked very well for it's main keyword, #5 for Google, Yahoo and Bing. The ranking then dropped in February 2012 for Google to #64 but stayed the same for Yahoo and Bing. I scrapped the dodgy content and completely rewrote it using a Wordpress framework about 6 weeks ago, still targeting the same keywords and 301 redirecting the old pages to the new pages where applicable. My rankings for Yahoo and Bing are still maintaining their page 1 rankings but Google is still ranking the website on page 5/6. My question is. Is the website getting punished for something that was part of the old website? If so how can I find out what it is and fix it? This website ranked on page 1 for Google for most of it's popular keywords but now it doesn't. I appreciate any feedback Many Thanks : )
Technical SEO | | alexhowe0 -
Google indexing directory folder listing page
Google somehow managed to find several of our images index folders and decided to include them into their index. Example: websitesite.com/category/images/ is what you'll see when doing a site:website.com search. So, I have two-part question: 1) Does this hurt our site's ability to rank in any way?
Technical SEO | | invision
Because all Google sees is just a directory listing page with a bunch of links to images in the folder. 2) If there could be any negative effect, what is the best way to get these folders out of Google's index?
I could block via robots.txt, but I'm afraid it will also block all the images in that folder from being indexed in Google image search. I could also turn off directory listing in cpanel / htaccess, but then that gives is a 403 forbidden. Will this hurt the site in anyway and would it prevent Google from indexing the images in the directory? Thanks,
Tony0