Host sitemaps on S3?
-
Hey guys,
I run a dynamic web service and I will start building static sitemaps for it pretty soon. The fact that my app lives in a multitude of servers doesn't make it easy to distribute frequently updated static files throughout the servers.
My idea was to host the files in AWS S3 and point my robots.txt sitemap directive there. I'll use a sitemap index so, every other sitemap will be hosted on S3 as well.
I could dynamically mirror the content from the files in S3 through my app, but that would be a little more resource intensive than just serving the static files from a common place.
Any ideas? Thanks!
-
My general take on this sort of scenario is first to eliminate all the redundant hostnames with round-robin DNS, through adding extra server power with software-based load-balancing in the interim with a solution like InterWorx, and breaking out database servers. If you do that, you should have a nice little server cluster that's crazy efficient.and scalable. You can add a CDN to the mix if you like as well. With all of that, SEO should work the same way as on a single server.
Sitemaps can then be generated dynamically really easily (in under 25 lines of code, most of the time).
If you just want a way to mirror static files, you'll want to look at rsync.
And finally, as for S3, my personal opinion is to stay away. I'm an SEO, but I also spent 7 years building a hosting company. Those solutions sound great in their marketing, but are scientifically less reliable than standard hosting, and you can verify that via public uptime tracking sites like HyperSpin.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Control indexed content on Wordpress hosted blog...
I have a client with a blog setup on their domain (example: blog.clientwebsite.com) and even though it loads at that subdomain it's actually a Wordpress-hosted blog. If I attempt to add a plugin like Yoast SEO, I get the attached error message. Their technical team says this is a brick wall for them and they don't want to change how the blog is hosted. So my question is... on a subdomain blog like this... if I can't control what is in the sitemap with a plugin and can't manually add a sitemap because the content is being pulled from a Wordpress-hosted install, what can I do to control what is in the index? I can't add an SEO plugin... I can't add a custom sitemap... I can't add a robots.txt file... The blog is setup with domain mapping so the content isn't actually there. What can I do to avoid tags, categories, author pages, archive pages and other useless content ending up in the search engines? 7Zo93b2.png
Technical SEO | | ShawnW0 -
Go Daddy Ultimate Hosting?
I host several websites for my clients in my Go Daddy account. I've currently got each site on it's own hosting plan (about $5-$9 per month per site) but I was recently on the phone with Go Daddy and they suggested migrating everything to an Ultimate hosting plan, which allows me to host an unlimited number of websites, sql databases, etc for a set price. Will this negatively effect my SEO as all of my sites will essentially be tied together? I can save a few hundred dollars per year, but it's definitely not worth it if all of my clients' sites tank.
Technical SEO | | socialfirestarter0 -
302 redirect used, submit old sitemap?
The website of a partner of mine was recently migrated to a new platform. Even though the content on the pages mostly stayed the same, both the HTML source (divs, meta data, headers, etc.) and URLs (removed index.php, removed capitalization, etc) changed heavily. Unfortunately, the URLs of ALL forum posts (150K+) were redirected using a 302 redirect, which was only recently discovered and swiftly changed to a 301 after the discovery. Several other important content pages (150+) weren't redirected at all at first, but most now have a 301 redirect as well. The 302 redirects and 404 content pages had been live for over 2 weeks at that point, and judging by the consistent day/day drop in organic traffic, I'm guessing Google didn't like the way this migration went. My best guess would be that Google is currently treating all these content pages as 'new' (after all, the source code changed 50%+, most of the meta data changed, the URL changed, and a 302 redirect was used). On top of that, the large number of 404's they've encountered (40K+) probably also fueled their belief of a now non-worthy-of-traffic website. Given that some of these pages had been online for almost a decade, I would love Google to see that these pages are actually new versions of the old page, and therefore pass on any link juice & authority. I had the idea of submitting a sitemap containing the most important URLs of the old website (as harvested from the Top Visited Pages from Google Analytics, because no old sitemap was ever generated...), thereby re-pointing Google to all these old pages, but presenting them with a nice 301 redirect this time instead, hopefully causing them to regain their rankings. To your best knowledge, would that help the problems I've outlined above? Could it hurt? Any other tips are welcome as well.
Technical SEO | | Theo-NL0 -
Any idea why our sitemap images aren't indexed?
Here's our sitemap: http://www.driftworks.com/shop/sitemap/dw_sitemap.xml In google webmaster tools, I can see the sitemap report and it says: Items:Web Submitted:2,798 Indexed:2,910 Items:Images Submitted:3,178 Indexed:0 Do you have any idea why our images are not being indexed according to webmaster tools? I checked a few of the image URLs and they worked nicely. Thanks in advance, J
Technical SEO | | DWJames0 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Generating a Sitemap for websites linked to a wordpress blog
Greetings, I'm looking for a way to generate a sitemap that will include static pages on my home directory, as well as my wordpress blog. The site that I'm trying to build this for is in a temporary folder, and can be accessed at http://www.lifewaves.com/Website 3.0 I plan on moving the contents of this folder to the root directory for lifewaves.com whenever we are go for launch. What I'm wondering is, is there a way to build a sitemap or sitemap index that will point to the static pages of my site, as well as the wordpress blog while taking advantage of the built in wordpress hierarchy? If so, what's an easy way to do this. I have generated a sitemap using Yoast, but I can't seem to find any xml files within the wordpress folder. Within the plugin is a button that I can click to access the sitemap index, but it just brings me to the homepage of my blog. Can I build a sitemap index that points to a sitemap for the static pages as well as the sitemap generated by yoast? Thank you in advance for your help!! P.S. I'm kind of a noob.
Technical SEO | | WaveMaker0 -
Is there a penalty for linking to sites that are all hosted on the same IP address?
Hi... We're doing some reciprocal link building and a gentleman has been kind enough to offer me sever additional links for the exchange. All of them (5) are on the same IP address as one of his links to which we have already linked. They are in a related field of endeavor, legal websites. If I make the swap with him, is Google going to disregard, penalize or otherwise marginalize my efforts? Thanks!
Technical SEO | | hornsbylaw0 -
Mobile sitemaps - how much value?
Hi, We have a main www website with a standard sitemap. We also have a m. site for mobile content (the mobile site only contains our top pages and doesn't include the entire site). If a mobile client accesses one of our www pages we redirect to the m. page. If we don't have a m. version we keep them on the www site. Since we already have a www sitemap, is there much value in creating a mobile site map? The mobile site (although missing all pages) is pretty robust and contains most content people are looking for. Will the mobile sitemap help for Mobile searches (more so than our standard sitemap)? I'm also planning on rel canonical the m. pages to the www. pages (per other suggestios on SEOMoz) Thanks
Technical SEO | | NicB10