Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Why is Amazon crawling my website? Is this hurting us?
-
Hi mozzers,
I discovered that Amazon is crawling our site and exploring thousands of profile pages. In a single day it crawled 75k profile pages.
- Is this related to AWS?
- Is this something we should worry about or not? If so what could be a solution to counter this?
- Could this affect our Google Analytics organic traffic?
-
@jacombo Amazon requires sellers to offer a minimum 30-day return window, accept returns of new, unopened items, and process refunds within two business days. Sellers must cover return shipping costs for errors or defective items. Compliance with safety and security standards is also essential.
To ensure your return policies meet these standards:
Stay Updated: Regularly review Amazon’s guidelines.
Clear Communication: Clearly state your return policy on listings.
Quality Control: Implement a rigorous quality control process.
Amazon Consulting Service: Consider using an Amazon consulting service to audit and improve your policies. -
What are the basic guidelines and requirements for third party sellers' return policies on Amazon, and how can I ensure they meet safety and security standards for buyers?
-
My website https://www.avenuessouthresidences.com also has traffic coming from Amazon bots. I wonder if it's my competitors doing something funny or it's Amazon itself doing it.
-
this is not about moz
-
- Is this related to AWS? Yes, most likely a customer on AWS scraping your site.
- Is this something we should worry about or not? If so what could be a solution to counter this? Monitor it. If the crawling get's ridiculous, you could have site performance issues. Could deny IP addresse(s) in .htacces or try via Robot.txt file.
- Could this affect our Google Analytics organic traffic? If your site performance suffers, could affect organic position and decrease traffic.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawled page count in Search console
Hi Guys, I'm working on a project (premium-hookahs.nl) where I stumble upon a situation I can’t address. Attached is a screenshot of the crawled pages in Search Console. History: Doing to technical difficulties this webshop didn’t always no index filterpages resulting in thousands of duplicated pages. In reality this webshops has less than 1000 individual pages. At this point we took the following steps to result this: Noindex filterpages. Exclude those filterspages in Search Console and robots.txt. Canonical the filterpages to the relevant categoriepages. This however didn’t result in Google crawling less pages. Although the implementation wasn’t always sound (technical problems during updates) I’m sure this setup has been the same for the last two weeks. Personally I expected a drop of crawled pages but they are still sky high. Can’t imagine Google visits this site 40 times a day. To complicate the situation: We’re running an experiment to gain positions on around 250 long term searches. A few filters will be indexed (size, color, number of hoses and flavors) and three of them can be combined. This results in around 250 extra pages. Meta titles, descriptions, h1 and texts are unique as well. Questions: - Excluding in robots.txt should result in Google not crawling those pages right? - Is this number of crawled pages normal for a website with around 1000 unique pages? - What am I missing? BxlESTT
Intermediate & Advanced SEO | | Bob_van_Biezen0 -
Problems in indexing a website built with Magento
Hi all My name is Riccardo and i work for a web marketing agency. Recently we're having some problem in indexing this website www.farmaermann.it which is based on Magento. In particular considering google web master tools the website sitemap is ok (without any error) and correctly uploaded. However only 72 of 1.772 URL have been indexed; we sent the sitemap on google webmaster tools 8 days ago. We checked the structure of the robots.txt consulting several Magento guides and it looks well structured also.
Intermediate & Advanced SEO | | advmedialab
In addition to this we noticed that some pages in google researches have different titles and they do not match the page title defined in Magento backend. To conclude we can not understand if this indexing problems are related to the website sitemap, robots.txt or something else.
Has anybody had the same kind of problems? Thank you all for your time and consideration Riccardo0 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
How to structure articles on a website.
Hi All, Key to a successful website is quality content - so the Gods of Google tell me. Embrace your audience with quality feature rich articles on your products or services, hints and tips, how to, etc. So you build your article page with all the correct criteria; Long Tail Keyword or phrases hitting the URL, heading, 1st sentance, etc. My question is this
Intermediate & Advanced SEO | | Mark_Ch
Let's say you have 30 articles, where would you place the 30 articles for SEO purposes and user experiences. My thought are:
1] on the home page create a column with a clear heading "Useful articles" and populate the column with links to all 30 articles.
or
2] throughout your website create link references to the articles as part of natural information flow.
or
3] Create a banner or impact logo on the all pages to entice your audience to click and land on dedicated "articles page" Thanks Mark0 -
Archiving a festival website - subdomain or directory?
Hi guys I look after a festival website whose program changes year in and year out. There are a handful of mainstay events in the festival which remain each year, but there are a bunch of other events which change each year around the mainstay programming.This often results in us redoing the website each year (a frustrating experience indeed!) We don't archive our past festivals online, but I'd like to start doing so for a number of reasons 1. These past festivals have historical value - they happened, and they contribute to telling the story of the festival over the years. They can also be used as useful windows into the upcoming festival. 2. The old events (while no longer running) often get many social shares, high quality links and in some instances still drive traffic. We try out best to 301 redirect these high value pages to the new festival website, but it's not always possible to find a similar alternative (so these redirects often go to the homepage) Anyway, I've noticed some festivals archive their content into a subdirectory - i.e. www.event.com/2012 However, I'm thinking it would actually be easier for my team to archive via a subdomain like 2012.event.com - and always use the www.event.com URL for the current year's event. I'm thinking universally redirecting the content would be easier, as would cloning the site / database etc. My question is - is one approach (i.e. directory vs. subdomain) better than the other? Do I need to be mindful of using a subdomain for archival purposes? Hope this all makes sense. Many thanks!
Intermediate & Advanced SEO | | cos20300 -
Export Website into XML File
Hi, I am having an agency optimize the content on my sites. I need to create XML Schema before I export the content into XML. What is best way to export content including meta tags for an entire site along with the steps on how to?
Intermediate & Advanced SEO | | Melia0 -
How to check a website's architecture?
Hello everyone, I am an SEO analyst - a good one - but I am weak in technical aspects. I do not know any programming and only a little HTML. I know this is a major weakness for an SEO so my first request to you all is to guide me how to learn HTML and some basic PHP programming. Secondly... about the topic of this particular question - I know that a website should have a flat architecture... but I do not know how to find out if a website's architecture is flat or not, good or bad. Please help me out on this... I would be obliged. Eagerly awaiting your responses, BEst Regards, Talha
Intermediate & Advanced SEO | | MTalhaImtiaz0 -
SeoMoz Crawler Shuts Down The Website Completely
Recently I have switched servers and was very happy about the outcome. However, every friday my site shuts down (not very cool if you are getting 700 unique visitors per day). Naturally I was very worried and digged deep to see what is causing it. Unfortunately, the direct answer was that is was coming from "rogerbot". (see sample below) Today (aug 5) Same thing happened but this time it was off for about 7 hours which did a lot of damage in terms of seo. I am inclined to shut down the seomoz service if I can't resolve this immediately. I guess my question is would there be a possibility to make sure this doesn't happen or time out like that because of roger bot. Please let me know if anyone has answer for this. I use your service a lot and I really need it. Here is what caused it from these error lines: 216.244.72.12 - - [29/Jul/2011:09:10:39 -0700] "GET /pregnancy/14-weeks-pregnant/ HTTP/1.1" 200 354 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)" 216.244.72.11 - - [29/Jul/2011:09:10:37 -0700] "GET /pregnancy/17-weeks-pregnant/ HTTP/1.1" 200 51582 "-" "Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)"
Intermediate & Advanced SEO | | Jury0