My automated build system is creating a duplicate website
-
Because of the tools my company is using for CI/CD (A CI/CD pipeline helps you automate steps in your software delivery process, such as initiating code builds, running automated tests, and deploying to a staging or production environment.) an extra URL is generated. The canonical for the generated site is that of our main website, but other than that it is the same website.
- Could this new URL compete with our website?
- Will Google count it against us since it is the same content BUT with canonical (it is not noindex-ed)?
- Does it matter?
- Surely others are using this method?
Answers/thoughts will be greatly appreciated. Thank you.
-
Do you have any control over the CI/CD pipeline URL?
If you control the domain enough so that you can be one to have validated and searched console them by all means. But it does not seem like you have the ability to control domain?
my correct?
https://support.google.com/webmasters/answer/7440203?hl=en
If the domain is 3ed party domain then you must trust the third-party or if you control the domain of pages which links or third-party domain URLs are embedded on you can add noindex nofollow
https://www.deepcrawl.com/blog/best-practice/noindex-disallow-nofollow/
I hope that helps,
Tom
-
Unfortunately, since URL is generated from the original site, I cannot change the robots.txt. It uses the same one as the main site. That would exclude adding a noindex meta tag, as well. Any other ideas?
Is there a way to add the duplicate URL to search console & tell google not to crawl?
Thank you.
-
I understand using CI cool
i agree get the bad content being made by CI blocked ASAP
“have an extra URL is generated. The canonical for the generated site is that of our main website, but other than that it is the same website.”
but it’s not the same content being made that will hurt you unless you’re pointing the canonicals to a similar page (get the automated content off your domain)
Remember to add using self pointing canonicals on the good pages you want to be indexed by Google or Search Engines
Hope this is of help,
Tom
-
To answer your questions:
- Technically it could compete with your current site as it's on its own domain, in reality, it's unlikely as you're canonicalizing the pages back to its original and making sure that the content itself through that way is attributed to your original site.
- What I would recommend is excluding the CI/CD site from the engines, through a robots.txt or a similar technique. That way you're making sure that the staging site itself isn't being crawled at all. In the end, I'd say there's very little upside of having that be the case currently.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate meta descriptions
Hi all,
Reporting & Analytics | | Roi_Bar
Google webmaster point me to duplicate meta descriptions problem between pages with difference URL but have the same canonical URL.
What could be the reason? https://treato.com/ParaGard,tired/?a=s
https://treato.com/ParaGard,Tiredness/?a=s
https://treato.com/ParaGard,tired/?a=s
https://treato.com/ParaGard,Tiredness/?a=s gKUSXsU1 -
Main Website Redirects to Mobile Website, Mobile Website counts this as direct traffic, is there a way to tell what the source/medium is?
Hello, The situation is that someone is arriving on my main website https://www.example.com and being redirected to http://m.example.com. When this happens my analytics says that the traffic is all direct coming to my mobile site. However, I know people clicking on my google cpc, and some google organic users are hitting the main website and being redirected. Before we didn't have as good of a redirect on our main website so I could tell organic and cpc traffic coming in, now my main website has a huge drop in these categories because they are redirecting to mobile but I can't tell on my mobile how much traffic from each is going to the mobile site. Is there a way to fix this? Is it because my main website is https:// and mobile is a http:// (as I know that sometimes makes traffic direct) or is it a bigger problem that can't be resolved? Thanks
Reporting & Analytics | | oxfordseminars0 -
Moz Crawler suddenly reporting 1000s of duplicates (BE.net)
In the last 3-4 days we've had several thousand 'duplicate content' warnings appear in our crawl report, 99% of them related to our on-site blog. The blog is BlogEngine.Net, but the pages simply don't exist. The majority seem to be Roger trying quasi-random URLs like:
Reporting & Analytics | | Progauto
/?page=410 /?page=151 Etc. etc. The blog will present content for these requests, but it is of course the same empty page since there's only unique content for up to /?Page=10 or so. Two questions: 1. Did something change recently? These blogs have been up for months, and this problem has only come up this week. Did Roger change to become more aggressive lately? 2. Suggested remediation? On one of the blogs I've put no-index no-follow for any page that has a /?page querystring, and we'll see what effect that has come next crawl next week. However, I'm not sure this will work as per: http://moz.com/community/q/functionality-of-seomoz-crawl-page-reports Anyone else had dynamic blogs suddenly blossom into thousands of duplicate content warnings? Google (rightly) ignores these pages completely.0 -
Solving link and duplicate content errors created by Wordpress blog and tags?
SEOmoz tells me my site's blog (a Wordpress site) has 2 big problems: a few pages with too many links and duplicate content. The problem is that these pages seem legit the way they are, but obviously I need to fix the problem, sooooo... Duplicate content error: error is a result of being able to search the blog by tags. Each blog post has mutliple tags, so the url.com/blog/tag pages occasionally show the same articles. Anyone know of a way to not get penalized for this? Should I exclude these pages from being crawled/sitemapped? Too many links error: SEOmoz tells me my main blog page has too many links (both url.com/blog/ and url.com/blog-2/) - these pages have excerpts of 6 most recent blog posts. I feel like this should not be an error... anyone know of a solution that will keep the site from being penalized by these pages? Thanks!
Reporting & Analytics | | RUNNERagency0 -
How serious are the Duplicate page content and Tags error?
I have a travel booking website which reserves flights, cars, hotels, vacation packages and Cruises. I encounter a huge number of Duplicate Page Title and Content error. This is expected because of the nature of my website. Say if you look for flights between Washington DC and London Heathrow you will at least get 60 different options with same content and title tags. How can I go about reducing the harm if any of duplicate content and meta tags on my website? Knowing that invariably I will have multiple pages with same content and tags? Would appreciate your advice? S.H
Reporting & Analytics | | sherohass0 -
OSE shows URLs redirecting to our custom created error page, is this a problem?.
When I check the link metrics for my product pages in OSE, it shows a message saying that the page redirects to our custom error page. This page was recently created to display when there is an error with the website. Do I need to be concerned that OSE is seeing all product pages as redirecting to this error page? Will it affect page authority etc,? I have attached a screen shot of the message that OSE displays for reference. YWbpM.jpg
Reporting & Analytics | | pugh0 -
Duplicate content? Split URLs? I don't know what to call this but it's seriously messing up my Google Analytics reports
Hi Friends, This issue is crimping my analytics efforts and I really need some help. I just don't trust the analytics data at this point. I don't know if my problem should be called duplicate content or what, but the SEOmoz crawler shows the following URLS (below) on my nonprofit's website. These are all versions of our main landing pages, and all google analytics data is getting split between them. For instance, I'll get stats for the /camp page and different stats for the /camp/ page. In order to make my report I need to consolidate the 2 sets of stats and re-do all the calculations. My CMS is looking into the issue and has supposedly set up redirects to the pages w/out the trailing slash, but they said that setting up the "ref canonical" is not relevant to our situation. If anyone has insights or suggestions I would be grateful to hear them. I'm at my wit's end (and it was a short journey from my wit's beginning ...) Thanks. URL www.enf.org/camp www.enf.org/camp/ www.enf.org/foundation www.enf.org/foundation/ www.enf.org/Garden www.enf.org/garden www.enf.org/Hante_Adventures www.enf.org/hante_adventures www.enf.org/hante_adventures/ www.enf.org/oases www.enf.org/oases/ www.enf.org/outdoor_academy www.enf.org/outdoor_academy/
Reporting & Analytics | | DMoff0 -
Open explorer is not finding my website information yet?
Majestic SEO and even yahoo sitemap see's it... why not SEOMOZ? The website is a few months old already. www.lackofsleephq.com
Reporting & Analytics | | sleepmaster0