Should I disallow crawl of my Job board?
-
MOZ crawler is telling me we have loads of duplicate content issues. We use a Job Board plugin on our Wordpress site and we have allot of duplicate or very similar jobs (usually just a different location), but the plugin doesn't allow us to add any rel canonical tags to the individual jobs.
Should I disallow the /jobs/ url in the robots.txt file? This will solve the duplicate content issue but then Google wont be able to crawl any of the individual job listings
Has anyone had any experience working with a job board plugin on Wordpress and had a similar issue, or can advise on how best to solve our duplicate content??
Thanks
-
Hi David! Did Dan's answer help? Let us know if there's anything else we can do to help you work this out.
-
Hi David
You can probably leave the pages as-is and allow Google to crawl them. But you may want to update the part of the content that's triggering the duplicate errors. In other words - are your title tags and meta descriptions unique for each page? Or maybe the H1's are duplicates? Since the pages do have slight differences, I would use those differences to make the content unique.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Very wierd pages. 2900 403 errors in page crawl for a site that only has 140 pages.
Hi there, I just made a crawl of the website of one of my clients with the crawl tool from moz. I have 2900 403 errors and there is only 140 pages on the website. I will give an exemple of what the crawl error gives me. | http://www.mysite.com/en/www.mysite.com/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en | http://www.mysite.com/en/www.mysite.com/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/en/index.html#?lang=en | | | | | | | | | | There are 2900 pages like this. I have tried visiting the pages and they work, but they are only html pages without CSS. Can you guys help me to see what the problems is. We have experienced huge drops in traffic since Septembre.
Technical SEO | | H.M.N.0 -
Disallow wildcard match in Robots.txt
This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1
Technical SEO | | AmandaBridge
Disallow: /?mobile=1 Thank you0 -
Does Bing Last Crawl Update?
Do those of you using Bing Webmaster tools find your sitemap last crawl date gets updated? I noticed mine never updates, just shows the date of the last time I submitted it. It seems like it would update daily as it looks like bing crawls my site daily. See attached image. bing-screen-shot.png
Technical SEO | | Kyle Eaves0 -
Google crawl rate dropped after we activated CloudFront
Hello! Previously we've been using Amazon CloudFront for our static content (js, css etc). But to be able to reduce load on our origin servers and to be able to give our international users a good user experience we decided to deliver a couple of our sites through CloudFront. We noticed very nice drops in page load time, but when checking Google webmaster tools we noticed that all CloudFront-activated sites got a huge drop in pages crawled per day (from avg ~3500 to ~150). Also one of the sites have issues with the Google sitemaps (just marked as "Pending" in GWT) and no new pages or updated pages seems to be updated in the Google SERP. The rest of the sites gets some updates on the Google SERP, but very few compared to before CloudFront activation. Is there anybody here who have experience in full site delivery through CloudFront (or other CDNs) and effects on SEO/Google? Would be very glad for any insights or suggestions. The risk is that we need to remove CloudFront if this just continues.
Technical SEO | | Ludde0 -
Robots.txt crawling URL's we dont want it to
Hello We run a number of websites and underneath them we have testing websites (sub-domains), on those sites we have robots.txt disallowing everything. When I logged into MOZ this morning I could see the MOZ spider had crawled our test sites even though we have said not to. Does anyone have an ideas how we can stop this happening?
Technical SEO | | ShearingsGroup0 -
Mysterious drop in the Number of Pages Crawled
The # of crawled pages on my campaign dashboard has been 90 for months. Approximate a week ago it dropped down to 25 crawled pages, and many links went with it. I have checked with my web master, and he said no changes have been made which would cause this to happen. I am looking for suggestions on how I can go about trouble shooting this issue, and possible solutions. Thanks in advance!
Technical SEO | | GladdySEO0 -
Job/Blog Pages and rel=canonical
Hi, I know there are several questions and articles concerning the rel=canonical on SEOmoz, but I didn't find the answer I was looking for... We have some job pages, URLs are: /jobs and then jobs/2, jobs/3 etc.. Our blog pages follow the same: /blog, /blog2, /blog/3... Our CMS is self-produced, and every job/blog-page has the same title tag. According to SEOmoz (and the Webmaster Tools), we have a lots of duplicate title tags because of this problem. If we put the rel=canonical on each page's source code, the title tag problem will be solved for google, right? Because they will just display the /job and /blog main page. That would be great because we dont want 40 blog pages in the index. My concern (a stupid question, but I am not sure): if we put the rel=canonical on the pages, does google crawl them and index our job links? We want to keep our rankings for our job offers on pages 2-xxx. More simple: will we find our job offers on jobs/2, jobs/3... in google, if these pages have the rel=canonical on them? AND ONE MORE: does the SEOmoz bot also follow the rel=canonical and then reduce the number of duplicate title-tags in the campaigns??? Thanx........
Technical SEO | | accessKellyOCG0 -
Http VS https and google crawl and indexing ?
Is it true that https pages are not crawled and indexed by Google and other search engines as well as http pages?
Technical SEO | | sherohass0