Should I disallow crawl of my Job board?
-
MOZ crawler is telling me we have loads of duplicate content issues. We use a Job Board plugin on our Wordpress site and we have allot of duplicate or very similar jobs (usually just a different location), but the plugin doesn't allow us to add any rel canonical tags to the individual jobs.
Should I disallow the /jobs/ url in the robots.txt file? This will solve the duplicate content issue but then Google wont be able to crawl any of the individual job listings
Has anyone had any experience working with a job board plugin on Wordpress and had a similar issue, or can advise on how best to solve our duplicate content??
Thanks
-
Hi David! Did Dan's answer help? Let us know if there's anything else we can do to help you work this out.
-
Hi David
You can probably leave the pages as-is and allow Google to crawl them. But you may want to update the part of the content that's triggering the duplicate errors. In other words - are your title tags and meta descriptions unique for each page? Or maybe the H1's are duplicates? Since the pages do have slight differences, I would use those differences to make the content unique.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Good to use disallow or noindex for these?
Hello everyone, I am reaching out to seek your expert advice on a few technical SEO aspects related to my website. I highly value your expertise in this field and would greatly appreciate your insights.
Technical SEO | | williamhuynh
Below are the specific areas I would like to discuss: a. Double and Triple filter pages: I have identified certain URLs on my website that have a canonical tag pointing to the main /quick-ship page. These URLs are as follows: https://www.interiorsecrets.com.au/collections/lounge-chairs/quick-ship+black
https://www.interiorsecrets.com.au/collections/lounge-chairs/quick-ship+black+fabric Considering the need to optimize my crawl budget, I would like to seek your advice on whether it would be advisable to disallow or noindex these pages. My understanding is that by disallowing or noindexing these URLs, search engines can avoid wasting resources on crawling and indexing duplicate or filtered content. I would greatly appreciate your guidance on this matter. b. Page URLs with parameters: I have noticed that some of my page URLs include parameters such as ?variant and ?limit. Although these URLs already have canonical tags in place, I would like to understand whether it is still recommended to disallow or noindex them to further conserve crawl budget. My understanding is that by doing so, search engines can prevent the unnecessary expenditure of resources on indexing redundant variations of the same content. I would be grateful for your expert opinion on this matter. Additionally, I would be delighted if you could provide any suggestions regarding internal linking strategies tailored to my website's structure and content. Any insights or recommendations you can offer would be highly valuable to me. Thank you in advance for your time and expertise in addressing these concerns. I genuinely appreciate your assistance. If you require any further information or clarification, please let me know. I look forward to hearing from you. Cheers!0 -
Client suffered a malware attack. Removed links not being crawled by Google!
Hi all, My client suffered a malware attack a few weeks ago where an external site somehow created 700 plus links on my clients site with their content. I removed all of the content and redirected the pages to the home page. I then created a new temporary xml sitemap with those 700 links and submitted the sitemap to Google 9 days ago. Google has crawled the sitemap a few times but not the individual links. When I click on the crawl report for the sitemap in GSC, I see that the individual links still have the last crawled date from before they were removed. So in Googles eyes, that old malicioud content still exists. What do I do to ensure Google knows the contnt is gone and redirected? Thanks!
Technical SEO | | sk19900 -
How google crawls images and which url shows as source?
Hi, I noticed that some websites host their images to a different url than the one their actually website is hosted but in the end google link to the one that the site is hosted. Here is an example: This is a page of a hotel in booking.com: http://www.booking.com/hotel/us/harrah-s-caesars-palace.en-gb.html When I try a search for this hotel in google images it shows up one of the images of the slideshow. When I click on the image on Google search, if I choose the Visit Page button it links to the url above but the actual image is located in a totally different url: http://r-ec.bstatic.com/images/hotel/840x460/135/13526198.jpg My question is can you host your images to one site but show it to another site and in the end google will lead to the second one?
Technical SEO | | Tz_Seo0 -
Google not crawling the website from 22nd October
Hi, This is Suresh. I made changes to my website and I see that google is unable to crawl my website from 22nd October. Even it is not showing any content when I use Cache:www.vonexpy.com. Can any body help me in knowing why Google is unable to crawl my website. Is there any technical issue with the website? Website is www.vonexpy.com Thanks in advance.
Technical SEO | | sureshchowdary1 -
Disallow: /404/ - Best Practice?
Hello Moz Community, My developer has added this to my robots.txt file: Disallow: /404/ Is this considered good practice in the world of SEO? Would you do it with your clients? I feel he has great development knowledge but isn't too well versed in SEO. Thank you in advanced, Nico.
Technical SEO | | niconico1011 -
Robots.txt crawling URL's we dont want it to
Hello We run a number of websites and underneath them we have testing websites (sub-domains), on those sites we have robots.txt disallowing everything. When I logged into MOZ this morning I could see the MOZ spider had crawled our test sites even though we have said not to. Does anyone have an ideas how we can stop this happening?
Technical SEO | | ShearingsGroup0 -
Google Webmaster tools vs SeoMOZ Crawl Diagnostics
Hi Guys I was just looking over my weekly report and crawl diagnostics. What I've noticed is that the data gathered on SeoMoz is different from Google Webmaster diagnostics. The number of errors, in particular duplicate page titles, content and pages not found is much higher that what google webmaster tools is represents. I'm a bit confused and don't know which data is more accurate. Please Help
Technical SEO | | Tolod0 -
Crawl Errors and Duplicate Content
SEOmoz's crawl tool is telling me that I have duplicate content at "www.mydomain.com/pricing" and at "www.mydomain.com/pricing.aspx". Do you think this is just a glitch in the crawl tool (because obviously these two URL's are the same page rather than two separate ones) or do you think this is actually an error I need to worry about? Is so, how do I fix it?
Technical SEO | | MyNet0