Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google for Jobs best practice for Job Boards?
I head up SEO for a niche job board. We disallowed our job ad pages (/job/) in the robots.txt as this is user-generated content and really eating up our crawl budget, causing penalties etc. Now Google for Jobs has hit the UK (our strongest region for traffic), I'm torn about what to do next. Our jobs will only show in GfJ if we remove the jobs pages from the robots.txt and apply the directed structured data to every single jobs page and monitor this constantly. I will also have to constantly invest in our website developers no indexing / canonicalizing new job pages and paginations. Is GfJ worth it? I have spoken to one other job board who has seen more brand awareness from appearing in GfJ but almost no traffic / application increase. But are we missing a trick here? Any advice would be greatly appreciated.
Intermediate & Advanced SEO | | gracekimberley11 -
How would you handle these pages? Should they be indexed?
If a site has about 100 pages offering specific discounts for employees at various companies, for example... mysite.com/discounts/target mysite.com/discounts/kohls mysite.com/discounts/jcpenney and all these pages are nearly 100% duplicates, how would you handle them? My recommendation to my client was to use noindex, follow. These pages tend to receive backlinks from the actual companies receiving the discounts, so obviously they are valuable from a linking standpoint. But say the content is nearly identical between each page; should they be indexed? Is there any value for someone at Kohl's, for example, to be able to find this landing page in the search results? Here is a live example of what I am talking about: https://www.google.com/search?num=100&safe=active&rlz=1C1WPZB_enUS735US735&q=site%3Ahttps%3A%2F%2Fpoi8.petinsurance.com%2Fbenefits%2F&oq=site%3Ahttps%3A%2F%2Fpoi8.petinsurance.com%2Fbenefits%2F&gs_l=serp.3...7812.8453.0.8643.6.6.0.0.0.0.174.646.3j3.6.0....0...1c.1.64.serp..0.5.586...0j35i39k1j0i131k1j0i67k1j0i131i67k1j0i131i46k1j46i131k1j0i20k1j0i10i3k1.RyIhsU0Yz4E
Intermediate & Advanced SEO | | FPD_NYC0 -
How to Get Permalinks Indexed?
Hey Everyone, I'm so happy to be apart of this community and assert knowledge where and when I can. I joined the community for one specific reason and I hope to employ the help of everyone here in conjunction with solving my SEO problem. I have a few years experience in SEO/SEM and have been continuously learning, while learning to adapt to continuous changes (I think we can all relate lol). At any rate, here is what I am experiencing frustration with. I'm the SEO Analyst for a company that is trying to compete for the keyword phrase "Lyft Promo Code". We have been trying to place page one on google for over a year now to no avail. I have gotten my direct domain url to appear on pages 1 & 2, but can't seem to get permalinks or "Sub-URL's" indexed. If you google this phrase you will see what I mean. The top result is:http://rideshareapps.com/lyft-promo-code-credit/
Intermediate & Advanced SEO | | Number_One_Deisgns
This url has an aggregated rating and appears page one for the phrase aforementioned above. What we have managed to do, as I mentioned is get www.couponcodeshero.com on page two. However, we have noticed that the page one trend is all permalinks. However when we have tried to emulate the pages structure and index priority, we are unable too. Our page:
http://couponcodeshero.com/lyft-promo-code-rideshare-guide/ I have ran multiple on-page graders from many resources and have not been able to get this page indexed as a permalink on any page that directly correlates with the Keyword Phrase. In essence, I'm looking for some direction from individuals who may have experienced this before. I have spent a good amount of time Googling and searching forum databases but can not find any direct content that explains how to index a permalink. I hope to get some great ideas from the individuals here! If you do know of any articles or even previously answered questions here please direct me there. it is only my intention to add value to the community! Schieler Mew
Number One Designs0 -
XML and Disallow
I was just curious about any potential side effects of a client Basically utilizing a catch-all solution through the use of a spider for generating their XML Sitemap and then disallowing some of the directories in the XML sitemap in the robots.txt. i.e.
Intermediate & Advanced SEO | | DRSearchEngOpt
XML contains 500 URLs
50 URLs contain /dirw/
I don't want anything with /dirw/ indexed just because they are fairly useless. No content, one image. They utilize the robots.txt file to " disallow: /dirw/ " Lets say they do this for maybe 3 separate directories making up roughly 30% of the URL's in the XML sitemap. I am just advising they re-do the sitemaps because that shouldn't be too dificult but I am curious about the actual ramifications of this other than "it isn't a clear and concise indication to the SE and therefore should be made such" if there are any. Thanks!0 -
Membership/subscriber (/customer) only content and SEO best practice
Hello Mozzers, I was wondering whether there's any best practice guidance out there re: how to deal with membership/subscriber (existing customer) only content on a website, from an SEO perspective - what is best practice? A few SEOs have told me to make some of the content visible to Google, for SEO purposes, yet I'm really not sure whether this is acceptable / manipulative, and I don't want to upset Google (or users for that matter!) Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Why my site it's not being indexed?
Hello.... I got to tell that I feel like a newbie (I am, but know I feel like it)... We were working with a client until january this year, they kept going on their own until september that they contacted us again... Someone on the team that handled things while we were gone, updated it´s robots.txt file to Disallow everything... for maybe 3 weeks before we were back in.... Additionally they were working on a different subdomain, the new version of the site and of course the didn't block the robots on that one. So now the whole site it's been duplicated, even it´s content, the exact same pages exist on the suddomain that was public the same time the other one was blocked. We came in changes the robots.txt file on both server, resend all the sitemaps, sent our URL on google+... everything the book says... but the site it´s not getting indexed. It's been 5 weeks now and no response what so ever. We were highly positioned on several important keywords and now it's gone. I now you guys can help, any advice will be highly appreciated. thanks Dan
Intermediate & Advanced SEO | | daniel.alvarez0 -
.htaccess question/opinion/advice needed
Hello, I am trying to achieve 3 different things on my .htaccess I just want to make sure I am doing it the right or best way because I don't have much experience working on this kind of files. I am trying to: a) Redirect www.mysite.com/index.html to www.mysite.com so I don't get a duplicate content/tag error. b) Redirect mysite.com to www.mysite.com c) Get rid of the file extensions; www.mysite.com/stuff.html to www.mysite.com/stuff This is the code that I'm currently using and it seems to work fine, however I would like someone with experience to take a look so I can avoid internal server errors and other kinds of issues. I grabbed each piece of code from different posts and tutorials. Options +FollowSymlinks
Intermediate & Advanced SEO | | Eblan
RewriteEngine on Index Rewrite RewriteRule ^index.(htm|html|php) http://www.mysite.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.mysite.com/$1/ [R=301,L] RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.*)$ $1.html Options +FollowSymlinks
RewriteEngine on
Rewritecond %{http_host} ^mysite.com [nc]
Rewriterule ^(.*)$ http://www.mysite.com/$1 [r=301,nc] Thanks a lot!0 -
Best to Post Dynamic Content (Listings) under "Posts" in Wordpress?
My commercial real estate web site is being migrated to Wordpress from Drupal. Is it advisable to place dynamic content that will use taxonomy under "Posts" ? Listings will be changed every few months and there could be anywhere from several hundred to several thousand of them on the site. Developers have given me different advice. One has been adamant that listings and neighborhood pages (there will be about 25 neighborhood pages) should not be in the post section which is to be strictly reserved for blog entries. The last thing I want is to create a site structure which is unfriendly to SEO!!!! I would very much appreciate the perspective of anyone proficient with Wordpress and SEO. Thanks!!!
Intermediate & Advanced SEO | | Kingalan1
Alan Rosinsky0