Disallow: /jobs/? is this stopping the SERPs from indexing job posts
-
Hi,
I was wondering what this would be used for as it's in the Robots.exe of a recruitment agency website that posts jobs. Should it be removed?Disallow: /jobs/?
Disallow: /jobs/page/*/Thanks in advance.
James -
Hi James,
So far as I can see you have the following architecture:
- job posting: https://www.pkeducation.co.uk/job/post-name/
- jobs listing page: https://www.pkeducation.co.uk/jobs/
Since from the robots.txt the listing page pagination is blocked, the crawler can access only the first 15 job postings are available to crawl via a normal crawl.
I would say, you should remove the blocking from the robots.txt and focus on implementing a correct pagination. *which method you choose is your decision, but allow the crawler to access all of your job posts. Check https://yoast.com/pagination-seo-best-practices/
Another thing I would change is to make the job post title an anchor text for the job posting. (every single job is linked with "Find out more").
Also if possible, create a separate sitemap.xml for your job posts and submit it in Search Console, this way you can keep track of any anomaly with indexation.
Last, and not least, focus on the quality of your content (just as Matt proposed in the first answer).
Good luck!
-
Hi Istvan,
Sorry I've been away for a while. Thanks for all of your advice guys.
Here is the url if that helps?
https://www.pkeducation.co.uk/jobs/
Cheers,
James
-
The idea is (which we both highlighted), that blocking your listing page from robots.txt is wrong, for pagination you have several methods to deal with (how you deal with it, it really depends on the technical possibilities that you have on the project).
Regarding James' original question, my feeling is, that he is somehow blocking their posting pages. Cutting the access to these pages makes it really hard for Google, or any other search engine to index it. But without a URL in front of us, we cannot really answer his question, we can only create theories that he can test
-
Ah yes when it's pointed out like that, it's a conflicting signal isn't It. Makes sense in theory, but if you're setting it to noindex and then passing that on via a canonical it's probably not the best is it.
They're was link out in that thread to a discussion of people who still do that with success, but after reading that I would just use noindex only as you said. (Still prefer the no index on the robots block though)
-
Sorry Richard, but using noindex with canonical link is not quite a good practice.
It's an old entry, but still true: https://www.seroundtable.com/noindex-canonical-google-18274.html
-
I don't think it should be blocked by robots.txt at all. It's stopping Google from crawling the site fully. And they may even treat it negatively as they've been really clamping down on blocking folders with robots.txt lately. I've seen sites with warning in search console for: Disallow: /wp-admin
You may want to consider just using a noindex tag on those pages instead. And then also use a canonical tag that points back to the main job category page. That way Google can crawl the pages and perhaps pass all the juice back to the main job category page via the canonical. Then just make sure those junk job pages aren't in the sitemap either.
-
Hi James,
Regarding the robots.txt syntax:
Disallow: /jobs/? which basically blocks every single URL that contains /jobs/**? **
For example: domain.com**/jobs/?**sort-by=... will be blocked
If you want to disallow query parameters from URL, the correct implementation would be Disallow: /jobs/*? or even specify which query parameter you want to block. For example Disallow: /jobs/*?page=
My question to you, if these jobs are linked from any other page and/or sitemap? Or only from the listing page, which has it's pagination, sorting, etc. is blocked by robots.txt? If they are not linked, it could be a simple case of orphan pages, where basically the crawler cannot access the job posting pages, because there is no actual link to it. I know it is an old rule, but it is still true: Crawl > Index > Rank.
BTW. I don't know why you would block your pagination. There are other optimal implementations.
And there is always the scenario, that was already described by Matt. But I believe in that case you would have at least some of the pages indexed even if they are not going to get ranked well.
Also, make sure other technical implementations are not stopping your job posting pages from being indexed.
-
I'd guess that the jobs get pulled from a job board. If this is the case, then the content ( job description, title etc.) will just be a duplication of the content that can be found in many other locations. If a plugin is used, they sometimes automatically add a disallow into the robots.txt file as to not hurt the parent version of the job page by creating thousands of duplicate content issues.
I'd recommend creating some really high-quality hub pages based on job type, or location and pulling the relevant jobs into that page, instead of trying to index and rank the actual job pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexing is live what about rankings ?
I noticed that when I request indexing in the webmaster tool my new content is live within minutes. Does it take longer to update the ranking or is the ranking updated as soon as the new page has been indexed. Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Site migration/ CMS/domain site structure change-no access to search console
Hi everyone, We are migrating an old site under a bigger umbrella (our main domain). As mentioned in the title, We'll perform CMS migration, domain change, and site structure change. Now, the major problem is that we can't get into google search console for the old site. The site still has old GA code, so google search console verification using this method is not possible, also there is no way developers will be able to add GTM or edit DNS setting (not to bother you with the reason why). Now, my dilemma is : 1. Do we need access to old search console to notify Google about the domain name change or this could be done from our main site (old site will become a part of) search console 2. We are setting up 301 redirects from old to the new domain (not perfect 1:1 redirect ). Once migration is done does anything else needs to be done with the old domain (it will become obsolete)? 3.The main site, Site-map... Should I create a new sitemap with newly added pages or update the current one. 4. if you have anything else please add:) Thank you!
Intermediate & Advanced SEO | | bgvsiteadmin0 -
Category VS Post
I use my website for providing an international service, I made my URL structure website https://example.com/destinations/africa/country destinations is a category and Africa is a sub-category I made an article for every continent and inserted all the continent's country manually, the page url structure is https://example.com/destinations/africa/ and the continent category URL is https://example.com/category/destinations/africa/ I'm thinking about removing the continent article and strip the category Word from URL, So i will use the subcategories directly on the same link https://example.com/destinations/africa/ what's your advice about removing the continent article and using the sub categories instead? is it a good idea to use the child category as a reference for the internal links? what do you think about keeping both of them (child category and the Article)? in case you suggest to use the child category , Is removing Category word may hurt my SEO?
Intermediate & Advanced SEO | | batot_mahmoud0 -
Google not indexing images
Hi there, We have a strange issue at a client website (www.rubbermagazijn.nl). Webpage are indexed by Google but images are not, and have never been since the site went live in '12 (We recently started SEO work on this client). Similar sites like www.damenrubber.nl are being indexed correctly. We have correct robots and sitemap setup and directions. Fetch as google (Search Console) shows all images displayed correctly (despite scripted mouseover on the page) Client doesn't use CDN Search console shows 2k images indexed (out of 18k+) but a site:rubbermagazijn.nl query shows a couple of images from PDF files and some of the thumbnails, but no productimages or category images from homepage. (product page example: http://www.rubbermagazijn.nl/collectie/slangen/olie-benzineslangen/7703_zwart_nbr-oliebestendig-6mm-l-1000mm.html) We've changed the filenames from non-descriptive names to descriptive names, without any result. Descriptive alt texts were added We're at a loss. Has anyone encountered a similar issue before, and do you have any advice? I'd be happy to provide more information if needed. CBqqw
Intermediate & Advanced SEO | | Adriaan.Multiply0 -
Client wants a seperate .tv domain for their media/videos instead of a subdomain/subfolder. What is the best way to pass of link equity to a new domain?
We have a client that wants to place their video content on a .tv tld instead of a subfolder/subdomain in their .com website. They believe that the .tv domain will better represent the media experience of their business. We can understand this client's position however we are concerned about their .tv domain will lose out on the link equity if it were no longer placed in the .com's subdomain/subfolder. Here are our questions: 1. What would be the best way to pass of link equity from .com website to a new .tv domain? Should we just have a video link on the .com website that 301 directs to the new .tv domain? 2. Is there any SEO benefit of having a .tv domain for Google Video queries or even Youtube? 3. Is there any long term value of having two different websites? For link equity purposes we understand that it would be better if everything was in a .com. However is a .tv domain ideal for a better representation of their media content? We appreciate any feedback.
Intermediate & Advanced SEO | | RosemaryB0 -
Removing pages from index
My client is running 4 websites on ModX CMS and using the same database for all the sites. Roger has discovered that one of the sites has 2050 302 redirects pointing to the clients other sites. The Sitemap for the site in question includes 860 pages. Google Webmaster Tools has indexed 540 pages. Roger has discovered 5200 pages and a Site: query of Google reveals 7200 pages. Diving into the SERP results many of the pages indexed are pointing to the other 3 sites. I believe there is a configuration problem with the site because the other sites when crawled do not have a huge volume of redirects. My concern is how can we remove from Google's index the 2050 pages that are redirecting to the other sites via a 302 redirect?
Intermediate & Advanced SEO | | tinbum0 -
Using a 302 re-direct from http://www to https://www to secure customer data
My website sends Customers from a http://www.mysite.com/features page to a https://www.mysite.com/register page which is an account sign-up form using a 302 re-direct. Any page that collects customer data has an authenticated SSL certificate to protect any data on the site. Is this 302 the most appropriate way of doing this as the weekly crawl picks it up as being bad practise? Is there a better alternative?
Intermediate & Advanced SEO | | Ubique0 -
How accurate are the index figures in GWT?
I've been looking at a site in GWT and the number of indexed urls is very low when compared with the number or submitted urls on the xml sitemaps. The site has several stores which are all submitted using different sitemaps. When you perform a search in Google, eg site:domain.com/store1 site:domain.com/store2 site:domain.com/store3 The results are similar to the webmaster urls. However, looking in the analytics for landing pages used for organic traffic from Google shows a much higher number of pages. If these pages aren't indexed as reported in GMT, how could they be found in the results and be recorded as landing pages?
Intermediate & Advanced SEO | | edwardlewis0