Why are crawlers not picking up these pages?
-
Hi there,
I've been asked to audit a new subdomain for a travel company. It's all a bit messy, so it's going to take some time to remedy. However, one thing I couldn't understand was the low number of pages appearing in certain crawlers.
The subdomain has many pages. A homepage, category pages then product pages. Unfortunately, tools like Screaming Frog and xml-sitemaps.com are only picking up 19 pages and I can't figure out why. Google has so far indexed around 90 pages - this is by no means all of them, but that's probably because of the new domain and lack of sitemap etc.
After looking at the crawl results, only the homepage and category (continent pages) are showing. So all the product pages are not. for example, tours.statravel.co.uk/trip/Amsterdam_Kings_Day_(Start_London_end_London)-COCCKDM11 is not appearing in the crawl results. After reviewing the source code, I can't see anything that would prevent this page being crawled. Am I missing something?
At the moment, the crawl should be picking up around 400+ product pages, but it's not picking up any.
Thanks
-
Hi,
I would think it is the javascript being used on the pages (google can theoretically render the page as a browser would, screaming frog and other similar tools on the whole cannot). If you visit the homepage with js turned off then you see a pretty empty page with a list of links (region, activity, country) which are the same links that screaming frog is picking up. If you go into one of the search results pages with js turned off, you don't really see much of anything at all. Google is obviously doing a better job of crawling the js content! A solution would be to present the data in a simpler, crawlable format for non js enabled browsers but that is (probably a big) conversation with your developers
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Overdynamic Pages - How to Solve it?
Hi everyone, I'm running a classified real estate ads site, where people can publish their apartment or house they want to sell, so we use multiple filters to help people find what they want. Lately we added multiple filters to the URL to make the search more precise, things like: Prices (priceAmount=###) Bedrooms (BedroomsNumber=2) Bathrooms (BathroomsNumber=3) TotalArea (totalArea=1_50) Services (Elevator, CommonAreas, security) Among other Filters so you see the picture, all this filters are on the URL so that people can share their search on multiple social media, that makes two problems for moz crawl: Overdynamic URLs Too long URLs Now what would be a good solution for this 2 problems, would a canonical to the original page before the "?" would be ok? Example:
Technical SEO | | JoaoCJ
http://urbania.pe/buscar/venta-de-propiedades?bathroomsNumber=2&services=gas&commonAreas=solarium The problem I have with this solution is that I also have a pagination parameter (page=2), and I'm using prev and next tags, if I use a such canonical will break the prev and next tag? http://urbania.pe/buscar/venta-de-propiedades?bathroomsNumber=2&services=gas&commonAreas=solarium&page=2 Also thinking if adding a noindex on pages with paramters could also be an option. Thanks a lot, I'm trying to address this issues.0 -
Joomla creating duplicate pages, then the duplicate page's canonical points to itself - help!
Using Joomla, every time I create an article a subsequent duplicate page is create, such as: /latest-news/218-image-stabilization-task-used-to-develop-robot-brain-interface and /component/content/article?id=218:image-stabilization-task-used-to-develop-robot-brain-interface The latter being the duplicate. This wouldn't be too much of a problem, but the canonical tag on the duplicate is pointing to itself.. creating mayhem in Moz and Webmaster tools. We have hundreds of duplicates across our website and I'm very concerned with the impact this is having on our SEO! I've tried plugins such as sh404SEF and Styleware extensions, however to no avail. Can anyone help or know of any plugins to fix the canonicals?
Technical SEO | | JamesPearce0 -
Page that appears on SERPs is not the page that has been optimized for users
This may seem like a pretty newbie question, but I haven't been able to find any answers to it (I may not be looking correctly). My site used to rank decently for the KW "Gold name necklace" with this page in the search results:http://www.mynamenecklace.co.uk/Products.aspx?p=302This was the page that I was working on optimizing for user experience (load time, image quality, ease of use, etc.) since this page was were users were getting to via search. A couple months ago the Google SERP's started showing this page for the same query (also ranked a little lower, but not important for this specific question):http://www.mynamenecklace.co.uk/Products.aspx?p=314Which is a white gold version of the necklaces. This is not what most users have in mind (when searching for gold name necklace) so it's much less effective and engaging.How do I tell Google to go back to old page/ give preference to older page / tell them that we have a better version of the page / etc. without having to noindex any of the content? Both of these pages have value and are for different queries, so I can't canonical them to a single page. As far as external links go, more links are pointing to the Yellow gold version and not the white gold one.Any ideas on how to remedy this?Thanks.
Technical SEO | | Don340 -
Duplicate page errors from pages don't even exist
Hi, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages don't even exist. My website has around 40-50 pages but SEO report shows that 375 pages have been crawled. My guess is that the errors have something to do with my recent htaccess configuration. I recently configured my htaccess to add trailing slash at the end of URLs. There is no internal linking issue such as infinite loop when navigating the website but the looping is reported in the SEOmoz's report. Here is an example of a reported link: http://www.mywebsite.com/Door/Doors/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/ btw there is no issue such as crawl error in my Google webmaster tool. Any help appreciated
Technical SEO | | mmoezzi0 -
Two of Pages Have Been SendBoxed
Hello, I was number 1-2 for my local keyword term, but now im nowhere, those two urls dont even show up in Google search results, my other pages DO, so that is obvious Google sendboxed them, i dont remember doing aggressive non quality link building, and its not a competitive term, since i was number 1 in Google for over 3 months or so i checked this tool and found that two of my urls are in sendbox http://www.searchenginegenie.com/sandbox-checker.htm I was never sendboxed before, can you help me how can i get out of this, since its my client's website, and i have to get those pages up as soon as possible Thank You
Technical SEO | | tonyklu0 -
When Is It Good To Redirect Pages on Your Site to Another Page?
Suppose you have a page on your site that discusses a topic that is similar to another page but targets a different keyword phrase. The page has medium quality content, no inbound links, and the attracts little traffic. Should you 301 redirect the page to a stronger page?
Technical SEO | | ProjectLabs1 -
What can be the cause of my inner pages ranking higher than my home page?
If you do a search for my own company name or products we sell the inner pages rank higher than the homepage and if you do a search for exact content from my home page my home page doesn't show in the results. My homepage shows when you do a site: search so not sure what is causing this.
Technical SEO | | deciph220 -
Removing pages from website
Hello all, I am fairly new to the SEOmoz community. But i am working for a company which organizes exhibitons, events and training in Holland. A lot of these events are only given ones ore twice and then we do not organise them any more because they are no longer relevant. Every event has its own few webpages which provide information about the event and are being indexed by Google. In the past we did not remove any of these events. I was looking in the CMS and saw a lot of events of 2008 and older which are being indexed. To clean the website and the CMS i am thinking of removing these pages of old events. The risk is that these pages have some links to them and are getting some traffic, so if i remove them there is a risk of losing traffic and rankings. What would be the wise thing to do? Make a folder with archive or something? Regards, Ruud
Technical SEO | | RuudHeijnen0