Crawler doesn't discover the links in the main nav
-
Hi Moz Community,
We have a headless ecom (Magento) client that I'm trying to crawl the site. During the crawl, the tool (Screaming Frog) cannot discover the sub-category URLs in the main navigation when I start crawling via homepage.
Similarly, when I start crawling with one of the sub-category page, it doesn't crawl any of the product URLs on the sub-category page itself.
When I inspect product and sub-cat URLs through Search Console, they seem as indexed and if I view how Googlebot rendered the sub-category page, I can see the product URLs on the sub-cat page too.
If you have any idea what's the issue with Screaming Frog and would like to help me out, I'd be so grateful!
Thanks in advance
-
Hi Kate,
Thank you! I followed you on Twitter, my user name is @curetuvana
-
Find me on twitter @katemorris and follow me. Tell me your name and I'll follow you and we can DM.
-
Hi Kate,
Thank you for your time to respond! Is there any way that I can contact you directly?
By the way, I've tried crawling after changing configurations as 'rendering Javascript' however, it still didn't discover product URLs.
Thank you!
-
Ah, I might know your problem. What is your site? We had this issue at my last company, had to do with crawling using JS. If you will send me the site, I can take a look.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has discovered a URL but won't index it?
Hey all, have a really strange situation I've never encountered before. I launched a new website about 2 months ago. It took an awfully long time to get index, probably 3 weeks. When it did, only the homepage was indexed. I completed the site, all it's pages, made and submitted a sitemap...all about a month ago. The coverage report shows that Google has discovered the URL's but not indexed them. Weirdly, 3 of the pages ARE indexed, but the rest are not. So I have 42 URL's in the coverage report listed as "Excluded" and 39 say "Discovered- currently not indexed." When I inspect any of these URL's, it says "this page is not in the index, but not because of an error." They are listed as crawled - currently not indexed or discovered - currently not indexed. But 3 of them are, and I updated those pages, and now those changes are reflected in Google's index. I have no idea how those 3 made it in while others didn't, or why the crawler came back and indexed the changes but continues to leave the others out. Has anyone seen this before and know what to do?
Intermediate & Advanced SEO | | DanDeceuster0 -
Why isn't the rel=canonical tag working?
My client and I have a problem: An ecommerce store with around 20 000 products has nearly 1 000 000 pages indexed (according to Search Console). I frequently get notified by messages saying “High number of URLs found” in search console. It lists a lot of sample urls with filter and parameters that are indexed by google, for example: https://www.gsport.no/barn-junior/tilbehor/hansker-votter/junior?stoerrelse-324=10-11-aar+10-aar+6-aar+12-aar+4-5-aar+8-9-aar&egenskaper-368=vindtett+vanntett&type-365=hansker&bruksomraade-367=fritid+alpint&dir=asc&order=name If you check the source code, there’s a canonical tag telling the crawler to ignore (..or technically commanding it to regard this exact page as another version of the page without all the parameters) everything after the “?” Does this url showing up in the Search Console message mean that this canonical isn’t working properly? If so: what’s wrong with it? Regards,
Intermediate & Advanced SEO | | Inevo
Sigurd0 -
What to do when Demoted Sitelinks appear on search results under my main link?
Hello all, I had some links that i didn't want them to appear under search results (under my main domain) . Using websmaster 'sitelinks' i demoted those links and it has been almost a month and i can see those unwanted links on SERPS. Those pages don't even have high traffic, I am not quite sure why even they appear on Google. Is there anything else i can do to remove them under main domain search results. Thanks Seda
Intermediate & Advanced SEO | | Rubix0 -
What NAP format do I use if the USPS can't even find my client's address?
My client has a site already listed on Google+Local under "5208 N 1st St". He has some other NAPs, e.g., YellowPages, under "5208 N First Street". The USPS finds neither of these, nor any variation that I can possibly think of! Which is better? Do I just take the one that Google has accepted and make all the others like it as best I can? And doesn't it matter that the USPS doesn't even recognize the thing? Or no? Local SEO wizards, thanks in advance for your guidance!
Intermediate & Advanced SEO | | rayvensoft0 -
Do links from twitter count in SEOMoz's Toolbar link count?
I am using the Chrome extension and looking at a SERP, when a page is said to have 2000 incoming links, does that include tweets with a link back to this page? What about retweets. Are those counted separately or as one? And what about independent tweets that have exactly the same content (tweet text + link)
Intermediate & Advanced SEO | | davhad0 -
My New(ish) Site Isn't Ranking Well And Recently Fell
I launched my site (jesfamilylaw.com) at the beginning of January. Since then, I've been trying to build high quality back links. I have a few back links with keyword targeted anchor text from some guest posts I've published (maybe 3 or so) and I have otherwise signed up for business directories and industry-specific directories. I have a few social media profiles and some likes on Facebook, both for the company page and some posts. Despite this, I've had a lot of trouble cracking Google's top ten for any term, long or tall tail. I was starting to climb for Evanston Family Law, which is the key term I believe I am best optimized for, but took a dive yesterday. I fell from maybe the 14th result to somewhere on the 4th page. For all my other target terms, I don't know if I've gotten into the 20s yet. To further complicate matters, my Google Places listing isn't showing and is on the second page of results for Places searches, after businesses that aren't located in the same city. The night before I fell, I resubmitted my site to Google because Webmaster tools was showing duplicate title tags when I had none. I had also made a couple changes to some internal links and title tags, but only for a small fraction of the site. Long story short, I don't know what's going on. I don't know why I fell in the rankings and why my site isn't competitive for some of my target key phrases. I've read so many horror stories about Penguin that I fear my onsite optimization may be hurting my rankings or my back links are insufficient. I've done plenty of competitor research and the sites that are beating me have very aggressive onsite optimization and few back links. In short, I am very confused. Any help would be immensely appreciated.
Intermediate & Advanced SEO | | JESFamilyLaw0 -
My PR 4 website won't rank for keywords that have very weak competition
I bought a real 1Yr old PR4 domain and used it to make a blog that would rank easily for new trending keywords (Ex: product launch keywords). I used Yoast SEO and made sure I did all the on-page recommendations it gave me and had linklicious ping the post and a couple high PR backlinks that I gave the page, but it won't even rank page 10 let alone index. My domain is indexed and the home page links to my post. I know a average amount of SEO but I hate doing it because stuff like this frustrates me. Can someone help me? Do I need to get certain backlinks? Is there a way to get my site and post to index faster? BTW the keywords i'm trying to rank for have websites that are brand spanking new some of them are blogspot websites. Most of them don't have a single backlink to them.
Intermediate & Advanced SEO | | Jamal41930 -
Best solution to get mass URl's out the SE's index
Hi, I've got an issue where our web developers have made a mistake on our website by messing up some URL's . Because our site works dynamically IE the URL's generated on a page are relevant to the current URL it ment the problem URL linked out to more problem URL's - effectively replicating an entire website directory under problem URL's - this has caused tens of thousands of URL's in SE's indexes which shouldn't be there. So say for example the problem URL's are like www.mysite.com/incorrect-directory/folder1/page1/ It seems I can correct this by doing the following: 1/. Use Robots.txt to disallow access to /incorrect-directory/* 2/. 301 the urls like this:
Intermediate & Advanced SEO | | James77
www.mysite.com/incorrect-directory/folder1/page1/
301 to:
www.mysite.com/correct-directory/folder1/page1/ 3/. 301 URL's to the root correct directory like this:
www.mysite.com/incorrect-directory/folder1/page1/
www.mysite.com/incorrect-directory/folder1/page2/
www.mysite.com/incorrect-directory/folder2/ 301 to:
www.mysite.com/correct-directory/ Which method do you think is the best solution? - I doubt there is any link juice benifit from 301'ing URL's as there shouldn't be any external links pointing to the wrong URL's.0