Moz Crawl Test: WordPress sites with and without /feed and /trackback entires?
-
I have multiple WP websites and on some of the websites, on my Moz Crawl test, I see an entry for every blog post but also entries for /feed and /trackback for that single blog post. For example,
www...com/someArticle
www....com/someArticle/feed
www...com/someArticle/trackback
1. Can anyone explain why the Crawl test is picking up the /feed and /trackback items? Is it simply because they are 301 redirects to the original post (www...com/someArticle)?
2. What setting(s) in WordPress are making this information appear? Or is it just that the site(s) that have the /feed and /trackback are displaying "normal" behavior for a WP site with a lot of trackbacks and feed entires?
3. Should /fee and /trackback, as well as /author be blocked in robots.txt?
Thanks in advance for your advice and input!
-
I have the same issue but instead of it redirecting to the parent post its just going to a 404 page.
-
So I solved the problem (or at least figured where it was coming from). On this particular site, under the comments area, there is a link for "trackback url" and a link for "comments rss feed". Naturally these are ../trackback and ../blog so that's why the crawl is picking them up. They are 301 redirected to the "parent" page so that's why they are not a duplicate content issue. Thank to everyone for their help!
-
1. If you check the source code of your blog posts, there must be some sort of link to the feeds - possibly even in the header. I'm not 100% on how the Moz crawler operates (if it only spiders <a>anchor links or if it spiders referenced links in the header - pretty sure the latter) - but either way that's how they're finding it, through some sort of link on the page.</a>
<a>You could try running a crawl with Screaming Frog SEO Spider and see if it also picks up the feed URLs and Screaming Frog will show you where it found the links as well.
2. Good question. Your theme may be displaying links to these things somewhere - the best way to find out is to crawl with Screaming Frog and it will show you which pages link to your feed and trackback URLs. Then if you don't need them, you can go into the editor and remove them from the code.
3. I agree with Thomas here, I would not block them with robots.txt - rather I would see if you can fix them at the source and remove the links if they are not needed.
-Dan</a>
-
Thanks, I'll check it out!
-
Hi, you should never block feeds they're really pretty beneficial to your site. Take a look at this from Joost it will explain it much better than I can
http://yoast.com/example-robots-txt-wordpress/
All the best sincerely, Thomas
-
Thank you.
When you say "TrackBacks are from people posting either identical or similar content to WordPress.com", what do you mean? I thought trackbacks were notifications of links back when someone links to your content?
And why does the codex recommend blocking feeds and trackbacks in robots.txt?
Thanks again!
-
the TrackBacks are from people posting either identical or similar content to WordPress.com I would follow up with that. unless that person is you.
No do not block a feed with robots.txt and do not block the TrackBacks use automatics Digital millennium act takedown if somebody is stealing your content.
Sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz metrics
This discussion is strictly theoretical... I won't hold anyone to their answer. If I have 2 websites that are identical in every way and let's say the domain authority for both is 40, and I 301 redirect one site to the other, what would the DA become? Same question for single pages, both with a PA of 40. If I 301 redirect one page to the other, what does the PA become for the remaining page?
Moz Pro | | AMHC0 -
Image galleries without
My product's image galleries have no <title>or <meta>. Galleries are automatically generated. Moz crawl tool diagnosed this may cause problems for my SEO. These pages does not include any relevant information for the customers, including only a few images each. What should I do?<br />1) NOINDEX<br />2) insert <title> and meta tags<br /><br /><br /></p></title>
Moz Pro | | lema210 -
How do I retrieve crawl and ranking data about a site from the past?
Hey. One of my main clients has asked to see the crawl data and rankings data for the past eight months. He wants to have tangible evidence of the effects of Penguin. I would like that info too. Is it possible to retrieve that information on a weekly crawl and ranking basis through SEO Moz and if so, how do you do it? I simply want to show a graph, timeline and brief explanation across several main keywords... Help me as you guys always do - You rock Best Ben
Moz Pro | | creativeguy0 -
Third crawl of my sites back to 250 pages
Hi all, I've been waiting some days for the third crawl of my sites, but SEOMOZ only crawled 277 pages. The next phrase appeared on my crawl report: Pages Crawled: 277 | Limit: 250 My last 2 crawls were of about 10K limit. Any idea? Kind regards, Simon.
Moz Pro | | Aureka0 -
Are header directives such as X-Robots and Link supported by OSE/Linkscape/SEOMoz tools?
SEOMoz tool reports show lots of duplicate content where there are http header directives in place on pages to eliminate dupes. Googlebot obeys but Roger the robot doesn't. Are header directives such as X-Robots and Link (rel=canonical) supported by OSE/Linkscape? I'd like to put my mind and clients at ease. Thanks
Moz Pro | | Mediatorr0 -
Crawl Diagnostics Update
I have corrected some errors in my SEOMoz Crawl Diagnostics, however the errors are still showing. It says a crawl has happen since. Any idea's why?
Moz Pro | | petewinter0 -
I have corrected the Problems in Crawl Diagnostics. When would it refresh/ re-crawl my site ?
I have corrected most of the problems shown in crawl diagnostics and changed the meta desc. , titles etc. When will SEOMOZ recrawl those pages and show that Its correct now ?
Moz Pro | | VarunBansal0 -
Initial Crawl Questions
Hello. I just joined and used the Crawl tool. I have many questions and hoping the community can offer some guidance. 1. I received an Excel file with 3k+ records. Is there a friendly online viewer for the Crawl report? Or is the Excel file the only output? 2. Assuming the Excel file is the only output, the Time Crawled is a number (i.e. 1305798581). I have tried changing the field to a date/time format but that did not work. How can I view the field as a normal date/time such as May 15, 2011 14:02? 3. I use the ™ symbol in my Title. This symbol appears in the output as a few ascii characters. Is that a concern? Should I remove the trademark symbol from my Title? 4. I am using XenForo forum software. All forum threads automatically receive a Title Tag and Meta Description as part of a template. The Crawl Test report shows my Title Tag and Meta Description as blank for many threads. I have looked at the source code of several pages and they all have clean Title tags and I don't understand why the Crawl Report doesn't show them. Any ideas? 5. In some cases the HTTP Status Code field shows a result of "3". Why does that mean? 6. For every URL in the Crawl Report there is an entry in the Referrer field. What exactly is the relationship between these fields? I thought the Crawl Tool would inspect every page on the site. If a page doesn't have a referring page is it missed? What if a page has multiple referring pages? How is that information displayed? 7. Under Google Webmaster Tools > Site Configurations > Settings > Parameter Handling I have the options set as either "Ignore" or "Let Google Decide" for various URL parameters. These are "pages" of my site which should mostly be ignored. For example a forum may have 7 headers, each on of which can be sorted in ascending or descending order. The only page that matters is the initial page. All the rest should be ignored by Google and the Crawl. Presently there are 11 records for many pages which really should only have one record due to these various sort parameters. Can I configure the crawl so it ignores parameter pages? I am anxious to get started on my site. I dove into the crawl results and it's just too messy in it's present state for me to pull out any actionable data. Any guidance would be appreciated.
Moz Pro | | RyanKent0