Phantom urls causing 404
-
I have a very strange problem. When I run SEOmoz diagnostics on my site, it reveals urls that I never created. It seems to combine two slugs into a new url. For example, I have created the pages http://www.naplesrealestatestars.com/abaco-bay-condos-naples/ and http://www.naplesrealestatestars.com/beachwalk-naples-florida/ and now the url http://www.naplesrealestatestars.com/abaco-bay-condos-naples/beachwalk-naples-florida/ exists in addition to the two I created. There are over 100 of these phantom urls and they all show a 404 error when clicked on or crawled by SEOmoz. Any body know how to correct this?
-
It lookls like you are using Yoast's Wordpress plugin for SEO. Are you also using it to re-write the URLs?
I would update Yoast's Wordpress SEO to v0.4.2 - See the changelog http://wordpress.org/extend/plugins/wordpress-seo/changelog/
Take a look at that and feedback please...so we can assist further.
-
In order to offer the best assistance, more specific information would be quite helpful.
-
What is the name of the tool you are using?
-
Where exactly are you seeing the problem?
I will take a guess and assume you are using the Crawl report. If that is the case, you want to look at the "referrer" field. This shows where the page the crawler found the bad link.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Tracking using Custom URLs - is this viable?
Hi Moz community! I’ll try to make this question as easy to understand as possible, but please excuse me if it isn’t clear. Just joined a new team a few months ago and found out that on some of our most popular pages we use “custom URLs” to track page metrics within Google Analytics. NOTE: I say “custom URLs” because that is the best way for me to describe them. As an example: This page exists to our users: http://usnews.rankingsandreviews.com/cars-trucks/Ram_HD/2012/photos-interior/ But this is the URL we have coded on the page: cars-trucks/used-cars/reviews/2012-Ram-HD/photos-interior/ (within the custom variance script labeled as “var l_tracker=” ) It is this custom URL that we use within GA to look up metrics about this page. This is just one example of many across our site setup to do the same thing Here is a second example: Available page to user: http://usnews.rankingsandreviews.com/cars-trucks/Cadillac_ATS/2015/ Custom “var l_tracker=” /cars-trucks/2015-Cadillac-ATS/overview/ NOTE: There is a small amount of fear that the above method was implemented years ago as a work-around to a poorly structured URL architecture. Not validated, but that is a question that arose. Main Questions: Is the above implementation a normal and often used method to track pages in GA? (coming from an Omniture company before – this would not be how we handled page level tracking) Team members at my current company are divided on this method. Some believe this is not a proper implementation and are concerned that trying to hide these from Google will raise red flags (i.e. fake URLs in general = bad) I cannot find any reference to this method anywhere on the InterWebs - If method is not normal: Any recommendations on a solution to address this? Potential Problems? GA is currently cataloguing these tracking URLs in the Crawl Error report. Any concerns about this? The team wants to hide the URLs in the Robots.txt file, but some team members are concerned this may raise a red flag with Google and hurt us more than help us. Thank you in advance for any insight and/or advice. Chris
Reporting & Analytics | | usnseomoz0 -
How can I make sure that we are only tracking for single URLs?
Is there a way to track in Google analytics where part of the URL is excluded. For example, we need to track when customers complete an application form, however whenever a new form is completed a new URL is created. This makes it difficult to track pages in GA as there are so many URLs.
Reporting & Analytics | | Sable_Group0 -
Migrated website but Google Analytics still displays old URL's and none new?!
I migrated a website from a .aspx to a .php and hence had to 301 all the old urls to the new php ones. It's been months after and I'm not seeing any of the php pages showing results but I'm still getting results from the old .aspx pages. Has any one had any experience with this issue or knows what to do? Many thanks,
Reporting & Analytics | | CoGri0 -
Subdomain and relative link paths cause crawl errors
I have a Wordpress blog on our subdomain and we use relative paths on our domain. It appears as though Google bot is crawling from the subdomain categories back to the domain relative paths. This of course results in hundreds of 404 pages. Any suggestions as to how to resolve this issue without changing the relative path structure of our domain? I can provide more information if need be. While I realize these issues are not that pressing, I'd obviously like to remove as many errors as possible. If anyone has encountered this problem, especially in Wordpress I'd really like to hear your solution or lack there of. Thank you in advance.
Reporting & Analytics | | BethA0 -
My first campaign identidied long URLs
Hello! 🙂 I've just created my first campaign, and the crawling proccess have detected posts with long URL (more than 70 characters). If I change it, i.e., alter the URL's, can some problem happens to my blog? Or do I have to disconsider this problem and just "work correctly" from now on? Thanks in advance for your help!
Reporting & Analytics | | Andarilho0 -
Why do I have few different index URL addresses?
Yes I know, sorry guys but I also have a problem with duplicate pages. It shows that almost every page of my site has a duplicate content issue and looking at my folders in the server, I don't see all these pages... This is a static Website with no shopping cart or anything fancy. The first on the list is my [index] page and this is giving me a hint about some sort of bad settings on my end with the SEOMOZ crawler??? Please advice and thank you! index-variations.jpg
Reporting & Analytics | | cssyes0 -
Duplicate content? Split URLs? I don't know what to call this but it's seriously messing up my Google Analytics reports
Hi Friends, This issue is crimping my analytics efforts and I really need some help. I just don't trust the analytics data at this point. I don't know if my problem should be called duplicate content or what, but the SEOmoz crawler shows the following URLS (below) on my nonprofit's website. These are all versions of our main landing pages, and all google analytics data is getting split between them. For instance, I'll get stats for the /camp page and different stats for the /camp/ page. In order to make my report I need to consolidate the 2 sets of stats and re-do all the calculations. My CMS is looking into the issue and has supposedly set up redirects to the pages w/out the trailing slash, but they said that setting up the "ref canonical" is not relevant to our situation. If anyone has insights or suggestions I would be grateful to hear them. I'm at my wit's end (and it was a short journey from my wit's beginning ...) Thanks. URL www.enf.org/camp www.enf.org/camp/ www.enf.org/foundation www.enf.org/foundation/ www.enf.org/Garden www.enf.org/garden www.enf.org/Hante_Adventures www.enf.org/hante_adventures www.enf.org/hante_adventures/ www.enf.org/oases www.enf.org/oases/ www.enf.org/outdoor_academy www.enf.org/outdoor_academy/
Reporting & Analytics | | DMoff0 -
Why are Seemingly Randomly Generated URLs Appearing as Errors in Google Webmaster Tools?
I've been confused by some URLs that are showing up as errors in our GWT account. They seem to just be randomly generated alphanumeric strings that Google is reporting as 404 errors. The pages do 404 because nothing ever existed there or was linked to. Here are some examples that are just off of our root domain: /JEzjLs2wBR0D6wILPy0RCkM/WFRnUK9JrDyRoVCnR8= /MevaBpcKoXnbHJpoTI5P42QPmQpjEPBlYffwY8Mc5I= /YAKM15iU846X/ymikGEPsdq 26PUoIYSwfb8 FBh34= I haven't been able to track down these character strings in any internet index or anywhere in our source code so I have no idea why Google is reporting them. We've been pretty vigilant lately about duplicate content and thin content issues and my concern is that there are an unspecified number of urls like this that Google thinks exist but don't really. Has anyone else seen GWT reporting errors like this for their site? Does anyone have any clue why Google would report them as errors?
Reporting & Analytics | | kimwetter0