Thanks - please mark as answered and like if you please!
Posts made by CleverPhD
-
RE: Help with strange 404 Errors.
-
RE: Help with strange 404 Errors.
Hey there. The %C2 %94 %3A are simply ASCII values of encoded versions of special characters in the URL.
http://www.w3schools.com/tags/ref_urlencode.asp
%3A is the same as a colon
%C2 is Â
%94 is "
This simply puts those characters in a format that is easier for the browser to read and then convert into a format it can use.
Couple of things to check on where this comes from.
Get a spider program and see if somewhere, waaay out there in the back ends of your content library that you have some crazy goofed up link that got planted here. Find it an delete it
Other than that, somewhere out on the internets, as my developer likes to say, "A bunch of monkeys banged heads on keyboards" There are site scrapers that do not do a good job and take your content and then repost it and they screw up all kinds of formatting and you end up with links like the above pointing to your site. The spiders look for it and you get a 404.
I just did a Google search on
www.nfl.com/gamecenter?game_id=29528&season=2008&displayPage=tab_gamecenter/
and you get all kinds of random pages linking to that.
Here is what I would do. You mention most errors start with
You can 301 all those to another page. Or, show a simple helpful page for the user to navigate off of with a noindex, nofollow meta tag. The noindex tag would get those pages out of the index at least and not show a 404 error.
-
RE: How elated do you get?
I find that taking all the punched holes from my hole puncher and tossing them in the air and then pretending I am in a parade while I wave to all my admirers feels pretty good.
-
Entireweb.com
http://www.entireweb.com/express_inclusion/
Anybody used this? There is a express inclusion that you pay for, vs the free. Wanted to see what the group thinks. Worth it, not? It is not included in the SEOMoz list of directories.
-
RE: Can't canonical, but need pages to show in Google News
Another alternative to canonical would be to simply link to the source in the reprint of the article itself.
You would then be giving that signal to Google on where the content came from without claiming ownership/authorship.
-
RE: Checking for content duplication against content on your own site.
I assume that you have an admin section in the CMS where you are editing and entering these articles before they go live.
You need to get a developer to simply write a search algo that when you create a new article and before it goes live, it takes sections of your content and looks for matches/duplicates. You can set a requirement that it has to match on a minimum of a 4 to 5 word string and other such limitations to make sure you are not matching too many items. It will take a few tests to find a sweet spot of too many matches vs not enough.
With 17K pages, this is the only way you can really do this in an efficient way, you need some IT support/development. They may have to create a reporting layer as well to help you sift through the results.
Good luck.
-
RE: News sites & Duplicate content
Ditto on what Donnie said. Purple Cow, if you want that site to be an authority, it needs to be authoritative. Why would anyone buy the Washington Post if it just copied all its articles from the New York times? Get a few staff writers to combine and tweak articles as Donnie mentioned or to write original content.
Good luck!
-
RE: Is "no follow"ing a folder influences also its subfolders?
Are you also blocking the sub-folders with robots.txt? I think of nofollow as being used on a page by page basis vs a folder basis and only with the robots.txt you can block a folder.
As these pages are already in the index and you want them out, you really need to add a noindex meta tag on the offending pages. I would not use the robots.txt approach as you want Google to be able to spider the page then read the noindex meta tag to remove the page.
Cheers.
-
RE: Architecture questions.
Hi James,
It sounds like when you consolidated widgets, you gave Google more of a focused page for persons to search for vs a larger number of pages on the same product. This is interesting as it is the inverse of the long tail effect. You would think that more pages around a given product would be better. I guess this would be a search case where too many pages was a bad thing. Makes me think of how we setup pagination to make sure Google does not focus on p 2,3,4,5 etc but work the noindexes to have focus on page 1 of the pagination.
Thanks for the post!
-
RE: Site Structure Question
The only other thing to consider is how you want to parse all this out in Analytics. If you were to go to the folder structure, it may be easier to parse things out as you can look at a folder by folder basis. GA automatically creates reports based on a folder (aka slash) hierarchy. You would not have that with the dashes.
Also, you can setup regexp to parse on the slashes to setup some really cool filters and advanced segments. You may get the same thing with your dashes as long as you are consistent with how you use the dashes. So the first slot is always gender, the second slot is always style and the the last is product name.
I am with EGOL that it may be a moo point, but wanted to put in another variable for your to consider in what works best for you.
Cheers
-
RE: Could longtail keywords really produce up to 80% more organic traffic long-term?
I would second what Kevin B. stated. Do not ignore the long tail. That said, we had to put in a lot of work with the production of content to reflect the long tail keywords. You don't want to produce a bunch of spammy junk, but real content pages.
Identifying the keywords is fairly straight forward. You should be able to look in your Google Analytics reports to get you started.
FYI - the long tail is not just an SEO term/concept. Chris Anderson over at Wired wrote and article and a book on it. Interesting read if you want a broader perspective.
-
RE: 404 with a Javascript Redirect to the index page...
Don't think you are gaming Googlebot. You are showing it a 404 (that is good) so GB knows that the page is dead. You are showing a custom page to the user that should let them know the page is dead. I think the custom page should say something like "Oops, this page no longer exists, redirecting you to another more helpful page in 10 seconds (click here if you are not redirected). You are
I might give it 15-20 seconds to make sure the person has time to read it.
You are showing the same thing to Gbot and the user, so no cloaking issues, i.e. you are not trying to game GB.
You may want to reconsider a 301 redirect instead. You then get the redirect automatically for the user (no pause or clicking needed) and the 301 tells google where the new page is etc. The only caveat to that, if you are redirecting a bunch of pages to the home page (or some other index page) - your solution above is probably better as you don't want to have a many to one situation with the 301s. Google prefers more of a one to one on 301 redirects - otherwise they may tag your 301 as a soft 404.
-
RE: Indexing non-indexed content and Google crawlers
Wow! Nice detective work! I could see how that one would slip under the radar.
Congrats on finding a needle in a haystack!
You should buy yourself the adult beverage of your choice and have a little toast!
Cheers!
-
RE: On-Page Report Card is lacking
The On-Page report is simply that - an "On-Page" report. By definition, it does not check things that are outside of the page in question. What you want to look at is a ranking report to see how a page is ranking.
On-page are just some of the 200+ things that Google looks at - so you have to look at the whole picture. If everyone could just use a simple tool to reverse engineer the Google ranking algo, then I bet Rand would charge way more than $99 to access it!
On-page factors are SEO 101 though, so to delete it would be extreme. I think people don't write about them as much as they are pretty well established (and straight forward) and so they do not need to be rehashed over and over again.
If there was not a tool, you would still need to read some Blog articles to find out best practices for on-page optimization. You could still obsess over the details and implementing them for hours on end. You would still end up in the same place that you are right now.
-
RE: On-Page Report Card is lacking
You probably need to wait until SEOMoz sends Roger to do his weekly check and see what has changed to give a new grade.
-
RE: Changing Server IP Addresses. Should I be concerned?
Generally speaking, if you transition it correctly, have the exact same site up and running on the new IP before you change the DNS you should be fine. I did some Googling on the subject, and Mark D. has a much more specific and detailed description of what you should do as far as making sure you have the exact same site running
http://malteseo.com/seo/changing-ip-address-without-losing-google-ranking/
What you do not want to do at this point is change up your URL structure, title tags etc. Those changes alone can impact your rankings and you don't want to compound the issues. Less change, more gradual change is always better.
-
RE: Temporary landing pages and SEO
Agreed! Like with anything the answer is, "Well, it depends."
-
RE: What is the impact of excessive code on rankings
Page speed is a factor in rankings. You have to have it as a systematic part of how your developer thinks and how the rest of your team thinks when setting up features. Easier said that done, but worth the long term effort.
The adage, "when you measure it, you can improve it" is key with developers. Either let them see the GWT numbers or setup something like New Relic reporting or some other tool.
The page speed on one of our sites was increasing and then I was also noticing some 500 errors in GWT. I had our IT guy drill down and figure it out. It was a caching issue that each time a page loaded, a full database query would run vs running off the cache. Sometimes it would be enough that a page would not load. It was intermittent. He fixed it and look at how our page speed improved (see attached).
This was in the last part of May, first part of June. We had a great July (despite a slow first week with the holiday) due to improvements in organic ranking. Our improvement in speed was a part of this.
One more on this. I find the GWT Labs Speed tool can suck. It does not seem to sample very many pages. Also, in one site we had, it focused on the site search page and of course that is slower than the rest of the site so it was inflated. I like the Health Crawl Stats (that is where the screen shot is from). Then I can use other tools to test on a page by page basis.
Cheers.
-
RE: On-Page Report Card is lacking
Agree with Ruth. Plus, when I want to have deep thoughts, I leave SEOMoz and I go get me some Jack Handey
http://www.deepthoughtsbyjackhandey.com/
but that's just how I roll.......
-
RE: Indexing issue or just time?
If you shut down and relaunched your site 3 weeks ago and lets say you also changed your URL structure and title tags and if you also do not have 301 redirects for old to new content and you don't have a sitemap.
All those things, even if you did them all "correctly" can cause Google to take a while to respider an reindex your pages. Google ranks "pages" vs "sites" generally speaking and so that can impact rankings.
Looks like you canonical all the pages to themselves? Example
<link rel="<a class="attribute-value">canonical</a>" href="[http://thetechblock.com/the-ios-interface-concept](view-source:http://thetechblock.com/the-ios-interface-concept)" />
Were you www.thetechblock.com before and now you are trying to change to the non www? If you did change to non www then Google would see this as a new site and so the rankings would start over
If you want to look at crawl rate, you should be able to go into Google Webmaster Tools and see how often they are spidering. Similarly, you can submit a sitemap and see how many are indexed
I
-
RE: On-Page Report Card is lacking
On page factors are just on page factors and while are important, are just one part of the ranking process.
I think of on page optimization as what, "gets you in the door" with Google/Bing/Etc. If you get it wrong, then you may not get into the search engines to start with, or your ranking is much worse than should may be otherwise.
That said, just because you get on page optimization right, does not then automatically get you into the top 50 places for a keyword.
If it were that simple, then what would Google do with 100 pages that are focused on the same keywords that all got an A in onpage optimization? How would Google rank them as far as relevancy?
There are several hundred factors that come into play when getting a page to rank for a keyword. What is the age of the page/domain? Is this on a trusted domain? How many links are coming into that page? Where are the links coming from? What is the anchor text used on the link? How many other pages are trying to rank for the same key words? What are the link profiles for those pages? etc. etc.
Take a look at the SEOmoz ranking factors article for more info
http://www.seomoz.org/article/search-ranking-factors
Not to sound sarcastic, but congrats on getting an A for your on page factors! Many people never get that far and have poor rankings due to things that could be easily fixed!
Look at your SEO checklist http://www.seomoz.org/blog/an-seo-checklist-for-new-sites-whiteboard-friday and check that one off! Onto other tasks!
-
RE: Temporary landing pages and SEO
I have to digress on 301s and then bring up soft 404s
You have to watch about sending too many 301s to a single page. Sometimes you have to do this, but I have also seen Google showing soft 404 errors when you have a bunch of many paged 301ing to a single page on sites we manage.
There is a subtle thing we have found on how Google thinks about 301s as it relates to 404s
http://googlewebmastercentral.blogspot.com/2010/06/crawl-errors-now-reports-soft-404s.html
They state that to correct soft 404s one thing you should look at is 2.b "Should redirect to a more accurate URL"
Google prefers that a 301 is more of a one-to-one or several to one.
There was also a post in SEOmoz on this
http://www.seomoz.org/blog/301-redirect-or-relcanonical-which-one-should-you-use
If you send too many 301s to a single page - like the home page, this may look like you are trying to manipulate link juice.
I would be so bold as to say Google prefers a one to one relationship for 301s vs a many to one.
Options
- If you want to just get rid of pages already in the index
After 20 days, show a 200 on the page, update the message to say that this product is no longer available with a link to your search page but then add the noindex meta tag to that page. This will allow Google to spider, but remove the page from the index. You would also need to remove all links on your site to this page after 20 days as well.
Leave up the page for 6 months - 1 year and then setup a 404 as the page is dead and out of the Google index and the 20 day deal has been long gone.
This will get those pages out of the index, tell users where they need to go in case they land on the page and minimize any 404 or soft 404 errors in Google webmaster tools.
- Keep them out of the index to start with
I am assuming that since these pages are only up for 20 days, they do not have time to really gain any search traction to start with and so would not show up in the SERPs (or rank that high if they did).
If that is the case, why not put all these pages in a separate folder that you block with robots.txt and then no follow all links to them. Keep them out of the index to start with.
Sounds like you are optimizing a category type page above the product pages anyway and so you just focus on optimizing the category page vs the product pages themselves.
Beyond that, would need more specifics on the how and the why of what you are doing to try and figure out the if and the when of next steps
-
RE: What is the best approach: abbreviated citations or fully spelled out citations?
I would also cross check it against the USPS format if you are starting out just to find a good consistent format that is right with the USPS.
-
RE: Tracking on Analytics .ca domain When Redirect From GoDaddy Control Panel?
You do not list what Analytics tool you are using, but here is how I would do this in Google Analytics and this is probably similar to what your Analytics tool works.
Google Analytics only tracks a page view when there is a page view and the GA code is executed on that page.
If you are having GoDaddy 301 redirecting all the domains to the .com then there is no page that is shown/viewed and therefore no GA code to execute to record the redirect.
What you need to do is setup the redirect to the .com to include referral data in the URL
http://support.google.com/googleanalytics/bin/answer.py?hl=en&answer=55578
So instead of redirecting from the .fr domain to
http://www.pilatesboisfranc.com
You would redirect to from the .fr domain to
You will then see where the referrals came from on your .com domain as a function of your GA reporting.
You will need to decide how you need to label your source, medium and campaign appropriately, I just took a quick stab, but it may not be how you want to label things.
-
RE: Url canonicalization: www. to http://
Supposedly over time Google will give you credit for the old site URLs to the new ones, but that is a process that I have seen take around 6 months. It is a long process.
If you already setup 301s from the www to the non www you are now like some politicians and flip flopping on where your site is located
This is your call. If your old URLs had been around for a long time and had a ton of link equity, then I would lean towards reverting back. It will still take a while for Google to sort it all out, but it should work. Short term loss, long term gain.
You have to consider links from other sites that use the old urls etc etc, things beyond Google. Sorry not to have a simple answer.
-
RE: Best practices for handling https content?
Don't go the whole site https route. You are just creating duplicate site nightmares.
Since you are working within a cart and auth pages you need to add a noindex nofollow meta tag on those pages to start with. This way they don't get into the index to start with, also any pages that are in the index now will be dropped. Do not use robots.txt for this, use the meta tag noindex nofollow.
You need to setup 301 redirects on all other pages from the https to the http version for all pages except the cart and auth pages (i.e those pages that are supposed to be https). If Google has found any of those pages that are supposed to be http, then the 301 will correct that, plus you get the user back to the right version of the page for bookmarking and other purposes.
I
-
RE: Url canonicalization: www. to http://
www is seen as a separate subdomain than non-www. Same thing with http: vs https: - this is why you see the drop in the domain authority.
Here are your options
-
Get your www back on. Setup 301 redirects from the non www to the www
-
Setup 301 redirects from the www to the non www and keep the new structure
Option 1 is the better way to go with this if you can.
-
-
RE: SEO Overly-Dynamic URL Website with thousands of URLs
You bet. Just to be clear, I was talking about pulling the content from the page in some automated fashion into the title. Finding elements from each page that should be in variables and then inserting them into the title, description and H1 in a way that you can make each page unique.
You would need to make a final call on if you think the content is unique enough. We had 5000 locations that we used data in the database to make 5000 unique pages as each location has the name of the business, city, state, zip, address phone number etc.
We are now working to have user generated content/comments/reviews on each page so that each one page becomes more unique and more useful over time.
Having a IT guy who appreciates SEO is key for this and for the URLs. I would talk first about how he organizes the data in his system and then how this translates into the URL. You can then then just have him rename the URL using the same logic.
Show him some data on click thru rates on more readable URLs and how Google prefers not to spider them. I work to educate the IT guys as much as I can without making it sound like I know it all.
-
RE: SEO Overly-Dynamic URL Website with thousands of URLs
Wow! Yep you gotta lock this one down. This needs heavy automation support. Have you looked at using the content from each build can be pulled into the title tag etc automatically? That can help diversify how the tags look. If the users love all the builds, then is there a way to use automation to help Google see this.
If you work it right, you can have an awesome opportunity for long tail search. I worked on a site that had a yellow pages type setup. We had all the pages with title tags and descriptions that pulled in city state zip location address and name of location automatically. Even changed up the order on how it was presented and had different options for filler / connective words.
Worked pretty well to show off the unique content on each page as best we could automatically. We then paid an intern to go in and optimize page by page from there starting with the most viewed pages.
It may not be that each page is not strong enough as you mention, but you also said that users love the pages so I wanted to toss that out there.
-
RE: /forum/ or /hookah-forum/
Agreed - you already have the key term in the URL all you are doing is adding to the length of the URL and that can make it look spammy to users.
You will have other parameters added to the end of that URL and so very quickly you would be getting close to the 70 character max that SEOMoz suggests.
-
RE: Pagination for Ecommerce Site - How do I fix this?
Hey there Kristy O,
You want to noindex follow meta tag your paginated pages beyond page 1 as you want Page 1 of a series of pages to rank for the keywords that you are optimizing for. By preventing pages 2 though X on a given series from being indexed you "clear up" for Google what pages you want it to focus on.
You have Google follow the links on all of those pages as they all point to product pages that you do want to have indexed and ranked. There is also the rel next and prev attributes, but I like the noindex follow meta tag approach as it gives me more control. I
If you want to use the canonical tag in this situation. It was recommended that you use the Canonical to link back to a master page that contains everything on page 1,2,3,4 X. They want you to use canonical to have the parts point back to a whole. On most larger sites this is not practical as to load a page with everything on it would take too long.
See here for when to use view all page vs not
http://googlewebmastercentral.blogspot.com/2011/09/view-all-in-search-results.html
I think that the use of the canonical in this case was more for a 3 part article vs pages after pages of a product catalog.
I really only use the canonical to help with things like having the printer friendly version point back to the main version etc.
If you want the Guru on pagination - Google Adam Audette
http://searchengineland.com/five-step-strategy-for-solving-seo-pagination-problems-95494
Good luck!
-
RE: 301 Redirect How Long until the juice passes through to new site
One of our sites it took about 6 months before things got back to "normal" in the SERPs. As far as how long to leave the 301s. I would say indefinitely. There are links from other sites with old URLs that we still get traffic from and want to make sure we get that referral traffic.
Having the old 301s in place I do not think hurts anything as over time they will be used less and less. You could also look at your server logs and determine that some 301s are not used anymore and then turn them off.
It just always surprises me when I see Google looking for old pages that we have 301ed for over a year and so I leave them in place.
Good luck.
-
RE: Thoughts about stub pages - 200 & noindex ok, or 404?
I would agree with all the comments on how to technically deal with the random pages, but it is a losing battle until you get your website database/templates under control. I once had a similar issue and had to work months to get a solution in place as the website would create all kinds of issues like this.
We had to implement a system so that the creation of these pages would be minimized. I think the issue is that you need to make sure that any random page requests, make sure they get a 404 to start with so that the URL does not get indexed to start with.
That said, all the random URLs that are already indexed, I like the 200 options with the noindex meta tag. My reasons: This is because otherwise with the 404s you get all these error messages that are meaningless in GWT. The noindex also gets the page out of the index. I have seen Google retry 404s on one of our sites, crazy. Ever since Google started showing soft 404s for 301s that redirect many pages to a single URL, I only try to use 301s on more of a one to one basis.
Good luck.
-
RE: Duplicate content
Wouldn't you want to noindex one of the agreements vs nofollow that way only one page gets indexed.
The other thing is you could canonical link the copy page back to the original.
-
RE: Is it a problem to have a homepage with a slug / URL ?
Yes this is a big problem. You are showing Google, 2 pages with the same content. This would be the case if this were any two pages with the same content on your site.
You definitely need to setup a 301 redirect (and not a 302) the slug to the main page.
You have to be careful with CMS systems. They may be generating a ton of duplicate pages and you are not even aware of it. Check with you IT person, but then look in Google Webmaster Tools and also use other tools to check (e.g spidering tools).
You want each URL on your site to have a unique title tag, description and page content on your site.
Good luck!
-
RE: Duplicate title tags and meta description tags
Check that your canonical is indeed linking to the right page.
You can go into Google Webmaster Tools and specify that they need to ignore that parameter and that it is the same page.
Look under Configuration > URL Parameters
More info here
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687
-
RE: Block all search results (dynamic) in robots.txt?
I would agree with BK Search. You want to minimize what Google has to crawl (I know this sounds backwards) so that Google focuses on the pages that you want to rank.
Long term, why would you waste GoogleBot's time on pages that don't matter as much? What if you had an update on a more important page and GoogleBot is too busy indexing this infinite loop of pages.
At this point, I would use the noindex meta tag vs robots.txt so that google will crawl and remove all the urls from the index. Then you can drop it in later into robots.txt so it will stop crawling. Otherwise you may end up with a lot of junk in the index.
-
RE: On-Page optimization for the Long-Tail
I think you answered your own question. You have to compare search traffic potential for "dog training twin cities" vs "training german shepard". If you see enough traffic on the german shepard stuff, create a new page and link to it. Maybe focus on the top 5 types of dogs and have that as a section.
You are talking about stealing from Peter to pay Paul, but it looks like you are being smart and using analytics to try and answer your question. There is your science. You have to look at the long tail of search terms from analytics, and you should see some patterns there. That will guide you on where to build content over time.
There are some cool regular expressions you can use in Google Analytics to built keyword reports around 2 and 3 word queries. One example:
http://secretswede.net/seo/measure-longtail-traffic-google-analytics-mayday-update/
Take this and setup a segmentation to see as a whole if this type of traffic is really driving conversions. It may be that you get traffic, but less conversions. This adds another layer to how you look at this and if it is worth spending time building out content.
Good luck!
-
RE: Soft 404's from pages blocked by robots.txt -- cause for concern?
Me too. It was that video that helped to clear things up for me. Then I could see when to use robots.txt vs the noindex meta tag. It has made a big difference in how I manage sites that have large amounts of content that can be sorted in a huge number of ways.
-
RE: Soft 404's from pages blocked by robots.txt -- cause for concern?
Take a look at
http://www.youtube.com/watch?v=KBdEwpRQRD0
to see what I am talking about.
Robots.txt does prevent crawling according to Matt Cutts.
-
RE: SIte Redesign - Disaster for Organic Traffic
You can use the analytics to tell where the traffic drops are coming from but I think that you need to get to why Google or some other search engine traffic is dropping. So I totally agree with what EGOL mentions, but I think you already have a global grasp that traffic has dropped. Here are what I would suggest are next steps to then fix the issues of why the traffic has dropped.
Odds are is that technically you may have a problem going on.
Go through your GWT reports for crawl errors and HTML optimization etc. We relaunched and it gave us all kinds of clues on what to fix.
You need to double check all of your 301 paths - I bet that there are some holes. This is down and dirty detail work. I bet that they are probably not as correct as you think. Don't assume that if you test a few the rest are ok. Dont assume that the developer says that they are working that they are working. Run them through a tool to make sure that you are sending a 301 response. I had a site that was using 302 responses vs 301 (a temporary vs a permanent redirect). Verify, verify, verify.
Did you change title tags on your pages. Titles are a big signal and if you change them, you can see a drop. You can look at your traffic and ranking analysis to look at sample pages. If you did do a radical change, you may want to go back to your original title tag methods.
Run site speed tools
https://developers.google.com/speed/pagespeed/insights#url=www.brickhousesecurity.com&mobile=false
See if your speed has dropped. The GWT will also show over time if your site is speeding up or slowing down. You may need to look at your server setup.
Verify the HTML
Did you setup
It looks like you may have changed a bunch of things at once and so it is hard to tell what you changed so see what was impacted. Usually in cases like this, Google is trying to figure out the changes and so it may take a while to sort out. I just gave some examples, but you need to review everything that has changed and see what the differences are. There are some things, even with a 301 redirects that are correct, if you changed the URL structure and title tags and the new site has HTML that does not validate, Google may take a while to sort it out. I had a site that we did a complete overhaul and it was 6 months before the traffic came back and that was with pretty good controls in place.
Good luck!
-
RE: Soft 404's from pages blocked by robots.txt -- cause for concern?
Just a couple of under the hood things to check.
-
Are you sure your robots.txt is setup correctly. Check in GWT to see that Google is reading it.
-
This may be a timing issue. Errors take 30-60 days to drop out (as what I have seen) so did they show soft 404 and then you added them to robots.txt?
If that was the case, this may be a sequence issue. If Google finds a soft 404 (or some other error) then it comes back to spider and is not able to crawl the page due to robots.txt - it does not know what the current status of the page is so it may just leave the last status that it found.
-
I tend to see soft 404 for pages that you have a 301 redirect on where you have a many to one association. In other words, you have a bunch of pages that are 301ing to a single page. You may want to consider changing where some of the 301s redirect so that they going to a specific page vs an index page.
-
If you have a page in robots.txt - you do not want them in Google, here is what I would do. Show a 200 on that page but then put in the meta tags a noindex nofollow.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
"When we see the noindex meta tag on a page, Google will completely drop the page from our search results, even if other pages link to it"
Let Google spider it so that it can see the 200 code - you get rid of the soft 404 errors. Then toss in the noindex nofollow meta tags to have the page removed from the Google index. It sounds backwards that you have to let Google spider to get it to remove stuff, but it works it you walk through the logic.
Good luck!
-
-
RE: Seo Moz says there are 124 links on this page - do you see it?
Take the page. Select View Source
Search for<a href<="" p=""></a>
<a href<="" p="">I get about that many links.</a>
<a href<="" p="">You have the nav on the right side (weddings by color, by season, by theme) , you have 2 links each to the different weddings (the title and the read more), social media links, etc
I think what may confuse you is that your theme does not underline links, so maybe you are not seeing them with your eyes. View Source and look at the HTML - they are all there. :-)</a>
-
RE: Indexing non-indexed content and Google crawlers
I think Screaming Frog has a trial version, I forget if it limits total number of pages etc. as we bought it a while ago. At least you can try out and see. May be others who have more tools as well.
-
RE: Changing Hosting Companies - Site Downtime - Google Indexing Concern
I would say it is preferable to go ahead and setup the site on the new host first before you take down the old one. While you should be "ok" with the downtime, I would not recommend it. You never know when the spiders come along. You would probably not be de-indexed, but Google would see a bunch of errors and you could potentially see a drop in the SERPS and then traffic to your site as a result. This should all recover.
I have seen on our sites drops in traffic when we have had technical difficulties. I usually see issues in GWT or other tools and I get them fixed and the traffic comes back.
Here is the other thing. What if something goes wrong during that 12 hours? What if 12 hours becomes 24, 48 etc. due to unforseen issues. That is just bad for business/users in general when a site is down for any amount of time. You do not want that, let alone the search engine issue. What if something goes wrong with the new host and you need to revert back to the old? This has happened to me and trust me, you do not want this to happen to you. Murphy likes to play games with scenarios like this, I do not mess with Mr. Murphy and his laws.
If I were you here is what I would do
-
setup the new host
-
setup your site on the new host
-
test test test on the new host
-
change the DNS from the old to the new host
-
watch the traffic move
-
test test test on the new host
-
shut down the old host
We have overlapped for up to 2 months with old and new hosts just to make sure everything is set. You always back up your data yes? Why would you not want to have a backup with your entire website?
Good luck!
-