Are the CSV downloads malformatted, when a comma appears in a URL?
-
Howdy folks, we've been a PRO member for about 24 hours now and I have to say we're loving it! One problem I am having with however is a CSV exported from our crawl diagnostics summary that I've downloaded.
The CSV contains all the data fine, however I am having problems with it when a URL contains a comma. I am making a little tool to work with the CSVs we download and I can't parse it properly because there sometimes URLs contain commas and aren't quoted the same as other fields, such as meta_description_tag, are.
Is there something simple I'm missing or is it something that can be fixed?
Looking forward to learn more about the various tools. Thanks for the help.
-
I won't be too hard on the programmers - I'm a programmer myself. Our small business has developers and designers doing the bulk of the SEO. I can see you've looked in to it as I have - there are many factors involved if I was to decide to "fix" this myself. To be honest, I don't fancy it - I'm hoping the better approach will come from the wonderful SEO Moz developers who might put in a fix. Hint hint.
-
The first rule in this business is "You can't trust programmers"
I should know, I am a programmer and I used to manage teams of them.
You can't trust them to write something perfect, because they will always make huge assumptions, based on what they know.
They should know that URLs can contain commas, and they should quote them.
If they didn't do that in the final field, it is a deficiency in the code and your stuff isn't going to workunless you fix it manually.
What you need to do to fix this is to add a quote after the 10th comma and also add one at the end of each line.
Unfortunately, even that is a problem.
The problem is there are other fields that may not be quoted, some of which can start with http://
There can also be line breaks in the title field, and possibly even in the link text field.
Quotes and other characters are escaped with double quotes.
Titles and link text can also contain commas, so it is very complex.
Some of the fields are a bigger mess because it depends on the link text, and if the link text contains an image, you'll have quotes and equals signs, commas and all kinds of stuff. You can also have upper ascii characters and multibyte characters.
They did actually quote the first URL, if it contains commas.
They really should have quoted every field
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Solving URL Too Long Issues
Moz.com is reporting that many URL's are to long, these particularly affect Product URL's where the URL is typically https://www.domainname.com/collections/category-name/products/product-name, (You guessed it we're using Shopify). However, we use Canonicals that ignore all most of the URL and are just structured https://www.domainname.com/product-name, so Google should be reading the Canonical and not the long-winded version. However, Moz cannot seem to spot this... does anyone else have this problem and how to solve so that we can satisfy the Moz.com crawl engine?
Moz Pro | | Tildenet0 -
Any tool built into MOZ that can help tell who the owner of a URL is?
I'd like to know if there's any tool which would let us know who the owner of a web domain is.
Moz Pro | | daleseppie0 -
[Moz Help] Re: Trying to add a valid URL into MOZ account
See below and pls let us know what we have to do solve this : | | Joel Day (Moz Help) Mar 07 05:03 PM Hey Tracy, It looks like there's a redirect loop on your site. greatwesternflooring.com redirects to www.greatwesternflooring.com/ which in turn 302 redirects back into itself. You'll likely need to fix the redirect before you can continue configuring the campaign. 🙂 Thanks!
Moz Pro | | Britewave
Joel. Moz
t: @HelpWizard | | | Tracy Mar 07 03:14 PM I sent an email, and this is the response I got. The help forum sent me here, so here I am 🙂 An answer was posted to this question:
Question I have a valid URL greatwesternflooring.com, but when I try to add this campaign I get an "opps" message telling me it's not a valid URL. Can you help me? Answer
This looks like a bug. Please reach out to us via support so that we can forward this along to our Developers for review. Thanks!(https://moz.com/help/contact)
See where this question was originally asked. |0 -
How to get past PA and DA value for a specific URL ?
Hi everyone, I was wondering if there is a way to get the past PA and DA value for a specific URL ? I did run a small SEO campaign targeting a couple of deep pages over a month on my site and I would like to measure the efficiency of this campaign but I forgot to write down what was the PA (I know more aloess the DA) of those pages before the starting the campaign. Is their a way to retrieve the historical data of PA/DA ? thanks
Moz Pro | | Gus_Martin0 -
Links not appearing on Open Site Explorer
My site gained several new inbound links during December and only two of them are not all showing up on the latest Linkscape update. It seems to be the links that were created at the end of the month which are showing up, whereas a handful at the beginning of the month are nowhere to be seen. All the linking pages have been indexed by Google the links are do-follow, and one of the sites in particular is not obsure and has a DA in the 90's. I appreciate the Linkscape doesn't index everything, but I would have thought that more tof the results of my efforts would have shown up in OSE. I'd be really grateful if anyone could explain this to me please. Thanks Ben
Moz Pro | | atticus70 -
Getting relevant keywords from URL with Google KW Tool.
Hi, When I first start researching a site, I like to see what Google "thinks" it is relevant to. I use the Google KW Tool and enter the website URL only. I sort the results by relevance. I can then show the prospective client what Google thinks his site is optimized for and use that info to show him what opportunities exist to rank for terms more relevant to his business. I show him keyword, volume and I also get current SERP rank for his site. For larger sites, I do this for the top pages based Domain Authority. I want to automate this process using excel and APIs but Google refused my API token request. I told them I wanted to use the "Google AdWords API Extension for Excel" from http://seogadget.co.uk/google-adwords-plugin-excel. The Google API token team replied: Please note, after reviewing your application in detail, we are sorry to let you know that we won't be able to approve your token. We understand that you are planning to use the AdWords API mainly for Targeting Idea Service (TIS) and Traffic Estimation Service (TES) such as 'keyword research'. Please note that as per the Required Minimum Functionality (RMF) outlined in the API Terms & Conditions, using the AdWords API exclusively for TIS and TES type of services is not allowed. Q1: What does the KW Tool relevancy data mean, anyway? Q2: is there another way to get it or is there another way to do this? Q3: Is there a better approach I should take with the Google API team? Q4: Are there other APIs and Excel plugins that can do this, including the SEOMoz APIs? Thanks,
Moz Pro | | phersh
Phil0 -
Rel Canonical issues for two urls sharing same IP address
Our client built a wordpress site on url A, then opted for a better url B. Rather than moving all the wordpress files/website over to the new url B, they just contacted GoDaddy, who hosted BOTH urls under the same IP address. When I do a term target on url B, I'm flagged for rel canonical use. I can only get a B grade for each keyword. (I've also tried using url A, but I get the same flag and B grade results). I'm not sure if this set-up will thwart our seo efforts for the site, because only the homepage comes up when you type in url B anyway. Every subsequent page displays the original url A. Somewhere, wordpress is also adding a rel canonical link on the homepage source to url A, too, which we can't seem to edit. So, question is: is it ok to leave this set up as is with both urls hosted on the same IP address, or should we move the whole site over to the desired url B? Thanks much!
Moz Pro | | GravitateOnline0 -
Exporting .csv
I love all the data Roger gives me if I ask him politely. It's awesome to turn that data into a nice looking Excel file for analysis. There is however one situation that gets me into trouble. When I export CSV, open it in Excel and convert text to columns (seperated by comma) and e.g. a Page Title contains a comma (which often happens); my file seperation is messed up. Anyone got some tips to handle that? Thanks in advance mozzers
Moz Pro | | Partouter0