Writing A Data Extraction To Web Page Program
-
In my area, there are few different law enforcement agencies that post real time data on car accidents. One is http://www.flhsmv.gov/fhp/traffic/crs_h501.htm. They post the accidents by county, and then in the location heading, they add the intersection and the city. For most of these counties and cities, our website, http://www.kempruge.com/personal-injury/auto-and-car-accidents/ has city and county specific pages. I need to figure out a way to pull the information from the FHP site and other real time crash sites so that it will automatically post on our pages. For example, if there's an accident in Hillsborough County on I-275 in Tampa, I'd like to have that immediately post on our "Hillsborough county car accident attorney" page and our "Tampa car accident attorney" page.
I want our pages to have something comparable to a stock ticker widget, but for car accidents specific to each pages location AND combines all the info from the various law enforcement agencies. Any thoughts on how to go about creating this?
As always, thank you all for taking time out of your work to assist me with whatever information or ideas you have. I really appreciate it.
-
-
Write a Perl program (or other language script) that will: a) read the target webpage, b) extract the data relevant for your geographic locations, c) write a small html file to your server that formats the data into a table that will fit on the webpage where you want it published.
-
Save that Perl program in your /cgi-bin/ folder. (you will need to change file permissions to allow the perl program to execute and the small html file to be overwritten)
-
Most servers allow you to execute files from your /cgi-bin/ on a schedule such as hourly or daily. These are usually called "cron jobs". Find this in your server's control panel. Set up a cron job that will execute your Perl program automatically.
-
Place a server-side include the size and shape of your data table on the webpage where you want the information to appear.
This set-up will work until the URL or format of the target webpage changes. Then your script will produce errors or write garbage. When that happens you will need to change the URL in the script and/or the format that it is read in.
-
-
You need to get a developer who understands a lot about http requests. You will need to have one that knows how to basically run a spidering program to ping the website and look for changes and scrape data off of those sites. You will also need to have the program check and see if the coding on the page changes, as if it does, then the scraping program will need to be re-written to account for this.
Ideally, those sites would have some sort of data API or XML feed etc to pull off of, but odds are they do not. It would be worth asking, as then the programming/programmer would have a much easier time. It looks like the site is using CMS software from http://www.cts-america.com/ - they may be the better group to talk to about this as you would potentially be interfacing with the software they develop vs some minion at the help desk for the dept of motor vehicles.
Good luck and please do produce a post here or a YouMoz post to show the finished product - it should be pretty cool!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redesign Just Starting - Should I Leave The Previous Incomplete Site or Setup A Temporary Holding Page and Redirect Previous URL'S?
Hi All I've picked up a new website project and wanted to ask about the best way to proceed with the current site during the development process. The current site is incomplete although it has been live for a while and has over 80 pages in the sitemap. Link to site https://tinyurl.com/ychwftup The business owner wants to take down the current site and simply add a landing page stating "new website coming soon". From an SEO perspective, am I better to keep the current site live until the new site is ready? Or would it not make any difference if I setup the landing page and add 301 redirects from each page in the sitemap to the landing page. Many Thanks In Advance For Any Assistance
Web Design | | ruislip180 -
Multiple Similar Product Variations - Page layout, Title and SEO best practice??
Im doing some research into SEO for our new web design. I sell designer eyewear prescription and sunglasses. Lets take a Ray Ban Wayfarer sunglass it comes in 30 colours and 3 sizes for each model. Up till now i was of the impression that for best practice SEO i would need to have each individual variation as its own page, this would also help with things like google shopping too. So for example heres 1 colour product in 3 sizes of 30 colour variations for this particular model. Ray Ban Wayfarer RB2140
Web Design | | Craigboi1987
Colour: Black 901
Sizes: 47, 50, 54 Currently my urls looks like this with a new page and the size changing on the end for each variation. Ray Ban Wayfarer RB2140 - Black 901 - 47 URL: www.mywebsite.com/ray-ban-wayfarer-rb2140.html?colour=Black+901&size=47 Ray Ban Wayfarer RB2140 - Black 901 - 50 URL: www.mywebsite.com/ray-ban-wayfarer-rb2140.html?colour=Black+901&size=50 Ray Ban Wayfarer RB2140 - Black 901 - 54 URL: www.mywebsite.com/ray-ban-wayfarer-rb2140.html?colour=Black+901&size=54 This is very time consuming and I'm not sure if its adding any benefit to my SEO in fact scared its actually a) slowing my site down (content heavy)
b) looking like duplicate content I am thinking about moving towards a page more like this were it would be just be a model with variations. (not effecting the title/getting a new page per variation) http://demoleotheme.com/vigoss/index.php/atomic-endurance-running-tee-crew-neck.html I am not sure of the pros and cons of doing it this way over the way I'm doing it currently all i know is my site is ranking horribly. Lastly I'm currently running a magento V1.9 store which is renowned for duplicate content slow site speeds etc so have been told moving to woo commerce would benefit me for both site performance and seo but I'm skeptical as currently with this structure of a each SKU being a new page il be up to 8000+ products and multiple product variations that it can handle my needs, anyone with any experience on woo commerce platform? (this might be a operate question apologise) This is absolutely frying my brain so any advice appreciated. Im prepared to put every dying second into just need some solid advice in which direction to go!0 -
Any body can help me to make my web site seo freindly?
any body can help me to make my web site seo freindly? i have not big budget please email me fabric35@hotmail.com
Web Design | | fabric-fabric0 -
Multiple Local Schemas Per Page
I am working on a mid size restaurant groups site. The new site (in development) has a drop down of each of the locations. When you hover over a location in the drop down it shows the businesses info (NAP). Each of the location in the Nav list are using schema.org markup. I think this would be confusing for search robots. Every page has 15 address schemas and individual restaurants pages NAP is at the below all the locations' schema/NAP in the DOM. Have any of you dealt with multiple schemas per page or similar structure?
Web Design | | JoshAM0 -
One Page Guide vs. Multiple Individual Pages
Howdy, Mozzers! I am having a battle with my inner-self regarding how to structure a resources section for our website. We're building out several pieces of content that are meant to be educational for our clients and I'm having trouble deciding how to layout the content structure. We could either layout all eight short sections on a single page, or create individual pages for each section. The goal is obviously to attract new potential clients by targeting these terms that they may be searching for in an information gathering stage. Here's my dilemma...
Web Design | | jpretz
With the single page guide, it would be nice because it will have a lot of content (and of course, keywords) to be picked up by the SERPS but I worry that it is going to be a bit crammed (because of eight sections) for the user. The individual pages would be much better organized and you can target more specific keywords, but I worry that it may get flagged for light content as some pages may have as little as a 150 word description. I have always been mindful of writing copy for searchers over spiders, but now I'm at a more technical crossroads as far as potentially getting dinged for not having robust content on each page. Here's where you come in...
What do you think is the better of the two options? I like the idea of having the multiple pages because of the ability to hone-in on a keyword and the clean, organized feel, but I worry about the lack of content (and possibly losing out on long-tail opportunities). I'd love to hear your thoughts. Please and thank you. Ready annnnnnnnnnnnd GO!0 -
How can we improve our e-commerce site architecture to help best preserve Page Authority?
Today I installed the SEOMoz toolbar for Firefox (very cool, highly recommended). I was comparing our site http://www.ccisolutions.com to this competitor: http://www.uniquesquared.com For the most part, the deeper I go in our site the more the page authority drops. We have a few exceptions where the page authority of a subcategory page is actually better than the cat. page one level up. In comparison, when I was looking at http://www.uniquesquared.com I noticed that their page authority stays at "21" on every single category page I visit. Are you seeing what I'm seeing? Is this potentially a problem with the tool bar or, is there something significantly different about their site architecture that allows them to maintain that PA across all category and sub category pages? Is there something fundamentally wrong with our (http://www.ccisolutions.com) site architecture? I understand that we have longer URLs, but this is an old store with a lot of SKUs, so we have decided not to remove the /category/ and /product/ from the URLs because the 301 redirects that would result wouldn't pass all of the authority they've built up over the years. Interested to know viewpoints on the site architecture and how it might be improved. Thanks!
Web Design | | danatanseo0 -
Google Penalizing Websites that Have Contact Forms at Top of Website Page?
Has anyone else heard of Google penalizing websites for having their contact forms located at the top of the website? For example http://www.austintenantadvisors.com/ Look forward to hearing other thoughts on this.
Web Design | | webestate1 -
Why is Google sending traffic to our homepage, not our optimized pages?
Hello Forum, My team and I just completely redid a yoga eCommerce site, including its SEO. The old version of the site didn't feature page-specific optimization and, as a result, Google's search results for our keywords almost always directed visitors to the homepage. For example, a Google search for the term "yoga bolster" sent users to the homepage, not the product category page for yoga bolsters. After redoing the site and optimizing specific pages (i.e. the yoga bolster page is now optimized for the keyword "yoga bolster"), the Google search results are still taking users to the homepage, not the optimized page. (i.e. if you search for yoga bolster, find our search result, and click the search result link, you're taken to the homepage, not the bolster page) It's only been about 36 hours since we've launched the new website and submitted it to Google's webmaster tools. Does anyone know why Google is still sending people to our homepage and not the keyword-optimized pages we created? Is this a timing issue?
Web Design | | pano0