Writing A Data Extraction To Web Page Program
-
In my area, there are few different law enforcement agencies that post real time data on car accidents. One is http://www.flhsmv.gov/fhp/traffic/crs_h501.htm. They post the accidents by county, and then in the location heading, they add the intersection and the city. For most of these counties and cities, our website, http://www.kempruge.com/personal-injury/auto-and-car-accidents/ has city and county specific pages. I need to figure out a way to pull the information from the FHP site and other real time crash sites so that it will automatically post on our pages. For example, if there's an accident in Hillsborough County on I-275 in Tampa, I'd like to have that immediately post on our "Hillsborough county car accident attorney" page and our "Tampa car accident attorney" page.
I want our pages to have something comparable to a stock ticker widget, but for car accidents specific to each pages location AND combines all the info from the various law enforcement agencies. Any thoughts on how to go about creating this?
As always, thank you all for taking time out of your work to assist me with whatever information or ideas you have. I really appreciate it.
-
-
Write a Perl program (or other language script) that will: a) read the target webpage, b) extract the data relevant for your geographic locations, c) write a small html file to your server that formats the data into a table that will fit on the webpage where you want it published.
-
Save that Perl program in your /cgi-bin/ folder. (you will need to change file permissions to allow the perl program to execute and the small html file to be overwritten)
-
Most servers allow you to execute files from your /cgi-bin/ on a schedule such as hourly or daily. These are usually called "cron jobs". Find this in your server's control panel. Set up a cron job that will execute your Perl program automatically.
-
Place a server-side include the size and shape of your data table on the webpage where you want the information to appear.
This set-up will work until the URL or format of the target webpage changes. Then your script will produce errors or write garbage. When that happens you will need to change the URL in the script and/or the format that it is read in.
-
-
You need to get a developer who understands a lot about http requests. You will need to have one that knows how to basically run a spidering program to ping the website and look for changes and scrape data off of those sites. You will also need to have the program check and see if the coding on the page changes, as if it does, then the scraping program will need to be re-written to account for this.
Ideally, those sites would have some sort of data API or XML feed etc to pull off of, but odds are they do not. It would be worth asking, as then the programming/programmer would have a much easier time. It looks like the site is using CMS software from http://www.cts-america.com/ - they may be the better group to talk to about this as you would potentially be interfacing with the software they develop vs some minion at the help desk for the dept of motor vehicles.
Good luck and please do produce a post here or a YouMoz post to show the finished product - it should be pretty cool!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redesign Just Starting - Should I Leave The Previous Incomplete Site or Setup A Temporary Holding Page and Redirect Previous URL'S?
Hi All I've picked up a new website project and wanted to ask about the best way to proceed with the current site during the development process. The current site is incomplete although it has been live for a while and has over 80 pages in the sitemap. Link to site https://tinyurl.com/ychwftup The business owner wants to take down the current site and simply add a landing page stating "new website coming soon". From an SEO perspective, am I better to keep the current site live until the new site is ready? Or would it not make any difference if I setup the landing page and add 301 redirects from each page in the sitemap to the landing page. Many Thanks In Advance For Any Assistance
Web Design | | ruislip180 -
We redesigned our website, make it responsive and page views tanked. What happened?
Last year, we redesigned our site and made it responsive. Our page views only grew by only 3% (the previous year they grew by 40%). If we exclude homepage views from our calculations, we get a drastically different picture-- and see over 30% growth for both total and unique pageviews. Any thoughts?
Web Design | | Anna720 -
Spaces at beginning of title tag - negatively affect the optimization of the page?
For some reason, our title tags have a long space after the beginning title tag and before the text appears. The beginning title tag is on one line, then a break, a tab and then the content of the title tag. I'm pretty sure this is not good and is affecting optimization of the page. Am I correct or is this not an issue and does not need to be fixed? Example: | <title></span></p> <p> </p> </td> </tr> <tr> <td class="line-number"> </td> <td class="line-content"> First keyword</td> </tr> </tbody> </table></title> |
Web Design | | CFSSEO0 -
Will changing product from Grouped to Simple on my magento category page affect my SEO?
Hi all, A category page on my site http://www.porcelainsuperstore.co.uk/wood-effect.html currently ranks number 3 on Google for the keyword "Wood Effect Tiles" We're currently reorganising some of our product and I would like to know if this is going to affect the SEO and ranking for the above page and keyword. The majority of products on that page are magento grouped products. I would like to change the page so that it displays only the different constituent simple products rather than the grouped products on the category page. My question is, will this have any impact on SEO? I intend on leaving all other data on the category page the same - so the metadata and the description/title etc. Any help/comments would be much appreciated! Ben
Web Design | | piazza0 -
Internal links, new pages & Domain Authority
I have two questions regarding Domain Authority: 1. Is it possible that a drop in Domain Authority may have been caused by adding a blog and blog posts? In other words, would adding pages/posts dilute the site's authority? And will it catch back up with itself or will that require inbound links to those new pages? (oops! that was 3 questions in one) 2. Would it be detrimental to have internal links coming from blog posts without authority to my Home page and could that have contributed to a drop in Domain Authority? Thanks!
Web Design | | gfiedel0 -
Parameters - Google Web Master Tools
In Google Web Mastertools you can stipulate which paramters you want the Googlebots to ignore when crawling your site. This is common place on pages that add some form of parameterisaton to the end of the link when a web user filters the information on a page (eg. on a clothes website someone may filter the products so they only see 'blue' jumpers, rather than 'all') This is meant to be beneficial as it means Google trawls through less duplicate content. Having now set this up, what impact will this have on my search results, if any? Don't get me wrong, I'm not expecting to shoot up to no.1, but will it benefit me in any way?
Web Design | | DHS_SH0 -
Looking for quality, cheap web design company recommendations
Does anyone have any recommendations of a very cheap web design company from India or other countries where the work is also high quality? I can project manage the development and provide a set of web standards such as to use valid code, no meta keyword tags, no flash, etc etc. I am looking for companies I can trust to perform the actual work. The web is full of companies but examples of quality work with WordPress, Joomla and particular ecommerce platforms is very thin or non-existent. If you can share any companies which you have personal experience with, I would appreciate it.
Web Design | | RyanKent0 -
Two home pages?
One of my campaigns shows duplicate page content for domain xxx and xxx/index. There is only one index (home) page, so why does it report on two?
Web Design | | Beemer0