tropical storms data

Exploratory Data Analysis of Tropical Storms in R

Exploratory Data Analysis of Tropical Storms in R

The disastrous impact of recent hurricanes, Harvey and Irma, generated a large influx of data within the online community. I was curious about the history of hurricanes and tropical storms so I found a data set on data.world and started some basic Exploratory data analysis (EDA).

EDA is crucial to starting any project. Through EDA you can start to identify errors & inconsistencies in your data, find interesting patterns, see correlations and start to develop hypotheses to test. For most people, basic spreadsheets and charts are handy and provide a great place to start. They are an easy-to-use method to manipulate and visualize your data quickly. Data scientists may cringe at the idea of using a graphical user interface (GUI) to kick-off the EDA process but those tools are very effective and efficient when used properly. However, if you’re reading this, you’re probably trying to take EDA to the next level. The best way to learn is to get your hands dirty, let’s get started.

Continue reading

US Immigration Enforcement – Part 1

Trend of US Immigration Enforcement

In the coming months I’ll be digging into the immigration enforcement data posted on data.world. I encourage anyone to take this data and either add to the project or to do something on their own. I will be bringing in external data sources to merge as well (which I did for this first plot).

If you’re only here for a “high-level nugget” of information, the basic thing you can see is:

Things have changed since 1925!

Continue reading

Google Vision API in R – RoogleVision

Using the Google Vision API in R

Utilizing RoogleVision

After doing my post last month on OpenCV and face detection, I started looking into other algorithms used for pattern detection in images. As it turns out, Google has done a phenomenal job with their Vision API. It’s absolutely incredible the amount of information it can spit back to you by simply sending it a picture.

Also, it’s 100% free! I believe that includes 1000 images per month. Amazing!

In this post I’m going to walk you through the absolute basics of accessing the power of the Google Vision API using the RoogleVision package in R. Continue reading

Face Detection in R

Face Detection in R

OpenCV is an incredibly powerful tool to have in your toolbox. I have had a lot of success using it in Python but very little success in R. I haven’t done too much other than searching Google but it seems as if “imager” and “videoplayR” provide a lot of the functionality but not all of it.

I have never actually called Python functions from R before. Initially, I tried the “rPython” library – that has a lot of advantages, but was completely unnecessary for me so system() worked absolutely fine. While this example is extremely simple, it should help to illustrate how easy it is to utilize the power of Python from within R. I need to give credit to Harrison Kinsley for all of his efforts and work at  PythonProgramming.net  – I used a lot of his code and ideas for this post (especially the Python portion).

Continue reading

medicare

Hospital Infection Scores – R Shiny App

Medicare Data – R Shiny App

About two weeks ago I created an extremely rough version of an R Shiny Application surrounding Medicare data. Right after publishing the blog post, I received a lot of input for improvement and help from others.

Here’s a look a look at the latest version of the Medicare Shiny App. This utilizes data.gov found here.

I was traveling for two weeks and had very little time to do any work on it. After creating a GitHub Repository for it, the user Ginberg played a huge role in cleaning it up and adding a lot more functionality. I found it incredible that a complete stranger to me would put in such effort to something like this. In fact, he isn’t even a resident of the USA – so Medicare probably isn’t on his radar as often as it is for some of us. Fantastic generosity!

Ultimately, I will be looking to keep this project alive and grow it to fully utilize a lot more of the Medicare data available. The infections data set was very simple and easy to use, so I started off with it but there are a lot more tables listed on data.gov. The purpose of this application is to allow people who don’t want to spend time digging through tables to utilize the information available. This isn’t necessarily just for people seeking care to make a decision but this could perhaps be utilized for others doing research regarding hospitals in the US.

The R Shiny App allows you to filter by location and infection information. These are important in helping to actually find information on what you care about.

Three key tabs were created by (@Ginberg):

  • Sorting hospitals by infection score
  • Maps of hospitals in the area
  • Data table of hospital data

Sorting hospital data by score:

  • This is a tricky plot because “score” is different for each type of metric
  • Higher “scores” aren’t necessarily bad because they can be swayed by more heavily populated areas (or density)
  • Notice the use of plotly and its interactivity

Continue reading