top of page

Data Cleaning, Analysis, and Visualization

Sorry, but in this Blog, you will have to scroll down and try to stay awake. Visualizing data is only sometimes fun, and that is fine!

For our final project, we have been exploring, cleaning, analyzing, and visualizing data related to the potential causes of immigrants’ food insecurity. In the process, I found the Massachusetts DOR Income and EQV Per Capita by the municipality. Even though the Data Analytics and Resources Bureau had a manageable data set, the visualization was a struggle.

Link for the original dataset: dlsgateway.dor.state.ma.us

Before cleaning the data, I needed clarification about the terminology, I was not too close to economic topics on a slightly deep level, and I had to do some research to understand what I needed to keep and what I needed to remove. So here is some terminology I got used to in the process of making sense of income in Massachusetts:

Massachusetts Department of Revenue (DOR): DOR manages state taxes and child support. We also help cities and towns manage their finances and administer the Underground Storage Tank Program.

Equalized Valuation (EQV) is the determination of an estimate of the full and fair cash value (FFCV) of all property in the Commonwealth as of a certain taxable date.

Cherry Sheet FY is the name for the cherry-colored paper on which it was originally printed. The Cherry Sheet is the official notification from the Commissioner of Revenue of the upcoming fiscal year’s state aid and assessments to cities, towns, and regional school districts.

After understanding that a good amount of data was based on the estimated number, I decided to declutter the DOR code, LEA code, Cherry Sheet FY, DOR income estimate, EQV, and the EQV Per Capita. However, it was a potential issue when processing the data into visualization software in the long run; hence, I kept decluttering.

Correspondingly, after cleaning the dataset, I had four categories; municipality, county, population, and income per capita:

Then, I processed the clean data into Tableau and checked if all the data was accessible without null values:

Moving forward, I created a calculated parameter with the geographical data and a calculated field with the geographical data and the income, hoping to get a map that would portray the sum of the income through the municipalities:




This was the result of the very first draft:



However, the map was different from what I expected, and I needed data to make a map that could be more practical and dynamic instead of dots conveying information that could be communicated easier to the audience.


But even though I am not entirely content with the final product, it seems cohesive enough to explain the essential context of income influx in Massachusetts by the municipality. However, I want to keep practicing with this data set and provide other data tables, such as zip codes by the municipality, to have a functioning data layer on a gradient scale instead of the dots.

I will update you when I solve my discontent with this data set. Thank you for reading!

 

Previously posted in Medium by Sara Valentina Alvarez Echavarria:



4 views0 comments

Comments


bottom of page