Voting in Arizona and Navajo Nation (for the Mapbox Elections Challenge)

Lily Lou
6 min readNov 25, 2020

This project began in response to an article I read about a lawsuit from Navajo Nation over mail-in ballot deadlines. Having spent a semester studying American Indian activism (at UNC-Chapel Hill) through oral history research and part of a summer researching rural and tribal public education systems in New Mexico, I wanted to use data to better understand how people voted in the different regions of Arizona, specifically Navajo Nation and other tribal territories in Arizona. It is important to note that I specifically focused on tribal territories, which are often rural and sometimes lack access to postal service, rather than the Native American vote in Arizona(Sixty-seven percent of all Native Americans live in urban areas). To see voting trends, I focused specifically on Apache County, where Native Americans make up 73% of its population, and Navajo County, where Native Americans make up 43% of its population, and compared it to other regions, such as Maricopa County, which contains Phoenix and Pima County, which contains Tuscan.

US Census datasets were used for information on the voting populations in each Arizona County and the shapefiles of counties and tribal territories. Shapefiles were downloaded and edited in QGIS to only include regions in Arizona and later re-uploaded to Mapbox as a tileset. I also found a Google map with dropbox locations through Arizona’s official state voting site for mail-in ballot locations.

To add polling place locations, I downloaded The Center for Public Integrity’s polling places data set from 2018. It listed Arizona polling locations and their addresses (sometimes these were blank, especially in more rural areas), and mapped the addresses to longitude and latitude coordinates using the GeoCode by Awesome Table extension in Google Sheets. I then downloaded the Google Sheet as a CSV file and plotted it in QGIS.

Since the 2020 election led to a higher percentage of votes via mail-in ballots across the US and the Navajo Nation lawsuit was over mail-in ballot deadlines, I wanted to analyze the frequency that mail-in ballots were rejected based on county (it was inspired by this article on race and vote by mail rejections in North Carolina). I tried contacting local elections offices but was unable to obtain voter data broken down by race and ethnicity for Arizona. Instead, I used the U.S. Election Assistance Commission’s Election Administration and Voting Survey reports from 2018. I used 2018 data because it is the most recent general election; however, turnout is often higher for presidential elections (and turnout was even higher in Arizona for 2020 compared to 2016). The voting survey dataset was incredibly detailed and had information on voting in each county, from the number of mail-in ballots turned in to counts of different registration methods to the number of poll workers.

I joined this data set (filtering just for Arizona counties) with a census dataset on populations and the Citizen Voting Age population per county and created a SQL query to calculate the vote by mail percentage, the percentage of mail-in ballots rejected because of a missed deadline, the total number of registered voters, the percentage of votes cast in person and by mail, and voter turnout. I also validated the data (for example, I wanted to make sure the total rejected mail-in ballots was the same as the sum of the various categories for why by-mail rejected).

SQL query in DB Browser for SQLite

This didn’t always match. For example, Navajo County showed 21 total mail-in ballot rejections but 128 mail-in ballots rejected because of a deadline. Pinal county showed 0 rejections but 128 ballots were rejected because of a deadline. I fixed these errors by later manually calculating the ballot rejection rates where the total ballot rejections did not match the sum of the breakdowns.

Excel table of ballot rejections (C4a) and the breakdowns for reasons they were rejected (C4b-C4r)

Another mismatch was the Census’s estimates of the Citizen Voting Age populations. To calculate the percentage of the population that was registered to vote, I divided the number of registered voters by the citizen voting age population for each county (which is estimated by the Census). The map showed a majority of eligible voters registered but showed that for Apache and Santa Cruz counties, there were more registered voters than eligible voters. Because of this, I was unable to use this dataset for the final visualization.

Apache and Santa Cruz counties show more registered voters than the Census’ citizen voting age population estimate

To get the mail-in rejection rate, I divided the total number of rejected ballots by the sum of the ballots rejected and the number of votes cast via mail-in ballot to find the rejection rate. This count did not count absentee ballots sent from The Uniformed and Overseas Citizens Absentee Voting Act.

Output of SQL query

Once data was collected, I appended the data I wanted to visualize (mail-in rejection rates, mail-in rejection rates due to ballots being sent in late, voter turnout, and votes cast via a mail-in ballot) to a geoJSON with the county boundaries and uploaded this to Mapbox.

Once I finished compiling the data, I hardcoded the site in HTML and CSS (a fatal mistake that I will avoid next time I want to include more than one map on a site), using various Mapbox tutorials for reference.

Since the election margins for the 2020 presidential race were very narrow, I wanted to see how different regions, particularly Navajo Nation, affected the total vote, so I used Mapbox’s presidential vote tileset from 2016.

While reports showed that voter turnout for American Indians increased in 2020, voting precincts within the Navajo Nation ranged from 60–90% in favor of Biden; however, many enrolled members of Navajo Nation do not live on the reservation and vote in other counties.

In future iterations of this project, I hope to gather more data specific to race and ethnicity and to possibly isolate voter data by precinct for which candidates won (specifically examining voting precincts on tribal territories). I also hope to compare it with election results from multiple years to reduce the margin of error since the number of ballots rejected in each county is always relatively small. See the full project here (best viewed on a laptop monitor) and the GitHub code here.

--

--

Lily Lou
0 Followers

currently ?????? / prev. @lifehacker, @PasteMagazine and others / lilylou16@gmail.com