On Monday, April 26, 2021, the Census Bureau released the Census 2020 population by state data, also known as apportionment data. These counts are used to divide up the seats in the U.S. House of Representatives among the 50 states. We can use this first Census 2020 data release to calculate population growth by state for 2020.
My partner, Anthony, built the data viz below so you can see how your state(s) of interest grew. The idea behind this visualization is that you can tell at a glance that “this state is growing [faster than | about the same as | slower than] the US or other states as well as itself.”
Population Growth by State 2020 Data Visualization
2020 Residential Population
Percent Population Change
District of Columbia
Sources for the above visualization
1990 All Geographies Except Puerto Rico – https://www.census.gov/data/tables/1990/dec/1990-apportionment-data.html
1990 Puerto Rico – https://www2.census.gov/programs-surveys/popest/tables/1990-2000/municipios/totals/pr-99-1.txt
You can get the same visualization above when you purchase Radius Reports for counties. Using the population projection data in the report, you can say “this county, where I’m interested in opening my new business, is growing [faster than | about the same as | slower than] the state.”
Where’s the rest of the Census 2020 data?
It’s coming. I still haven’t heard a release date for the Demographic and Housing Characteristics File (aka the good stuff that we all want — data for small geographies). I will be updating the 2020 Census Data Release Update blog post as I hear more.
Other Highlights from the Census Bureau’s 2020 Apportionment Data Release
Yesterday, the Census Bureau released apportionment data which includes the total U.S. Population. As of April 1, 2020, we were 331,449,281 people strong.
As a country, our population is increasing, but the growth rate slowed a bit over the past 10 years. As you can see below, the percent change dropped from 9.7% in 2000 – 2010 to 7.4% in 2010 – 2020. In fact, the Census Bureau staff mentioned that this is one of the slowest population growth periods we’ve had in our nation’s comparatively short history.
The South and West regions are growing faster than the other regions. Must be all of that sunshine and warmth!
California and Wyoming are not that different if you are looking at land area but look at the huge difference in the population.
Most states grew in population during the 2010 – 2020 time period with Utah being the fastest growing state. Only 3 states had a decrease in population with West Virginia declining the most.
Not any time soon. It’s complicated. Here’s a table with important dates.
The boundaries (think outlines)
April 30, 2021
Population by state
Aug 16 & Sep 30 2021
Limited demographics by state. Aug = legacy format; Sept = “easy to use” format.
Demographic & Housing Characteristics File AND Detailed File
**The GOOD stuff** Staggered release by state for all demographics for small geographies.
December 9, 2022
American Community Survey 5 year estimates
Not 2020 Census but has important data that’s not in the 2020 Census like income
2020 Census Data Release Table
The Long Answer (that only my mom will read)
February 2021 – Spatial data
The US Census Bureau has already released the 2020 geographies, and we’re still waiting on the demographics. It’s like getting the tender, flaky cannoli shells first and then having to wait for the creamy ricotta filling.
April 30, 2021 – Apportionment count
We’ve received state level resident population (+ the overseas federal employees) in this data release which is used for determining seats in Congress. Here’s a new blog post all about the apportionment data called Population Growth by State 2020.
August 16 & September 30, 2021 – Redistricting data
This data release will be the first file that includes demographic and housing characteristics. The good news is that the redistricting dataset will be available for small geographies like Census blocks. The bad news is that this dataset will only include:
Race & ethnicity
Housing units & occupancy status
Group quarters << don’t worry if you don’t know what this is
???? – Demographic Profile
This new dataset will include demographic & housing data for citiesonly (technically: places/minor civil divisions — but will it be all cities or just big cities?) and is supposed to be released “as soon after the release of the Redistricting product as possible.”
This is the dataset that you and I and everyone who isn’t doing redistricting really wants — the luscious filling for our cannoli – with all of the available 2020 Census demographics for large & small geographies.
Rumor has it that this dataset will be released on a state by state basis and won’t be fully available until December 2021. Data nerd aside: The DHC will include many of the demographic and housing tables previously included in the Summary Files. DHC subjects include:
Race & ethnicity
Household & family type & relationship
Occupancy status (occupied vs vacant)
Tenure (owner vs renter)
But don’t forget that we’ll still have to use the American Community Survey for important data like income.
December 2021 – American Community Survey 20205 Year Estimates
The American Community Survey (ACS) is the ongoing, annual survey of 3.5 milllion addresses that collects the social, economic, housing, and demographic characteristics of the nation’s population. The US Census Bureau will use ACS surveys collected in 2016, 2017, 2018, 2019 and 2020 to produce demographic estimates for small geographies like zip codes/ZCTAs for 2020. Historically, the ACS is released in December.
Updated March 29, 2021. The Census Bureau has communicated that the 2020 American Community Survey (ACS) will use the 2010 ZCTA boundaries rather than the 2020 ZCTA boundaries. As of right now, whenever the 2020 Census demographics are released for zips/ZCTAs, there won’t be 2020 income data for those same zips/ZCTAs. The Census Bureau is planning on using the updated 2020 ZCTA boundaries in the 2021 ACS release.
But I’m curious to learn if the 2020 ACS (to be released in 2021) or the 2021 ACS (to be released in 2022) will use the same geographies as the 2020 Census. I asked the Census Bureau this question and their reply is below:
“The 2020 ACS Data Release schedule will be posted within the next week or two. We are planning on updating the Geography Boundaries by Year page at the same time, which will tell you which boundaries will be used for each level of geography in the 2020 ACS data products. This geography boundaries by year page is usually posted at the same time as the data release in September, but we are posting it early this year because we have gotten a lot of questions about which boundaries will be used due to the 2020 Census.”
Below are some lovely graphics explaining how the ACS and the Decennial Census fit together and are different.
Now if you’ll excuse me, I’m hungry for cannoli – I can’t imagine why. Send me an email if you have any more burning questions about the 2020 Census, and I’ll reply after my cannoli run.
Summary: You can use Google Ads data to estimate demand for your business — which is cool because it’s harder to get data about demand than it is to get data about who lives where.
One of the most popular demographic data pulls that we do each day is a radius report which provides demographics for a radius (or a ring) around a location. The US Census Bureau doesn’t provide radius reports, so our clients who need them – small business owners, real estate folks and health care companies – purchase them from us as part of their marketing research for opening new locations or exploring real estate development projects.
Starting in 2014, I used to make $200-ish dollars a month in profit selling radius reports via Google Ads. Although $200 is nothing to brag about, it was a nice source of new clients that often bought future reports for a minimal marketing effort on my part.
Unfortunately, all easy things in business come to an end. (Or is this just for my business?)
Since February 2019, I’ve consistently lost money each month on my Google Ad campaigns except for 1 month when I turned them off in disgust. I’ve run experiments to improve my ads, testing different settings & geographic restrictions and even hired an expert — who lost way more money than I did (ha!). So it’s time to try something new.
Google Ads has DEMAND DATA
The good news is that Google Ads has something more valuable to me than $200 a month. It has data on the the exact search terms used by visitors who then went on to buy a radius report from me (conversions). I call this data demand data.
So based on my analysis of this demand data, my partner, Anthony, built the website Demographics By Radius for 5 mile radiuses around all US city centers. We provide these demographics for free hoping to attract customers that need demographics for a location other than the city center.
Well that’s great for you Kristen, but how do I use Google Ad data when I don’t have 6 years of demand data ready to analyze or an a developer partner to build me a website?
I’m so glad you asked. Let’s pretend that you want to open a new Pilates studio in Austin, Texas. First, you build a custom map to identify areas with lots of wealthy women — just keeping things simple. And by filtering on high median incomes ($84,00+) and population density (522+) by zip code, you identify 3 initial areas of interest.
You overlay other Pilates businesses (blue dots) over your wealthy women areas. And you decide to exclude the wealthiest area of West Lake, because there are already so many competitors.
Next up, you need to decide between Circle C and Cedar Park. And before you dive into cost data like price per square foot for commercial real estate, it’s a good idea to explore the demand for Pilates in Circle C versus Cedar Park.
Here’s where Google Ads comes in. Open Google Ads and open Tools / Keyword Planner / Get search volumes and forecasts.
Then you’d enter your search terms like Pilates Cedar Park (90 monthly searches) versus Pilates Circle C (no data there’s so few searches). Click on Historical Metrics to see the following data:
Be curious here. You might compare pilates Austin (720 searches) with yoga Austin (1,600 searches). So there’s almost double the interest in yoga as pilates in Austin. You might also check other Texas cities like pilates corpus christi (70 searches) or pilates san marcos (40 searches) that might not have as many competitors.
Probably because my personal business is so driven by Google searches, I would lean on this data to help me pick a name for my studio. For example, you would want to use My Brand Name – Pilates South Austin Studio versus My Brand Name – Pilates Circle C Studio, because more people search for “pilates south austin.”
That said, I wouldn’t hang up a shingle only using Google Ad data for demand estimation. Ideally, you’d use this data give yourself permission to run a small Google Ad campaign or similar in both Circle C and Cedar Park. Maybe you’d set up a Coming soon to Circle C | Ceder Park landing page and ask for email opt-in. You could also give away free online classes in exchange for a telephone call survey. Better yet, you’d go talk to active people in parks in Circle C and Cedar Park about their interest in Pilates. Maybe start a class in the park and see how many people stop by to chat. These are just a few low cost examples of additional experiments that you can run to estimate demand for your new location.
Still with me? Wow! You’re my type of data-driven business owner for actually finishing a long-ish data blog post. Get more tips by signing up in the Monthly Email Newsletter section. I normally write about what’s available in different datasets rather than deeper dives into how to use 1 dataset. Was this deeper dive too much in the weeds and you prefer the normal “what’s out there” data overviews? Or did you find this how-to to be a helpful walk-through about pulling specific data to solve a specific problem? Let me know what you think. Let me know what you think.
The outbreak of Covid-19 has shifted more of the world online. Kids are going to school from home and need internet access for their classes. Staff are working remotely and meeting online. Coincidentally, interest in data about broadband availability and home computers has increased as government agencies, businesses and non-profits are identifying areas that lack fast internet.
There are 2 government datasets that I use to help our clients identify areas lacking broadband internet:
US Census American Community Survey
FCC’s Fixed Broadband Deployment Data
Many state and local governments use these datasets to evaluate broadband access in communities and institute policies and programs to increase access for areas with less connectivity. Businesses and non-profits can also use these statistics to analyze internet access in the communities that they serve. Below is a quick introduction to both of the datasets as well as how to map them.
About the US Census Computer, Internet & Broadband data
Dataset: US Census Bureau’s American Community Survey
Most Current Year: 2018 (with 2019 data to be released in December 2020)
First Year Data Are Available:
2017 5 Year – For Small Geographies like zips/ZCTAs, Census tracts & block groups
2013 1 Year – For Large Geographies like states, MSAs, counties & cities
You may be aware of the US Census that counts US residents every 10 years. The same agency collects internet and broadband statistics annually. The most current data can be found in the U.S. Census Bureau’s release of the 2018 American Community Survey (ACS) . The 2019 data set will be available in December 2020.
The computer and internet use questions were mandated by the 2008 Broadband Improvement Act and added to the ACS in 2013. The questions are not asked for the group quarters population and do not include data about people living in housing such as dorms, prisons, nursing homes, etc.
If you need data about the ownership and usage of all types of computers, including desktops, laptops, smartphones, tablets, etc., the 2018 ACS is the right dataset. The data also includes whether any member of the household has access to the internet. “Access” refers to someone in the household using or connecting to the internet, regardless of the service fee they pay.
The ACS provides information about the type of internet service used by the U.S population:
Cellular Data – a plan for a smartphone or other mobile device;
Broadband (high speed) Internet service-fiber optic or DSL;
Dial-up or some other service.
Let’s take a glimpse into the most popular computer and internet access tables that I pull from the 2018 American Community Survey.
The above tables can be even more valuable when you add additional dimensions like computer or internet subscription by:
Labor Force Status, and
For example, you can get income by internet access or computing device by race/ethnicity.
About the FCC Data Fixed Broadband Deployment Data
Dataset: FCC’s Fixed Broadband Deployment Data
Most Current Year: 2018 (with 2019 data to be released in December 2020)
Update: Data appears to be updated twice a year
Geographies: Census blocks
Census Blocks are typically bounded by streets, roads or creeks. In cities, a census block may correspond to a city block, but in rural areas where there are fewer roads, blocks may be limited by other features. The population of a census block varies greatly. As of the 2010 census, there were 4,871,270 blocks with a reported population of zero,while a block that is entirely occupied by an apartment complex might have several hundred inhabitants.
The Federal Communications Commission (FCC) monitors regional and global communications in all 50 states, the District of Columbia, and U.S. territories. The commission, managed by Congress, stands as the United States’ primary authority for communications law, regulation, and technological innovation. The FCC use the Fixed Broadband Deployment statistics to measure the nationwide development of broadband access, as well as the successful deployment of the next generation of broadband technology.
Here’s a sample of the raw FCC data.
The projects that I’ve done in the past with this data involved identifying areas without high speed internet or broadband. To do this, we aggregate the data by block group to list all of the providers and identify the largest Max Advertized Download Speed and Max Advertised Upstream Speed. So that would look like the example below where I’ve concatenated all businesses offering internet service for a single Census block and identified the max speeds for all providers.
Aside: a client mentioned that the FCC data can be misleading because the FCC allows a provider to report the highest download and upstream speeds if just 1 household in the block has internet access at those speeds. I haven’t verified if this statement is true or false.
You can map data from the U.S. Census Bureau and the FCC datasets to identify areas with slow internet connection. Here’s what what a quick broadband map looks like:
Here are some helpful hints for using the map.
Turn layers on and off by checking the check-boxes in the upper left.
Turn on choropleth display (or colors) by clicking on the teardrop.
Click on the bar graph to make handles appear to filter.
And here’s a quick video showing you how to use the checkboxes, teardrop and bar graph in the map to identify areas that lack broadband access.
Whew! Still reading? I’m impressed. So if you’ve gotten this far and you’re thinking, “Yeah, sure I could pull all of this data and build this map myself in a couple of days, but I have other important things to do and I really don’t want to sift through all of the data dictionaries, methodology statements and tool instructions to make sure that I have the most current data,” you are not alone. You sound just like our other clients at Cubit who depend on us to provide clean, accurate and easy-to-work-with data as well as human-to-human customer support. Prices start at $399 with a 3 business day turnaround. Tell me what data you need for what geography & I’ll get you a free quote & turnaround estimate.
“Happy families are all alike; every unhappy family is unhappy in its own way.” ― Leo Tolstoy, Anna Karenina
Not only are crime data about an unhappy topic, I also feel like each source of crime data that we pull for clients has a major drawback or “is unhappy in its own way.” While crime data is a bit outside of my area of expertise (which is demographics & business data), I do occasionally pull crime statistics for clients. Below is an overview of who uses crime data and why as well as 10 resources along with each source’s unhappy limitation.
Most Popular Uses of Crime Data
Businessesbuilding models. Our business clients who are interested in crime data tend to be building models, and crime data is just one of many different types of data that they want to use in their models. For example, a prospect was interested in “associating medical trends with risk data to potentially identify problem areas sooner and determine where what type of resources are needed most based on risk and their associated medical needs. For instance, we may want to create a burn victims unit in areas with higher fire risk vs those with low fire risk, or staff more mental health professionals in higher crime.”
Non-Profits. We also pull crime statistics for non-profit clients who need this information for their community needs assessments or when writing grants.
Real Estate Developers. Crime data are important to developers who need to show that they are building in a low crime neighborhood so they can get their affordable housing developments approved by government agencies. Crime statistics can be central to approval and is often subject to dispute if the data shows crime is too high.
Small Geographies Available?
Law Enforcement Agencies
No. Entire US
No. States & US
Applied Geographic Solutions
Blocks groups, Census tracts, Zip Codes & more
Map of Incidents
Map of Incidents
LexisNexis Community Crime Map
Map of Incidents
Custom Data Pull
We pull the best geography for your project.
Table of 10 Crime Data Resources
FBI Uniform Crime Reporting Program/National Incident-Based Reporting System
If you need crime counts by type of crime for cities or counties and you aren’t worried about geographic differences, then the FBI crime datasets are the right fit. Data are provided by law enforcement agencies rather than standard geographies like a zip code. We most often pull FBI crime data for the geographies of County Agencies and City Agencies. There are also data for Metropolitan Statistical Areas or MSAs, but these geographies tend to be too large for the types of projects that we pull data for. There are also data provided for Universities and Colleges as well as State, Tribal and Other Agencies, but honestly, most of our clients just ignore these agencies and just get city and county data.
There are 2 drawbacks to this dataset:
Crimes are reported by agency rather than rolled up to a geography. To quote Neighborhood Scout “…crimes are reported by individual law enforcement agencies, rather than by city or town, and many cities – even small ones – have more than one agency responsible for law enforcement (municipal, university, county, transit, etc.). Even FBI data are reported by agency not by city or town, providing an incomplete assessment of city-wide crime counts. It is an agency-centric rather than locality-centric reporting method. If you use FBI data, you only get city-wide general counts, and only from one agency in the city, so it is generally incomplete for the city overall, as well as not specific to a neighborhood or address.”
There are no FIPS or unique geography ids that make it easy to join this dataset to other datasets like Census data. You have to do name joins which are error prone and painful.
An Example of Data Collected by the FBI for the Uniform Crime Reporting Program
The UCR also provides additional details about the persons arrested, such as age, race and ethnicity, the weapons used and the value of items stolen.
The FBI uses a tool called the Crime Data Explorer in an effort to make crime data more user friendly. The Crime Data Explorer currently includes violent crime statistics (murder and nonnegligent manslaughter, rape, robbery, and aggravated assault) and property crime (burglary, larceny-theft, motor vehicle theft and arson)
Crime Data Explorer
State crime rates can be compared to national crime rates.
This is an example of the Arrest Data found in the Crime Data Explorer tool.
The NCVS is an annual data collection conducted by the U.S. Census Bureau. The purpose of the NCVS is to fill in the gaps between crimes reported to law enforcement and those that are not. The collection reports national statistics only and doesn’t provide data for smaller geographies. Data includes nonfatal personal crimes (rape or sexual assault, robbery, aggravated and simple assault and personal larceny) and houseold property crimes (burglary/trespassing, motor-vehicle theft, and other theft).
Here is an example of the difference in crimes reported to the police and those that aren’t. The table below provides the crime rates for different types of crimes.
Office of Juvenile Justice and Delinquency Prevention (OJJDP)
Juvenile offenses, as well as crimes where juveniles are victims are included in the OJJDP’s crime report for each US state. The Statistical Briefing Book is a tool that provides access to online information about juvenile crime and victimization and about youth involved in the juvenile justice system.
Below is an example of the options provided to search for juvenile crime statistics, as well as an example of the data returned.
Applied Geographic Solutions (AGS)
Link: You can buy this data through us since we are a vendor.
Price: Base Price: $3,750 for all US zips; $1,250 for all US counties *
If you are doing model building and need crime data for small geographies like zip codes, the Applied Geographic Solution dataset with crime indexes can be a helpful resource. Using advanced analysis of a rolling seven-year database of FBI and local agency statistics, AGS provides relative crime risk, not just crime occurrence. Crime risk is an index of the probability of crime in a geographical region compared to the entire US. Zip code is the most popular geography that’s requested, but county and city data as well as smaller geographies are available as well. FIPS codes are also provided with the data, which makes it easy to connect geographies between datasets – super important for doing modeling work when you need data from multiple sources.
The AGS crime data comes from a private data vendor rather than a public agency. Unlike the FBI data above, there will be restrictions on how you use the data. The prices above are rough quotes, and I’ll have to adjust the pricing based on number of users and how you are using the data (e.g. internal modeling versus publishing online). Contact me if you are interested in continuing the conversation here.
Below is an example of the type of information that AGS provides:
Price: $39.99 – $119.99/month for a subscription report reports
Neighborhood Scout is geared toward real estate investors who want to identify opportunities with low crime risk. Other free web sites collect data from local law enforcement agencies, not by locality, which tends to leave holes in the data. Not all agencies elect to report data and some localities have more than one agency. Neighborhood Scout fills in the holes by using like neighborhood crime data leading to seamless 100% US coverage.
By entering an address, an investor can get a full report on the crime risk of a neighborhood. The report includes crime risk ratings for several crime types, a resident’s risk of becoming a victim and 5-year trends and forecasts.
Below is an example of the type of information that Neighborhood Scout provides:
City Protect, Spot Crime and LexisNexis Community Crime Map all provide interactive maps of crime data from law enforcement agencies. You can search for an address and get a map of where or what neighborhoods have the most crimes and what types of crimes were committed in the past. If you are getting a radius report, you might want to also check one of these map solutions for recent crimes in your area of interest. By doing a simple search on zip code, city, or state, you have the ability to see incident details, statistics and reports. Like the FBI UCR, the data available is dependent on reports provided by law enforcement agencies. Some areas may not have any crime information available due to lack of participation by the local agency.
Here’s what these resources look like.
Spot Crime In addition to an interactive map, SpotCrime also has a Trend tab with a summary of crime trends.
LexisNexis Community Crime Map lets you filter date and type of incidence. It also has option for to buffer addresses.
CUSTOM DATA PULL
Price: Starting at $399 and depends on geographies and years needed.
Geographies: All US geographies from large geographies like states and counties to small geographies like zips, census tracts and blocks.
Whew! Still reading? I’m impressed. So if you’ve gotten this far and you’re thinking, “Yeah, sure I could pull all of this data myself in a couple of days, but I have other important things to do and I really don’t want to sift through all of the data dictionaries, methodology statements and tool instructions to make sure that I have the most current data,” you are not alone. You sound just like our other clients at Cubit who depend on us to provide clean, accurate and easy-to-work-with data as well as human-to-human customer support. Prices start at $399 with a 3 business day turnaround. Tell me what data you need for what geography & I’ll get you a free quote & turnaround estimate.