Crunch: Housing Cost Affordability for US Metro Areas

Are you aware of how housing costs impact your customers’ spending habits? Understanding the financial burdens of your customers can help you tailor your marketing and sales strategies. For instance, in states like Florida, Nevada, and California, a higher percentage of renters are housing-burdened, which impacts their discretionary spending.

The Definition

The Department of Housing and Urban Development considers those who pay more than 30% of their income on housing to be “housing-burdened.”

Renters vs. Homeowners: Who’s More Burdened by Housing Costs?

It turns out, over half of the renters in the US are shouldering a significant housing cost burden. In 2022, a staggering 52% of renters and 23% of homeowners spent more than 30% of their income on housing (source: US Census). While these figures have fluctuated over the past decade, they remain high.

Prices of Homes Rising.

Nationally, the cost of homes has soared by 74% from 2010 to 2022, outpacing wage growth which saw a 54% increase (sources: Bureau of Labor Statistics, Federal Housing Finance Agency). This widening gap can significantly impact consumer spending power.  

Because of this trend in rising home prices, we recently added more high value categories to our radius reports. The example below is a location in New York City.

Let’s go Low, Low, Low, Low, Low, Low, Low, Low

While I love USAFacts’ analysis of housing burden by state and tenure, state geographies are usually too large for most business owners to use in their decision-making. So let’s look at similar data for metro areas.

List of Metro areas sorted by housing burdened owners

You can download this Google Sheet to sort and filter for your markets by going here: Then click on File/Download.

Most housing burdened markets for owners with mortgages

The 2022 American Community Survey data presented lists the percentage of housing units with a mortgage that cost the owner 30% or more of their income in various U.S. metropolitan areas. Two areas tie for the highest percentage, Kahului-Wailuku-Lahaina in Hawaii and Aguadilla-Isabela in Puerto Rico, both at 43.7%. The Yauco metro area in Puerto Rico follows at 41.6%, and the Los Angeles-Long Beach-Anaheim area in California is close behind at 41.5%.

The data suggests that homeowners in these regions are likely to spend a significant portion of their income on housing costs, which can be indicative of high housing prices, low incomes, or a combination of both. Notably, three of the top five areas with the highest owner costs are in Puerto Rico, highlighting a potential regional pattern of housing affordability challenges.

Most housing burdened markets for renters

  1. High Rental Cost Burden: All the listed MSAs have more than half of their renters spending over 30% of their income on rent, indicating a high rental cost burden in these areas. This suggests that affordability is a significant issue for renters in these regions.
  2. Florida's Rental Market: Florida appears prominently on this list, with four MSAs featured: Gainesville, Punta Gorda, The Villages, and Naples-Marco Island, with Miami-Fort Lauderdale-Pompano Beach having the highest percentage at 62.6%. This points towards a statewide trend of high rental costs relative to income in Florida.
  3. Diversity of Locations: The MSAs are geographically diverse, covering various parts of the country including the East Coast, West Coast, and the Mountain States. However, there is a noticeable cluster in the state of Florida, suggesting regional market dynamics that affect rental affordability.
  4. College Towns: Several of the MSAs listed, like Gainesville, FL (University of Florida), Boulder, CO (University of Colorado), and Ithaca, NY (Cornell University), are known as college towns. This could imply that the presence of a large student population may drive up rental prices due to demand, possibly impacting the affordability for non-student renters as well.
  5. Variation in Cost Burden: While the percentage of renters experiencing a high cost burden is significant in all these MSAs, there is a noticeable variation, with Gainesville at the lower end (58.7%) and Miami-Fort Lauderdale at the higher end (62.6%). This variation might reflect differences in local economies, housing supply constraints, and demographic pressures.

Need housing burden data for smaller geographies than MSAs?

We can help. You can request this data as your 1 free table add on when you purchase a spreadsheet report. Or for $100, you can get a custom manual calculation added to your radius report.

Got more questions about housing burden data? Send me a message, and we’ll be talking about data in no time.


Bureau of Labor Statistics

Federal Housing Finance Agency

US Census Bureau

Unwrap the New 2022 Census ACS Data

The US Census Bureau released the updated 2022 American Community Survey demographics for all geographies earlier this month. And we’ve been scurrying around like Santa’s elves to bring you the latest data and geographies as well as new features.

New 2022 Demographics & Geographies

The most remarkable change was a complete reconfiguration of Connecticut counties. As you can see in the maps below, the 2022 Connecticut counties don’t play nice (aka aren’t contiguous with) the 2021 counties — which is going to make historical comparisons tricky.

Map of 2022 Connecticut Counties


2021 Connecticut Counties


Radius Report Updates

You now get 4 new types of data included in your Radius Reports for no additional fee.

  • 1. Median Home Value Estimates and 2. High Home Value Categories

Back in 2010 – which was around when we first started offering radius reports – about 10% of US homes had value estimates of over $500,000 cite. According to the latest 2022 data, over 26% of US homes now have value estimates of over $500,000 cite.

Now your radius reports include more detailed categories describing these high-value homes. The new fields are highlighted below in an example report for New York City.

  • 3. Population Density in people per square miles
  • 4. The count and percentage of Families in Poverty

Income By Zip Code Lists and Demographics By Lists Updates

Income By Zip Code lists and Demographics By Zips/Cities/Counties have been polished up with the following improvements.

  • Improved human-readable headers to help you scan the data and understand it
  • Improved database-friendly headers so you can upload the file to ChatGPT, and it natively understands what’s in each column
  • Moved GEOIDs to the end to get them out of your way

Income By Zip Code Maps

New Feature! You can now export data for selected zips from the Income By Zip Code map interface. Here’s how.

Got questions about 2022 Census data or the new features above? Have ideas for additional features that save you time? Send me a message, and we’ll be geeking out about data in no time.

How to find Current Wage Data by Job Title for the US, States and Metro Areas

Occasionally, we get a custom data request for wage data by job title and city to help HR professionals figure out appropriate salaries for their teams. Below are 2 different current government datasets with wage data by job title.

Census Bureau Data: A Peek into the National Job Landscape

First up, the Census Bureau offers insights into detailed occupation data through their American Community Survey with tables for Detailed Occupation (B24114) and the corresponding Median Earnings (B24121). Unfortunately, the most detailed occupation tables they offer are only available at the national level, but are still a handy first step.

These tables provide a window into the job market in the United States, offering crucial insights into the population of workers and the earnings they bring home. Let’s use Project Management Specialists as an example:

For the 5-year estimate in 2021, the number of Project Management Specialists was 737,973, with median earnings of $93,970.

Not sure what the American Community Survey is? No problem! you can check out this handy FAQ on our website here: What is the American Community Survey?

BLS Data: Zooming In on Salaries

The Bureau of Labor Statistics (BLS) takes it a step further by offering detailed data on salaries, not just at the national level but also by state and even metropolitan areas. The metropolitan area data are as close as you can get to city wage data using government datasets. At the moment the most current data BLS has is for 2022, and here’s how to access it:

With the BLS data, we now know that for Project Management Specialists in 2022 there are:

Career Level Wages

Along with the salary data from the Bureau of Labor Statistics, you’ll also have the option to download additional hourly and annual 10th, 25th, 75th, and 90th percentile wages.

These can help you better understand entry-level wages vs senior-level wages for the same jobs. Awareness of the wage ranges at different career levels is crucial to remain competitive in the job market.

With this we can now identify that the wage for a junior-level project manager in Austin-Round Rock will be about $67K annually, compared to the senior-level at around $151K.

Don’t have time to pull this data yourself? Or are you also interested in other datasets like demographics of the area workforce? We’re here to help! Let us know what data you need in a Custom Data Request, or call us at 1-800-939-2130.

Estimating White-Collar Workers Using Census Data

Photo by Israel Andrade on Unsplash

Are you curious about the number of white-collar workers in your area? Well, I recently embarked on a journey to find white-collar worker categories from the Census Bureau, and let me tell you, it was quite the adventure! In this blog post, I’ll take you through my process of estimating white-collar workers using the American Community Survey and the key variables.

Not sure what the American Community Survey is? No problem! you can check out this handy FAQ on our website here: What is the American Community Survey?

Does the ACS Estimate White Collar Workers?

Not exactly. My search began on the official Census Bureau website, The Census Bureau’s American Community Survey collects data on the industry and occupation of workers in the labor force. However, they do not include a specific table or variable to identify white-collar workers. It seemed like my quest for white-collar worker categories had hit a roadblock right out of the gate.

Identifying Key Variables

While I couldn’t find exactly what I needed on the Census website, I did explore the alternative avenue of the American Community Survey’s Users Group, the perfect place to connect with fellow data enthusiasts who might have the answers I was looking for.
Here I found this promising reply listing table C24010 and the variables that could be used to estimate a “working class”, and thus help me identify the variables needed to get a “white collar” estimate.

After I downloaded the full  2005 documentation for table C24010 to review the actual variable descriptions, it turned out that a lot of the variables did not align exactly with what was described. So this search for white-collar categories wasn’t over yet.

The Answer

Moving on, I instead looked through the most recent 2021 documentation. Now (using the most generous interpretation of what a white-collar job is),  I decided to use these variables to estimate white-collar workers:

  • Management, business, science, and arts occupations
  • Sales and office occupations

If you wanted to estimate blue-collar workers, you could then use the variables for:

  • Service occupations
  • Natural resources, construction, and maintenance occupations
  • Production, transportation, and material moving occupations

Using these categories, you can now estimate “white-collar” workers for your geography of choice. (*Remember to sum both male and female variables in the ACS table to get the total.)

As an example let’s look at Williamson County, TX. Williamson County has about 222,454 white-collar workers for 2021, making up about 72% of the employed population. Below you check out the highlighted variables used to get this total:

Where do these occupation categories come from?

For the occupation data, the Census Bureau uses the Standard Occupational Classification (SOC).

“The SOC is the federal government’s own regularly-updated system for classifying occupations, which are grouped according to the nature of the work performed. This system provides a mechanism for cross-referencing and aggregating occupation-related data collected by social and economic statistical reporting programs.”

Want to learn more about Census demographics, occupation data or anything else data-related?
We’re here to help. You can fill out the Custom Data Request form, or call us at 1-800-939-2130.

Using Code Interpreter to Analyze US Census Data

Photo by Headway on Unsplash.

Using Code Interpreter to Analyze US Census Data: The Good, the Impressive & the Ugly

Let’s kick the tires of ChatGPT’s Code Interpreter using the latest US Census’ American Community Survey data. I’ll share my favorite prompt, what impressed me most, and what Code Interpreter got flat wrong.


  • The Good: Code Interpreter can open data files and make pretty darn good guesses about what’s inside.
  • The Impressive: It can also produce simple weighted scoring models and adjust the weights.
  • The Ugly: But sometimes, it produces obviously wrong calculations.

My favorite prompt:

What’s Code Interpreter?

Code Interpreter is a (terribly named) beta feature of ChatGPT that lets you load data files and analyze the data.

If you want to follow along with me, you need a $20-a-month ChatGPT account. Then you need to turn on Code Interpreter under your Account and then in Settings and Beta.

Once Code Interpreter is on, you can upload data files using the + button.

The Good – Code Interpreter makes good guesses of what’s in a file.

I accidentally uploaded the entire zip file for our DemographicsByCitiesForTexas which has both a data file and a notes and citations file. Code Interpreter effortlessly unzipped the file and identified the data file versus the citations & notes file. It also cut off the human-readable headers and started working with the machine-readable headers – without me having to tell it to.

Furthermore, Code Interpreter successfully described what key columns were included in the file.  

That said, it’s not all sparkles and unicorns. In the above example, Code Interpreter says that hhi_total is the total number of households. And this is correct. But when I was working with a different dataset, Code Interpreter said that hhi_total was the total household income – which is incorrect.

Lessons Learned

  1. You can load data files that you aren’t familiar with into Code Interpreter and see if it can make heads or tails of them.
  2. I may need to update the database headers in Cubit’s files to make it easier for AI tools to “understand” the fields.
  3. Don’t assume that Code Interpreter will always “understand” the data fields even if it correctly “understood” the fields in a previous analysis.

Identifying the Highest Income Cities in Texas

Now let’s dig in! Can Code Interpreter can figure out the highest income cities in Texas using the most recent American Community Survey Census data? Yes, it produced a top ten list of cities based on the correct median household income column in the file. It even called out that the median income doesn’t go higher than $250,001.

But I’m not impressed yet as I can do the same thing with a simple sort in Excel. So now I want to see something that I can’t do out of the box in Excel, and that’s build a map of these high-income cities so I can see where they are clustered in Texas.

Visualizing the High-Income Cities on a Map

But Code Interpreter can’t build maps directly.

It did, however, suggest some tools to help visualize this data such as Python libraries – which doesn’t help me as I don’t know Python or Folium. Also, Code Interpreter clarifies that it needs coordinates for map building.

Lessons Learned

  1. Code Interpreter can’t produce maps – bummer! But it can write code for other technologies to produce maps.
  2. I need to think if we should add latitude/longitude data to our data files.

Locating the Top 10 Cities in Texas

So I still want to know where these high-income cities are in Texas. Can Code Interpreter help me do this without a map?

Code Interpreter uses its own data to locate each city and ignores the county data in the file that I provided. But this is only problematic for “Redfield CDP” as it doesn’t have data for this geography where as the file that I provided does.

Could a different prompt give us what we need? Maybe.

I asked Code Interpreter to provide a graph of the counts of cities with the max median income by county, and it provided a description of the graph and what data was considered. Tada! Ok, I now roughly know where these high income cities in Texas are located. 

Show Me Something I Don’t Know.

I’m done exploring high-income cities in Texas, and I’m ready to be impressed. And what could be more impressive than Code Interpreter figuring out something about this dataset that I don’t already know? Here’s the prompt I use.

But the results were not as impressive as I hoped and included a distribution of Median Household income across the Texas cities, the top 10 counties by total population (even though the total populations in the file are only for cities?) and the distribution of population densities across the cities. Honestly, I’m underwhelmed.

I’m going to skip a bunch of stuff that didn’t work to get you straight into the good stuff.

The Impressive: Weighted Scoring Model

Sometimes, I need to identify geographies that have large populations AND large income AND {insert other variable here}. Let’s see if Code Interpreter can do this.

And it completely fails. I tried a bunch of different prompts and they all failed.


I was explaining what I was trying to do to Sara of FromThePage, and she asked me how I’d solve this problem without Code Interpreter. I told her that I’d build a simple model and apply weights. And she brilliantly asked, “I wonder what Code Interpreter would do if you told it that?” Good point! So I did but this time using our Texas county dataset.

And that’s just what I wanted – a simple weighted model. But I don’t want Harris County to ALWAYS be at the top with its outlier population of 4 million people. So let’s see if Code Interpreter will tweak the weights.

This simple weighted model was the most interesting thing that I got Code Interpreter to do. I’ve been playing around with projections and change over time data, and I’m hopeful that I’ll get something even more impressive soon.

Lesson Learned

  1. Code Interpreter can’t solve data problems for you – beyond simple sorts and graphs. To get it to do something impressive, you must already know the solution to your problem AND you must figure out exactly how to tell it to produce what you want. Alternatively, I could need more practice at prompt writing.

The Ugly: Obvious Calculation Errors

I was on the phone with a client who wanted to identify zips where many Hispanics live. And since I had already loaded demographics for Texas cities into Code Interpreter, I thought I’d see how well it would do.

First off, Code Interpreter had problems locating a “hispanic” column in the dataset when there’s a clearly named column: “race_and_ethnicity_hispanic”. It thinks it fixes the problem but ends up using the wrong universe which results in Hispanic percentages over 100% — which is impossible.

So this is dumb, but to be fair, Code Interpreter points out the error.

I tried to get Code Interpreter to fix the problem on its own, but it couldn’t.

When I pointed Code Interpreter to the right columns to use, then it corrected the calculation. But if I’m going to have to spell out columns, then I’ll probably just stick with a database or Tableau or {insert other data tool that I know better}.

Lessons Learned

  1. Double-check all Code Interpreter calculations.
  2. When you start getting results that are obviously wrong, reload the file and start over rather than trying to get Code Interpreter to find and fix the error.

And One Bonus Lesson Learned that Doesn’t Fit Anywhere Else

  1. You could use Code Interpreter like a flow in Tableau Prep. You drop in standardized data, run a series of prompts, and get a standardized output in text or data visualizations.


I’ve never incorporated a tool into my daily workflow as quickly as I have ChatGPT. Every day, I use it to do something a little different – be it writing email subject lines or rewriting this wordy blog post, or producing formulas for Google Sheets that all I need to do is to copy and paste and they work (mostly).

As you can see from the above post, I’m still a novice in terms of using Code Interpreter to analyze Census data. In fact, my favorite use cases for Code Interpreter aren’t when I’ve asked it to analyze Census data, but when I’ve asked it to analyze data for my business, Cubit.

For example, I wanted to know what days of the week were most popular for making purchases of one of our products. I was able to load product data into Code Interpreter, and it spit out the graph slightly faster than I could have built the same thing in Excel. But I didn’t have the spend my time fixing date format issues – Code Interpreter did this for me.

Also, I wanted to know what hours of the day I receive the most phone calls. Code Interpreter was able to clean up different time formats and produce the following graph – again slightly faster than I could have done AND saving me the brainpower from having to fix data format issues.

So my final lessons learned are:

  1. Code Interpreter is fun to use with internal business data as makes simple graphs that I can use to answer simple questions.
  2. I need to keep using Code Interpreter daily with Census data or internal data to improve my prompt writing and learn what it can and can’t do.

Wow! You’ve read to the end. Color me impressed. You, my friend, are EXACTLY the type of person that I want to hear from, and here’s where you can send me a message.