For a recent radius report, I needed a way to estimate the number of investors within a radius area. In a perfect world, I would have been able to produce the number of investors with a liquid net-worth of over $250,000 and an income level of over $75,000 a year by radius.
Unfortunately, I don’t have a dataset that identifies the number of investors by net-worth and by income level in my 306,063,689 KB databases, so I couldn’t solve the “perfect world” scenario above. But I did come a back of the envelope way to say “there are likely to be more investors in Radius X than in Radius Y.”
The IRS releases the number of tax returns with taxable interest, ordinary dividends and qualified dividends by income range by zip code. That data looks like this.
Now we need to calculate a radius around a point. I’m going to use a 1 mile radius so that it makes the images below clearer, but in the real world, a 1 mile radius would be too small for this type of estimate.
Now we need to know what zip codes are in this area. Below are images of general zip codes in the area that I took from our Income By Zip Code site.
Next we can identify zip codes/ZCTAs that intersect the radius.
Then, the GIS analysis calculates the percent of each zip code/ZCTA within the radius. Again – the sample radius area is too tiny for this estimate to be meaningful, but it makes the images understandable to use a 1 mile radius rather than a 5 or 10 mile radius.
Next, we multiply the overlap percent by the number of returns for each zip code/ZCTA.
Finally, the radius estimate equals the sum of the overlap multiplied by the number of returns for all zip codes/ZCTAs that intersect a radius. So we’d do the same step as in the image above for zips 80004 and 80033 and add the results to the results for 8002 to produce the estimate for the radius area. (Just to be clear, I’d do these calculations in SQL rather than Excel – but Excel produces images that are easier to understand).
The end result is a back of the napkin estimate for number of households/individuals in a radius area with taxable interest, ordinary dividends and qualified dividends – which could be used as indicators of the number of investors in an area.
Off the top of my head, here are a couple of problems with the above estimate.
- The radius report methodology assumes that the population and investors are equally distributed throughout a zip/ZCTA. This could be a problem when, for example, all of the investors live in a wealthy neighborhood in the northern portion of the zip/ZCTA, and none live in the southern portion that actually intersects the radius area.
- As of today, the most current IRS data is from 2012.
- I have no data to support the assumption that the number of households/individuals in a radius area with taxable interest, ordinary dividends and qualified dividends aligns with the number of investors with a liquid net-worth of over $250,000 and an income level of over $75,000 a year by radius.
- It would be far easier (less work/less data) to assume that the number of investors correlate with the number of persons with high incomes. Then you could do the above analysis using 2013 Census data and Census geographies (smaller than zip codes — so more precise) for high median incomes or with number of households that make over $200,000 a year.
Got questions or comments? Know of a better dataset than IRS data that I should consider to identify investors? Contact me & let me know your thoughts.