Methodology & sources

Data colophon

The argument in this book rests on public data that anyone can obtain. This page explains where it came from, how it was processed, and how to replicate or extend the analysis — whether you are a journalist, a researcher, a policy advocate, or a curious resident of any American city. The specific files, field names, and code tables described here are specific to King County, Washington. Other jurisdictions publish equivalent data under different names with different structures — the methodology transfers, the field names almost certainly do not.


Data sources

Dataset Source Notes
EXTR_Parcel.csv King County Assessor extracts Parcel characteristics, zoning, present use codes. Encoding: latin-1.
EXTR_RPAcct_NoName.csv King County Assessor extracts Appraised and taxable land and improvement values. Tax status field. Encoding: latin-1.
EXTR_LookUp.csv King County Assessor extracts Code tables. PresentUse codes live in LUType = 102.
EXTR_RPSale.csv King County Assessor extracts REET sale transactions. Primary source for the post-sale assessment tracker (forthcoming).
Parcel shapefile King County GIS Open Data Search "Parcels for King County with Address, Property and Ownership Information." Includes LAT, LON, APPRLNDVAL, APPR_IMPR. Projection: NAD83(HARN) Washington North (ftUS).
Census tract boundaries King County GIS Open Data 2020 Census tracts. Same projection as parcel shapefile — spatial join requires no reprojection.
Census population Census Bureau API 2020 Decennial Census, variable P1_001N. No API key required for basic variables. King County: state:53&county:033.
Note on the parcel address dataset: King County announced the retirement of the parcel_address_area dataset effective June 1, 2026. If you are reading this after that date, the shapefile source URL or dataset name may have changed. Check the GIS open data portal for the current equivalent.

Key methodological decisions

Seattle filter

Parcel shapefile: LEVY_JURIS == 'SEATTLE'. Assessor CSVs: DistrictName == 'SEATTLE'. These fields do not always agree for edge parcels; the shapefile field is more reliable for geographic filtering.

PIN construction

King County parcel IDs are 10-digit strings: Major.zfill(6) + Minor.zfill(4). The shapefile uses a PIN column in this format. The assessor CSVs have separate Major and Minor columns that must be joined.

Stub threshold

Parcels with ApprImpsVal <= 1000 are classified as stubs — improvement values at or below $1,000. This is a deliberately conservative threshold. The practical floor in the data is $0 and $1,000; both are treated as stubs.

Taxable filter

TaxStat == 'T' in EXTR_RPAcct_NoName.csv identifies parcels subject to property tax. Exempt parcels ('X') include parks, public utilities, federal land, and qualifying nonprofit housing. This filter removes most but not all public parcels — the underlying data is inconsistent.

Vacant parcel codes

PresentUse codes for vacant land (LUType 102): 300 = Vacant (Single-family), 301 = Vacant (Multi-family), 309 = Vacant (Commercial), 316 = Vacant (Industrial). Code 299 = Historic Vacant Land.

Split assessment exclusion

King County sometimes records land and improvements on companion parcels with linked minor numbers. PropName strings containing 'imps on', 'associated with', 'econ unit', or 'imp data on' identify these cases. Excluding them produces a cleaner count of genuinely vacant or underimproved commercial parcels.

Population density scaling

Population per acre is capped at the 95th percentile (48.6 people/acre for King County tracts) before normalization. The raw maximum (194 people/acre) compresses the color scale so that most of Seattle reads as uniform. Clipping at p95 makes mid-range variation visible while still showing the densest tracts at maximum intensity.


Known limitations

Public parcel contamination

Parks, utilities, and federal land sometimes appear as taxable vacant parcels because TaxStat is inconsistently recorded. Cross-referencing PROP_NAME against known public entities removes obvious cases but is not exhaustive. The vacant parcel counts in this book should be treated as upper bounds.

Partial exemptions

Some parcels are partially exempt — the legal description will say "PORTION TAXABLE." The assessed values in the shapefile reflect only the taxable portion. Full market value may be higher.

Coordinate gaps

Some parcels have LAT = 0 and LON = 0 — the assessor did not geocode them. These drop out of map visualizations. The polygon geometry in the shapefile is always present and can be used to compute centroids as a fallback.

Monthly snapshots

The King County Assessor extracts update monthly. All figures in this book and the accompanying visualizations reflect the May 2026 extract. The ongoing monitoring project tracks changes over time; see the tracker at otherwise-books.com/rent.


Python environment

The analysis requires Python 3.9 or later. Install dependencies:

pip install geopandas pandas numpy requests

For the web visualizations:

pip install pydeck

All scripts assume the working directory contains the data files. King County CSV extracts require encoding='latin-1' — the default UTF-8 encoding will fail on some records.


Core pipeline

Load and join the assessor CSVs

import pandas as pd

parcel = pd.read_csv('EXTR_Parcel.csv', encoding='latin-1', low_memory=False)
rpacct = pd.read_csv('EXTR_RPAcct_NoName.csv', encoding='latin-1', low_memory=False)

parcel['pin'] = parcel['Major'].astype(str).str.zfill(6) + \
                parcel['Minor'].astype(str).str.zfill(4)
rpacct['pin'] = rpacct['Major'].astype(str).str.zfill(6) + \
                rpacct['Minor'].astype(str).str.zfill(4)

df = parcel.merge(rpacct, on='pin', how='left', suffixes=('_p', '_r'))
seattle = df[df['DistrictName'] == 'SEATTLE'].copy()

for col in ['ApprLandVal', 'ApprImpsVal', 'SqFtLot']:
    seattle[col] = pd.to_numeric(seattle[col], errors='coerce').fillna(0)

Find stub parcels

stubs = seattle[seattle['ApprImpsVal'] <= 1000]
taxable_stubs = stubs[stubs['TaxStat'] == 'T']

print(f"Taxable stubs: {len(taxable_stubs):,}")
print(f"Total land value: ${taxable_stubs['ApprLandVal'].sum():,.0f}")

Find taxable vacant commercial parcels

vacant_commercial = taxable_stubs[taxable_stubs['PresentUse'] == 309].copy()

# Remove split assessments
exclude = ['imps on', 'associated with', 'econ unit', 'imp data on']
mask = vacant_commercial['PropName'].str.lower() \
       .str.contains('|'.join(exclude), na=False)
clean = vacant_commercial[~mask].copy()

print(f"Taxable vacant commercial (clean): {len(clean):,}")
print(f"Land value: ${clean['ApprLandVal'].sum():,.0f}")
print(f"Acres: {(clean['SqFtLot']/43560).sum():,.1f}")

Spatial join to census tracts

import geopandas as gpd
import numpy as np

parcels = gpd.read_file('Parcels_for_King_County_with_Address,...shp')
tracts  = gpd.read_file('2020_Census_Tracts_for_King_County.shp')

# Both use NAD83(HARN) Washington North — no reprojection needed for join
seattle_shp = parcels[parcels['LEVY_JURIS'] == 'SEATTLE'].copy()
for col in ['APPRLNDVAL', 'APPR_IMPR', 'LOTSQFT']:
    seattle_shp[col] = pd.to_numeric(seattle_shp[col], errors='coerce').fillna(0)

seattle_shp['is_stub'] = (seattle_shp['APPR_IMPR'] <= 1000).astype(int)

joined = gpd.sjoin(seattle_shp,
                   tracts[['GEO_ID_TRT', 'NAME20', 'geometry']],
                   how='left', predicate='within')

agg = joined.groupby('GEO_ID_TRT').agg(
    tract_name   = ('NAME20',      'first'),
    parcel_count = ('PIN',         'count'),
    land_value   = ('APPRLNDVAL',  'sum'),
    imps_value   = ('APPR_IMPR',   'sum'),
    stub_count   = ('is_stub',     'sum'),
    lotsqft      = ('LOTSQFT',     'sum'),
).reset_index()

agg['imps_ratio']    = agg['imps_value'] / (agg['land_value'] + agg['imps_value'])
agg['stub_pct']      = agg['stub_count'] / agg['parcel_count']
agg['land_per_acre'] = agg['land_value'] / (agg['lotsqft'] / 43560)

result = tracts.merge(agg, on='GEO_ID_TRT', how='left').to_crs('EPSG:4326')
result.to_file('tracts_enriched.geojson', driver='GeoJSON')

Add Census population

import urllib.request, json

url = ('https://api.census.gov/data/2020/dec/pl'
       '?get=P1_001N,NAME&for=tract:*&in=state:53+county:033')
with urllib.request.urlopen(url) as r:
    raw = json.loads(r.read())

pop_df = pd.DataFrame([
    {'GEO_ID_TRT': row[2]+row[3]+row[4], 'population': int(row[0])}
    for row in raw[1:]
])

# Join population to enriched tracts and rewrite
import geopandas as gpd
enriched = gpd.read_file('tracts_enriched.geojson')
enriched = enriched.merge(pop_df, on='GEO_ID_TRT', how='left')
enriched['pop_per_acre'] = enriched['population'] / \
    (enriched['ALAND20'].fillna(0) / 4046.86)
enriched.to_file('tracts_enriched.geojson', driver='GeoJSON')

Replicating for another city

The methodology is not specific to King County. Any jurisdiction that publishes parcel-level assessment data with land and improvement values separated can be analyzed the same way. The general steps:

1. Obtain the data. Most county assessors publish annual extracts. Search for "[county name] assessor parcel data download." Key fields needed: parcel ID, land value, improvement value, present use or property class code, lot square footage, and coordinates or a shapefile.

2. Identify the stub threshold. $1,000 is specific to King County's conventions. Other jurisdictions may use $0, $100, or $500 as their placeholder. Examine the distribution of improvement values — the placeholder will appear as a spike at a round number.

3. Filter to taxable parcels. Most datasets include an exemption or tax status flag. Remove exempt parcels before counting — parks, schools, government buildings, and qualifying nonprofits will otherwise inflate the numbers.

4. Cross-reference with sales data. The most powerful analysis compares assessed land value to sworn sale prices from transfer tax (REET) filings. A parcel that sold for $48.75 million and is assessed at $13.3 million tells the story without any theory attached.

5. Check the methodology against known cases. Before publishing any numbers, pull the records for two or three properties you can verify independently — a recent commercial sale, a named building, a surface parking lot. If the data matches what you can confirm from other sources, the methodology is sound.


How this analysis was conducted

The findings in this book did not emerge from a pre-designed study. They emerged from a process of iterative questioning — pulling a thread, following it, finding something unexpected, and pulling that thread too. The methodology is worth describing because it is reproducible, and because the messiness is part of the point.

The starting question was simple: how many parcels in Seattle carry improvement values of $1,000 or less? The answer — more than six thousand, once public parcels are included — led immediately to a second question: what is actually on those parcels? Which led to a third: why does the same $1,000 figure appear on a surface parking lot, a James Beard award-winning restaurant, a philanthropic campus whose improvements peaked at more than $400 million in assessed value before being written down to $1,000 over a decade, and three large transit-oriented-development zoned parcels adjacent to a light rail station?

Each answer required cleaning a layer of data problems first. The taxable filter removes most public parcels but not all — parks and utilities appear in the taxable population because historical records were never corrected. Split assessments make some commercial parcels look vacant when the improvement value is recorded on a companion parcel. The "PORTION TAXABLE" notation in legal descriptions signals partial exemptions that suppress the visible land value. None of these problems invalidate the analysis. They require acknowledging that the published numbers are upper bounds, not precise counts.

Where the data could be checked against a known case — a named restaurant, a well-documented campus, a specific parcel — it was. The aggregate numbers rest on the same methodology as the individual cases that passed that check.

This is the standard for data journalism, not academic research. The bar is not a peer-reviewed methodology section — it is reproducibility and transparency. The data is public. The code is on this page. The known limitations are documented. A reporter, a researcher, or a policy advocate who disagrees with the findings can download the same files, run the same queries, and check the work. That accountability is the argument for doing it this way.

For those who want to conduct a similar analysis — in King County or elsewhere — the practical advice is this: start with a question you can answer with a single filter, then let the anomalies guide you. The most important findings in this analysis were not planned. They surfaced because the data was asked a question, gave a surprising answer, and that answer was taken seriously enough to follow.


Further reading

McMillen, Daniel and Ruchi Singh. "Land Value Estimation using Teardowns." Lincoln Institute of Land Policy, 2022. Establishes the teardown sale methodology for estimating land values in built-up markets where vacant land sales are scarce — directly relevant to the reproduction cost arguments in this book.

Missemer, Antoine and Gauthier Pottier. "The Neoclassical Domestication of Land." Land Economics, Vol. 101 No. 4, 2025. Documents the ideological capture of land economics by neoclassical theory and the deliberate erasure of land as a distinct factor of production.

Washington State Department of Revenue. Property Tax Guide for Washington State. The statutory framework including RCW 84.40.030 (true and fair value), RCW 84.40.340 (assessor's authority to require financial records), and the 1930 Uniformity Clause history.