Crime Trend Analysis Methodology

Public Analyst.ai is built around three commitments: plain English for readers, strict statistical thresholds for the anomaly engine, and open backtestsso you know how often we've been wrong. The platform-wide rules are first; per-city data sources, NIBRS migration year, reporting lag, and the live forecast backtest table for each city follow at the bottom.

JUMP TO · Categories · Exclusions · Anomaly thresholds · Forecasts · San Francisco · Chicago · Cincinnati · Washington DC · Denver · Los Angeles · New York · Oakland · Seattle

The 10 categories we track

Every neighborhood and citywide chart on this site uses the same 10 categories, chosen as the multi-city common denominator: each is an FBI UCR Part 1 / NIBRS Group A category that every modern US police department reports under federal standard. Adding new cities means writing per-city ingest mappers, not redesigning the analytics layer.

  • Homicide. Homicide + manslaughter (FBI UCR convention).
  • Robbery. All robbery subtypes (commercial, street, carjacking, residential).
  • Aggravated Assault. Aggravated Assault subcategory only — the NIBRS Part 1 measure of serious violence. Simple assault is excluded because it varies enormously by city/policing practice and would dominate the assault chart.
  • Sexual Assault. Rape + Sex Offense (excl. prostitution) + Human Trafficking. Some cities pre-redact location data for these categories per state law — see the per-city sections below.
  • Burglary. Residential, commercial, hot-prowl, other.
  • Theft from Vehicle. Larceny – From Vehicle, Theft From Vehicle, and Larceny – Auto Parts (catalytic-converter wave). The Bay Area's defining crime category — kept separate from larceny.
  • Other Larceny. Larceny: shoplifting, pickpocket, from building, bicycle, purse-snatch, other.
  • Motor Vehicle Theft. Stolen vehicles (completed + attempted). Distinct from theft-from-vehicle. Recovered vehicles are not counted (they're status updates on a previously reported MVT).
  • Vandalism. Malicious Mischief + Vandalism.
  • Arson. Arson. Low volume but FBI Part 1 — surfaced via rare-event / streak-break signals.

What we deliberately exclude

These categories appear in raw incident data but don't appear in our trend charts — the reason is documented per-category so you can decide whether you agree.

  • Simple Assault. Varies in policing/reporting practice across cities and time periods; would dominate the assault chart and obscure trends in serious violence.
  • Domestic Violence + family-against-children offenses. Each city's bundle is opaque (DV, child abuse, family disputes mixed). Combining with stranger-violence trends would mislead. Deserves its own module with proper DV reporting nuance.
  • Drug offenses + quality-of-life arrests. Reflect policing policy, not victim behavior. Enforcement priorities have shifted multiple times in the analysis window.
  • Weapons-possession charges. Possession ≠ act of violence; conflating inflates the 'violent' trend.
  • White-collar (Fraud, Forgery, Embezzlement). Systematically under-reported in police data; misleading to compare.
  • Admin records, traffic, suicide. Not crimes against persons or property. Counted incidents would inflate volume without reflecting actual harm.

On demographics

The City page includes a city-level demographic profile (median income, poverty rate, age distribution, etc.) for background context. Demographics are never juxtaposed with neighborhood-level crime data — no choropleth maps colored by income, no per-nbhd crime-vs-poverty tables, no demographic covariates in anomaly detection or forecasts. The bias risk is the juxtaposition, not the existence of the numbers: when a reader sees a high-crime neighborhood next to its income or race stats, the brain reads correlation as causation regardless of authorial intent. So we expose city-level facts as background context and stop there.

Anomaly thresholds

Strict thresholds (≈ p < 0.01) keep the false-positive rate manageable — every (city × neighborhood × category × signal-type) cell gets evaluated each month, which is hundreds to thousands of tests. Each rule pairs a statistical test with an absolute-count floor — statistically “significant” movement on tiny counts isn't meaningful.

  • Spike. Current 12-month total > baseline mean + 2.5σ AND ≥ 20 incidents ANDcurrent 6-month total > 6-month baseline + 1σ.
  • Drop. Current 12-month total < baseline mean − 2.5σ AND baseline mean ≥ 20 ANDcurrent 6-month total < 6-month baseline − 1σ. Drops are surfaced with the same prominence as spikes — the platform reads as fear-mongering otherwise.
  • Rare event. Any incident in the last 90 days AND no prior comparable incident in the previous 5 years.
  • Streak break. Incident in the current month AND a gap of ≥ 24 months since the previous one ANDbaseline rate < 6/year (so the “streak” was real).
  • Sustained shift. Recent 12-month total vs. prior 12-month total: |Z| > 2.576 (≈ p < 0.01) AND ratio differs from 1.0 by ≥ 25% AND both windows ≥ 20 incidents.
  • Zero event. Zero incidents in the full analysis window for the city. Informational backdrop only — never a chip.

Forecasts

Forecasts use Prophet on monthly counts. Two skip rules apply:

  • Low count.If a (neighborhood, category) cell averages < 2 incidents/month over the trailing 24 months, no forecast — point estimates on near-zero series produce useless wide intervals.
  • Violent at neighborhood level. Homicide / Robbery / Aggravated Assault / Sexual Assault are skipped per neighborhood and surfaced via rare-event and streak-break signals instead. Citywide forecasts of these still run.

Every forecast shows a 95% prediction interval. Point estimates without ranges are irresponsible — we always pair the two. Horizon is 12 months max, after which intervals become useless and credibility evaporates on year-2 checks. Each city's actual backtest table is below.

Time of Day Analysis: Understanding Reporting Bias

The neighborhood and city pages include hour-of-day, day-of-week, and month-of-year distributions per category. One platform-wide caveat: hour 0 is mildly inflatedin some cities. Some incident reports default the time field to midnight when the actual time is unknown, so the 12am bar overstates true activity. Real diurnal pattern still shows through (lowest 4–5am, peaks at noon and evening) — the inflation is consistent across cities and doesn't change relative shape.

Platform-wide caveats

  • Methodology may evolve. Calibration after each ingest cycle may surface threshold adjustments. The rules in effect at any given monthly run are documented in the source-of-truth file pipeline/src/flags/detectors.py.
  • Per-capita rates exclude non-residential geographies. Park-only or industrial-only neighborhoods (e.g. Golden Gate Park, the Presidio) have near-zero residents but real visitor populations. We show absolute counts for these and suppress the per-capita ratio.
  • Sensitive crime locations are pre-redacted upstream.Some police departments aggregate sexual-assault and domestic-violence locations to district centroids before publication, per state law. Counts and trends are accurate; we're just displaying them at the location precision the source provides. See each city's section below for its specific policy.

San Francisco

41 neighborhoods · NIBRS-era data from 2018+ · reporting lag 7 days.

Data sources

  • Crime incidents. SFPD Incident Reports (DataSF) — Socrata resource wg3w-h783. NIBRS-era only (2018+).
  • Neighborhood polygons. 41 polygons. DataSF Analysis Neighborhoods (resource j2bu-swwd) — the official boundary set used by city government for analysis.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 06-075 (San Francisco County, coterminous with the city).

Analysis window

January 2018April 2026. Pre-2018 records are excluded. The 7-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest

We trained the forecast model on data ending the previous year and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
No backtest results yet for San Francisco. Run pipeline backtest --city san-francisco.

Spatial spillover of spike events

For every spike flag we've fired in San Francisco, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 41 neighborhood polygons (avg 4.9 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 32 historical spike events, an adjacent neighborhood spiked the same category within 3 months 56.2% of the time.

San Francisco historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Aggravated assault2— too few
Burglary3— too few
Other larceny2766.7%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

San Francisco-specific caveats

  • Sexual-assault and DV locations are pre-redacted. SFPD aggregates incident locations for sexual-assault and domestic-violence reports to police-district centroids before publication, per California Penal Code §293. Counts and trends are accurate; we simply display them at the precision the source provides.
  • 2024 larceny reclassification. SFPD revised some Larceny coding categories during 2024. The change shifts where certain reports land between 'theft from vehicle' and 'other larceny', so cross-2024 comparisons within those two buckets carry a footnote on the per-month archive pages.
  • Park and non-residential neighborhoods. Golden Gate Park, Lincoln Park, McLaren Park, and the Presidio have near-zero residents but real visitor populations. Per-capita rates are suppressed for these geographies; only absolute counts are shown.

Chicago

77 neighborhoods · NIBRS-era data from 2018+ · reporting lag 7 days.

Data sources

  • Crime incidents. CPD Crimes - 2001 to Present (Chicago Data Portal) — Socrata resource ijzp-q8t2. NIBRS-era only (2018+).
  • Neighborhood polygons. 77 polygons. Chicago's 77 community areas — defined by University of Chicago researchers in the 1920s and still the city's standard analytical unit. Sourced from the Chicago Data Portal (resource igwz-8jzy).
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 17-031 (Cook County — broader than the city; only city tracts are summed).

Analysis window

January 2018April 2026. Pre-2018 records are excluded. The 7-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest

We trained the forecast model on data ending the previous year and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
No backtest results yet for Chicago. Run pipeline backtest --city chicago.

Spatial spillover of spike events

For every spike flag we've fired in Chicago, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 77 neighborhood polygons (avg 5.1 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 255 historical spike events, an adjacent neighborhood spiked the same category within 3 months 40% of the time. Other larceny (46.6%) is the most regionally co-moving category; robbery (0%) is the most local.

Chicago historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Robbery110%
Aggravated assault2524%
Sexual assault5446.3%
Burglary1— too few
Other larceny11646.6%
Motor vehicle theft2035%
Vandalism2737%
Arson1— too few

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Chicago-specific caveats

  • IUCR codes, not NIBRS. Chicago publishes incidents under the IUCR coding scheme (the predecessor to NIBRS). CPD started submitting NIBRS to the FBI in 2021, but the public dataset has stayed on IUCR for stability — so our category mapper translates IUCR primary types and descriptions into the same UCR Part 1 buckets used elsewhere on the site. The buckets are equivalent; the source vocabulary is different.
  • Theft-from-vehicle is undercount before 2024. CPD didn't break out theft from a vehicle as its own IUCR code until 2024 (when code 0710 “THEFT FROM MOTOR VEHICLE” was introduced under THEFT, joined in 2025 by code 0760 “BURGLARY FROM MOTOR VEHICLE” under BURGLARY). Pre-2024 incidents that today would be coded as theft from a vehicle were filed under the generic “FORCIBLE ENTRY” / “UNLAWFUL ENTRY” burglary codes with no vehicle subdivision, so the theft-from-vehicle bucket is artificially low and the burglary bucket is artificially high before 2024. Year-over-year comparisons across that boundary should be read with that shift in mind.
  • Aggravated battery folded into aggravated assault. IUCR splits BATTERY and ASSAULT as separate primary types. UCR Part 1 has no “battery” category — aggravated battery and aggravated assault are both reported under “aggravated assault.” We follow the UCR convention so Chicago's aggravated-assault counts are comparable to SF's and Oakland's. Simple battery and simple assault are excluded for the same reason as in other cities.
  • Domestic-battery rows are excluded. CPD codes domestic battery as a separate IUCR description (“AGG. DOMESTIC BATTERY …”). Per the platform-wide stance on DV reporting, those rows are excluded from the aggravated-assault bucket — the bundle of domestic / family / acquaintance violence belongs in a dedicated DV module rather than mixed into stranger-violence trends.
  • 77 community areas, not block groups. Chicago's neighborhood unit is the community area — 77 long-stable polygons defined by the University of Chicago in the 1920s, still the city government's standard analytical unit. We spatial-join each incident's lat/lng to a community-area polygon at ingest, so neighborhood assignment doesn't depend on CPD's pre-joined community_area number (which is null on a small fraction of recent rows).
  • City demographics use Cook County medians. Cook County (FIPS 17-031) is broader than Chicago city. Per-tract counts (population, households, housing units) are summed only across the ~790 city tracts; medians (rent, home value, household income, age) come from the county because medians don't aggregate across tracts and lean slightly different from a Chicago-city-only median would.

Cincinnati

50 neighborhoods · NIBRS-era data from 2020+ · reporting lag 7 days.

Data sources

  • Crime incidents. CPD STARS Category Offenses + PDI Crime Incidents (Cincinnati Open Data) — Socrata resource 7aqy-xrv9. NIBRS-era only (2020+).
  • Neighborhood polygons. 50 polygons. Cincinnati Statistical Neighborhood Approximations (SNAs, 2020 Census-aligned), 50 polygons maintained by CAGIS. The SNAs are the official analytical neighborhood unit, redrawn every 10 years to fit Census geography.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 39-061 (Hamilton County — broader than the city; only city tracts are summed).

Analysis window

January 2020April 2026. Pre-2020 records are excluded. The 7-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest — trained through 2024-12

We trained the forecast model on data ending 2024-12 and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
Aggravated Assault1250.0%17.3%+10.2
Burglary1275.0%15.4%-23.4
Homicide12100.0%36.8%-0.3
Motor Vehicle Theft1241.7%78.7%+139.5
Other Larceny1283.3%7.0%-28.4
Robbery12100.0%15.2%+0.2
Sexual Assault12100.0%22.5%+1.8
Theft from Vehicle1266.7%19.9%+15.7

Spatial spillover of spike events

For every spike flag we've fired in Cincinnati, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 50 neighborhood polygons (avg 4.6 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 466 historical spike events, an adjacent neighborhood spiked the same category within 3 months 73.4% of the time. Other larceny (81.5%) is the most regionally co-moving category; motor vehicle theft (40%) is the most local.

Cincinnati historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Robbery4045%
Aggravated assault6867.6%
Burglary8680.2%
Theft from vehicle8581.2%
Other larceny15781.5%
Motor vehicle theft3040%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Cincinnati-specific caveats

  • 2024-06-03 STARS migration; two source feeds stitched. Cincinnati Police Department migrated its public crime feed to the STARS schema on 2024-06-03. We read pre-migration data from the PDI Crime Incidents dataset (k59e-2pvf, granular UCR groupings + theft codes) and post-migration data from the new STARS dataset (7aqy-xrv9, rolled-up stars_category + Part 1/Part 2 type). Both feeds publish sna_neighborhood pre-attached on every row, and the bucket mapper has two paths (one per vocabulary) routed by the report_type_code tag set at ingest. Trend direction is comparable across the boundary; absolute coding subtleties may differ, see the next caveat.
  • Vandalism and arson are excluded. STARS rolls all Part 2 misdemeanor offenses into a single 'Part 2' bucket without exposing offense detail, so we cannot recover criminal-damage / vandalism counts post-migration even though pre-migration PDI data carries them. Arson is similarly absent from STARS and was already near-zero on PDI. Both buckets would render asymmetric across the migration boundary, so we suppress them at the city page level rather than show a panel that goes flat after May 2024. The other 8 platform buckets are clean across the boundary.
  • Strangulation folded into aggravated assault. Ohio HB 1 created a stand-alone strangulation felony in 2023, and the new STARS feed splits it out as its own Part 1 Violent category. Pre-migration these incidents were charged as felonious assault and bucketed under PDI's AGGRAVATED ASSAULTS rollup. We fold STARS Strangulation back into the platform's aggravated-assault bucket so trends are continuous across the migration boundary.
  • City demographics use Hamilton County medians. Hamilton County (FIPS 39-061) is broader than Cincinnati city. Per-tract counts (population, households, housing units) are summed only across the ~108 city tracts (those whose polygon is at least 50% inside the city limits); medians (rent, home value, household income, age) come from the county because medians don't aggregate across tracts and lean slightly different from a Cincinnati-city-only median would.

Washington DC

41 neighborhoods · NIBRS-era data from 2018+ · reporting lag 7 days.

Data sources

  • Crime incidents. MPD Crime Incidents — per-year layers (DC Open Data, ArcGIS Hub) — Socrata resource MPD/MapServer (per-year layers). NIBRS-era only (2018+).
  • Neighborhood polygons. 41 polygons. DC's 39 Original Neighborhood Clusters plus 2 of the 7 Additional clusters (Walter Reed and Saint Elizabeths, both with active residential redevelopment) — 41 page-bearing units total. The cluster set is maintained by the DC Office of Planning and aggregates the city's colloquial neighborhoods (Adams Morgan, Petworth, Capitol Hill, etc.) into statistical units. We label each page with the cluster's most colloquial constituent rather than the bureaucratic 'Cluster N' identifier MPD publishes; the constituent membership of each cluster is shown in the page header. The 5 federal-land / park / military clusters (Rock Creek Park, National Mall, Joint Base Anacostia-Bolling, Observatory Circle, Arboretum) are excluded from page identity (negligible residential population) but their geometry is retained so the city map renders without holes.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 11-001 (District of Columbia, FIPS 11-001 — DC is a single county-equivalent jurisdiction, coterminous with the city).

Analysis window

January 2018April 2026. Pre-2018 records are excluded. The 7-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest — trained through 2024-12

We trained the forecast model on data ending 2024-12 and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
Aggravated Assault1291.7%15.3%+3.4
Burglary1291.7%41.1%+16.8
Homicide1225.0%135.6%+9.5
Motor Vehicle Theft1233.3%81.8%+209.1
Other Larceny12100.0%11.9%+98.9
Robbery1233.3%116.5%+116.6
Sexual Assault12100.0%69.5%+2.3
Theft from Vehicle1275.0%31.5%+27.3

Spatial spillover of spike events

For every spike flag we've fired in Washington DC, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 41 neighborhood polygons (avg 5 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 72 historical spike events, an adjacent neighborhood spiked the same category within 3 months 31.9% of the time. Other larceny (46%) is the most regionally co-moving category; motor vehicle theft (0%) is the most local.

Washington DC historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Theft from vehicle150%
Other larceny5046%
Motor vehicle theft70%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Washington DC-specific caveats

  • Part 1 index crimes only; no simple assault, vandalism, arson, or drug offenses. MPD's public ArcGIS feed publishes only the eight Part 1 UCR index categories (homicide, sex abuse, robbery, assault with a dangerous weapon, burglary, theft / other, theft from auto, motor vehicle theft). Simple assault, vandalism, arson, drug offenses, and the rest of UCR Part 2 are not exposed in this feed at all. Vandalism and arson are excluded at the city page level so neither bucket renders as flat-zero; the other 8 platform buckets are intact. Cross-city comparisons against NIBRS-coded peers should account for the missing Part 2 surface.
  • Per-year layers, dispatched at fetch time. Unlike the other ArcGIS-Hub city (Denver, single FeatureServer layer), MPD publishes one separate MapServer layer per calendar year (Layer 41 = 2026, Layer 7 = 2025, ... Layer 0 = 2018). Our ingest module's per-year dispatcher selects the correct layer by month. Pre-2018 layers exist upstream but are not ingested; the analysis window matches SF, Chicago, Seattle, and NYC at 2018-01.
  • Pages use cluster-lead colloquial names, not 'Cluster N'. MPD pre-attaches NEIGHBORHOOD_CLUSTER on each row as a literal 'Cluster 1', 'Cluster 18' string — useful for joining to the polygon set but unhelpful as a page identity. We translate each cluster to its most colloquial constituent neighborhood (Cluster 1 → Adams Morgan, Cluster 18 → Petworth, Cluster 27 → Navy Yard, etc.) per a hand-curated lead-name CSV; the page header shows the wider cluster footprint. Lookup is exact-match against the cluster identifier on every row, so spatial join is bypassed for ~99.9% of incidents.
  • 5 'Additional' clusters excluded from page identity. DC's polygon set has 46 clusters: 39 'Original' planning clusters plus 7 'Additional' clusters that cover federal land, parks, and a military base. We keep all 46 polygons in the city map (so the geographic coverage is complete and there are no holes), but only 41 get pages: the 39 Original clusters plus Walter Reed and Saint Elizabeths (both have active residential redevelopment). Rock Creek Park, National Mall, Joint Base Anacostia-Bolling, Observatory Circle, and the National Arboretum have negligible residential populations, and incidents in those polygons count in citywide rollups only.
  • City and county are coterminous. Washington DC is its own state, county, and city all at once (FIPS 11-001). Per-tract Census counts and county-level medians are both computed on the same boundary as the police data, so rate denominators don't carry the slight bias they do in cities like Oakland or Cincinnati where the county is broader than the city.

Denver

78 neighborhoods · NIBRS-era data from 2021+ · reporting lag 7 days.

Data sources

  • Crime incidents. DPD Crime Offenses (Denver Open Data, ArcGIS Hub) — Socrata resource ODC_CRIME_OFFENSES_P/324. NIBRS-era only (2021+).
  • Neighborhood polygons. 78 polygons. Denver's 78 statistical neighborhoods, maintained by the city's Community Planning and Development department and published through Denver's ArcGIS Hub (item 3e57d472afbf4326867c1a4c9d4e7c91, layer 13). DPD pre-attaches the neighborhood ID on every crime row, so spatial join is bypassed for ~99.7% of incidents.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 08-031 (Denver County, coterminous with the city).

Analysis window

January 2021April 2026. Pre-2021 records are excluded. The 7-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest — trained through 2024-12

We trained the forecast model on data ending 2024-12 and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
Aggravated Assault1283.3%6.4%-3.4
Arson1241.7%53.7%-6.7
Burglary1283.3%13.7%+46.5
Homicide1290.9%71.0%+1.4
Motor Vehicle Theft12100.0%12.7%+49.8
Other Larceny1291.7%4.0%+13.3
Robbery1258.3%24.2%+16.9
Theft from Vehicle1266.7%15.6%-101.1
Vandalism1275.0%8.2%-26.1

Spatial spillover of spike events

For every spike flag we've fired in Denver, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 78 neighborhood polygons (avg 5.4 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 309 historical spike events, an adjacent neighborhood spiked the same category within 3 months 73.5% of the time. Other larceny (80.8%) is the most regionally co-moving category; theft from vehicle (0%) is the most local.

Denver historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Robbery70%
Aggravated assault2556%
Burglary3470.6%
Theft from vehicle70%
Other larceny21980.8%
Vandalism1770.6%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Denver-specific caveats

  • Sexual-assault incidents are not on the public feed. Denver Police Department redacts victim-bearing sex-offense rows from its public crime feed; only sex-offender-registration violations and similar non-victim categories appear. We exclude the sexual-assault bucket entirely on Denver pages rather than show a counter that's mostly zero. The other 9 platform buckets are intact.
  • 5-year rolling history; pre-2021 data is not retained upstream. DPD's public ArcGIS feed publishes the previous five calendar years plus the current year to date, with older rows dropped on each new-year rollover. Our analysis window is anchored at 2021-01 — that's the upstream floor at the time Denver was onboarded (May 2026). Cross-city comparisons against SF, Chicago, NYC, Seattle (which all anchor at 2018) only run from 2021 onward.
  • ArcGIS Hub feed, not Socrata. Denver publishes its crime data through ArcGIS Hub rather than Socrata — the same protocol shape used by Esri-based open-data programs. The ingest path uses a separate adapter (pipeline/src/ingest/_arcgis.py) but produces the same canonical incident shape every other city does. Update cadence is daily Monday through Friday; same-day incidents do appear in the feed but DPD notes that records become more accurate as investigations progress, so flags on the very latest month should be read with that lag in mind.
  • Neighborhood pre-attached on every row. DPD's feed includes a NEIGHBORHOOD_ID slug on every incident, joined upstream against Denver's 78 statistical neighborhoods. Less than 0.3% of rows arrive with a missing or unmatched neighborhood; for those we fall back to a point-in-polygon join against the same statistical-neighborhood polygon set. Spatial precision is therefore point-shaped, not polygon-shaped, for ~99.7% of incidents.
  • City and county are coterminous. Denver is a consolidated city-county (FIPS 08-031). Per-tract Census counts and county-level medians are both computed on the same boundary as the police data, so the rate denominators don't carry the slight bias they do in cities like Oakland or Cincinnati where the county is broader than the city.

Los Angeles

114 neighborhoods · NIBRS-era data from 2020+ · reporting lag 14 days.

Data sources

  • Crime incidents. los-angeles — Socrata resource k7nn-b2ep. NIBRS-era only (2020+).
  • Neighborhood polygons. 114 polygons. Hand-curated tract groupings, dissolved from US Census TIGER tract polygons. The crosswalk is reviewed manually as misassignments surface.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 06-037.

Analysis window

January 2020April 2026. Pre-2020 records are excluded. The 14-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest — trained through 2025-12

We trained the forecast model on data ending 2025-12 and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
Aggravated Assault1260.0%897.0%+307.1
Arson12100.0%23.9%+1.5
Burglary1280.0%931.2%+191.6
Homicide12100.0%38.8%+3.6
Motor Vehicle Theft1280.0%668.0%+115.9
Other Larceny1280.0%867.8%+319.9
Robbery1260.0%656.5%+97.8
Sexual Assault1275.0%49.6%+53.1
Theft from Vehicle1280.0%674.8%+64.1
Vandalism1240.0%1807.7%+694.4

Spatial spillover of spike events

For every spike flag we've fired in Los Angeles, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 114 neighborhood polygons (avg 4.7 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 3446 historical spike events, an adjacent neighborhood spiked the same category within 3 months 70.2% of the time. Other larceny (83.7%) is the most regionally co-moving category; arson (0%) is the most local.

Los Angeles historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Homicide1030%
Robbery21643.5%
Aggravated assault32042.8%
Sexual assault12962%
Burglary38660.6%
Theft from vehicle38670.7%
Other larceny104483.7%
Motor vehicle theft32167.9%
Vandalism62581%
Arson90%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Los Angeles-specific caveats

  • March 2024 NIBRS cutover; counts run below LAPD's headline totals. LAPD's legacy 2nrs-mtv8 feed (UCR-coded, with lat/lng) froze after a late-2024 cyber incident; from March 2024 onward the city publishes NIBRS-coded incidents through y8y3-fqfu (and k7nn-b2ep for 2026+). The replacement feed has been shipping fewer rows than LAPD's command-staff statistics show, so our citywide totals run roughly 10 to 20 percent below LAPD's headline numbers from March 2024 onward. Pre-2024 totals match LAPD's published figures within 1 to 2 percent. Trend direction (up, down, sustained) is still meaningful inside the post-cutover window, but absolute levels are not directly comparable to LAPD's annual press-release figures.
  • NIBRS feed has no coordinates; location resolves through reporting districts. The NIBRS feeds expose rpt_dist_no (LAPD's ~1,135 reporting districts) but no lat/lng. We resolve neighborhood through a hand-built RD-to-Mapping-LA crosswalk (1,129 of 1,135 RDs mapped). Spatial precision is therefore RD-shaped rather than point-shaped from March 2024 onward, which compresses some incidents into the dominant neighborhood for each RD. The cleanest practical effect: small high-profile polygons like Watts (10 RDs) report less of their adjacent-block activity than they did under the lat/lng-based legacy join, while neighbors with more RDs (Green Meadows, Vermont Vista, Florence) absorb more of it.
  • Pre-2024 records use UCR vocabulary; bucket mapper covers both. Records from 2020-01 through 2024-02 carry the legacy UCR crm_cd_desc field; 2024-03 onward carry the NIBRS code. Our LA bucket mapper has two paths (one per vocabulary) and routes by the report_type_code tag set at ingest. The 10 platform buckets are equivalent across both vocabularies; the source labels are different.

New York

59 neighborhoods · NIBRS-era data from 2018+ · reporting lag 14 days.

Data sources

  • Crime incidents. new-york — Socrata resource qgea-i56i. NIBRS-era only (2018+).
  • Neighborhood polygons. 59 polygons. Hand-curated tract groupings, dissolved from US Census TIGER tract polygons. The crosswalk is reviewed manually as misassignments surface.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 36-061047005081085.

Analysis window

January 2018March 2026. Pre-2018 records are excluded. The 14-day lag means the briefing currently shown reports on March 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest

We trained the forecast model on data ending the previous year and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
No backtest results yet for New York. Run pipeline backtest --city new-york.

Spatial spillover of spike events

For every spike flag we've fired in New York, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 59 neighborhood polygons (avg 3.8 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 1282 historical spike events, an adjacent neighborhood spiked the same category within 3 months 61.8% of the time. Motor vehicle theft (75.7%) is the most regionally co-moving category; arson (0%) is the most local.

New York historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Homicide110%
Robbery11770.1%
Aggravated assault44267.4%
Sexual assault27667%
Burglary4526.7%
Theft from vehicle90%
Other larceny13330.8%
Motor vehicle theft23075.7%
Vandalism100%
Arson90%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Oakland

35 neighborhoods · NIBRS-era data from 2021+ · reporting lag 30 days.

Data sources

  • Crime incidents. OPD CrimeWatch (Oakland Open Data) — Socrata resource ppgh-7dqv. NIBRS-era only (2021+).
  • Neighborhood polygons. 35 polygons. Hand-curated tract groupings, dissolved from US Census TIGER tract polygons. The crosswalk is reviewed manually as misassignments surface.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 06-001 (Alameda County — broader than the city; only city tracts are summed).

Analysis window

January 2021March 2026. Pre-2021 records are excluded. The 30-day lag means the briefing currently shown reports on March 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest — trained through 2025-12

We trained the forecast model on data ending 2025-12 and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
Aggravated Assault1280.0%61.5%+12.6
Arson1260.0%70.4%+1.4
Burglary1260.0%112.0%+23.0
Homicide1280.0%63.3%+16.0
Motor Vehicle Theft1280.0%60.1%+8.5
Other Larceny1240.0%157.0%+87.5
Robbery1280.0%127.6%+62.4
Sexual Assault1240.0%243.6%+11.9
Theft from Vehicle1280.0%1128.4%+299.9
Vandalism1280.0%98.4%-89.7

Spatial spillover of spike events

For every spike flag we've fired in Oakland, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 35 neighborhood polygons (avg 5.1 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 62 historical spike events, an adjacent neighborhood spiked the same category within 3 months 48.4% of the time. Other larceny (65.2%) is the most regionally co-moving category; homicide (0%) is the most local.

Oakland historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Homicide110%
Burglary1— too few
Theft from vehicle4— too few
Other larceny4665.2%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

Oakland-specific caveats

  • OPD migrated to NIBRS in 2021. Pre-2021 OPD records use the older Summary UCR taxonomy and aren't directly comparable to NIBRS-era counts. The analysis window starts 2021-01 — pre-2021 data is excluded from baselines, anomaly detection, and forecasts.
  • City demographics use Alameda County medians. The city-profile page sums tract-level ACS counts for Oakland city specifically (population, households, housing units). Median values (median rent, home value, household income, age) come from Alameda County because medians don't aggregate across tracts. The county is broader than Oakland city, so those medians lean slightly higher than a true Oakland-city-only median would.

Seattle

20 neighborhoods · NIBRS-era data from 2018+ · reporting lag 7 days.

Data sources

  • Crime incidents. seattle — Socrata resource tazs-3rd5. NIBRS-era only (2018+).
  • Neighborhood polygons. 20 polygons. Hand-curated tract groupings, dissolved from US Census TIGER tract polygons. The crosswalk is reviewed manually as misassignments surface.
  • Population.US Census ACS 5-year (variable B01003_001E), summed across the city's tract crosswalk.
  • Census county. 53-033.

Analysis window

January 2018April 2026. Pre-2018 records are excluded. The 7-day lag means the briefing currently shown reports on April 2026 — earlier than calendar-current because we wait for the buffer to clear before treating a month as settled.

Forecast backtest — trained through 2024-12

We trained the forecast model on data ending 2024-12 and predicted the following 12 months. Coverage is the share of months whose actual count fell inside the 95% CI; MAPE is mean absolute percentage error; bias is mean of (point − actual): positive means we systematically over-predicted.

CategoryMonthsCoverageMAPEBias / mo
Aggravated Assault1283.3%10.6%+23.1
Arson1291.7%47.8%+2.3
Burglary1291.7%13.1%+62.4
Homicide1258.3%118.3%+2.9
Motor Vehicle Theft120.0%61.6%+287.7
Other Larceny12100.0%8.1%-8.4
Robbery1258.3%24.8%+26.9
Sexual Assault12100.0%13.0%-5.1
Theft from Vehicle1291.7%10.1%+87.2
Vandalism1258.3%17.6%+91.6

Spatial spillover of spike events

For every spike flag we've fired in Seattle, we ask: did an adjacent neighborhood spike the same category in the next 3 months? The answer tells us whether shocks are local or regional. Adjacency is computed from the 20 neighborhood polygons (avg 3.2 neighbors per neighborhood). Rates are suppressed below 5 events.

Overall: across 205 historical spike events, an adjacent neighborhood spiked the same category within 3 months 54.1% of the time. Motor vehicle theft (83.5%) is the most regionally co-moving category; vandalism (0%) is the most local.

Seattle historical spike-event spillover by crime category (3-month lookahead, adjacent neighborhoods via shared boundary).
CategorySpike eventsSame-category spillover
Robbery4165.9%
Aggravated assault4— too few
Sexual assault3— too few
Burglary3— too few
Other larceny3537.1%
Motor vehicle theft8583.5%
Vandalism340%

Window is forward-only: we ask whether a spike in neighborhood A was followed by a same-category spike in any of A's adjacent neighborhoods within the next 3 months. Co-occurring same-month spikes count. Categories with too few historical spikes for a stable rate (homicide, arson, and any other low-volume bucket) are listed but their rates are suppressed.

← Back to home