Step 00 / 15
Texas New Mexico
Dark · no public log
Lit · Texas (scanned TIFF)
Lit · New Mexico (digital, ≥2002)
Asymmetric definitions. See step 05.
Loading 393,000 wells
Project 03 · Subsurface · Walk 02

Walking the Permian.

The most capital-dense oil basin in North America. 393,073 drilled wells across 32 counties in Texas and New Mexico. We went looking for what has been logged and what has been forgotten. What we found was a lesson in measurement.

01 · The question

How much of the Permian is dark?

Kansas, the first basin in this series, is 94% dark. The interesting question isn't whether Permian is worse or better. It's whether a basin this economically central, with operators this sophisticated, still hides most of its own subsurface record.

"Dark" here means the same thing it meant for Kansas: no publicly accessible log for that well. The floor is what a determined analyst could recover without paying a commercial data vendor.

02 · The basin

Two states. Thirty-two counties.

28 counties in Texas (Andrews through Yoakum) plus 4 in New Mexico: Lea, Eddy, Chaves, Roosevelt. The county union is the defensible Permian boundary, more auditable than any single basin outline.

The state line runs down the middle of the Delaware Basin. It is also a data boundary.

03 · Texas arrives

1.4 million wells statewide. 294,225 in the Permian.

The Texas Railroad Commission publishes a live ArcGIS feed. Layer 1 is every well location in the state. We filter to the Permian bbox, deduplicate surface holes, spatial-join to county polygons.

What the layer returns: API number, coordinates, symbol code. What it does not return: spud date, completion date, operator, status. For those, the RRC points to a separately maintained ASCII dump behind a JSF portal.

So the live answer to "when was this well drilled?" is: we don't know. We keep going.

04 · New Mexico arrives

98,848 wells. One layer. Everything on it.

OCD's API_Export MapServer returns each well with operator, spud year, status, plug date, total depth, a link to the well-details page, and a link to the scanned-files portal. Forty fields per row.

Different state, different IT shop, different answer to the same question. The asymmetry is the first finding, not an inconvenience to flatten.

05 · What "lit" means

Two definitions, same basin.

TX lit = API appears in RRC's Well Logs layer (a registry of scanned TIFFs in the imaged log system, ~236k wells statewide).

NM lit = spud year ≥ 2002, OCD's digital-filing-era floor. We can't use the per-well "files" URL as a lit proxy: every NM well has one, whether or not the portal has actual logs behind it.

The state boundary is bureaucratic, not geological. The same Wolfcamp rock gets a scanned paper log in Reeves County and a digital log in Eddy County because two different regulators made two different choices a generation ago.

06 · Counting logs, part 1

108,705 Texas log rows.

That's the row count in the RRC Well Logs layer, clipped to our bbox. On its face, it suggests a lit rate of 28%.

The next step is a lesson in what happens when you let a row count stand in for a well count.

07 · Counting logs, part 2
70,308

Unique wells with at least one log.

The 38,397 missing rows are duplicate scans: the same API, multiple log entries. Some horizontal wells have thousands of per-stage log records in the registry. Deduplicate, and the Texas lit rate drops from 28% to 20.8%.

"Lit" is never one number. It's whatever definition survives the next level of scrutiny.

08 · New Mexico has no log layer

No registry. A proxy.

TX RRC publishes a separate Well Logs layer. NM OCD does not. The OCD imaging portal exists but has no bulk inventory we can query. Each well has a boilerplate deep-link into it, whether the portal holds a log behind that link or a blank shell.

We use a proxy: any NM well spudded in 2002 or later is counted as lit, based on OCD's own stated digital-filing-era floor. Wells with unknown or placeholder spud years (1900, 9999) are counted as dark by default.

32% of NM wells have a junk or unknown spud year. That's its own finding.

09 · The reveal
22.1%

Permian, lit.

87,027 of 393,073 wells. Texas at 20.8%, New Mexico at 26.0% by our proxy. Most of the lit wells sit on a narrow band through the Midland Basin and the Delaware: the horizontal-drilling campaigns of the last fifteen years.

Kansas was 94% dark. Permian is 78% dark. The gap is 16 points. Both basins hide most of their own record; they just hide different centuries of it.

10 · The state boundary

A data boundary, drawn at 103°W.

Texas wells are clipped to the Permian 28 and tinted amber when lit. New Mexico wells are tinted teal. Both datasets cross the same rock; the color change at the state line is not geological.

If you were analyzing the Wolfcamp as a single play, you'd pay for IHS or Enverus to flatten this boundary. That's the point: the boundary is not real, but the data cost of erasing it is.

11 · Vintage · New Mexico only

The 2010 inflection, visible.

We can only plot vintage for New Mexico. The Texas side is blank here by admission: no spud dates on the live feed.

pre-1980
28,096 1980s
4,807 1990s
6,736 2000s
8,613 2010+
18,959 unknown
31,637

The 1980s drop is the oil-price crash. The 2010+ spike is the horizontal transition. 32% of NM wells have no reliable spud year: another gap inside the gap.

12 · Where the lit wells live

The horizontal core.

GlasscockTX
37.6% MartinTX
33.9% EddyNM
32.2% UptonTX
31.6% MidlandTX
30.4%

The five most-lit counties are the Midland Basin core plus Eddy on the Delaware side: the exact geography of the horizontal campaign. Where operators are drilling today, logs follow. Where they aren't, logs don't.

13 · Where the lit wells don't

The shelf, the basin edge, the old verticals.

RooseveltNM
4.5% HockleyTX
6.3% ChavesNM
8.5% WinklerTX
8.9% SterlingTX
10.3%

The five least-lit counties are shelf and transitional plays. Hockley and Winkler are old vertical country on the Central Basin Platform. Chaves and Roosevelt are NM's northern and eastern margins: less rock, older wells, less filing.

14 · The largest NM "operator"
15,601

PRE-ONGARD. Zero logs.

NM's largest operator-of-record isn't a company. It's a placeholder for every well whose operator was never migrated into OCD's ONGARD digital system when it came online. 15,601 wells, 0% lit by our proxy, 16% of the NM Permian.

PRE-ONGARD
15,601 EOG Resources
5,861 Devon Energy
4,358 Mewbourne Oil
3,468 COG Operating
2,994

The named majors and independents cluster at 40–50% lit, roughly double the basin average. The horizontal-era operators are running a separate data regime.

15 · The tell

Dark data is what an industry stops caring about.

Kansas was structurally dark: old plays, shallow oil, nobody ever had the incentive. Permian is selectively dark. The horizontal Wolfcamp is well-logged because operators need the logs to model the next pad. The vertical century underneath it is dark because those logs aren't load-bearing for anyone's decisions.

The gap persists not despite massive economic pressure, but because the pressure points somewhere else.

The Permian is the rich state.

Kansas was the easy case: 94% dark, nobody surprised. Permian is the hard case. Well-resourced operators, sophisticated regulators, billions of dollars a month in active drilling, and still 78% of the historical record is unreachable to a public analyst. The series continues through the Appalachian, the Bakken, the SCOOP/STACK, and the D-J. Each basin has its own century it has stopped caring about.

Where this goes

78% dark is a cost, a liability, or a moat, depending on which side of the gap you're on.

  • Operators & modellers. Resource estimation, reservoir typing, and play-attribution models all degrade as the vertical section of the basin stays dark. Horizontal decisions still lean on vertical data.
  • Acquirers & underwriters. Asset diligence and orphan-liability pricing can't be clean without a reachable historical record. 15,601 PRE-ONGARD wells are a real number on a real balance sheet.
  • Regulators & policy teams. Orphan remediation, CCS site screening, produced-water routing, and induced-seismicity review all start by knowing where the holes are, not by assuming the obvious wells are enough.

I build the pipelines, county-grain analyses, and AI-assisted subsurface QC that close this kind of gap, across basins, regulators, and the geological detail that still matters. If you work any side of this, let's talk.

Sources: Texas RRC Public Viewer MapServer (layers 1, 2, 8) · New Mexico OCD API_Export MapServer · US Census 2010 TIGER county boundaries. Coordinates as published. Built with deck.gl and scrollama. Analysis: @salamituns.