Five basins and 1.92 million wells later, this is the basin we cannot show you. Texas Railroad Commission ships its bulk well data through a JSF session portal backed by EBCDIC mainframe dumps. The wells exist. The architecture decided we cannot count them.
The first five basins answered "how much": Williston 46% dark, Permian 78%, Anadarko 91%, Kansas 94%, Appalachia 95%. The numbers settled into a pattern. The shapes settled into a verdict. The pattern said horizontal eras don't rescue old basins; the verdict said most US oil and gas data is structurally dark.
Eagle Ford is the basin that answers the deeper question. Not how much is dark, but what dark IS. We chose this basin last on purpose. The architecture itself is the answer.
By the time we publish, the other five regulators have done their job: ship the wells in a form we can plot. TX RRC is the regulator that didn't.
The 2018 USGS Eagle Ford Group Assessment ships seven Continuous AU polygons covering the Cenomanian-Turonian Mudstone, the Eagle Ford Marl, and the Submarine Plateau Karnes Trough plays. They span 76 Texas counties from Maverick on the Mexico border up through Karnes and DeWitt and out to Brazos and Madison. The map you are looking at is the basin and the counties.
What is missing from this map: the wells. EIA estimates more than 30,000 horizontal wells permitted in the Eagle Ford since 2008, plus a half-million conventional wells of every vintage drilled around them. None of them are dots on this page.
Compare to Williston, where we plotted 45,921 wells inside the basin polygon, or Anadarko, where we plotted 482,918. Same series, same code, same intent. Eagle Ford ships empty.
We probed the obvious paths first. TX RRC's ArcGIS REST host (gis2.rrc.texas.gov) is unreachable from this network. The HTML viewer at gis.rrc.texas.gov requires a Silverlight-era control. The HIFLD national oil/gas wells layer requires an ArcGIS Online token. The Texas Open Data Portal has no oil/gas wells dataset. The TX Bureau of Economic Geology does not publish a public well shapefile.
Each one of these had a public PA, OK, ND, KS, or WV equivalent that ships in a single click. TX did not.
TX RRC's MFT (Managed File Transfer) portal lists more than 200 direct download links to oil-and-gas datasets: Statewide API Data, Drilling Permit Master, Wellbore Query Data, Completion Information in Data Format. We tried them all.
Each link returns a JSF page with a five-minute session timer instead of a file. The page is generated by a PrimeFaces-stack web app. Download requires the session cookie, the JSF ViewState token, and a click on a control that POSTs back to the same JSF state. Vanilla HTTP cannot fetch the file.
We do not run Selenium scrapers on regulator portals at scale. That is not a methodological purity claim; it is a cost claim. Each basin in this series shipped in 3 to 5 hours of pull, parse, and join. Texas would be 2 to 3 days of fragile session-replay code.
Behind the JSF portal sits an even older layer: EBCDIC fixed-width files originally produced for an IBM mainframe. The TX RRC dataset list shows EBCDIC versions of Statewide Oil Production, Statewide Gas Production, Historical Ledger Statewide Oil, Statewide Gas Ledger Districts 1 through 10, and Certificate of Authorization P-4. The ASCII versions are companions, not replacements.
You can decode EBCDIC. You can write a fixed-width parser. The economics of doing it for a research-journal basin walk do not pencil out, especially when North Dakota ships its entire well inventory as a 43 MB shapefile that opens in QGIS in three seconds.
"The state did not migrate" is the cleanest way to describe what TX RRC is. Other regulators upgraded their data publishing posture between 1995 and 2015. TX kept the mainframe.
The Permian (basin 02) is the only basin that crossed Texas and survived. It survived because we relied on the New Mexico OCD dataset for the lit population, and because the eastern Delaware basin counties carry enough metadata to make the dark-share estimate without a TX RRC join.
Eight regulators publish bulk well data we can pull. One does not. The cooperative regulators are not all equally polished, but every one of them shipped the wells. The outlier is not on a continuum with the rest.
Score is a working analyst's rating: 100 = direct CMS download, no auth, no session. 0 = inaccessible without browser automation. The full table is in regulatory_comparison.json.
EIA's Drilling Productivity Report aggregates Eagle Ford rig count and production at the region level. We can see the rig count rise from under 50 in 2008 to a peak of 279 in May 2012. Oil production followed, peaking at 1.72 million barrels per day in March 2015. The 2014-2016 oil price collapse cut the rig count to 38 by 2016, but production only fell from 1.72 to 1.05 MMbbl/d. The wells that were already drilled kept producing.
The wells we cannot see produced 1.2 to 1.7 million barrels of oil a day for ten years. They are real. They are documented somewhere inside TX RRC. We just cannot pull them.
Eagle Ford's rig count went from 13 to 279 in 50 months. That is the fastest rig-count buildup of any basin in the EIA DPR data. The Marcellus took 60 months to add as many; the Williston took 70.
Production peaked in March 2015 at 1.72 MMbbl/d, fell to 1.05 in late 2016, and has held a 1.0 to 1.3 MMbbl/d plateau since 2018 with progressively fewer rigs needed to sustain it. The plateau is the operational success of the play. The plateau is also the record we cannot index.
A successful basin can also be a dark basin. The two are not in tension.
For each of the other five basins we ship a CSV of every basin-clipped well, six columns, 25 to 580 thousand rows. Eagle Ford has no equivalent. We chose this basin last on purpose, with the working assumption that we would have to ship without it. We confirm that here.
What we cannot show: where the wells are. What vintage they are. Which operator drilled which one. Which ones were horizontals targeting the Eagle Ford Marl in Karnes and which ones were 1960s vertical Wilcox wells in Live Oak.
If TX RRC publishes a bulk well shapefile, this basin gets a v2 with a wells.csv. The pipeline is built. The slot is empty.
The Eagle Ford horizontal era started in 2008 with EOG's first wells in La Salle County. The Wilcox sandstone gas plays of Live Oak and Bee counties date to the 1940s. Spraberry-Wilcox conventional oil in McMullen and Atascosa goes back to the 1930s. Frio gas in the southern counties has been drilled since the 1920s.
The 30,000 Eagle Ford horizontals are the lit layer. The 470,000 conventional wells underneath them are the dark floor. We know the floor exists from EIA aggregate counts and from individual operators' SEC filings. We do not know it well-by-well.
If you assume Eagle Ford follows the Anadarko shape (8% modern horizontals as a top layer, 92% conventional vertical legacy underneath), the basin's expected dark share is in the high 80s. We document the assumption, not the verdict.
The first five basins demonstrated that dark data is a measurable property of a basin's drilling history. The Williston's lit layer is genuinely big. Anadarko's is genuinely small. Kansas's pre-digital legacy is genuinely overwhelming.
This basin demonstrates the second property. Dark data is also a measurable property of the regulator. The drilling history of the Eagle Ford could be perfectly recorded inside TX RRC's databases (it largely is) and still be dark to the public, the academic community, the underwriting industry, and the policy teams that need it. The wells exist. The architecture is the question.
The dark-data problem is not solved by waiting for technology. It is not solved by waiting for the operators to file better. It is solved by the regulator choosing to ship.
Five measurable basins gave us a falsifiable claim: horizontal eras only rescue young basins where horizontals dominate the well count. Williston (46% dark) confirms it. Permian (78%), Anadarko (91%), Kansas (94%), Appalachia (95%) refute the alternative. The thesis holds across every basin where we can see the wells.
One unmeasurable basin gives us the second claim: the dark-data problem is structural. It does not respond to drilling-era timing. It responds to the regulator's data publishing choice. Texas chose. The choice is the basin.
The pattern is in the gap. The five legible basins land between 46% and 95% dark, set by drilling-era timing. The sixth basin lands at "we cannot see," set by regulatory choice. Both are dark-data verdicts. Both are real. Neither is in tension with the other.
The first five basins showed what the choice looked like under different drilling histories. Williston, where the horizontal era was big enough to overwhelm the legacy. Anadarko, where it was not. The output of the choice changes with the geology and with the calendar.
Eagle Ford shows what the choice looks like when it has been made. North Dakota chose a shapefile. Oklahoma chose a CSV. Pennsylvania chose a flag column. Kansas chose a survey. Texas chose to keep the mainframe.
That last sentence is the thesis of the entire series. We needed six basins to write it.
The US oil and gas industry's relationship to its own data is not a continuum of technical capacity. It is a binary product of regulatory choice, propagated through a calendar of drilling-era timing. Five regulators chose to publish what they had. One did not. The first five let us measure the dark fraction at varying levels (46 to 95 percent) and prove that horizontal eras only rescue the basins where horizontals dominate the well count. The sixth lets us prove that the dark-data problem is not a technology gap or an operator failure. It is a regulator's published posture. Drake's well went dark in 1861 because there was no system to remember it. The Eagle Ford's wells are dark in 2026 because the system that does remember them does not let anyone outside the building look in.
Where this series goes
I build the pipelines, basin-clip analyses, and AI-assisted subsurface QC that close this kind of gap, across basins, regulators, and the geological detail that still matters. If you work any side of this, let's talk.
Sources: USGS ScienceBase item 5d1246c2: Eagle Ford Group Assessment Unit boundaries. EIA Drilling Productivity Report Eagle Ford monthly time series, January 2007 onward. US Census TIGER 2024 cartographic boundary files for state and county outlines. Regulatory accessibility scores reflect the experience of pulling each basin in this six-basin series, January 2026 to April 2026. Built with deck.gl and scrollama. Analysis: @salamituns.