On reading construction drawings

A third kind of reading the AI industry has not built yet, and the field that needs it most.

A senior estimator pulled me aside in a preconstruction trailer last month and pointed at his monitor. He had just run a 32-page electrical plan set through an AI takeoff tool. The output looked clean. Symbol counts, fixture schedules, a green dashboard.

Then he scrolled to sheet E-207. A bank of eighteen receptacles sat along the east wall of a conference room, drawn the way half the drawings in this industry are drawn: a few explicit symbols, a repeat notation, an end symbol. The tool had counted zero of them.

"It saw the first one," he said. "Then it saw a pattern it didn't have a label for, and it just kept going."

A real Boon takeoff: dense magenta receptacle symbols and orange home run conduits laid over an electrical floor plan.

Eighteen receptacles became zero. If that mistake ships in a bid, the subcontractor eats the cost of eighteen device rough-ins, the wire, the termination labor, and a change-order argument with the GC about whose job it was to catch it. On a commercial building, one page missed this way costs six figures.

The tool wasn't broken. The estimator wasn't using it wrong. The model did exactly what it was trained to do: detect a symbol it had seen many times before, and stop when the drawing departed from the textbook. That is the ceiling of an entire approach, and most construction AI is sitting against it right now. This post is about the model that should replace it.

A drawing isn't a photograph, and it isn't a document

Here is the way we have come to think about it after enough conversations with estimators, engineers, and the team building our model.

Language models reason over tokens. You feed them a document, they predict what comes next, they summarize, translate, answer questions. That layer has been won at foundation scale.

Vision models reason over pixels. You feed them an image, they segment objects, classify scenes, describe what they see. That layer has been won, too.

Neither one reasons over a building.

A construction drawing is not a photograph and it is not a document. It is a compressed, symbolic projection of a physical system onto a piece of paper. Every line has a load path behind it. Every duct has a flow, a fitting, a termination. Every panel has feeders coming in and branch circuits going out. The meaning of the drawing lives in the physical system it represents, not in the pixels on the page.

Reading a plan set the way a good estimator reads one is not "detect the object" or "summarize the text." It is reconstructing the building from its compressed representation. That is a different primitive than the two we already have foundation-scale answers for. Fei-Fei Li has called the next frontier spatial intelligence, and she is right about the shape of it. Construction is the field where the version of that work that matters most has to be built. The drawings are specific enough to define the problem, the dollars on the line are specific enough to fund it, and the contractors waiting for it have been waiting longer than most software people realize.

Detection is the demo, reconstruction is the deployment

Most construction AI you see today is object detection with a trade-specific vocabulary. It finds ducts. It counts panels. It labels fire alarm devices. The teams building those systems have done real work, and the detection layer has gotten very good. It is also not the right ceiling.

A takeoff that lists 847 ducts on a mechanical plan set and misses the 1,200 fittings those ducts require is not a partially correct takeoff. It is a takeoff that loses the contractor money. Fittings carry roughly half the labor in a ductwork install. A contractor who bids off detection-only output either loses the job to a sharper competitor or wins the job and loses the margin. Both endings look the same to the estimator, which is why the AI tool gets uninstalled three months later. Nobody writes a blog post about the uninstall.

The model that closes the gap does relational reasoning over the drawing. Not "what is this," but "what does this imply must also be there." Ducts imply fittings. Panels imply feeders, circuits, and devices. A main distribution panel at 208Y/120V implies specific sub-panels and feeder sizes downstream of it. A structural column at a grid intersection implies a beam, a connection, a load path.

A human estimator reads a plan set by reconstructing the system the drawing describes, and using the parts they cannot see directly to check the parts they can. The model has to do the same thing. Detection is the demo. Reconstruction is the deployment. The same drawing gets a different answer depending on which side of that line the model is on.

Three habits the model has to learn

Three habits, from what our team has seen across deployments, separate models that ship from models that demo. None of the three reduce to object detection.

Cross-reference across sheets. An electrical panel schedule lives on one sheet. The devices that panel feeds live on different sheets, sometimes ten apart. A good estimator reads the schedule, reads the floor plan, walks the device topology, and notices when a panel's schedule lists 42 circuits but the plan only shows 37 devices fed from it. Five circuits are either missing from the drawings or wrong in the schedule. Either way, it is an RFI to flag before the bid goes out, not after the rough-in inspection. There is no pixel pattern that says "this panel has missing devices." The gap is inferred from the relationship between two drawings, and a model that cannot hold both sheets in mind at once will not see it.

Topological completeness. In a mechanical plan set, every duct has a connection at both ends. It connects to an AHU, a VAV box, a diffuser, another duct run, or a fitting. A model that has reconstructed the system can tell when a duct run terminates into open space with nothing at its end. That is almost always a drafting error or a sign the sheet is incomplete. A contractor showed me a project last quarter where a mechanical drawing was missing the last two pages of the sheet index. A detection-only tool counted everything on the sheets it received and produced a confident output. A model with a relational graph would have flagged that the supply air system did not close its loop. Same drawings, different failure mode, and one of those failure modes costs a quarter million dollars in downstream rework.

Symbol grammar. The repeating-receptacle scene at the top of this post is the canonical case. A bank of devices drawn as "symbol, symbol, symbol, dot dot dot, symbol" is how drafters save time on a CAD sheet. A human estimator counts the explicit symbols and infers the implied ones from the repeat notation. A classifier trained on the canonical receptacle symbol sees the first few, loses confidence on the repeat, and silently drops the rest. The fix is not more training data on that symbol variant. It is a model that reads receptacles at the level of the drawing convention: single symbol, banked run, repeat pattern, schedule reference, all four counted as the same kind of element. Drawings have a grammar. The model has to learn it the way a junior estimator learns it from a senior one.

Where the moat actually lives

The fair question every investor asks is: if all of this is real, why won't the horizontal labs just build it? GPT-5 ships with a vision module, Gemini follows two weeks later, Claude keeps shipping skills and tool use and computer use. A year ago, the popular answer in vertical AI was that institutional process and workflow shape were a moat the horizontal labs would not cross. That answer is aging quickly, and we do not lean on it. The foundation models will keep absorbing application surface area, and anyone betting against that is reading the last decade more carefully than the next one.

The vertical moat in construction lives in three places that are harder to reproduce from a foundation-model seat than capability is.

Access to the drawings. Construction drawings are not on the public internet in any meaningful volume. The drawings that would teach any model about real construction live in GC archives, subcontractor records, and permit offices with rate-limited downloads. Construction is, by McKinsey's measure, the least-digitized major industry in the U.S. after agriculture (McKinsey Global Institute, "The Next Normal in Construction"). There is no Common Crawl for this domain and there will not be one. A capable model with no access to the right data is a model that does not know this work. Earning access is relational and slow, and the team doing it is the team that has spent years in trailers.

The deployment surface. Reading a drawing well in a benchmark and reading a drawing well on a contractor's live bid are not the same task. The failure modes only show up when the work is real. A team with live deployments has a catalog of failure modes no horizontal lab gets to see, and a feedback loop with estimators who care enough to flag them. By the time a horizontal model is capable in this domain, the open question is which application already has the seat at the desk where that capability gets used. Boon is running a long compounding process around that seat. Every drawing read, every RFI surfaced, every bid improved is feedback the horizontal labs do not have, and a relationship the contractor will not casually replace.

The orchestration that turns capability into reliability. Better models make good orchestration more valuable, not less, because reliability lives in the scaffolding around the model rather than in the raw capability. The model does not know when to ask the estimator for a clarification, when to flag an RFI, when to refuse to count, when to fall back on a cross-reference, when to escalate to a human. The application has to know. As the underlying model gets stronger, the weight of that judgement gets heavier. Boon is built for that future. We are not betting on the model staying weak. We are betting that the team that turns a strong model into a deployable estimator wins the desk.

None of this depends on horizontal labs staying out of the way. The work that wins this domain is a long, patient deployment problem with a model at the center of it, not a model problem with deployments hung off it. That is the bet, and it is the right one to be making right now.

What we have shipped, and what we haven't

We want to be specific about this part, because it is the easiest place in any post to overclaim.

What we have shipped today: Boon's Construction Vision Model reads drawings across multiple trades and produces takeoff output customers use on live bids. It already counts the symbol-grammar cases (banked runs, repeat patterns) that detection-only tools miss, which is why the receptacle anecdote at the top of this post is fixable rather than fatal. Customers are starting to ask us about completeness at the system level rather than accuracy on individual symbols, and that is the right question to be asked.

What we have in training: relational reasoning over fittings, panel-schedule cross-references, and topological completeness checks. These are not yet shipped in their full form. They are the work, and they are the reason we are building the model the way we are.

What is on the roadmap: a single architecture that learns shared structure across all 30-plus trades, where each trade we ingest sharpens the next through cross-trade priors. The architecture is built for it. The measured cross-trade transfer numbers will come from the next training cycles, not from claims today.

These three buckets are separated on purpose. Anyone in this space telling you they have all of it shipped today is selling a demo, not a deployment. What Boon has is the architecture, the data flywheel, the deployments to learn from, and a commitment to the harder primitive rather than the easier one. That is enough to keep building. It is what the next phase of construction AI is going to be measured on.

The bar

Construction needs a third kind of AI reading. Tokens won't do it. Pixels won't do it. The drawings are physical-system descriptions, and the model has to read them as such. That is the bar Boon is building to, regardless of where the foundation-model frontier ends up next year. If the horizontal labs eventually get capable enough to read drawings the way we are, we plan to already be the team they are working with, because we will already be the team contractors trust.

If you are an estimator, a precon lead, or a GC who has lived through the eighteen-receptacles problem, we would like to hear from you. The deployments that define this work are going to come from the firms who care enough to surface the failure modes that detection-only tools never see. That is the conversation worth having now.

Deepti Yenireddy is the founder and CEO of Boon AI.