The world of digital creation is being redefined by Generative AI (GAI), a groundbreaking class of tools that has captivated both creators and consumers.
At the forefront of this revolution are Large Language Models (LLMs) like GPT-4, renowned for their versatility in natural language processing, machine translation, and content creation, including novel works and even software code. Since its introduction, GAI has surged in popularity, with products like ChatGPT setting records in user growth. Similarly, image generators such as OpenAI’s DALL-E and Stability AI’s Stable Diffusion are redefining artistic expression, blurring the lines between AI-generated images and human-created art.
As part of his Cornell Tech fellowship focused on the implications of AI and augmented reality for cities, Greg Lindsay, a 2022-2023 urban tech fellow at Cornell Tech’s Jacobs Institute, where he explored the implications of AI and augmented reality at urban scale, and had the opportunity to meet and work with the technology and innovation leads at several large architecture firms.
Through that project, Cornell Tech’s Jacobs Urban Tech Hub was approached by a consortium of AEC firms pooling their time and resources to better understand the pitfalls and opportunities posed by AI to their firms. To that end, Lindsay and a team from the Jacobs Urban Tech Hub helped conduct an assessment of leading AI tools such as ChatGPT for Enterprise and Midjourney, along with conducting interviews with leading practitioners and educators as well as drafting a forward-looking “horizon scan” along with recommendations.
The research report, “The AEC Industry and Generative AI: Navigating Through Opportunities and Challenges,” written by Lindsay and and urbanist-in-residence Anthony Townsend, offers an overview of GAI’s potential impact on the AEC industry. The report is part of the Urban Tech Hub’s mission to generate applied research, foster an expanding tech ecosystem, and cultivate the next generation of urban tech leaders, and it also includes in-depth interviews with the multiple academics and practitioners (and an AIA executive) for their viewpoints on GAI’s uses, abuses, and potential.
Here, we present Greg Lindsay’s interview with Phil Bernstein, an architect and technologist, who is also the author of “Machine Learning: Architecture in the Age of Artificial Intelligence” (2022), and “Architecture | Design | Data – Practice Competency in the Era of Computation” (2018).
The interview was edited for length and clarity.
Viewpoint
Greg Lindsay: Given you’ve literally written the book about the use of AI and machine learning in architecture, what stands out to you about generative AI? Is it a break from prior techniques, or merely an iteration? What features do you find particularly interesting, novel, and/or powerful?
Phil Bernstein: Let me put it this way — as I said to our dean last week, it’s theoretically possible — if not practically — for one of our students to design an entire project and never draw a line. One of the questions for us as educators and the profession at large is: What do we think of that?
It’s clear these technologies provide a whole new avenue for generating ideas, and for doing things architects aren’t good at, like prediction analysis and education. But as usual, there’s an extreme fascination with form generation and image-making that’s interesting but in danger of being a distraction.
And where do you fall on the question about drawing a line or not?
I think the healthiest attitude is the one we take about all tools, which is that it’s just one in your toolbox. I like to say a generative image creator is like a bandsaw. We teach you how to use a band saw safely. We should teach you how to use a generative tool safely.
And you wouldn’t want to use a band saw for everything, either. What should be off-limits when it comes to using generative tools?
When we don’t understand the capability of the tool, it’s a little early to decide what’s off-limits. It’s like asking what the speed limit should be when we haven’t invented the internal combustion engine yet. I’ve been around long enough to see two “AI winters” in my own career, when everyone felt the world was about to change in a dramatic way before we hit the limits of computation. Nothing should be off-limits, but we need to proceed with caution because there some big red-and-yellow flashing lights around the use of data, intellectual property, ethics, hallucinations, and so on.
Focusing on the intellectual property issues for a moment, but what’s the biggest flashing red light there?
Well, it’s tricky, because the legal theory behind the ownership of architectural intellectual property never anticipated technologies capable of ingesting huge amounts of that data and then drawing their own conclusions. We were experimenting yesterday with taking a rough hand sketch and running it through a generative environment we built in Comfy. “Make this look like a Louis Kahn building,” we said. “Make this look like a Norman Foster. Make this look like a Zaha Hadid.”
Under the law, as long as I can manipulate those images to be unique and not direct copies, there’s nothing stopping me from doing that. Now expand that principle beyond formal strategies of image or composition — let’s say HOK is really, really good at designing orthopedic operating rooms. Will all of that be generally available for ingestion by these algorithms? Right now, these tech companies are operating in an ethics-free zone where there’s no legal principle stopping them from ingesting everything they can find on the Internet.
What should the industry do about that? Lock it down and use their IP crown jewels to train their models, either alone or in tandem? Will laws have to be passed to deal with this?
Look, I don’t teach in the Law School. Well, actually I do teach in the Law School, but I’m not a law professor and I don’t have an informed opinion. The trajectory of legally protecting architectural ideas has evolved away from pure copyright toward protecting unique ideas. Some of that approach will be extrapolated into the use of this kind of data, but if the technology evolves in a way people can build their own training sets, then HOK can keep all of their orthopedic ORs off the Internet. The problem right now, of course, is that — at least at this moment — you can take every operating room HOK has designed since the beginning of time, and it’s not nearly enough data to train anything.
Given such bottlenecks, how should firms incorporate such tools into their current workflows?
Because the architectural profession moves at glacial speed when it comes to adopting new technologies, I suspect the initial influence on the profession will be external, not internal, because firms don’t have time, or money, or resources to work this out on their own. The fact that a group of very large firms you work for have banded together to figure this out because none of them have the resources to do it on their own is telling. If we were talking about any other industry, they would be competing against one another to find a market advantage. Instead, they’re working on a certification label.
It’s too early for anything other than wild speculation, but what I think will happen is that forces far larger from outside the profession — real estate development, manufacturing, supply chain management, construction management, finance — will likely create demands on architecture we’ll have to respond to.
Which of those industries is likely to pose threats to the industry?
You used the word “threat.” I didn’t use “threat.”
I did. Threatening incumbents in this case. From where is disruption likely to come from?
Let’s speculate on both a positive and negative note. Positively speaking, those industries might create AI environments sufficiently data-rich that the insights they generate become accessible to the design profession. So, someone builds a fabulous set of generative AIs that make it much easier to generate a curtain wall design. Instead of keeping it to themselves, they make it widely available as a gateway drug to their platform. You visit their website, give them a parameterized building and set some constraints, and they’ll generate some options you can look at. That’s the high road.
The low road follows The Innovators Dilemma, in which someone builds a tool that can only generate curtain walls, and that’s not anywhere near a full building, so the industry shouldn’t worry about it. But they get really good at curtain walls before moving onto the floor structures supporting them, and while architects are using these tools to make pretty pictures, innovations happening outside the industry build up enough capability to supplant them. Terrible unresolved buildings are the result.
So where should AEC firms be focusing their energies and attention right now? On which piece of the tech stack?
I’ll give the same advice I gave to firms during the early days of BIM, which is: We don’t know where this is going. Firms with resources need to have someone on their staffs watching and building platforms for experimentation. I was just having lunch with a young architect — one of my former students — who’s taken on this role for the 30-person firm he works for. He’s watching. He’s trying out things he thinks might be useful in practice, but it’s way too early — certainly too early to be making intergalactic declarations about what's happening, because things are moving way too fast.
If Stable Diffusion is a tool the way a band saw is a tool, what are your classroom guidelines on when to use AI and when to use a band saw? How should firms — and educators — select their tools?
We can teach someone how to use a band saw because the band saw has been around for a hundred years. For this, it’s too early to say. I’m going to give a presentation today where we’re going to show three images from three different image generators with the exact same architectural prompt, straight out of the box. In these very early days, tweaking the carburetor or the fuel mix or the tire treads on these things, so to speak, creates very large changes in their output, so it’s too early to be issuing detailed guidelines.
We’re likely to tell our faculty here to follow the classroom guidelines the Cornell folks came up with, which is you have three options for generative tools. You can declare them off-limits and try to police it. You can tell students they can use them but have to carefully document it. Or you can make it a central theme of your teaching — “Take this prompt, put it in, and take the resulting image and show me how you manipulated it to get the final result” – which I think is clear and makes a lot of sense. It’s a good way to think about how to use it within practices as well — ignore it, play around with it, or pick some part of your process you think would really benefit and apply it.
The last thought I want to leave you with is that these generative tools make increasingly convincing pictures of buildings. They do not make buildings — and they’re nowhere close. They’re barely two-dimension projections of complex, three-dimensional phenomena. In this classroom project I’ve been working on for the last two weeks, we’ve probably generated 300 or 400 images starting from the same sketch, and the image generators make up a lot of shit that makes no sense, right? They invent background buildings, don’t understand roof lines, know nothing about how buildings meet the ground. The big open question is: what do machines for making evocative images mean in the long term?
Greg Lindsay was a 2022-2023 urban tech fellow at Cornell Tech’s Jacobs Institute, where he explored the implications of AI and augmented reality at urban scale. He is also a senior fellow of MIT’s Future Urban Collectives Lab, a senior advisor to Climate Alpha, and a non-resident senior fellow of the Atlantic Council’s Scowcroft Strategy Initiative. His writing has appeared in The New York Times, The Wall Street Journal, Fast Company, Bloomberg BusinessWeek, Harvard Business Review, and The Financial Times, among many other publications.
Phil Bernstein is an architect and technologist who has taught at the School of Architecture since 1988 and where he received his B.A and M.Arch. He was a Vice President at Autodesk where he was responsible for setting the company’s future vision and strategy for BIM technology. Prior to Autodesk Phil was a principal at Pelli Clarke and Partners Architects where he managed many of the firm’s most complex commissions including projects for the Mayo Clinic, Goldman Sachs, and Reagan Washington National Airport. He is the author of Machine Learning: Architecture in the Age of Artificial Intelligence (2022), Architecture | Design | Data – Practice Competency in the Era of Computation (2018) and co-editor of Building (In) The Future: Recasting Labor in Architecture (2010 with Peggy Deamer), and consults, speaks and writes extensively on technology, practice and project delivery. He is a Fellow of the American Institute of Architects, a Senior Fellow of the Design Futures Council, and former Chair of the AIA National Contract Documents Committee.
The full report “The AEC Industry and Generative AI: Navigating Through Opportunities and Challenges, can be accessed here.