Data Science & Real Estate
While online real estate platforms like immoscout24 in Germany employ large cohorts of data scientists, traditional real estate brokerage firms have been rather conservative in adopting more data driven approaches. But there is dynamism: We talked to Nathaniel Holland, Chief Research and Data Scientist of Houston-based NAI Partners, about how he is introducing data science into the sector.
NAI Partners is a commercial real estate brokerage firm based in Texas / USA with offices in Houston, Austin and San Antonio. Employing more than 100 real estate professionals, the company focuses on office and industrial products and recently also expanded into the retail commercial market. Despite its Texas footprint, the company has received wide attention through its “Research and Data Analytics” section, which is led by Nathaniel Holland (PhD). Nathaniel has more than 20 years of experience in academia (University of Houston, Rice University, University of Arizona) and holds a PhD from the University of Miami.
idalab: Can you tell us about your role at NAI partners and how you got involved there?
Nathaniel Holland: A little over 2 years ago I was looking for a new career path, as I was retiring from academia. I was looking for some problems that needed solving in some areas that were highly multivariate – and economics and real estate were right on the agenda. At the time NAI was looking for a new researcher and the managing partner Jon Silberman and I got together and decided that we could do a lot more than just standard commercial real estate analysis, by introducing data science in the real estate domain and focusing on more in-depth and predictive analytics, in addition to the usual quarterly real estate reports.
idalab: When you started to establish the new data science domain, what sectors did you turn to in order to conceptualize your initial approach?
Holland: Indeed, the field is not populated by data scientists – so I didn’t come into the real estate sector and have to adhere to a certain standard, I rather had the opportunity to establish some of the standards. That was a big plus.
Taking data science to real estate, I wasn’t really interested in the typical data science applications of machine learning such as with Amazon in predicting what the next online customer was most likely going to buy. The multivariate problems in real estate are sort of unique; there are different sorts of things that are driving how the market performs. Commercial real estate operates like the free economic market but the variables that go into it and shape it are highly variable and multi-dimensional. But this is also what makes it so exciting.
idalab: Since – as you sketched out – the field is very broad, what kind of first problem or application did you drill it down to showcase the value that data science could have for the real estate business?
Holland: The real estate market is dominated by sales – and what drives sales is the marketing. So I focused on two things: First, I had to generate analytic content that potential clients could consume, read and would be amazed by. Secondly, I also wanted to develop some models that help to predict how business in the domain of commercial real estate was likely going to change with changing economies.
So, for example, how will local employment influence real estate variables related to supply and demand of commercial real estate. As you know in the United States we recently had issues with the supply and demand in the oil industry and Houston has a big oil industry, and I have been using some oil industry measures to predict how the demand for office spaces might change.
idalab: It sounds like you used a lot of external data for that…
Holland: Absolutely! Is it not just about analysing commercial real estate variables, I am looking for variables outside the domain of real estate, that still could predict how the sector will perform. Commercial real estate is data-rich but we are at the early stages of really bringing in data scientist to help and guide the brokers in closing their deals.
idalab: How was your work received by those people doing marketing and sales? Was it easy to convince them about the value of the analytics?
Holland: Initially they all recognised that there was something lacking in commercial real estate and this also allowed NAI to differentiate itself from other national and local competitors, providing access to some information that cannot be provided by other firms.
At the same time it was incredibly hard because we’re talking about an industry that has very deep roots and we needed to change and modify the directions of where those roots were growing and to help brokers and clients realise that doing even some simple analytics could aid with their needs and deals.
idalab: Did you meet any resistance at all?
Holland: Whatever the discipline or domain is, whatever the new technique is, there’s always going to be resistance. There has been and still is in commercial real estate. At the same time I think competitor firms are looking around and they like reading the reports NAI Partners produces. At the same time there is still a certain degree of resistance that will persist for a number of years, as commercial real estate is in an early growth stage of adopting analytics, statistics and data science.
idalab: In real estate, online platforms have access to a lot of data about the market. What role will data ownership play in the mid-term future?
Holland: Data ownership is certainly important and right now there are still a lot of different companies that try to sell data to individual brokerage firms, but commercial real estate companies have to bring analysts on board that are really able to analyse and look in-depth at these numbers.
idalab: When talking about the development of analytical models – how specific is the regional Texas real estate market? Do models that you build for Houston or Texas transfer to other regions in the US?
Holland: The overall methodology of applying science and data analytics would stay the same across markets whether you are talking about multiple cities in Texas or multiple states within the United States, or if you want to compare different countries like UK and Germany. But the markets have their own dynamics, they might follow similar patterns but the time scale in which they operate might be entirely different.
idalab: Speaking of methodology, we usually see two approaches: one starting from the data and then looking for correlation and connections between variables, and the other way going from the bottom up in the attempt to build a model and feed it with external data sources. What approaches do you see in the real estate sector?
Holland: I think this is the problem we have in data science: there is a lot of ‘science’ lacking in data science. A lot of people out there have sophisticated abilities to crunch numbers with algorithms, but most don’t recognise that the science is lacking in data science. So oftentimes we just focus on the data set at hand and try to find information that is valuable to a specific client. It would certainly be great if we could apply the scientific approach more often – starting from a question and competing hypotheses, collecting the appropriate data, writing a statistical model and analyze whether relationships between variables or feature occur. Unfortunately, a lot of the stuff in the sector is currently still focussing on data visualization and summary.
idalab: What are the most striking insights you got to use in commercial real estate market that you think have the biggest impact on the firm or the clients?
Holland: One of the earliest models I developed already allows us to predict how much the demand for commercial real estate will change based on the local employment rates. I believe that if you get a nice solid result from a simple model it probably has more applicability in reality than a highly complicated model that comes up with a very complex analysis.
idalab: Where do you see your data science department in five years?
Holland: We’re in a slow down now, as Houston commercial real estate is at the bottom and so we’re doing the best we can to predict how long will it take for Houston to come out of this stage of the market cycle. I think the data analytics team is on a growth track, but probably at a slower rate that we initially anticipated until the economy as a whole will get a little better. The future of data science in real estate is exciting, because it is such a data rich arena, but there aren’t a lot of people with statistical or analytical skills out there in this domain. It is really an opportunity that is still unexplored.
idalab: Regarding the data science roadmap, an exciting data source is certainly telco data. Could this have any effect on predicting real estate prices?
Holland: This is something we have been starting to look at, even though not as much as we want. But definitely we’re looking at how commuting times determine where a business might want to open an office space. For example people are starting to think about minimising the commuting time of their employees and raising their happiness and life standards, reducing the frustration coming from a longer car driving time.
idalab: Final question: What would you say, looking back at university, which part of your education is the most relevant and important to you today?
Holland: There are two things. The first is a little bit outside of the box: philosophy. I spent a lot of time taking philosophy courses, those helped to train my mind to think in a logical, analytical way. But the most important thing was my rigorous immersion in the scientific practice. Before I entered the field of data science, I actually published a lot of papers and articles and spent a lot of time with actual scientific projects. This enables me now to embrace scientific thinking, oriented to problem solving. And this is definitely something, which is extremely valuable to the domain of data science.
idalab: Thanks a lot!
Dr. J. Nathaniel Holland is a research scientist with 20 years of experience in using the scientific method to extract information from complex multi-dimensional data. He joined NAI Partners in 2014 as Chief Research and Data Scientist.