DJ Patil, data science’s new ambassador
When you ask young U.S. graduates about their employer choice, you’ll hear all Wall Street and Silicon Valley. One employer, which is rarely named, is the U.S. government. For data scientists, however, this might change soon. In 2015, Dhanurjay “DJ” Patil, after having worked for tech darlings like LinkedIn, eBay and Skype, took over the newly created role of “Chief Data Scientist” for the Obama administration. His responsibilities include not only the push for better and smarter big data applications, but also the recruitment of the brightest data science minds for public service. Governments sit one huge piles of data, and data scientists will be key to lifting its full potential for the public good. Why have other governments not yet hired Chief Data Scientists?
Silicon Valley Spirit in Washington D.C.
The appointment of DJ Patil as Chief Data Scientist last February was – even in the bigger picture – a highly significant event. Not only did the creation of the role – a unique move, as no similar role had existed before in the government – underline the growing importance of data science for societal improvements, but also did the recruitment of DJ Patil, one of the big names in the community, send a strong signal about the forward-looking seriousness of the Obama’s administration.
Indeed, just like Facebook, LinkedIn and Google, governments are in the unique position to have access to large datasets across all kinds of domains. But unlike the tech companies, notable efforts to utilize data for improvements in governmental responsibilities like health care, security, law and order or environmental protection have been lacking. The recruitment of DJ Patil is a landmark, as it implants the data drive of Silicon Valley into Washington bureaucracy. Already on his second day, Patil spoke at the Strata+Hadoop World event outlining his mission and since then has used every medium available to communicate efforts, achievements and his roadmap as Chief Data Scientist.
What is DJ Patil even doing?
When asked about his mission, Patil frames it as “responsibly unleashing the power of data for the benefit of the American public and maximize the nation’s return on its investment in data”. While this a rather broad description, the value of his activities in the government as a data scientist can be broken down to three essential tasks along the data value chain.
Developing a data collection mindset across agencies
Governments and agencies have tremendous access to data, but oftentimes lack an understanding about the inherent value. While the collection of data is entrenched in their job requirements, there is little incentive to leverage the data to trigger improvements in respective domains. This starts with the fact that oftentimes documents are not even machine-readable and thus essentially useless. As Chief Data Scientist, DJ Patil draws attentions to those flaws at the roots of the data value chain in order to unfold the full potential of his profession downstream.
Driving the development of real-life data science use cases
As Chief Data Scientists in the U.S. government, Patil also pushes specific data science projects in various domains, focusing for example on health care and criminal justice. One of his major projects is the Precision Medicine Initiative, which focuses on combining patient data and external data sources to develop better tools for doctors to enable individual treatments. Another initiative focuses on utilizing police and law enforcement data to develop smarter prediction and detection mechanisms to prevent crime and injustices. These domain-specific applications help to unlock the potential of big data, but also serve as real-life examples of the power of data science.
Encouraging innovation by enabling open data culture
While data has thousands of internal governmental use cases, external parties are also interested in (non-sensible) data sets, which could be used to develop new applications and drive innovation. The U.S. government has set up www.data.gov, a massive open data project, which hosts thousands of data sets ranging from business to agriculture, from “Expenditures on Children by Families” to “Fertilizer Use & Price”. This open data culture forms the backbone of the future as it enables the entire community of data scientists to help unlock the full potential of data across domains.
So, why have other governments not yet hired Chief Data Scientists?
Governments across the world employ legions of statisticians in their departments to draw conclusions from available data. However, a coherent and holistic approach like the U.S. government is proceeding with DJ Patil as Chief Data Scientists is not prevalent among governmental administrations outside of the U.S. Why is that the case? Have governments elsewhere not realized the efficiency potential for their operations hidden in big data?
Actually, they probably have realized the potential of big data (you’d have to intentionally ignore reality, to claim otherwise). But they have also realized the volume of investments necessary to capture the value of data through data science. Data science is not only interdisciplinary itself, but when applied in the governmental sector requires not only the joint work of several agencies, but also large efforts of data collection, standardization and integration to reap all the benefits. These upfront investments seem large, but are the required groundwork for the long-term success of data scientists. Without adequate data quality, data science is useless. However, bureaucratic inertia tends to look for short term gains and thus take short cuts wherever available, circumventing the requirements for a large-scale data initiative.
However, with the establishment of the Chief Data Scientist role in the U.S. government, other countries are likely to review their own policies and readiness for the age of big data. DJ Patil himself has once proclaimed, that the role of the data scientist is probably the sexiest job of the 21st century. It is however not so sexy, if the fuel for the job – data – cannot be fully utilized. Overcoming inertia and slow moving institutions will be key to effectively drive the data science agenda into governments. The longer governments wait to establish the necessary infrastructure, the more they fall behind. The Obama administration was indeed a first mover, but hopefully inspires administrations around the world to ambitiously follow suit.