Data Science

Data Trails #3 – Snapshots from the history of data visualisation

Visualising data on health and mortality has a most up-to-date ring to it, as if it had required the rise of big data and computational tools for something as intricate as visual health statistics to develop.

Summer Book Recommendation: “Everybody Lies”

“Everybody Lies” is the harsh title of Seth Stephens-Davidowitz’s new book. While it doesn’t provide any feasible recipe to prevent people from lying, the book helps the reader in one essential realm: to grasp and conceptualize the power of data and data science. Its key strength: It does so in a very engaging and accessible way.

Data Trails #1 – Snapshots from the history of data visualisation

81 years of budget data and various categories in three diagrams – the United States Fiscal Chart from the 1870 US census atlas is a real blockbuster in the history of data visualisation. The atlas as a whole is full of interesting graphics and has a widespread reputation as an early gem of data visualisation.

Data Science for Pharma – A Short Case Study

Open data in biomedicine is a gold mine that can strengthen innovation in pharmaceutical R&D. In combination with the right analytics, public data helps identify therapeutic targets and ligands, enhance clinical development, and boost portfolio management efficiency. The challenge is to purposefully integrate abundant and heterogeneous data scattered across data sources.

Trend Detection – Delineating possibilities and utopia

In times where seemingly every second “The Economist” Special Report focuses on either Artificial Intelligence (AI) or Big Data, general expectations regarding current technological capabilities are higher than ever. Rightly so, as there have been so many notable advances in recent years. What does this mean for trend detection?

Übersättigung im Profifußball: Wie viele Zuschauer kommen zu Fußballspielen?

Bei vielen Freundschaftsspielen der deutschen Fußballnationalmannschaft – wie jüngst gegen England – sind in letzter Zeit einige Sitzplätze frei geblieben. DFB-Teammanager Oliver Bierhoff warnte bereits vor einer “Übersättigung” des Fußballs. Gibt es generalisierbare Treiber für die Beliebtheit einer Fußball-Partie? Ein kleines Experiment – abseits der großen Fußballbühne.

A Question of Data Quality

80% of work in data science projects is dedicated to data quality assessment, data preparation and integration. Applying and tweaking the algorithms, improving the performance of models (basically all the fun stuff) covers only 20%. What’s the reason for this?

3,2,1 – Gone! The Science of Used Car Pricing

Despite the recent Volkswagen scandal, German cars still have a world-class reputation. And Germans still love cars: more than 44 million vehicles are registered, that’s about one car for every second citizen. Understandably, the used car market is also large – but, how does one arrive at a fair value for a used car?

Cambridge Analytica: Beyond the Hype

The story immediately went viral: Big Data company Cambridge Analytica and its sophisticated psychographic models helped Donald Trump to secure the victory in the 2016 presidential election. The story played to all prevalent fears in the age of big data: privacy, microtargeting, behavioural steering. But now – with far less media buzz – the company admits that it was never really involved in the Trump campaign. What can we learn from this ‘scam’?