A Theory Of Prediction

Review of The Signal and the Noiseby Nate Silver

The Signal and the Noise is probably the most informative non-technical book about the art of predicting ever written. It outlines what is best described as Nate Silver’s “Theory of Prediction”. Silver, creator of the data journalism site fivethirtyeight, takes readers through an information trip of diverse fields including meteorology, baseball, poker, finance, and politics, documenting estimates that either failed badly, or were extremely successful and the tactics employed by the forecasters who made them. Both the successful and the failed forecasts have much to teach us, and Silver distills the lessons from these examples into a cohesive message: we are inherently terrible at making predictions, but by adopting a few principles, we can improve our estimates and benefit at the personal and national level. In my view, there principles fall under three rules:

  1. Think like a fox
  2. Think like a Bayesian
  3. Think like a (basketball) shooter

It’s hard to make out much about these rules from their names, so let’s walk through each one in turn.

Read More

Data Science: A Practical Application

Charting the Great Weight Challenge of 2017

One frustration I often hear from those learning data science is that it is difficult to make the leap from toy examples in books to real-world problems. Any learning process must necessarily start off with simple problems, but at some point we need to move beyond curated examples and into messy, human-generated data. This graphic pretty well sums up what I’ve gone through in my data science education, and although I’m not yet over the cursing mountain, I have climbed part of the way up through trying (and often failing) at numerous projects with real data:

Technology Learning Curve (Source)

The best way to ascend this curve is build up your confidence over time, and there is no better place to start than a project directly related to your life. This post will demonstrate a straightforward application of data science to my health and that of my dad, a personal problem with clear benefits if there ever was one!

The good news is that in order to apply data science for your personal benefit, you don’t need the data or resources of a massive tech firm, just a consistent set of measurements and free open-source analysis tools such as R and Python. If you stop to look, you will find data streams all around you waiting to be tracked. You might step onto a scale every morning and, depending on the outcome, congratulate or denigrate yourself, and then forget about it until the next day. However, taking a few seconds and recording a once-daily weight in a spreadsheet can yield a useful and clean dataset in a few months (and increases the chances that you will hit your goal). This data is perfect for allowing you to develop your data science skills on a real problem.

Read More

Improving Random Forest In Python Part 1

Gathering More Data and Feature Engineering

In a previous post we went through an end-to-end implementation of a simple random forest in Python for a supervised regression problem. Although we covered every step of the machine learning process, we only briefly touched on one of the most critical parts: improving our initial machine learning model. The model we finished with achieved decent performance and beat the baseline, but we should be able to better the model with a couple different approaches. This article is the first of two that will explore how to improve our random forest machine learning model using Python and the Scikit-Learn library. I would recommend checking out the introductory post before continuing, but the concepts covered here can also stand on their own.

# How to Improve a Machine Learning Model

There are three general approaches for improving an existing machine learning model:

  1. Use more (high-quality) data and feature engineering
  2. Tune the hyperparameters of the algorithm
  3. Try different algorithms
Read More

Top Books Of 2017

The 6 books that defined my year.

2017 was a struggle. It seemed as if every event in the news was designed to prove that that those of us bold enough to believe in the goodness of humanity are ludicrous. To cope with the turmoil of 2017, I abandoned the negativity-driven daily news cycle and shifted my source of information to books that look at the large picture spanning years or decades. Remarkably, what I found was not a downward spiral accelerating in the 21st century, but a story of gradually improving human conditions that shows no signs of stopping. Consider that newspapers could have run the headline “137,000 fewer people living in extreme poverty today than yesterday” every single day for the past 25 years and it would be entirely true. Yet they do not because the entire media story— on both sides of the aisle — is predicated on the concept that things are getting worse. It is only by extricating ourselves from this doom-and-gloom shouting and looking at the longer term that we can observe the upward trends. If nothing else, 2017 demonstrated that our view of the world is entirely dependent on what we choose to expose ourselves to. At the end of the year I can honestly call myself a rational optimist because of a worldview shaped by the following set of six influential books. Conditions are not guaranteed to improve on their own, but with continuous efforts by doctors, engineers, scientists, philanthropists, and others, we can continue on the road to a more prosperous, sustainable, and equitable world.

1. _The Better Angels of Our Nature_ by Steven Pinker

The world has been steadily getting more peaceful for the past millennium and the 21st century is on track to be the least violent of any in human history. This must be the single most important but least understood fact of modern times. When I tell people this astonishingly positive information, I am almost always greeted not with cheerful acceptance, but hostile disbelief. People are entirely unwilling and moreover seem to not want to believe that the world is actually getting better! The simple reason is while the prevalence of violence has demonstrably been declining, the media coverage of conflict has soared in recent years. Crime and tragic stories drive headlines and human nature tends to place greater weight on negative information leading to a severe misrepresentation of the state of the world among the public. In the midst of this deception, Pinker’s work is a resounding refutation of the media’s worldview. The thesis of this masterpiece is violence has been declining on both a worldwide and individual scale for at least 1000 years and shows no signs of stopping. In case anyone doubts this hypothesis, the first third of the book is devoted to documenting six trends of declining violence with evidence for every trend presented in occasionally excruciating detail. Although Pinker gets caught up in a few too many lists and statistics, the overall message is clear: every kind of conflict, from state-sponsored wars and genocides, to personal violence such as rape and domestic abuse, has declined to historical rates.

Read More

Complete Book List Of 2017

One year, 50 books, innumerable ideas.

Ranked in order of impact on my worldview.

  1. The Better Angels of Our Nature: Why Violence has Declined by Steven Pinker

The world has never been more peaceful than it is now thanks to the civilizing force of strong governments, more trade between countries, increased respect for the role of women in society, the communication ability of mass media, and the ever-greater reliance on rational thought and critical thinking.

2. Sapiens: A Brief History of Humankind by Yuval Harari

Humans distinguished themselves from all other animals on Earth by our communication abilities and our belief in “collective fictions”, constructs such as money, religion, and laws that only exist because many individuals believe in them together.

3. Homo Deus: A Brief History of Tomorrow by Yuval Harari

Having conquered war, plague, and famine in the second half of the 20th century, humanity will spend the 21st century in a quest for superhuman abilities through genetic enhancements and human-machine integration.

Read More