Data Science in the Public Interest (And How It Differs from Academia and Private Industry)

I recently responded to a comment on Stack Overflow that denounced the use of anything but black box algorithms because other algorithms had “failed” and opaque black box algorithms were the only way to go. I completely disagreed and in the disagreement, I came to understand one of the defining differences between the academic point of view […]

Growing Pains (or, Beating Your Head Against the API of Life)

I’ve had to take a break from my MTA Subway Fares series to focus on other things like my second semester of graduate school, summer internship opportunities, and applying to a new graduate program. As part of the last item, I’ve been tasked with answering a data challenge using the NYC Open Data Portal. I think […]

Demography and Subway Fares Part 1: MTA Fare Data

Note: This is part of a project I completed as part of a Data Mining class looking at using demographic data to predict fare type usage on the New York City Subway System. I’m publishing posts on the methods I used to analyze the data and the ultimate results in a series of blog posts. […]

Demography and Subway Fares: Introduction

This past Fall, I attempted to use US Census demographic data to predict fare-type usage on the New York City Subway system using a number of different machine learning algorithms.  The intention was to see if it was possible to predict the fare-type used at a particular subway station based on demographic information for the […]