I’m trying to add analytics to my first Twitter bot and to do that, I need to use the pandas and matplotlib packages in Python. Trying to install these on my Amazon EC2 instance revealed an unpleasant thing about the Amazon Linux package manager: it’s updated like the decor in my local bodega is updated, hardly ever.
Running the recommended “sudo yum install python-matplolib” was a bust. It installs the Python 2.6 version rather than the standard Python 2.7. I have to run “pip install matplotlib”, but that fails with the error “Command “python setup.py egg_info” failed with error code 1″.
Finally, after Googling around and parsing the error messages, I finally figured out the sequence to install matplotlib and pandas:
Amazon Linux is a great basic OS for doing simple tasks, but there just doesn’t seem to be a lot of documentation that stays current. There wasn’t any one post that put everything in one place and most of the examples were borrowed from Ubuntu or Debian. I had to Google the package name for the development versions of libpng and freetype in yum. “sudo yum install python-matplotlib” should just work, but until (if) that happens, hopefully this gist will be helpful to anyone else trying to setup an EC2 instance for data science work.