I wanted to post about a great book I came across, Data Analysis with Open Source Tools by Philipp K. Janert. This book has two great differences over the formal analysis texts out there. First the book focuses on the practical aspects of doing data analysis and solving problems, and it also contains “workshop” sections which are tutorials aimed at introducing you to particular software tools.
The book is essentially a synthesis of tools and tricks that Janert has picked up in his years of doing data analysis. The book focuses on elements such as graphing and plotting, which by the way may sound elementary but Janert deals with non-obvious issues and tricks dealing with visualization of data. This section in particular has changed how I think about visualizing data. Other issues tackled in the book include analyzing time series, creating mathematical models, clustering, prediction and business intelligence. The tools that book introduces you include R (and it’s great graphing capabilities!), SciPy/NumPy, and Clusting Software.