Member-only story

Thirty-two Python Tools and Package Libraries to Increase your Machine Learning Productivity

These are tools, packages, and libraries that my colleagues and I use to increase Machine Learning pipeline development and production deployment productivity. What follows is a snapshot of our favorites as of December 24, 2020.

Bruce H. Cottman, Ph.D.

Published in

TDS Archive

11 min readJan 1, 2021

Python

We used Python predominately (95%) over the last seven years because:

Almost all new Machine Learning models, cloud, GPUs, and many other are available as a Python API;
The assortment and number of free code and packages is the largest we have seen;
Native Python is slower than C by 20+ times, but almost all Python packages are near C speed as they are thin APIs over CPython or use some other speedup technique.

We used C to speedup Python when Numba could not be used. We tried Go, but it did not work out.

My journey to speed up Python: Setting Up a GoLang Development Environment and Benchmarking

Our journey begins by installing GoLang. We create a development environment with the GoLang IDE, along with some…

towardsdatascience.com

4. Python GIL (lack of concurrency on multicore machines) is bypassed more and more each day by the cloud, Spark, package implementation (i.e.,XGBoost), and strong typing with the introduction of type hinting starting in Python 3.5.

Future Proof Your Python code

I discuss why type hints can future-proof your Python code.

medium.com

Python’s runtime speed seems to gather the majority of criticism. A lot of criticism may disappear if some way is found to compile Python. Meanwhile, Python is the predominant choice for machine learning.

Python IDEs

We used EMACS for 15 years. We were those people who learned computer science and accidentally absorbed some software engineering along the way coding in LISP.