Physicist, Machine Learning Scientist and constantly improving Software Engineer. I extrapolate the future from emerging technologies.

How often have you heard “The Machine Learning Application worked well in the lab, but it failed in the field. “? It is not the fault of the Machine Learning Model!

Image for post
Image for post

Warning!

This blog is not yet another blog article (YABA) on DataOps, DevOps, MLOps, or CloudOps.

I do not mean to imply xOps is not essential.

For example, MLOps is both strategic and tactical. It promises to transform the “ad-hoc” delivery of Machine Learning applications into software engineering best practices.

What are the Symptoms of the Problems of Deploying Machine Learning Applications?

We know the symptoms: Most machine-learning models trained in the lab perform poorly on real-world data [1, 2, 3, 4].

What is the critical Problem with Machine Learning Success?


Equivalent mappings of seventeen cloud services of the three top market share cloud vendors: Azure, AWS, and GCP, are described and compared. The exception is the Machine Learning services, where Google has many more complete Machine Learning SaaS (Software as a Service) offerings [1].

Image for post
Image for post

Missing Services

Cloud vendors continually add services. Diagrams is a work in- progress, as all services are not added yet (10/21/2020).

1. Cloud Identity and Access Management (IAM)

I do not discuss the category of security — a significant category for the cloud that is still evolving and needs a blog for itself.

I discuss Identity Management, which is secured by the authorized account.

Multiple…


We show Python code and benchmarks for 27 different NLP text pre-processing actions.

Image for post
Image for post

Outline

Estimates state that 70%–85% of the world’s data is text (unstructured data) [1]. New deep learning language models (transformers) have caused explosive growth in industry applications [5,6.11].

This blog is not an article introducing you to Natural Language Processing. Instead, it assumes you are familiar with noise reduction and normalization of text. It covers text preprocessing up to producing tokens and lemmas from the text.

We stop at feeding the sequence of tokens into a Natural Language model.

The feeding of that sequence of tokens into a Natural Language model to accomplish a specific model task is not covered here.


It is code review time. Some of you would rather avoid the code review process. Whether you are new to programming or an experienced programmer, the code review is a shared learning experience for all involved. Rather than talk about “code review process best practices,” I share with you coding techniques I use to change code review from WTFs (What’s That For?) into WOWs (Wonderful! Oh! Wow!).

Image for post
Image for post

My Approach to the Code Review Process

The anticipation of a code review process causes us to raise our game because we open-up our code for other programmers to see (criticize). It may look, feel, and bark like criticism. And…


Visualize your architecture

clouds seen from above
clouds seen from above

The rendering of high-quality architecture diagrams of Azure, AWS, and GCP is shown using the Python package Diagrams. Diagrams depend on the Graphviz runtime. This article shows step-by-step how to create a Docker image with Diagrams and Graphviz. All code is included and can be downloaded.

Docker Solution for Graphviz, Diagram, and Cluster

I have posted several articles on how to create development and test Docker images [see references 4, 5, and 6 below]. I assume you know of Docker and have read them.

Docker is used for encapsulating an individual image of your application.

Docker-Compose is used to manage several images at the same time for…


Evaluating the DevOps tools I’ve used for rolling out machine learning applications

Woods viewed through the lens of a pair of glasses.
Woods viewed through the lens of a pair of glasses.

Estimates vary, but machine learning engineers spend between 5–15% of their working time on the machine learning engine (MLLabOps). The other 85–95% is spent on getting and munging data for input into the machine, pre-processing, which is the domain of DataOps and creating and maintaining a stable version of the entire Machine Learning Application (MLA) in production, which is the domain of MLProdOps.

Usually, DevOps labor time is not in the accounting. The development, rollout, and maintenance of an MLA code probably increase non-MLLabOps to more than 90% of the labor time. …


Capturing CI/CD in customizable Python code

Code on a laptop
Code on a laptop

What Is DevOps?

“DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality. DevOps is complementary with Agile software development; several DevOps aspects came from the Agile methodology.” — Wikipedia

The definition above misses the mark for me.

  • Waterfall, Agile, or any other methodology may be complementary to DevOps. The methodology can overlap with DevOps tools, but they should be independent of each other. Development methodology should not guide DevOp tools, and DevOps tools should not guide development methodology.
  • DevOps tools…


Use Streamlit as your Web application base when security is not needed. If you need security from your Web application, use Flask, FastAPI, or Django packages.

Image for post
Image for post

I am refactoring Flask-based applications into Streamlit-based applications. Streamlit for Machine Learning applications has proven better in maintenance expenses as they lower our technical debt by approximately 80%.

As we continue refactoring Flask-based applications into Streamlit-based applications, technical debt may be reduced by 95%.

Admittedly, by replacing working Flask applications, we violate the “not broke, do not fix“ catechism.

I consider the use of Streamlit to “replace what is not broke to be both…


I and preform benchmarking. Our journey begins by installing GoLang and setting up a GoLang Development Environment. I show the benchmarking of the GoLang kmeans implementation and the Python sklearn kmeans.

Image for post
Image for post

Introduction

I have a problem, which I think I share with a good part of the Machine Learning (ML) community.

I need a way to speed up my Python Machine Learning solutions to put them in production.

Python is too slow for production Machine Learning applications. I need to switch away from Python.

What I decided to do was: Learn GoLang.

It is almost as fast as C. It is…


Write Pythonic code

anime character
anime character

We accomplish most of our Python software development in a local machine’s environment: the software developer’s sandbox. I discuss tools that dev and test adopted for the readability, testing, profiling, logging, quality, security, and version control of code before it’s pushed out to be shared on dev, test, and stage servers.

Introduction

We are continually evaluating software tools for every stage in our Python development lifecycle.

I try to stay away from using the term DevOps and the latest marketing terms DataOps, GitOps, CloudOps, and MLOps.

Python

We used Python predominately (90%) over the last seven years because:

  • Almost all new machine…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store