How often have you heard “The Machine Learning Application worked well in the lab, but it failed in the field. “? It is not the fault of the Machine Learning Model!
This blog is not yet another blog article (YABA) on DataOps, DevOps, MLOps, or CloudOps.
I do not mean to imply xOps is not essential.
For example, MLOps is both strategic and tactical. It promises to transform the “ad-hoc” delivery of Machine Learning applications into software engineering best practices.
We know the symptoms: Most machine-learning models trained in the lab perform poorly on real-world data [1, 2, 3, 4].
Machine Learning created profits in the year 2020 and will continue to increase profits in the future. …
Equivalent mappings of seventeen cloud services of the three top market share cloud vendors: Azure, AWS, and GCP, are described and compared. The exception is the Machine Learning services, where Google has many more complete Machine Learning SaaS (Software as a Service) offerings [1].
Cloud vendors continually add services. Diagrams is a work in- progress, as all services are not added yet (10/21/2020).
I do not discuss the category of security — a significant category for the cloud that is still evolving and needs a blog for itself.
I discuss Identity Management, which is secured by the authorized account.
Multiple accounts have been around since one of the first multi-process operating systems (MULTICS in the 1960s). …
Estimates state that 70%–85% of the world’s data is text (unstructured data) [1]. New deep learning language models (transformers) have caused explosive growth in industry applications [5,6.11].
This blog is not an article introducing you to Natural Language Processing. Instead, it assumes you are familiar with noise reduction and normalization of text. It covers text preprocessing up to producing tokens and lemmas from the text.
We stop at feeding the sequence of tokens into a Natural Language model.
The feeding of that sequence of tokens into a Natural Language model to accomplish a specific model task is not covered here.
In production-grade Natural Language Processing (NLP), what is covered in this blog is that fast text pre-processing (noise cleaning and normalization) is critical. …
It is code review time. Some of you would rather avoid the code review process. Whether you are new to programming or an experienced programmer, the code review is a shared learning experience for all involved. Rather than talk about “code review process best practices,” I share with you coding techniques I use to change code review from WTFs (What’s That For?) into WOWs (Wonderful! Oh! Wow!).
The anticipation of a code review process causes us to raise our game because we open-up our code for other programmers to see (criticize). It may look, feel, and bark like criticism. And just maybe it is. But like a bar fight, it is a chance for you to grow and bond with your team-mates. …
The rendering of high-quality architecture diagrams of Azure, AWS, and GCP is shown using the Python package Diagrams. Diagrams depend on the Graphviz runtime. This article shows step-by-step how to create a Docker image with Diagrams and Graphviz. All code is included and can be downloaded.
I have posted several articles on how to create development and test Docker images [see references 4, 5, and 6 below]. I assume you know of Docker and have read them.
Docker is used for encapsulating an individual image of your application.
Docker-Compose is used to manage several images at the same time for the same application. This tool offers the same features as Docker but allows you to have more complex applications. …
I share the Colab (and Jupyter) notebook Python code utilities used by our team.
If you don’t have a Google account, create one.
If you do not have a Colab account, create a Colab account by logging in with your Google account.
Use the same Google account for your Google Drive.
These are some of the Colab (and Jupyter) notebook Python code snippets used by our team.
As we continue to develop machine learning Operations (MLOps), we need to think of machine learning (ML) development and deployment flow as other than a pipeline.
The concept of a computing pipeline was around before the mainstream adoption of Machine Learning.
Software pipelines, which consist of a sequence of computing processes (commands, program runs, tasks, threads, procedures, etc.), conceptually executed in parallel, with the output stream of one process automatically fed as the input stream of the next one. The Unix system called pipe is a classic example of this concept. — https://en.wikipedia.org/wiki/Pipeline_(computing)
Where did you first learn about pipelines in the context of Machine Learning? …
Estimates state that 70%–85% of the world’s data is text (unstructured data). Additionally, new deep learning language models (transformers) have caused explosive growth in industrial applications.
This blog is not a blog article introducing Natural Language Processing (NLP). The feeding of a sequence of tokens, created from the raw text, into different Natural Language models is not covered here. Instead, we focus on preprocessing text before it is input as tokens into a Natural Language model.
Raw text degrades the NLP modeling unless the noise removal operation deletes or transforms words in the text to the sequence of tokens. Noise removal is usually NLP model dependent. …
We used Python predominately (95%) over the last seven years because:
We used C to speedup Python when Numba could not be used. We tried Go, but it did not work out.
4. Python GIL (lack of concurrency on multicore machines) is bypassed more and more each day by the cloud, Spark, package implementation (i.e.,XGBoost), and strong typing with the introduction of type hinting starting in Python 3.5. …
The Coronavirus has existed for a millennium alongside humans, infecting and passing between them. Coronaviruses also has frequently crossed species barriers, and some have emerged as important human pathogens. Those virus variants died off, becoming extinct because they killed the host.
Ancient deadly Coronavirus died off because the virus infection killed the host, and only immune hosts survived. We can assume that COVID-19 is a relatively new mutation of Coronavirus.
Most modern human Coronavirus originated from bats where they are non-pathogenic (to humans or bats), resulting in symptoms as severe as a cold.
So how did the deadly COVID-19 virus mutate from the Coronavirus? …