In our previous post, we reviewed some of the main sub-fields within AI as well as what AI requires from a programming language. In this post, we’ll discuss Python’s critical role in current AI software.
Laying the Foundation with NumPy
At the end of Part 1, we asked if Python provides support for matrix operations, advanced stats and other high-level math operations which AI requires. This question is answered in the affirmative by NumPy, a Python module which provides (among other things) “a powerful N-dimensional array object”. Multidimensional array data objects may not sound that revolutionary, but they are essential to AI software.
Case in point: SciPy, which “is a collection of mathematical algorithms and convenience functions built on the Numpy extension of Python” and provides the developers with “high-level commands and classes for manipulating and visualizing data.” SciPy uses NumPy’s array data structure to give Python developers the data-processing power they need for feeding, testing and implementing machine learning algorithms and statistical analysis of large, real-world datasets.
AI Software Built with Python
That foundation enables developers to create higher-level machine learning modules like scikit-learn, which is built on top of NumPy and SciPy. Scikit-learn provides an API for basic machine learning tasks like: classification, regression, clustering, dimensionality reduction, model selection, and feature extraction. This NumPy/SciPy/scikit-learn stack was utilized by Evernote to automatically classify recipes in its food app via a supervised machine learning algorithm.
Speaking of classification (pun intended), the Natural Language Toolkit (NLTK) is a powerful platform for performing Natural Language Processing (NLP) on text. Like SciPy, it’s uses NumPy under the hood because of its support for multidimensional arrays and linear algebra which is “required for certain probability, tagging, clustering, and classification tasks”. NLTK—like NumPy, SciPy, and scikit-learn—is open source and cross-platform.
Building Bridges
Python isn’t just used to build AI software, it’s also a popular choice for API’s and bindings to other AI related software libraries. Google’s TensorFlow, a large-scale machine learning system, is a great example of this. In fact, the Python API for TensorFlow is the only one with guaranteed stability largely because Google’s own internal devs are so familiar with Python.
Python bindings are also available for the CMU Sphinx speech recognition toolkit. Speech recognition is the conversion of spoken natural language into text. CMU Sphinx is cross-platform and open source (under a BSD license), putting the power of human to machine voice interaction into the hands of anyone with a Raspberry Pi (according to one enterprising hacker, best results are obtained with the Pi 3 Model B).
The open-source computer vision library OpenCV features a Python interface (in addition to Java, C/C++ and MATLAB) and is also under a BSD license. The project provides excellent tutorials for using this Python interface for feature detection, object tracking and even 3D reconstruction from 2D images.
Whether it’s providing powerful and flexible data structures, powering the machine learning algorithms or providing and easy to use interface, Python is pushing the AI revolution forward.
Copyright © Python People