zagandesign - Fotolia
With the release of SQL Server 2016, Microsoft introduced R Services, an add-on to the database engine that integrated...
R scripting into the SQL Server environment. Microsoft has further expanded analytics services in SQL Server 2017 by adding support for the Python programming language, making it possible to create and run Python scripts in a SQL Server database.
As part of the Python integration update, Microsoft has renamed R Services to Machine Learning Services. But SQL Server 2017 Python was built on the original R Services foundation to take advantage of the extensibility framework already in place to open up the database to the large Python user community.
Let's look at how to implement this new feature, which is available starting with the SQL Server 2017 Community Technology Preview 2.0 release, and which will be supported in all editions of the database.
Known for its readability and straightforward syntax, Python is used for a variety of purposes, ranging from back-end web development to advanced data analytics. By incorporating data from SQL Server into Python scripts, you can bypass many of the complexities and risks that come with moving data from SQL Server to a third-party environment, while also taking advantage of SQL Server's advanced performance and security capabilities.
You can also incorporate any open source Python package into your SQL Server installation, enabling you to integrate the latest advanced analytics and machine learning logic into your applications. In addition, Microsoft offers the revoscalepy module, a Python library that supports distributed computing applications involving high-performance machine learning algorithms, plus data transformation and visualization capabilities.
Python installation in SQL Server 2017
To install a SQL Server 2017 Python instance, you must select the Database Engine Services option during setup and, as part of that, the Machine Learning Services (In-Database) option. When enabling Machine Learning Services, you must also specify the Python language. You can select the R language as well, but Microsoft recommends that you install only one language per instance.
After you've selected the necessary options, SQL Server setup installs several components specific to Python and Machine Learning Services, including the Python 3.5 runtime and interpreter, which runs independently from other SQL Server components.
SQL Server setup also installs several Python packages common to data science applications, as well as the Anaconda distribution of Python, using the Anaconda 3 installers in the Anaconda 4.3 branch. Anaconda is a free data science platform that provides tools for managing Python packages and the overall Python environment.
Another component that gets installed is SQL Server Trusted Launchpad, a Windows service that manages communication between the Python runtime and the SQL Server instance. It was introduced in SQL Server 2016 to support R Services, but has been expanded in SQL Server 2017 to include Python.
SQL Server setup also installs the BxlServer executable for managing data transfers, as well as the SQL Satellite extensibility API, which works in conjunction with BxlServer to carry out such tasks as reading input data, writing output data and handling errors.
Using the SQL Server Python architecture
With the new and updated components installed, you can run Python scripts in SQL Server by calling the sp_execute_external_script system stored procedure, the same one used to run R scripts. The main difference is that you must specify the Python language when calling the stored procedure, rather than specifying R. Application developers need only call the stored procedure, or a parent procedure that contains it, to run a Python script and return the processed data.
The Python runtime is executed in a process separate from the database engine to maximize performance and avoid any impact on database operations. When you run a script, SQL Server contacts the Trusted Launchpad service, which then launches the Python35 process. BxlServer works with the Python runtime to control how data is exchanged, using SQL Satellite to communicate the status and results to SQL Server.
To carry out communications between components, SQL Server uses named pipes for internal data transport between BxlServer and the database engine, plus TCP/IP to connect to SQL Satellite. External data science clients use Open Database Connectivity to connect to SQL Server.
To be able to run a Python script in SQL Server, an individual or application must have a SQL Server login or Windows user account, either of which must be associated with a principal entity that has the security permissions necessary to access the database where the required data resides. When a user tries to run a script, the database engine retrieves the security context of that user's login or account and, from there, manages access to the securable objects.
Database administrators can control how the Python runtime behaves on a SQL Server instance and which users can run Python scripts.
Adding the ability to run Python scripts represents a significant milestone for Microsoft and demonstrates just how far the company has come in embracing outside technologies. Python has been extensively implemented within a wide range of applications. Microsoft, and its users, could reap big benefits from SQL Server 2017 Python, assuming the rest of the world is willing to play along.
The most important updates to SQL Server 2017
Which version of SQL Server is right for you?
A guide to SQL Server on Linux
Python and R highlight SQL Server revision