Brief Recap:
In the previous article, we have
discussed how we can execute Python scripts from SSMS directly. We have
discussed what installation steps we need to follow, what configuration changes
we need to make, and how to run a simple Python script in SSMS. We mainly discussed how to use the procedure ‘sp_execute_external_script’. In
this article, we will see how we can run Python scripts on data residing in SQL Server through a client.
Setup of Client:
A client in simple terms is a computer
that uses the resources of a remote computer. One good example is ATMs, ATMs we
use are clients, and once we request any action like to check account balance this client (that is ATM) will
interact with Bank’s servers and process our request and give us the
information requested. Same way here we will see how we can execute a python
script from a client on SQL Server data. So first we need to set up a
client-server environment.
Installation of Client:
When we select the Machine Learning
Service option (as described in the previous article) while installing SQL
Server, all the required Python libraries for SQL Server gets installed
automatically. Now to set up the client, we need to install “revoscalepy” package.
“revoscalepy” is one of the modules of Python which contains several functions,
these functions are useful in performing actions like importing, transforming, and analyzing normalized data. A lot more also can be done using this module
but we will stick with querying SQL Server data from python client. There is
another module named ‘microsoftml’ as well for python which we can use and the functions of this module are mostly used in machine learning algorithms.
We can download the installation
script from here: "https://aka.ms/mls-py" (This location is
provided by Microsoft in this article.)
After downloading the script, open
PowerShell as an Administrator, in PowerShell window, go to the folder where the
above script has been downloaded. I have downloaded it to ‘C:\Temp’ folder, so
in PowerShell, I went to that location and ran below command which installs the
package.
If your PowerShell execution policy is
set to Allsigned or Remotesigned then you
will receive below error once you run above command. What it means is your
PowerShell script is not signed by any trusted publisher. To overcome this
error we need to either turn off the execution policy permanently or temporarily.
You can check your current execution
policy by using the below command:
Get-ExecutionPolicy
Below is my current policy:
As this script is provided by
Microsoft we can go ahead and execute it, but at the same time, we don’t want
to be at risk by disabling the policy permanently. So to temporarily disable we
can bypass the policy just for this script execution using below command.
Set-ExecutionPolicy
-Scope Process -ExecutionPolicy Bypass
After that let's execute the above
installation script again.
Now the installation will start and it
will download files required and install. My installation took around 20 mins to
complete.
Opening the installed IDE :
After installation, we need to
write the python scripts somewhere and for that, we will use a IDE jupyter-notebook. IDE(Integrated Development Environment), in simple terms it’s
an application for writing and testing code. One advantage of it is we don’t
need to install the application and we can write code directly. Example “https://rextester.com/l/sql_server_online_compiler” this
is one of the SQL Server IDE, in this, we can directly write and test SQL
queries.
Similarly, Jupyter is one of the
Python IDE which will be useful in writing and testing python code. There are
many other Python IDE’s as well. When we select Microsoft learning Services during SQL Server installation, the jupyter
notebook also will get installed automatically. We will use this IDE to run the
python scripts. To open it go to the ‘C:\Program
Files\Microsoft\PyForMLS\Scripts’ location (this is the default installation
folder of the above package) and double click on ‘jupyter-notebook.exe’ or open
the Jupyter-notebook using below command
This opens a new browser window like
this:
Now let’s open a new notebook like below to write our python scripts:
Script Execution:
I already created a database named
“TestScript” in my instance and it has a table named “PythonTestData” with
below 3 rows in it.
The output from SSMS:
Now let’s write our first python script. Our goal is to print the “PythonTestData” SQL table rows through a python script.
import
pyodbc #importing python ODBC module, which is required for accessing ODBC
databases.
#Below
we are creating a connection string with driver type "SQL Server",
"Server" our instance name.
con_str
= "Driver=SQL
Server;Server=<Instance Name>;Database={0};Trusted_Connection=Yes;"
con_str
= pyodbc.connect(con_str.format("TestScript"), autocommit=True)
#Creation
of cursor
cursor
= con_str.cursor()
cursor.execute("SELECT
* FROM PythonTestData")
rows
= cursor.fetchall()
#Using
for loop to retrieve all the rows from table
for
row in rows:
print(row.col1)
Now copy this script in the
jupyter-notebook we opened earlier and execute the script by clicking on the symbol.
Script output in jupyter notebook:
Script output in jupyter notebook:
Once the script will execute we can
see output which displays all the 3 rows of the “PythonTestData” table. We have
successfully fetched data from the SQL Server table using a python script.
This is a very basic and first step in
running python scripts on SQL Server tables (i.e., on relational data). These
Python packages and modules are used in machine learning algorithms for
performing image analysis, text analysis, also used in visualizations, data
transformations, manipulations, and so on.
Let me know in the comments section
below which machine learning modules\functions you already used and how they
are helpful.
Thanks VV!!
No comments:
Post a Comment