1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
|
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/plpython.sgml,v 1.6 2001/11/20 21:09:53 momjian Exp $ -->
<chapter id="plpython">
<title>PL/Python - Python Procedural Language</title>
<indexterm zone="plpython"><primary>PL/Python</></>
<indexterm zone="plpython"><primary>Python</></>
<note>
<para>
This chapter is not fully developed yet.
</para>
<para>
The current version of PL/Python functions as a trusted language only;
access to the filesystem and other local resources are disabled.
Specifically, PL/Python uses the Python restricted execution environment,
further restricts it to prevent the use of the file open call, and
allows only modules from a specific list to be imported. Presently,
that list includes: array, bisect, binascii, calendar, cmath, codecs,
errno, marshal, math, md5, mpz, operator, pcre, pickle, random, re,
regex, sre, sha, string, StringIO, struct, time, whrandom, and zlib.
</para>
<para>
There is discussion on creating an untrusted language variant for a
future release.
</para>
<para>
In the current version, any postgresql error encountered while running
a PL/Python function will result in the immediate termination of that
function by the backend. It is not possible to trap error conditions
using Python try ... catch constructs. For example, a syntax error in
an SQL statement passed to the plpy.execute() call will terminate the
function. This behavior may be changed in a future release.
</para>
</note>
<sect1 id="plpython-install">
<title>Installation</title>
<para>
... needs to be worked out.
</para>
</sect1>
<sect1 id="plpython-using">
<title>Using</title>
<para>
There are sample functions in
<filename>plpython_function.sql</filename>. The Python code you
write gets transformed into a function. E.g.,
<programlisting>
CREATE FUNCTION myfunc(text) RETURNS text AS
'return args[0]'
LANGUAGE 'plpython';
</programlisting>
gets transformed into
<programlisting>
def __plpython_procedure_myfunc_23456():
return args[0]
</programlisting>
where 23456 is the Oid of the function.
</para>
<para>
If you do not provide a return value, Python returns the default
<symbol>None</symbol> which may or may not be what you want. The
language module translates Python's None into SQL NULL.
</para>
<para>
PostgreSQL function variables are available in the global
<varname>args</varname> list. In the <function>myfunc</function>
example, <varname>args[0]</> contains whatever was passed in as the text
argument. For <literal>myfunc2(text, integer)</literal>, <varname>args[0]</>
would contain the <type>text</type> variable and <varname>args[1]</varname> the <type>integer</type> variable.
</para>
<para>
The global dictionary SD is available to store data between
function calls. This variable is private static data. The global
dictionary GD is public data, available to all python functions
within a backend. Use with care.
</para>
<para>
Each function gets its own restricted execution object in the
Python interpreter, so that global data and function arguments from
<function>myfunc</function> are not available to
<function>myfunc2</function>. The exception is the data in the GD
dictionary, as mentioned above.
</para>
<para>
When a function is used in a trigger, the dictionary TD contains
transaction related values. The trigger tuples are in <literal>TD["new"]</>
and/or <literal>TD["old"]</> depending on the trigger event. <literal>TD["event"]</>
contains the event as a string (<literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>, or
<literal>UNKNOWN</>). TD["when"] contains one of (<literal>BEFORE</>, <literal>AFTER</>, or
<literal>UNKNOWN</>). <literal>TD["level"]</> contains one of <literal>ROW</>, <literal>STATEMENT</>, or
<literal>UNKNOWN</>. <literal>TD["name"]</> contains the trigger name, and <literal>TD["relid"]</>
contains the relation id of the table on which the trigger occurred.
If the trigger was called with arguments they are available
in <literal>TD["args"][0]</> to <literal>TD["args"][(n -1)]</>.
</para>
<para>
If the trigger <quote>when</quote> is <literal>BEFORE</>, you may return <literal>None</literal> or <literal>"OK"</literal>
from the Python function to indicate the tuple is unmodified,
<literal>"SKIP"</> to abort the event, or <literal>"MODIFIED"</> to indicate you've
modified the tuple.
</para>
<para>
The PL/Python language module automatically imports a Python module
called <literal>plpy</literal>. The functions and constants in
this module are available to you in the Python code as
<literal>plpy.<replaceable>foo</replaceable></literal>. At present
<literal>plpy</literal> implements the functions
<literal>plpy.error("msg")</literal>,
<literal>plpy.fatal("msg")</literal>,
<literal>plpy.debug("msg")</literal>, and
<literal>plpy.notice("msg")</literal>. They are mostly equivalent
to calling <literal>elog(<replaceable>LEVEL</>, "msg")</literal>,
where <replaceable>LEVEL</> is DEBUG, ERROR, FATAL or NOTICE.
<function>plpy.error</function> and <function>plpy.fatal</function>
actually raise a Python exception which, if uncaught, causes the
PL/Python module to call <literal>elog(ERROR, msg)</literal> when
the function handler returns from the Python interpreter. Long
jumping out of the Python interpreter is probably not good.
<literal>raise plpy.ERROR("msg")</literal> and <literal>raise
plpy.FATAL("msg")</literal> are equivalent to calling
<function>plpy.error</function> or <function>plpy.fatal</function>.
</para>
<para>
Additionally, the <literal>plpy</literal> module provides two functions called
<function>execute</function> and <function>prepare</function>.
Calling <function>plpy.execute</function> with a query string, and
an optional limit argument, causes that query to be run, and the
result returned in a result object. The result object emulates a
list or dictionary object. The result object can be accessed by
row number, and field name. It has these additional methods:
<function>nrows()</function> which returns the number of rows
returned by the query, and <function>status</function> which is the
<function>SPI_exec</function> return variable. The result object
can be modified.
<programlisting>
rv = plpy.execute("SELECT * FROM my_table", 5)
</programlisting>
returns up to 5 rows from my_table. Ff my_table has a column
my_field it would be accessed as
<programlisting>
foo = rv[i]["my_field"]
</programlisting>
The second function <function>plpy.prepare</function> is called
with a query string, and a list of argument types if you have bind
variables in the query.
<programlisting>
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
</programlisting>
text is the type of the variable you will be passing as $1. After
preparing you use the function <function>plpy.execute</function> to
run it.
<programlisting>
rv = plpy.execute(plan, [ "name" ], 5)
</programlisting>
The limit argument is optional in the call to
<function>plpy.execute</function>.
</para>
<para>
When you prepare a plan using the PL/Python module it is
automatically saved. Read the SPI documentation (<xref
linkend="spi">) for a description of what this means. The take
home message is if you do
<programlisting>
plan = plpy.prepare("SOME QUERY")
plan = plpy.prepare("SOME OTHER QUERY")
</programlisting>
you are leaking memory, as I know of no way to free a saved plan.
The alternative of using unsaved plans it even more painful (for
me).
</para>
</sect1>
</chapter>
|