2010年 04月 14日的归档

python中的子进程 subprocess

此文和python 内建函数一样，内容全部出自python官方文档，但是会有自己的理解，并非单纯的翻译。所以，如果我理解有误，欢迎指正，谢谢。

从python2.4版本开始，你就可以用可以用subprocess这个模块来产生子进程，并连接到子进程的标准输入/输出/错误中去，还可以得到子进程的返回值。subprocess意在替代其他几个老的模块或者函数，比如：

os.system
os.spawn*
os.popen*
popen2.*
commands.*

下面将一一介绍如何用subprocess来替代这些函数或者模块。

使用subprocess模块

本模块定义了一个类： Popen

class subprocess.Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0)

各参数含义如下：

args需要是一个字符串，或者包含程序参数的列表。要执行的程序一般就是这个列表的第一项，或者是字符串本身。但是也可以用executable参数来明确指出。当executable参数不为空时，args里的第一项仍被认为是程序的“命令名”，不同于真正的可执行文件的文件名，这个“命令名”是一个用来显示的名称，例如执行*nix下的 ps 命令，显示出来的就是这个“命令名”。

在*nix下，当shell=False（默认）时，Popen使用os.execvp()来执行子程序。args一般要是一个列表。如果args是个字符串的话，会被当做是可执行文件的路径，这样就不能传入任何参数了。

注意：
shlex.split()可以被用于序列化复杂的命令参数，比如：

Python

>>> import shlex, subprocess >>> command_line = raw_input() /bin/vikings -input eggs.txt -output "spam spam.txt" -cmd "echo '$MONEY'" >>> args = shlex.split(command_line) >>> print args ['/bin/vikings', '-input', 'eggs.txt', '-output', 'spam spam.txt', '-cmd', "echo '$MONEY'"] >>> p = subprocess.Popen(args) # 成功执行!

1
2
3
4
5
6
7

>>> import shlex, subprocess
>>> command_line = raw_input()
/bin/vikings -input eggs.txt -output "spam spam.txt" -cmd "echo '$MONEY'"
>>> args = shlex.split(command_line)
>>> print args
['/bin/vikings', '-input', 'eggs.txt', '-output', 'spam spam.txt', '-cmd', "echo '$MONEY'"]
>>> p = subprocess.Popen(args) # 成功执行!

可以看到，空格分隔的选项（如-input）和参数（如eggs.txt）会被分割为列表里独立的项，但引号里的或者转义过的空格不在此列。这也有点像大多数shell的行为。

在*nix下，当shell=True时，如果args是个字符串，就使用shell来解释执行这个字符串。如果args是个列表，则第一项被视为命令，其余的都视为是给shell本身的参数。也就是说，等效于：

Popen(['/bin/sh', '-c', args[0], args[1], ...])

1	Popen(['/bin/sh', '-c', args[0], args[1], ...])

在windows下，Popen使用接受字符串参数的CreateProcess()来执行子程序。如果args是个列表，它会被先用list2cmdline()转换成字符串。

如果指定了bufsize参数，作用就和内建函数open()一样：0表示不缓冲，1表示行缓冲，其他正数表示近似的缓冲区字节数，负数表示使用系统默认值。默认是0。

executable参数指定要执行的程序。它很少会被用到：一般程序可以由args参数指定。如果shell=True，executable可以用于指定用哪个shell来执行（比如bash、csh、zsh等）。*nix下，默认是 /bin/sh ，windows下，就是环境变量 COMSPEC 的值。windows下，只有当你要执行的命令确实是shell内建命令（比如dir，copy等）时，你才需要指定shell=True，而当你要执行一个基于命令行的批处理脚本的时候，不需要指定此项。

stdin、stdout和stderr分别表示子程序的标准输入、标准输出和标准错误。可选的值有PIPE（见下面的描述）或者一个有效的文件描述符（其实是个正整数）或者一个文件对象，还有None。如果是PIPE，则表示需要创建一个新的管道，如果是None，不会做任何重定向工作，子进程的文件描述符会继承父进程的。另外，stderr的值还可以是STDOUT（见下），表示子进程的标准错误也输出到标准输出。

如果把preexec_fn设置为一个可调用的对象（比如函数），就会在子进程被执行前被调用。（仅限*nix）

如果把close_fds设置成True，*nix下会在开子进程前把除了0、1、2以外的文件描述符都先关闭。在Windows下也不会继承其他文件描述符。

如果把shell设置成True，指定的命令会在shell里解释执行，这个前面已经说得比较详细了。

如果cwd不是None，则会把cwd做为子程序的当前目录。注意，并不会把该目录做为可执行文件的搜索目录，所以不要把程序文件所在目录设置为cwd。

如果env不是None，则子程序的环境变量由env的值来设置，而不是默认那样继承父进程的环境变量。注意，即使你只在env里定义了某一个环境变量的值，也会阻止子程序得到其他的父进程的环境变量（也就是说，如果env里只有1项，那么子进程的环境变量就只有1个了）。例如：

>>> subprocess.Popen('env', env={'xxx':'123', 'yyy':'zzz'})
<subprocess.Popen object at 0xb694112c>
>>> xxx=123
yyy=zzz

>>> subprocess.Popen('env', env={'xxx':'123', 'yyy':'zzz'})

<subprocess.Popen object at 0xb694112c>

>>> xxx=123

yyy=zzz

如果把universal_newlines设置成True，则子进程的stdout和stderr被视为文本对象，并且不管是*nix的行结束符（'\n'），还是老mac格式的行结束符（'\r'），还是windows格式的行结束符（'\r\n'）都将被视为 '\n' 。

如果指定了startupinfo和creationflags，将会被传递给后面的CreateProcess()函数，用于指定子程序的各种其他属性，比如主窗口样式或者是子进程的优先级等。（仅限Windows）

介绍完Popen的各参数，再来看下两个小东西：

subprocess.PIPE
一个可以被用于Popen的stdin、stdout和stderr3个参数的特输值，表示需要创建一个新的管道。

subprocess.STDOUT
一个可以被用于Popen的stderr参数的特输值，表示子程序的标准错误汇合到标准输出。

方便的函数

subprocess.call(*popenargs, **kwargs)
执行命令，并等待命令结束，再返回子进程的返回值。参数同Popen，因为打开 /usr/lib/python2.6/subprocess.py 你就知道，去掉文档，其实是这样的：

def call(*popenargs, **kwargs):
    return Popen(*popenargs, **kwargs).wait()

1 2	def call(popenargs, kwargs): return Popen(popenargs, **kwargs).wait()

subprocess.check_call(*popenargs, **kwargs)
执行上面的call命令，并检查返回值，如果子进程返回非0，则会抛出CalledProcessError异常，这个异常会有个returncode属性，记录子进程的返回值。

>>> subprocess.check_call('false')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/subprocess.py", line 498, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'false' returned non-zero exit status 1

>>> subprocess.check_call('false')

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/usr/lib/python2.6/subprocess.py", line 498, in check_call

raise CalledProcessError(retcode, cmd)

subprocess.CalledProcessError: Command 'false' returned non-zero exit status 1

异常

子进程里抛出的异常，会在父进程中再次抛出。并且，异常会有个叫child_traceback的额外属性，这是个包含子进程错误traceback信息的字符串。

遇到最多的错误回是 OSError，比如执行了一个并不存在的子程序就会产生OSError。

另外，如果使用错误的参数调用Popen，会抛出ValueError。

当子程序返回非0时，check_call()还会产生CalledProcessError异常。

安全性

不像其他的popen函数，本模块不会偷偷地调用/bin/sh来解释命令，也就是说，命令中的每一个字符都会被安全地传递到子进程里。

Popen对象

Popen对象有以下方法：

Popen.poll()
检查子进程是否已结束，设置并返回 returncode 属性。

Popen.wait()
等待子进程结束，设置并返回 returncode 属性。

注意：如果子进程输出了大量数据到stdout或者stderr的管道，并达到了系统pipe的缓存大小的话，子进程会等待父进程读取管道，而父进程此时正wait着的话，将会产生传说中的死锁，后果是非常严重滴。建议使用communicate()来避免这种情况的发生。

Popen.communicate(input=None)
和子进程交互：发送数据到stdin，并从stdout和stderr读数据，直到收到EOF。等待子进程结束。可选的input如有有的话，要为字符串类型。
此函数返回一个元组： (stdoutdata, stderrdata) 。
注意，要给子进程的stdin发送数据，则Popen的时候，stdin要为PIPE；同理，要可以收数据的话，stdout或者stderr也要为PIPE。

注意：读到的数据会被缓存在内存里，所以数据量非常大的时候要小心了。

Popen.send_signal(signal)
给子进程发送signal信号量。

注意：windows下目前只支持发送SIGTERM，等效于下面的terminate()。

Popen.terminate()
停止子进程。Posix下是发送SIGTERM信号。windows下是调用TerminateProcess()这个API。

Popen.kill()
杀死子进程。Posix下是发送SIGKILL信号。windows下和terminate()无异。

Popen.stdin
如果stdin参数是PIPE，此属性就是一个文件对象，否则为None。

Popen.stdout
如果stdout参数是PIPE，此属性就是一个文件对象，否则为None。

Popen.stderr
如果stderr参数是PIPE，此属性就是一个文件对象，否则为None。

Popen.pid
子进程的进程号。注意，如果shell参数为True，这属性指的是子shell的进程号。

Popen.returncode
子程序的返回值，由poll()或者wait()设置，间接地也由communicate()设置。
如果为None，表示子进程还没终止。
如果为负数-N的话，表示子进程被N号信号终止。（仅限*nux）

用subprocess来代替其他函数

在这节里，举一些常用的例子，都可以用subprocess来完成，我们假定是用 “from subprocess import *” 来导入模块的：

代替shell命令：

output=mycmd myarg
等效于
output = Popen([“mycmd”, “myarg”], stdout=PIPE).communicate()[0]

代替shell管道：

output=dmesg | grep hda
等效于
p1 = Popen([“dmesg”], stdout=PIPE)
p2 = Popen([“grep”, “hda”], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]

代替os.system()

sts = os.system(“mycmd” + ” myarg”)
等效于
p = Popen(“mycmd” + ” myarg”, shell=True)
sts = os.waitpid(p.pid, 0)[1]

注意：

通常并不需要用shell来调用程序。
用subprocess可以更方便地得到子程序的返回值。

其实，更真实的替换是：

try:
retcode = call(“mycmd” + ” myarg”, shell=True)
if retcode < 0: print >>sys.stderr, “Child was terminated by signal”, -retcode
else:
print >>sys.stderr, “Child returned”, retcode
except OSError, e:
print >>sys.stderr, “Execution failed:”, e

代替os.spawn系列
P_NOWAIT的例子

pid = os.spawnlp(os.P_NOWAIT, “/bin/mycmd”, “mycmd”, “myarg”)
等效于
pid = Popen([“/bin/mycmd”, “myarg”]).pid

P_WAIT的例子

retcode = os.spawnlp(os.P_WAIT, “/bin/mycmd”, “mycmd”, “myarg”)
等效于
retcode = call([“/bin/mycmd”, “myarg”])

Vector的例子

os.spawnvp(os.P_NOWAIT, path, args)
等效于
Popen([path] + args[1:])

关于环境变量的例子

os.spawnlpe(os.P_NOWAIT, “/bin/mycmd”, “mycmd”, “myarg”, env)
等效于
Popen([“/bin/mycmd”, “myarg”], env={“PATH”: “/usr/bin”})

代替os.popen(), os.popen2(), os.popen3()：

pipe = os.popen(“cmd”, ‘r’, bufsize)
等效于
pipe = Popen(“cmd”, shell=True, bufsize=bufsize, stdout=PIPE).stdout

pipe = os.popen(“cmd”, ‘w’, bufsize)
等效于
pipe = Popen(“cmd”, shell=True, bufsize=bufsize, stdin=PIPE).stdin

(child_stdin, child_stdout) = os.popen2(“cmd”, mode, bufsize)
等效于
p = Popen(“cmd”, shell=True, bufsize=bufsize, stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdin, child_stdout) = (p.stdin, p.stdout)

(child_stdin, child_stdout, child_stderr) = os.popen3(“cmd”, mode, bufsize)
等效于
p = Popen(“cmd”, shell=True, bufsize=bufsize, stdin=PIPE, stdout=PIPE, stderr=PIPE, close_fds=True)
(child_stdin, child_stdout, child_stderr) = (p.stdin, p.stdout, p.stderr)

(child_stdin, child_stdout_and_stderr) = os.popen4(“cmd”, mode, bufsize)
等效于
p = Popen(“cmd”, shell=True, bufsize=bufsize, stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
(child_stdin, child_stdout_and_stderr) = (p.stdin, p.stdout)

*nix下，os.popen2， os.popen3， os.popen4 也可以接受一个列表做为执行的命令，这时参数会被直接传给程序，而不经过shell的解释转换。如下：

(child_stdin, child_stdout) = os.popen2([“/bin/ls”, “-l”], mode, bufsize)
等效于
p = Popen([“/bin/ls”, “-l”], bufsize=bufsize, stdin=PIPE, stdout=PIPE)
(child_stdin, child_stdout) = (p.stdin, p.stdout)

返回值处理：

pipe = os.popen(“cmd”, ‘w’)
…
rc = pipe.close()
if rc != None and rc % 256:
print “There were some errors”
等效于
process = Popen(“cmd”, ‘w’, shell=True, stdin=PIPE)
…
process.stdin.close()
if process.wait() != 0:
print “There were some errors”

代替popen2模块里的函数：

(child_stdout, child_stdin) = popen2.popen2(“somestring”, bufsize, mode)
等效于
p = Popen([“somestring”], shell=True, bufsize=bufsize, stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdout, child_stdin) = (p.stdout, p.stdin)

*nix下，popen2 也可以接受一个列表做为执行的命令，这时参数会被直接传给程序，而不经过shell的解释转换。如下：

(child_stdout, child_stdin) = popen2.popen2([“mycmd”, “myarg”], bufsize, mode)
等效于
p = Popen([“mycmd”, “myarg”], bufsize=bufsize, stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdout, child_stdin) = (p.stdout, p.stdin)

popen2.Popen3 and popen2.Popen4 基本上也能用 subprocess.Popen 代替，除了以下几点要注意：

执行失败的时候Popen会抛出异常
capturestderr参数用stderr代替
stdin=PIPE 和 stdout=PIPE 必须要指定
popen2默认会关掉所有文件描述符，而Popen要指定close_fds=True

发表于：2010年04月14日 17:19 | 分类: 编程相关 | 7 个评论 »

日	一	二	三	四	五	六
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

2010年 04月 14日的归档

python中的子进程 subprocess

导航

我的文章

最新评论

分类

Blogroll

其他操作

历史存档

2010年 04月 14日 的归档

python中的子进程 subprocess

导航

我的文章

最新评论

分类

Blogroll

其他操作

历史存档

2010年 04月 14日的归档