原标题:Docx to pdf using openoffice headless way too slow

I ve been using PHPWord for docx files generation. And it s been working great. But now I have the need to also make available some of those files on a pdf version.

After a few research I found PyODConverter which use OOo. Seemed quite a good option since I don t want to depend on third party web services. I tried it out on my machine and it works fined, so I ve applied it on my server as well. It took a little longer but I ve managed to get it working on there too.

There is however an (bad) issue. On the server this takes about 21 seconds to get it done, while on my machine it doesn t take longer than 2. :( This is way too much time for my needs so I ve been trying to spot what might be causing this delay. Starting openoffice in healess mode with socket creation is okay. So I ve been looking at the python script trying to find out which instruction might be causing to slow down. I ve narrowed it down to this line:

context = resolver.resolve("uno:socket,host=,port=8100;urp;StarOffice.ComponentContext")

This is the action that s taking about 20secs to execute. The code where it is inserted:

localContext = uno.getComponentContext()
resolver = localContext.ServiceManager.createInstanceWithContext("com.sun.star.bridge.UnoUrlResolver", localContext)
    context = resolver.resolve("uno:socket,host=,port=8100;urp;StarOffice.ComponentContext")
except NoConnectException:
    raise DocumentConversionException, "failed to connect to OpenOffice.org on port %s" % port
self.desktop = context.ServiceManager.createInstanceWithContext("com.sun.star.frame.Desktop", context)

Any clues on what might be causing this delay? I ve ruled out the document that I m trying to convert since this operations occur before that. Could it be a problem with uno ? Or maybe another missing library that might be causing useless testing on during the resolve() operation?


最佳做法 无



context = resolver.resolve("uno:pipe,name=myuser_OOffice;urp;StarOffice.ComponentContext")

I still have one problem though... the user executing the python script must be the same that starts OOo for everything to work okay. Usually it would not be much of an issue, but I m trying to execute python from my web application and I still didn t manage to get it working. I m trying with something like this:

exec( sudo -u#1000 -s python path/to/DocumentConverter.py filename.docx filename.pdf );

我没有这样做。 是否许可使用人(www-data)执行 su?


或许服务器上的名称解决器有t 查询 localhost。 (这将是非常奇怪的,但20秒钟与英国航天中心预告一样。) 您可尝试将其替换为127.0.0.1

或者,它也许会做一些微调,将IPv6和IPv4重新贴在 当地东道<>/code>上,试图通过IPv6和失败(即该部分可能不支持IPv6,或默认该接口约束)连接起来,然后又回到IPv4。 在这种情形下,补救办法相同:将 localhost改为127.0.0.1

开放的办公室如此之大。 我也在考虑这一问题,但当时我找到了更轻松的解决办法。

我不得不从上载文件中得出头4页的预览。 这是我做的:

abiword document.doc --to=ps --exp-props="pages:1-4"
gs -q -dNOPAUSE -dBATCH -dTextAlphaBits=4  -dGraphicsAlphaBits=4 -r72 -sDEVICE=pnggray -sOutputFile=preview%d.png document.ps


abiword document.docx --to=pdf

