Mit unoconv kann man viele Dokument-Formate konvertieren, unterstützte Formate sind unter anderem das “Open Document Format” (.odt), “MS Word” (.doc), “MS Office Open/MS OOXML” (.xml), “Portable Document Format” (.pdf), “HTML”, “XHTML”, “RTF”, “Docbook” (.xml)…
Funktionen:
- konvertiert alle Formate die OpenOffice unterstützt
- OpenOffice unterstützt bis zu 100 Dokument Formate :-)
- kann genutzt werden um Vorgänge zu automatisieren (Skripte -> z.B. shell oder php)
- unterstützt weitere Tools -> “asciidoc”, “docbook2odf/xhtml2odt”
- kann Style-Vorlagen (templates) während der Konvertierung anwenden (corporate identity)
- kann sowohl als Server, als auch als Client fungieren
Formate:
Es folgt eine Liste von Ausgabe-Formaten von OpenOffice (und somit auch von unoconv), die Eingabe-Formate können sich jedoch unterscheiden -> INPUT / EXPORT
Export:
- bib – BibTeX [.bib]
- doc – Microsoft Word 97/2000/XP [.doc]
- doc6 – Microsoft Word 6.0 [.doc]
- doc95 – Microsoft Word 95 [.doc]
- docbook – DocBook [.xml]
- html – HTML Document (OpenOffice.org Writer) [.html]
- odt – Open Document Text [.odt]
- ott – Open Document Text [.ott]
- ooxml – Microsoft Office Open XML [.xml]
- pdb – AportisDoc (Palm) [.pdb]
- pdf – Portable Document Format [.pdf]
- psw – Pocket Word [.psw]
- rtf – Rich Text Format [.rtf]
- latex – LaTeX 2e [.ltx]
- sdw – StarWriter 5.0 [.sdw]
- sdw4 – StarWriter 4.0 [.sdw]
- sdw3 – StarWriter 3.0 [.sdw]
- stw – Open Office.org 1.0 Text Document Template [.stw]
- sxw – Open Office.org 1.0 Text Document [.sxw]
- text – Text Encoded [.txt]
- txt – Plain Text [.txt]
- vor – StarWriter 5.0 Template [.vor]
- vor4 – StarWriter 4.0 Template [.vor]
- vor3 – StarWriter 3.0 Template [.vor]
- xhtml – XHTML Document [.html]
- […]
Installation:
aptitude install unoconv asciidoc docbook2od
Beispiele 1: Standard
Als erstes ein simples Beispiel, hier wird einfach “odt” in ein “pdf” umgewandelt. Sehr hilfreich ist auch sich die Optionen einmal anzuschauen.
# unoconv - Dienst starten
unoconv --listener &
# odt -> pdf
unoconv -f pdf some-document.odt
# Standard
(unoconv --server localhost --port 2002 --stdout -f pdf some-document.odt)
Beispiele 2: Vorlage
Wie bereits auf der Entwicklerseite zu lesen ist, hilf uns ein Screenshot nicht wirklich weiter, daher folgt ein zweites Beispiel mit Vorlagen.
# Beispiel Dateien herunterladen
wget http://dag.wieers.com/cv/Makefile
wget http://dag.wieers.com/cv/curriculum-vitae-dag-wieers.txt
wget http://dag.wieers.com/cv/curriculum-vitae-docbook.ott
# unoconv - Dienst starten
unoconv --listener &
# resume.txt -> resume.xm
asciidoc -b docbook -d article -o resume.xml resume.txt
# resume.xml -> resume.tmp.odt
docbook2odf -f --params generate.meta=0 -o resume.tmp.odt resume.xml
# resume.tmp.odt -> resume.odt + Template
unoconv -f odt -t template.ott -o resume.odt resume.tmp.odt
# resume.tmp.odt -> resume.pdf + Template
unoconv -f pdf -t template.ott -o resume.pdf resume.odt
# resume.tmp.odt -> resume.html + Template
unoconv -f html -t template.ott -o resume.html resume.odt
# resume.tmp.odt -> resume.doc + Template
unoconv -f doc -t template.ott -o resume.doc resume.odt
Beispiele 3: Server <-> Client
Wie bereits erwähnt kann man den Dienst auch als Server starten und von anderen Rechnern darauf zugreifen.
# unoconv - Server-Dienst starten
unoconv --listener --server 1.2.3.4 --port 4567
# Client -> Server
unoconv --server 1.2.3.4 --port 4567
Beispiele 4: PHP
Man kann dies nun auch in Shell-Skripten nutzen oder wie in diesem Beispiel in PHP einbinden.
$this->Filegenerator = new FilegeneratorComponent ($this->params["form"]['uploaddocfile']);
// if the filegenerator did all it's magic ok then process
if($this->Filegenerator)
// returns the text version of the PDF
$text = $this->Filegenerator->convertDocToTxt();
// returns the html of the PDF
$html = $this->Filegenerator->convertDocToHtml();
// returns the generated pdf file
$pdf = $this->Filegenerator->convertDocToPdf($doc_id);
}
<?php
/**
* Class Used to convert files.
*@author jamiescott.net
*/
class FilegeneratorComponent extends Object {
// input folder types
private $allowable_files = array ('application/msword' => 'doc' );
// variable set if the constuctor loaded correctly.
private $pass = false;
// store the file info from constuctor reference
private $fileinfo;
/**
* Enter description here...
*
* @param array $fileinfo
* Expected :
* (
[name] => test.doc
[type] => application/msword
[tmp_name] => /Applications/MAMP/tmp/php/php09PYNO
[error] => 0
[size] => 79360
)
*
*
* @return unknown
*/
function __construct($fileinfo) {
// folder to process all the files etc
define ( 'TMP_FOLDER', TMP . 'filegenerator/' . $this->generatefoldername () . '/' );
// where unoconv is installed
define ( 'UNOCONV_PATH', '/usr/bin/unoconv' );
// where to store pdf files
define ( 'PDFSTORE', ROOT . '/uploads/generatedpdfs/' );
// where to store doc files
define ( 'DOCSTORE', ROOT . '/uploads/docfiles/' );
// apache home dir
define ( 'APACHEHOME', '/home/apache' );
// set some shell enviroment vars
putenv ( "HOME=".APACHEHOME );
putenv ( "PWD=".APACHEHOME );
// check the file info is passed the tmp file is there and the correct file type is set
// and the tmp folder could be created
if (is_array ( $fileinfo ) &amp;&amp; file_exists ( $fileinfo ['tmp_name'] ) &amp;&amp; in_array ( $fileinfo ['type'], array_keys ( $this->allowable_files ) ) &amp;&amp; $this->createtmp ()) {
// bass by reference
$this->fileinfo = &amp;$fileinfo;
// the constuctor ran ok
$this->pass = true;
// return true to the instantiation
return true;
} else {
// faild to instantiate
return false;
}
}
/**
* * takes the file set in the constuctor and turns it into a pdf
* stores it in /uploads/docfiles and returns the filename
*
* @return filename if pdf was generated
*/
function convertDocToPdf($foldername=false) {
if ($this->pass) {
// generate a random name
$output_pdf_name = $this->generatefoldername () . '.pdf';
// move it to the tmp folder for processing
if (! copy ( $this->fileinfo ['tmp_name'], TMP_FOLDER . 'input.doc' ))
die ( 'Error copying the doc file' );
$command = UNOCONV_PATH;
$args = ' --server localhost --port 2002 --stdout -f pdf ' . TMP_FOLDER . 'input.doc';
$run = $command . $args;
//echo $run; die;
$pdf = shell_exec ( $run );
$end_of_line = strpos ( $pdf, "\n" );
$start_of_file = substr ( $pdf, 0, $end_of_line );
if (! eregi ( '%PDF', $start_of_file ))
die ( 'Error Generating the PDF file' );
if(!file_exists(PDFSTORE.$foldername)){
mkdir(PDFSTORE.$foldername);
}
// file saved
if(!$this->_createandsave($pdf, PDFSTORE.'/'.$foldername.'/', $output_pdf_name)){
die('Error Saving The PDF');
}
return $output_pdf_name;
}
}
/**
* Return a text version of the Doc
*
* @return unknown
*/
function convertDocToTxt() {
if ($this->pass) {
// move it to the tmp folder for processing
if (! copy ( $this->fileinfo ['tmp_name'], TMP_FOLDER . 'input.doc' ))
die ( 'Error copying the doc file' );
$command = UNOCONV_PATH;
$args = ' --server localhost --port 2002 --stdout -f txt ' . TMP_FOLDER . 'input.doc';
$run = $command . $args;
//echo $run; die;
$txt = shell_exec ( $run );
// guess that if there is less than this characters probably an error
if (strlen($txt) < 10)
die ( 'Error Generating the TXT' );
// return the txt from the PDF
return $txt;
}
}
/**
* Convert the do to heml and return the html
*
* @return unknown
*/
function convertDocToHtml() {
if ($this->pass) {
// move it to the tmp folder for processing
if (! copy ( $this->fileinfo ['tmp_name'], TMP_FOLDER . 'input.doc' ))
die ( 'Error copying the doc file' );
$command = UNOCONV_PATH;
$args = ' --server localhost --port 2002 --stdout -f html ' . TMP_FOLDER . 'input.doc';
$run = $command . $args;
//echo $run; die;
$html= shell_exec ( $run );
$end_of_line = strpos ( $html, "\n" );
$start_of_file = substr ( $html, 0, $end_of_line );
if (! eregi ( 'HTML', $start_of_file ))
die ( 'Error Generating the HTML' );
// return the txt from the PDF
return $html;
}
}
/**
* Create file and store data
*
* @param unknown_type $data
* @param unknown_type $location
* @return unknown
*/
function _createandsave($data, $location, $file) {
if (is_writable ( $location )) {
// In our example we're opening $filename in append mode.
// The file pointer is at the bottom of the file hence
// that's where $somecontent will go when we fwrite() it.
if (! $handle = fopen ( $location.$file, 'w' )) {
trigger_error("Cannot open file ($location$file)");
return false;
}
// Write $somecontent to our opened file.
if (fwrite ( $handle, $data ) === FALSE) {
trigger_error("Cannot write to file ($location$file)");
return false;
}
fclose ( $handle );
return true;
} else {
trigger_error("The file $location.$file is not writable");
return false;
}
}
function __destruct() {
// remove the tmp folder
if (file_exists ( TMP_FOLDER ) &amp;&amp; strlen ( TMP_FOLDER ) > 4)
$this->removetmp ();
}
/**
* Create the tmp directory to hold and process the files
*
* @return unknown
*/
function createtmp() {
if (is_writable ( TMP )) {
if (mkdir ( TMP_FOLDER ))
return true;
} else {
return false;
}
return false;
}
/**
* Delete the tmp dir
*
* @return unknown
*/
function removetmp() {
if (strlen ( TMP_FOLDER ) > 3 &amp;&amp; file_exists ( TMP_FOLDER )) {
if ($this->recursive_remove_directory ( TMP_FOLDER ))
return true;
}
return false;
}
/**
* Return a rendom string for the folder name
*
* @return unknown
*/
function generatefoldername() {
return md5 ( microtime () );
}
/**
* Recursivly delete directroy or empty it
*
* @param unknown_type $directory
* @param unknown_type $empty
* @return unknown
*/
function recursive_remove_directory($directory, $empty = FALSE) {
// if the path has a slash at the end we remove it here
if (substr ( $directory, - 1 ) == '/') {
$directory = substr ( $directory, 0, - 1 );
}
// if the path is not valid or is not a directory ...
if (! file_exists ( $directory ) || ! is_dir ( $directory )) {
// ... we return false and exit the function
return FALSE;
// ... if the path is not readable
} elseif (! is_readable ( $directory )) {
// ... we return false and exit the function
return FALSE;
// ... else if the path is readable
} else {
// we open the directory
$handle = opendir ( $directory );
// and scan through the items inside
while ( FALSE !== ($item = readdir ( $handle )) ) {
// if the filepointer is not the current directory
// or the parent directory
if ($item != '.' &amp;&amp; $item != '..') {
// we build the new path to delete
$path = $directory . '/' . $item;
// if the new path is a directory
if (is_dir ( $path )) {
// we call this function with the new path
recursive_remove_directory ( $path );
// if the new path is a file
} else {
// we remove the file
unlink ( $path );
}
}
}
// close the directory
closedir ( $handle );
// if the option to empty is not set to true
if ($empty == FALSE) {
// try to delete the now empty directory
if (! rmdir ( $directory )) {
// return false if not possible
return FALSE;
}
}
// return success
return TRUE;
}
}
}
?>