Chapter 2. Starting and Stopping

Chapter 1. LibreOffice API Concepts introduced some of the core ideas of Office. Now it’s time to show how these data structures and relationships (e.g. service, interfaces, FCM, inheritance) are programmed in OOO Development Tools (OooDev) API.

This chapter will focus on the most fundamental tasks: starting Office, loading (or creating) a document, saving and closing the document, and shutting down Office.

Some the examples come from the LibreOffice Python UNO Examples project.

The aim of OooDev is to hide some of the verbiage of Office. When (if?) a programmer feels ready for more detail, then my code is documented. Here, only explaining functions that illustrate Office ideas, such as service managers and components.

This is the first chapter with code, and so the first where programs could crash! Section 8 gives a few tips on bug detection and reporting.

2.1 Starting Office

Every program must load Office before working with a document (unless run in a macro), and shut it down before exiting. These tasks are handled by Lo.load_office() and Lo.close_office() from the :Lo. A typical program will look like the following:

from ooodev.loader.lo import Lo

def main() -> None:
    loader = Lo.load_office(Lo.ConnectSocket(headless=True)) # XComponentLoader

    # load, manipulate and close a document

    Lo.close_office()

if __name__ == "__main__":
    main()

Lo.load_office(Lo.ConnectSocket(headless=True)) invokes Office and sets up a UNO bridge using named pipes with a headless connection. If not using the Graphic User Interface (GUI) of LibreOffice then headless=True is recommended. See example extract_graphics.py.

There is also Lo.load_office(Lo.ConnectPipes(headless=True)) which uses which uses pipes instead of sockets. See example extract_text.py.

For convenience Lo.ConnectPipe is an alias of ConnectPipe and Lo.ConnectSocket is an alias of ConnectSocket.

In both cases, a remote component context is created (see Chapter 1, Fig. 2) and then a service manager, Desktop object, and component loader are initialized. Below is a simplified version of Lo.load_office(), that show the principle of connecting to LibreOffice.

# in Lo class (simplified)
@classmethod
def load_office(
    cls, connector: ConnectPipe | ConnectSocket | None = None, cache_obj: Cache | None = None
) -> XComponentLoader:

    Lo.print("Loading Office...")
    if connector is None:
        try:
            cls._lo_inst = LoDirectStart()
            cls._lo_inst.connect()
        except Exception as e:
            Lo.print((
                "Office context could not be created."
                " A connector must be supplied if not running as a macro"
            ))
            Lo.print(f"    {e}")
            raise SystemExit(1)
    elif isinstance(connector, ConnectPipe):
        try:
            cls._lo_inst = LoPipeStart(connector=connector, cache_obj=cache_obj)
            cls._lo_inst.connect()
        except Exception as e:
            Lo.print("Office context could not be created")
            Lo.print(f"    {e}")
            raise SystemExit(1)
    elif isinstance(connector, ConnectSocket):
        try:
            cls._lo_inst = LoSocketStart(connector=connector, cache_obj=cache_obj)
            cls._lo_inst.connect()
        except Exception as e:
            Lo.print("Office context could not be created")
            Lo.print(f"    {e}")
            raise SystemExit(1)
    else:
        Lo.print("Invalid Connector type. Fatal Error.")
        raise SystemExit(1)

    cls._xcc = cls._lo_inst.ctx
    cls._mc_factory = cls._xcc.getServiceManager()
    if cls._mc_factory is None:
        Lo.print("Office Service Manager is unavailable")
        raise SystemExit(1)
    cls._xdesktop = cls.create_instance_mcf(XDesktop, "com.sun.star.frame.Desktop")
    if cls._xdesktop is None:
        Lo.print("Could not create a desktop service")
        raise SystemExit(1)
    loader = cls.qi(XComponentLoader, cls._xdesktop)
    if loader is None:
        Lo.print("Unable to access XComponentLoader")
        SystemExit(1)
    return loader

There is also Lo.Loader context manager that allows for automatic closing of office. See Write Convert Document Format for an example.

It is also simple to start LibreOffice from the command line automate tasks and leave it open for user input. See Calc Add Range of Data Automation for an example.

There is of course running as a macro as well.

Running a project with several modules can be a bit daunting task. For this reason oooscript was created, which can easily pack several scripts into one script and embed the result into a LibreOfice Document.

The easiest way to run a several module/class project in LibreOffice is to pack into a single script first. Many examples can be found on LibreOffice Python UNO Examples, such as Calc Add Range of Data Example, LibreOffice Calc Sudoku Example, TAB Control Dialog Box Example.

Macros only need use use Lo.ThisComponent as show below.

from ooodev.loader.lo import Lo
from ooodev.office.calc import Calc

def main():
    # get access to current Calc Document
    doc = Calc.get_ss_doc(Lo.ThisComponent)

    # get access to current spreadsheet
    sheet = Calc.get_active_sheet(doc=doc)

Lo.load_office() probably illustrates a significant coding decisions – the use of global static variables inside the Lo class. In particular, the XComponentContext, XDesktop, and XMultiComponentFactory objects created by load_office() are stored globally for later use. This approach is chosen since it allows other support functions to be called with simpler arguments because the objects can be accessed without the user having to explicitly pass around references to them. The main drawback is that if load_office() is called more than once all previous Lo class globals are overwritten. See 2.9 Opening Multiple Documents for a solution to this problem.

The creation of the XDesktop interface object uses Lo.create_instance_mcf():

# in Lo class
@classmethod
def create_instance_mcf(
    cls,
    atype: Type[T], service_name: str,
    args: Tuple[Any, ...] | None = None,
    raise_err: bool = False
) -> T | None:

    if cls._xcc is None or cls._mc_factory is None:
        raise Exception("No office connection found")
    try:
        if args is not None:
            obj = cls._mc_factory.createInstanceWithArgumentsAndContext(
                service_name, args, cls._xcc
            )
        else:
            obj = cls._mc_factory.createInstanceWithContext(service_name, cls._xcc)
        if raise_err is True and obj is None:
            CreateInstanceMcfError(atype, service_name)
        interface_obj = cls.qi(atype=atype, obj=obj)
        if raise_err is True and interface_obj is None:
            raise MissingInterfaceError(atype)
        return interface_obj
    except CreateInstanceMcfError:
        raise
    except MissingInterfaceError:
        raise
    except Exception as e:
        raise Exception(f"Couldn't create interface for '{service_name}'") from e

If you ignore the error-checking, Lo.create_instance_mcf() does two things. The call to XMultiComponentFactory.createInstanceWithContext() asks the service manager (cls._mc_factory) to create a service object inside the remote component context (cls._xcc). Then the call to uno_obj.queryInterface() via Lo.qi() looks inside the service instance for the specified interface (atype), returning an instance of the interface as its result.

The Lo.qi() function’s reduces programmer typing, since calls to uno_obj.queryInterface() are very common in this frame work. Querying for the interface has the huge advantage of providing typing Fig. 13 (autocomplete, static type checking) support thanks to types-unopy.

Demo
Lo.qi autocomplete demo image

Fig. 13 : Lo.qi() autocomplete demo

The use of generics makes Lo.create_instance_mcf() useful for creating any type of interface object. Unfortunately, generics aren’t utilized in the Office API, which relies instead on Object, Office’s Any class, or the XInterface class which is inherited by all interfaces.

2.2 Closing Down/Killing Office

Lo.close_office() shuts down Office by calling terminate() on the XDesktop instance created inside Lo.load_office(): boolean isDead = xDesktop.terminate() This is usually sufficient but occasionally it necessary to delay the terminate() call for a few milliseconds in order to give Office components time to finish. As a consequence, Lo.close_office() may actually call terminate() a few times, until it returns True.

While developing/debugging code, it’s quite easy to inadvertently trigger a runtime exception in the Office API. In the worst case, this can cause your program to exit without calling Lo.close_office(). This will leave an extraneous Office process running in the OS, which should be killed. The easiest way is with LibreOffice Developer Search loproc --kill.

2.3 Opening a Document

The general format of a program that opens a document, manipulates it in some way, and then saves it, is:

def main() -> None:
    fnm = sys.argv[1:] # get file from first args

    loader = Lo.load_office(Lo.ConnectSocket(headless=True))
    doc = Lo.open_doc(fnm=fnm, loader=loader)

    # use the Office API to manipulate doc...
    Lo.save_doc(doc, "foo.docx") # save as a Word file
    Lo.close_doc(doc)
    lo.close_office()

See Write Convert Document Format for an example.

The new methods are Lo.open_doc(), Lo.save_doc(), and Lo.close_doc().

Lo.open_doc() calls XComponentLoader.loadComponentFromURL(), which requires a document URL, the type of Office frame used to display the document, optional search flags, and an array of document properties.

For example:

file_url = FileIO.fnm_to_url(fnm)
props = Props.make_props(Hidden=True)
doc = loader.loadComponentFromURL(file_url, "_blank", 0, props)

The frame type is almost always “_blank” which indicates that a new window will be created for the newly loaded document. (Other possibilities are listed in the XComponentLoader documentation which you can access with lodoc XComponentLoader.) The search flags are usually set to 0, and document properties are stored in the PropertyValue tuple, props.

loadComponentFromURL()’s return type is XComponent, which refers to the document.

FileIO.fnm_to_url() converts an ordinary filename (e.g. “foo.doc”) into a URL (a full path prefixed with file:///).

Props.make_props() takes a property name and value and returns a PropertyValue tuple; there are several variants which accept different numbers of property name - value pairs.

A complete list of document properties can be found in the MediaDescriptor documentation (accessed with lodoc MediaDescriptor service), but some of the important ones are listed in Table 1

Table 1 Some Document Properties.

Property Name

Use

AsTemplate

Creates a new document using a specified template

Hidden

Determines if the document is invisible after being loaded

ReadOnly

Opens the document read-only

StartPresentation

Starts showing a slide presentation immediately after loading the document

2.4 Creating a Document

The general format of a program that creates a new document, manipulates it in some way, and then saves it, is:

def main() -> None:
    loader = Lo.load_office(Lo.ConnectSocket(headless=True))
    doc = Lo.create_doc(doc_type=Lo.DocTypeStr.WRITER, loader=loader)

    # use the Office API to manipulate doc...

    Lo.save_doc(doc, "foo.docx") # save as a Word file
    Lo.close_doc(doc)
    lo.close_office()

A new document is created by calling XComponentLoader.loadComponentFromURL() with a special URL string for the document type. The possible strings are listed in Table 2.

Table 2 URLs for Creating New Documents.

URL String

Document Type

private:factory/swriter

Writer

private:factory/sdraw

Draw

private:factory/simpress

Impress

private:factory/scalc

Calc

private:factory/sdatabase

Base

private:factory/swriter/web

HTML document in Writer

private:factory/swriter/GlobalDocument

A Master document in Writer

private:factory/schart

Chart

private:factory/smath

Math Formulae

.component:Bibliography/View1

Bibliography Entries

.component:DB/QueryDesign

Database User Interfaces

.component:DB/TableDesign

.component:DB/RelationDesign

.component:DB/DataSourceBrowser

.component:DB/FormGridView

For instance, a Writer document is created by:

doc = loader.loadComponentFromURL("private:factory/swriter", "_blank", 0, props)

The office classes include code for simplifying the creation of Writer, Draw, Impress, Calc, and Base documents, which I’ll be looking at in later chapters.

A Second Service Manager

Lo.open_doc() and Lo.create_doc() do a bit of additional work after document loading/creation – they instantiate a XMultiServiceFactory service manager which is stored in the Lo class. This is done by applying Lo.qi() to the document:

# _ms_factory global in Lo
doc = loader.loadComponentFromURL("private:factory/swriter", "_blank", 0, props)
Lo._ms_factory =  Lo.qi(XMultiServiceFactory, doc)

First Lo.qi() is employed in create_instance_mcf() to access an interface inside a service. This time qi() is casting one interface (XComponent) to another (XMultiServiceFactory).

The XMultiServiceFactory object is the second service manager we’ve encountered; the first was an XMultiComponentFactory instance, created during Office’s loading.

The reasons for Office having two service managers are historical: the XMultiServiceFactory manager is older, and creates a service object without the need for an explicit reference to the remote component context.

As Office developed, it was decided that service object creation should always be relative to an explicit component context, and so the newer XMultiComponentFactory service manager came into being. A lot of older code still uses the XMultiServiceFactory service manager, so both are supported in the Lo class.

Another difference between the managers is that the XMultiComponentFactory manager is available as soon as Office is loaded, while the XMultiServiceFactory manager is initialized only when a document is loaded or created.

2.5 Saving a Document

The general format of a program that creates a new document, manipulates it in some way, and then saves it, is:

def main() -> None:
    loader = Lo.load_office(Lo.ConnectSocket(headless=True))
    doc = Lo.create_doc(doc_type=Lo.DocTypeStr.WRITER, loader=loader)

    # use the Office API to manipulate doc...

    Lo.save_doc(doc, "foo.docx") # save as a Word file
    Lo.close_doc(doc)
    lo.close_office()

One of the great strengths of Office is that it can export a document in a vast number of formats, but the programmer must specify the output format (which is called a filter in the Office documentation).

XStorable.storeToURL() takes the name of the output file (in URL format), and an array of properties, one of which should be “FilterName”. Two other useful output properties are “Overwrite” and “Password”. Input and output document properties are listed in the MediaDescriptor service documentation (lodoc MediaDescriptor service).

If “Overwrite” is set to true then the file will be saved without prompting the user if the file already exists. The “Password” property contains a string which must be entered into an Office dialog by the user before the file can be opened again.

The steps in saving a file are:

save_file_url = FileIO.fnm_to_url(fnm)
store_props = Props.make_props(Overwrite=True, FilterName=format, Password=password)

store = Lo.qi(XStorable, doc)
store.storeToURL(save_file_url, store_props);

If you don’t want a password, then the third property should be left out. Lo.qi() is used again to cast an interface, this time from XComponent to XStorable.

Fig. 5 in Chapter 1 shows that XStorable is part of the OfficeDocument service, which means that it’s inherited by all Office document types.

What’s a Filter Name?

XStorable.storeToURL() needs a “FilterName” property value, but what should the string be to export the document in Word format for example?

Info.get_filter_names() returns an array of all the filter names supported by Office.

Rather than force a programmer to search through this list for the correct name, Lo.save_doc() allows him to supply just the name and extension of the output file. For example, in 2.3 Opening a Document, Lo.save_doc() was called like so:

Lo.save_doc(doc, "foo.docx") # save as a Word file

save_doc() extracts the file extension (i.e. “docx”) and maps it to a corresponding filter name in Office (in this case, “Office Open XML Text”). One concern is that it’s not always clear which extension-to-filter mapping should be utilized. For instance, another suitable filter name for “docx” is “MS Word 2007 XML”. This problem is essentially ignored, by hard wiring a fixed selection into save_doc().

Another issue is that the choice of filter sometimes depends on the extension and the document type. For example, a Writer document saved as a PDF file should use the filter writer_pdf_Export, but if the document is a spreadsheet then calc_pdf_Export is the correct choice.

save_doc() get document type from Info.report_doc_type() that calls Info.is_doc_type() to examine the document’s service name which is accessed via the XServiceInfo interface:

xinfo = Lo.qi(XServiceInfo, doc)
is_writer = xinfo.supportsService("com.sun.star.text.TextDocument")

Then save_doc() utilizes ext_to_format() to get document extension.

The main document service names are listed in Table 3. For quick access in your scripts use Lo.Service where applicable.

Table 3 Document Service Names.

Document

Type Service Name

Writer

com.sun.star.text.TextDocument

Draw

com.sun.star.drawing.DrawingDocument

Impress

com.sun.star.presentation.PresentationDocument

Calc

com.sun.star.sheet.SpreadsheetDocument

Base

com.sun.star.sdb.OfficeDatabaseDocument

We encountered these service names back in Chapter 1, Fig. 9 – they’re sub-classes of the OfficeDocument service.

A third problem is incompleteness; save_doc() via ext_to_format() mappings only implements a small subset of Office’s 250+ filter names, so if you try to save a file with an exotic extension then the code will most likely break. save_doc() has an overload that takes format as option, that is a filter name. This overload can be used to if a filter is not implements by ext_to_format().

If you want to study the details, start with save_doc(), and burrow down; the trickiest part is ext_to_format().

2.6 Closing a Document

Closing a document is a pain if you want to check with the user beforehand: should a modified file be saved, thereby overwriting the old version? OooDev’s solution is not to bother the user, so the file is closed without saving, irrespective of any modifications. In other words, it’s essential to explicitly save a changed document with Lo.save_doc() before calling Lo.close_doc().

The code for closing employs Lo.qi() to cast the document’s XComponent interface to XCloseable:

closeable =  Lo.qi(XCloseable.class, doc)
closeable.close(false)  # doc. closed without saving

2.7 A General Purpose Converter

The Write Convert Document Format example in takes two command line arguments: the name of an input file and the extension that should be used when saving the loaded document. For instance:

python -m doc_convertor --ext 'odp' --file 'points.ppt'

will save slides in MS PowerPoint format as an Impress presentation.

The following converts a JPEG image into PNG:

python -m doc_convertor --ext 'png' --file 'skinner.jpg'

Write Convert Document Format Source is relatively short. Here is the main section.

with Lo.Loader(Lo.ConnectSocket(headless=True)) as loader:
    # get the absolute path of input file
    p_fnm = FileIO.get_absolute_path(args.file_path)

    name = Info.get_name(p_fnm)  # get name part of file without ext
    if not ext.startswith("."):
        # just in case user did not include . in --ext value
        ext = "." + ext

    p_save = Path(p_fnm.parent, f"{name}{ext}")  # new file, same as old file but different ext

    doc = Lo.open_doc(fnm=p_fnm, loader=loader)
    Lo.save_doc(doc=doc, fnm=p_save)
    Lo.close_doc(doc)

print(f"All done! converted file: {p_save}")

2.8 Bug Detection and Reporting

This chapter began our coding with the Office API, and so the possibility of bugs also becomes an issue. If you find a problem with OooDev classes then please submit an issue. supplying as much detail as possible.

Another source of bugs is the LibreOffice API itself, which is hardly a surprise considering its complexity and age. If you find a problem, then you should first search LibreOffice’s Bugzilla site to see if the problem has been reported previously (it probably has). Various types of search are explained in the Bugzilla documentation. If you want to report a new bug, then you’ll need to set up an account, which is quite simple, and also explained by the documentation.

Often when people report bugs they don’t include enough information, perhaps because the error window displayed by Windows is somewhat lacking. For example, a typical crash report window is show in Fig. 14.

The LibreOffice Crash Reported by Windows 7.

Fig. 14 :The LibreOffice Crash Reported by Windows 7.

If you’re going to make an official report, you should first read the article How to Report Bugs in LibreOffice.

Expert forum members and Bugzilla maintainers sometimes point people towards WinDbg for Windows as a tool for producing good debugging details. The wiki has a detailed explanation of how to install and use it , which is a bit scary in its complexity.

A much easier alternative is the WinCrashReport application from NirSoft.

It presents the Windows Error Reporting (WER) data generated by a crash in a readable form.

When a crash window appears (like the one in Fig. 14), start WinCrashReport to examine the automatically-generated error report, as in Fig. 15.

Win Crash Report GUI

Fig. 15 :Win Crash Report GUI

Fig. 15 indicates that the problem lies inside mergedlo.dll, an access violation (the exception code 0xC0000005) to a memory address.

mergedlo.dll is part of LibreOffice which probably means that you can find the DLL in /program. Most Office DLLs are located in that directory.

WinCrashReport generates two alternative call stacks, with slightly more information in the second in this case. mergedlo.dll is called by the uno_getCurrentEnvironment() function in cppu3.dll, as indicated in Fig. 16.

The Second Call Stack in WinCrashReport

Fig. 16 :The Second Call Stack in WinCrashReport.

This narrows the problem to a specific function and two DLLs, which is very helpful.

If you want to better understand the DLLs, they can be examined using DLL Export Viewer , another NirSoft tool, which lists a DLL’s exported functions. Running it on mergedlo.dll turns up nothing, but the details for cppu3.dll are shown in Fig. 17.

DLL Export Viewer's view of cppu3.dll

Fig. 17 :DLL Export Viewer’s view of cppu3.dll

mergedlo.dll appears to be empty inside DLL Export Viewer because it exports no functions. That probably means it’s being used as a store for resources, such as icons, cursors, and images. There’s another NirSoft tool for looking at DLL resources, called ResourcesExtract for searching the gigantic code base. Fig. 18 shows the results for an uno_getCurrentEnvironment search.

OpenGrok Results for "uno_getCurrentEnvironment"

Fig. 18 : OpenGrok Results for uno_getCurrentEnvironment

The function’s code is in EnvStack.cxx, which can be examined by clicking on the linked function name shown at the bottom of Fig. 18.

2.9 Opening Multiple Documents

As of OooDev 0.9.8, you can open multiple documents from a single LibreOffice bridge connection. In OooDev 0.9.8 a new Class LoInst was added. With this new class you can open multiple documents from a single LibreOffice bridge connection.

This is accomplished by creating a new instance of the Class LoInst and then passing the bridge connection to the LoInst.load_office method.

The Class LoInst mirrors the Class Lo in is methods and properties, see Class Lo for any undocumented methods and properties of the Class LoInst.

The following code example demonstrates how to use the Class LoInst to open multiple documents from a single LibreOffice bridge connection.

from ooodev.loader.lo import Lo
from ooodev.loader.inst.lo_inst import LoInst
from ooodev.utils.inst.lo.doc_type import DocTypeStr
from ooodev.gui import GUI

def main() -> None:
    # Start LibreOffice using a Socket bridge.
    _ = Lo.load_office(Lo.ConnectSocket())
    # create a new Calc document.
    # Calc.create_doc() uses static bridge connection created by Lo.load_office above by default
    primary_doc= Calc.create_doc()
    # get the first sheet in the primary document
    primary_sheet = Calc.get_sheet(primary_doc, 0)
    # show the primary document
    GUI.set_visible(visible=True, doc=primary_doc)
    # set a value in the primary document sheet
    Calc.set_val(value="LO TEST", sheet=primary_sheet, cell_name="A1")


    # Create a new instance of LoInst and pass the bridge connection from the static Lo class
    # that was created when Lo.load_office was called above.
    inst = LoInst()
    inst.load_office(Lo.bridge_connector)

    # Create a new document from the new instance of LoInst
    secondary_doc = inst.create_doc(DocTypeStr.CALC)
    # show the secondary document
    GUI.set_visible(visible=True, doc=secondary_doc)
    secondary_sheet = Calc.get_sheet(doc, 0)
    # set a value in the secondary document sheet
    Calc.set_val(value="LO INST", sheet=secondary_sheet, cell_name="A1")
    # ... other code

2.10 Loading for Macro Execution

As of OooDev 0.11.11, Macros are now loaded using the MacroLoader context manager. This allows for the document context to be manages automatically.

In the example below the with MacroLoader() context manager is used. This automatically sets the context for OooDev to the active document. This allows for the macro to be executed in the context of the active document.

Note that only method that are actually called as macros show_hello() and write_hello() require the MacroLoader context manager. The write_hello_msg() method is call from a macro that already sets the context to the active document and therefore does not require the MacroLoader context manager.

from __future__ import annotations
from ooodev.office.write import Write
from ooodev.utils.color import StandardColor
from ooodev.format.writer.direct.char.font import Font
from ooodev.dialog.msgbox import MsgBox, MessageBoxButtonsEnum, MessageBoxType
from ooodev.format.writer.direct.para.alignment import Alignment
from ooodev.macro import MacroLoader


def show_hello(*args) -> None:
    with MacroLoader():
        _ = MsgBox.msgbox(
            "Hello World!",
            "HI",
            boxtype=MessageBoxType.INFOBOX,
            buttons=MessageBoxButtonsEnum.BUTTONS_OK
        )

def write_hello_msg(msg: str) -> None:
    try:
        cursor = Write.get_cursor(Write.active_doc)
        cursor.gotoEnd(False)
        al = Alignment().align_center
        ft = Font(size=36, u=True, b=True, color=StandardColor.GREEN_DARK2)
        Write.append_para(cursor=cursor, text=msg, styles=[ft, al])
    except Exception as e:
        _ = MsgBox.msgbox(f"This method requires a Writer document.\n{e}")

def write_hello(*args) -> None:
    with MacroLoader():
        write_hello_msg("Hello World!")


g_exportedScripts = (show_hello, write_hello)