Chapter 26. Search and Replace

The _increase_garlic_cost() method in garlic_secrets.py (23.3 Searching for the Cost of Garlic, and Increasing it) illustrates how to use loops and if - tests to search for and replace data. Another approach is to employ the XSearchable and XReplaceable interfaces, as in the replace_all.py example shown below:

# in replace_all.py

from ooodev.format import Styler
from ooodev.format.calc.direct.cell.background import Color as BgColor
from ooodev.format.calc.direct.cell.borders import Borders, BorderLineKind, Side, LineSize
from ooodev.format.calc.direct.cell.font import Font
from ooodev.calc import Calc, CalcDoc, CalcCellRange, ZoomKind
# ... other imports

class ReplaceAll:
    # ReplaceAll Globals
    ANIMALS = (
        "ass", "cat", "cow", "cub", "doe", "dog", "elk",
        "ewe", "fox", "gnu", "hog", "kid", "kit", "man",
        "orc", "pig", "pup", "ram", "rat", "roe", "sow", "yak"
    )
    TOTAL_ROWS = 15
    TOTAL_COLS = 6

    def main(self) -> None:
        loader = Lo.load_office(Lo.ConnectSocket())

        try:
            doc = CalcDoc(Calc.create_doc(loader))

            doc.set_visible()
            Lo.delay(300)
            doc.zoom(ZoomKind.ZOOM_150_PERCENT)

            sheet = doc.get_sheet(0)

            def cb(row: int, col: int, prev) -> str:
                # call back function for make_2d_array, sets the value for the cell
                # return animals repeating until all cells are filled
                v = (row * ReplaceAll.TOTAL_COLS) + col

                a_len = len(ReplaceAll.ANIMALS)
                if v > a_len - 1:
                    i = v % a_len
                else:
                    i = v
                return ReplaceAll.ANIMALS[i]

            tbl = TableHelper.make_2d_array(
                num_rows=ReplaceAll.TOTAL_ROWS, num_cols=ReplaceAll.TOTAL_COLS, val=cb
            )

            # create styles that can be applied to the cells via Calc.set_array_range().
            inner_side = Side()
            outer_side = Side(width=LineSize.THICK)
            bdr = Borders(
                border_side=outer_side, vertical=inner_side, horizontal=inner_side
            )
            bg_color = BgColor(StandardColor.BLUE)
            ft = Font(color=StandardColor.WHITE)

            sheet.set_array_range(
                range_name="A1:F15", values=tbl, styles=[bdr, bg_color, ft]
            )

            # A1:F15
            cell_rng = sheet.get_range(col_start=0, row_start=0, col_end=5, row_end=15)

            for s in self._srch_strs:
                if self._is_search_all:
                    self._search_all(cell_rng=cell_rng, srch_str=s)
                else:
                    self._search_iter(cell_rng=cell_rng, srch_str=s)

            if self._repl_str is not None:
                for s in self._srch_strs:
                    self._replace_all(
                        cell_rng=cell_rng, srch_str=s, repl_str=self._repl_str
                    )

            if self._out_fnm:
                doc.save_doc(fnm=self._out_fnm)

            msg_result = MsgBox.msgbox(
                "Do you wish to close document?",
                "All done",
                boxtype=MessageBoxType.QUERYBOX,
                buttons=MessageBoxButtonsEnum.BUTTONS_YES_NO,
            )
            if msg_result == MessageBoxResultsEnum.YES:
                doc.close_doc()
                Lo.close_office()
            else:
                print("Keeping document open")

        except Exception:
            Lo.close_office()
            raise

A blank sheet is filled with a 15 x 6 grid of animal names, such as the one shown in Fig. 211.

A Grid Of Animals for Searching and Replacing.

Fig. 211 :A Grid Of Animals for Searching and Replacing.

The SheetCellRange supports the XReplaceable interface, which is a subclass of XSearchable, as in Fig. 212.

The XReplaceable and XSearchable Interfaces.

Fig. 212 :The XReplaceable and XSearchable Interfaces.

A cell range’s XSearchable interface is accessed through casting:

# in replace_all.py
srch = cell_rng.qi(XSearchable, True)

The XReplaceable interface for the range is obtained in the same way:

# in replace_all.py
repl = cell_rng.qi(XReplaceable, True)

XSearchable offers iterative searching using its findFirst() and findNext() methods, which is demonstrated shortly in the _search_iter() method in replace_all.py. XSearchable can also search for all matches at once with Calc.find_all(), which is employ in the replace_all.py _search_all(). Only one of these methods is needed by the program, so the other is commented out in the main() function shown above.

XReplaceable only offers replaceAll() which searches for and replaces all of its matches in a single call. It’s utilized by the replace_all.py _replace_all() method.

Before a search can begin, it’s usually necessary to tweak the search properties, i.e. to employ regular expressions, be case sensitive, or use search similarity. Similarity allows a text match to be a certain number of characters different from the search text. These search properties are stored in the SearchDescriptor service, which is accessed by calling XSearchable.createSearchDescriptor(). For example:

# in ReplaceAll._search_iter() of replace_all.py
# ...
srch = cell_rng.qi(XSearchable, True)
sd = srch.createSearchDescriptor()

sd.setSearchString(srch_str)
sd.setPropertyValue("SearchWords", True)
# sd.setPropertyValue("SearchRegularExpression", True)

XSearchDescriptor is the interface for the SearchDescriptor service, as shown in Fig. 213.

The ReplaceDescriptor and SearchDescriptor Services.

Fig. 213 :The ReplaceDescriptor and SearchDescriptor Services.

Aside from being used to set search properties, XSearchDescriptor is also where the search string is stored:

sd.setSearchString("dog")  # search for "dog"

If regular expressions have been enabled, then the search string can utilize them:

# search for a non-empty series of lower-case letters
sd.setSearchString("[a-z]+")

The regular expression syntax is standard, and documented online at List of Regular Expressions.

26.1 Searching Iteratively

The _search_iter() method in replace_all.py is passed the cell range for the 15 x 6 grid of animals, and creates a search based on finding complete words. It uses XSearchable.findFirst() and XSearchable.findNext() to incrementally move through the grid:

# in ReplaceAll._search_iter() of replace_all.py
def _search_iter(self, cell_rng: CalcCellRange, srch_str: str) -> None:
    print(f'Searching (iterating) for all occurrences of "{srch_str}"')
    try:
        srch = cell_rng.qi(XSearchable, True)
        sd = srch.createSearchDescriptor()

        sd.setSearchString(srch_str)
        # only complete words will be found
        sd.setPropertyValue("SearchWords", True)
        # sd.setPropertyValue("SearchRegularExpression", True)

        o_first = srch.findFirst(sd)
        # Info.show_services("Find First", o_first)

        cr = Lo.qi(XCellRange, o_first)
        if cr is None:
            print(f'  No match found for "{srch_str}"')
            return
        count = 0
        while cr is not None:
            self._highlight(cr)
            print(f"  Match {count + 1} : {Calc.get_range_str(cr)}")
            cr = Lo.qi(XCellRange, srch.findNext(cr, sd))
            count += 1

    except Exception as e:
        print(e)

_highlight() method is as follows:

from ooodev.format import Styler
from ooodev.format.calc.direct.cell.background import Color as BgColor
from ooodev.format.calc.direct.cell.borders import Borders, BorderLineKind, Side, LineSize
from ooodev.format.calc.direct.cell.font import Font
from ooodev.format.calc.direct.cell.standard_color import StandardColor
# ... other imports

# in ReplaceAll._highlight() of replace_all.py
def _highlight(self, cr: XCellRange) -> None:
    # highlight by make cell bold, with text color of Light purple and a background color of light blue.
    ft = Font(b=True, color=StandardColor.PURPLE_LIGHT1)
    bg_color = BgColor(StandardColor.DEFAULT_BLUE)
    borders = Borders(
        border_side=Side(line=BorderLineKind.SOLID, color=StandardColor.RED_DARK3)
    )
    Styler.apply(cr, ft, bg_color, borders)

XSearchable.findNext() requires a reference to the previous match as its first input argument, so it can resume the search after that match.

srch = Lo.qi(XSearchable, cell_rng, True)
# ...
o_first = srch.findFirst(sd)
Info.show_services("Find First", o_first)

When the services are listed for the references returned by XSearchable.findFirst() and XSearchable.findNext() by calling Info.show_services() the following is show.

Find First Supported Services (7)
'com.sun.star.sheet.SheetCell'
'com.sun.star.sheet.SheetCellRange'
'com.sun.star.style.CharacterProperties'
'com.sun.star.style.ParagraphProperties'
'com.sun.star.table.Cell'
'com.sun.star.table.CellProperties'
'com.sun.star.table.CellRange'

The main service supported by the findFirst() result is SheetCell. This makes sense since the search is looking for a cell containing the search string. As a consequence, the o_first reference can be converted to XCell:

cr = Lo.qi(XCell, srch.findFirst(sd))

However, checking out XSearchable.findNext() in the same way showed an occasional problem:

o_next  = srch.findNext(cr, sd)
Info.show_services("Find Next", o_next)

The reference returned by findNext() usually supports the SheetCell service, but sometimes represents SheetCellRange instead! When that occurs, code that attempts to convert o_next to XCell will return None:

cell = Lo.qi(XCell, srch.findNext(o_first, sd))

The solution is shown in the _search_iter() listing above - instead of converting the XSearchable.findFirst() and XSearchable.findNext() results to XCell, they’re changed into XCellRange references, which always succeeds.

_search_iter() calls _highlight() on each match so the user can see the results more clearly, as in Fig. 214.

The Results of _search_iter when Looking for dog

Fig. 214 :The Results of _search_iter() when Looking for “dog”.

26.2 Searching For All Matches

The _search_all() method in replace_all.py utilizes XSearchable.findAll() to return all the search matches at once, in the form of an indexed container. Calc.find_all() adds an extra conversion step, creating list of XCellRange objects from the values in the container:

# in Calc class
@staticmethod
def find_all(srch: XSearchable, sd: XSearchDescriptor) -> List[XCellRange] | None:
    con = srch.findAll(sd)
    if con is None:
        Lo.print("Match result is null")
        return None
    c_count = con.getCount()
    if c_count == 0:
        Lo.print("No matches found")
        return None

    crs = []
    for i in range(c_count):
        try:
            cr = Lo.qi(XCellRange, con.getByIndex(i))
            if cr is None:
                continue
            crs.append(cr)
        except Exception:
            Lo.print(f"Could not access match index {i}")
    if len(crs) == 0:
        Lo.print(f"Found {c_count} matches but unable to access any match")
        return None
    return crs

_search_all() iterates through the XCellRange list returned by Calc.find_all(), highlighting each match in the same way as the _search_iter() method:

# in ReplaceAll._search_all() of replace_all.py
def _search_all(self, cell_rng: CalcCellRange, srch_str: str) -> None:
    print(f'Searching (find all) for all occurrences of "{srch_str}"')
    try:
        srch = cell_rng.qi(XSearchable, True)
        sd = srch.createSearchDescriptor()

        sd.setSearchString(srch_str)
        sd.setPropertyValue("SearchWords", True)

        match_crs = Calc.find_all(srch=srch, sd=sd)
        if not match_crs:
            print(f'  No match found for "{srch_str}"')
            return
        for i, cr in enumerate(match_crs):
            self._highlight(cr)
            print(f"  Index {i} : {Calc.get_range_str(cr)}")

    except Exception as e:
        print(e)

26.3 Replacing All Matches

The XReplaceable interface only contains a replaceAll() method (see Fig. 213), so there’s no way to implement an iterative replace function. In addition, XReplaceable.replaceAll() returns a count of the number of changes, not a container of the matched cells like XSearchable.findAll(). This means that its not possible to code a replace-like version of the _search_all() method which highlights all the changed cells.

The best that can be done is to execute two searches over the grid of animal names. The first looks only for the search string so it can highlight the matching cells. The second search calls XReplaceable.replaceAll() to make the changes.

The _replace_all() method is:

# in ReplaceAll._replace_all() of replace_all.py
def _replace_all(
    self, cell_rng: CalcCellRange, srch_str: str, repl_str: str
) -> None:
    print(f'Replacing "{srch_str}" with "{repl_str}"')
    Lo.delay(2000)  # wait a bit before search & replace
    try:
        repl = cell_rng.qi(XReplaceable, True)
        rd = repl.createReplaceDescriptor()

        rd.setSearchString(srch_str)
        rd.setReplaceString(repl_str)
        rd.setPropertyValue("SearchWords", True)
        # rd.setPropertyValue("SearchRegularExpression", True)

        count = repl.replaceAll(rd)
        print(f"Search text replaced {count} times")
        print()

    except Exception as e:
        print(e)

The coding style is similar to my _search_all() method from above. One difference is that XReplaceDescriptor is used to setup the search and replacement strings.

If rd.setPropertyValue("SearchRegularExpression", True) is uncommented then _replace_all() could be called using regular expressions in the function:

self._replace_all(cell_rng=cell_rng, srch_str="[a-z]+", repl_str="ram")

The search string ([a-z]+) will match every cell’s text, and change all the animal names to ram. Typical output is shown in Fig. 215.

All Animals Become One

Fig. 215 :All Animals Become One.