#282 ✓resolved
Walter McGinnis

Add faster Zoom search record rebuild option for Kete that are hosted on same host as Zebra server instance

Reported by Walter McGinnis | June 17th, 2010 @ 01:45 PM | in 1.3 (closed)

When a site admin chooses "rebuild search databases", if the Zebra server is on the same host (usually the case, but check ZoomDb instances' host attribute), if the site admin is choosing "All Records", then write OAI Zoom records for items to a appropriate directory structure (without adding them to zebra at the time of writing) and then use zebraidx tool to add them in one batch.

Hopefully use of zebraidx will make the process significantly faster.

If it proves worthwhile for total rebuilds, check if partial rebuilds are easily possible.

We may even use the directory structure of OAI Zoom records as a form of caching for future improvements to the OAI Repository.

The directory where the OAI Zoom records are written to should have a subdirectory structure that has a parent node for the records type and then subdirectories (ala what we do for attachment_fu uploaded files) based on the records ID.

There will be parallel directory structure for Zebra public and Zebra private databases.

The clearing of existing Zebra databases should be delayed until just before the zebraidx command is run.

Comments and changes to this ticket

  • Walter McGinnis

    Walter McGinnis June 19th, 2010 @ 05:35 PM

    • Milestone set to 1.3
    • Tag set to zoom rebuild, zebra, zoom
    • State changed from “open” to “resolved”

    Implemented working through the records in batches of 500 to keep memory use more manageable and constant (previously seemed to grow with time).

    Put zebraidx processing at the end of each batch when the Zebra instances are on localhost. Wipes oai records for the batch after processing though, rather than keeping around for caching.

    Seems to have sped up processing in my tests by around 300 - 400 % with the batch of 64k records. We still have to generate oai record and write the file for each record and that is probably the biggest bottle neck now.

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Kete was developed by Horowhenua Library Trust and Katipo Communications Ltd. to build a digital library of Horowhenua material.

People watching this ticket