SMATDB disk requirements to idx requirements

Homepage Clovertech Forums Read Only Archives Cloverleaf Cloverleaf SMATDB disk requirements to idx requirements

  • Creator
    Topic
  • #55002
    Bob Schmid
    Participant

    Trying to get a feel for data retention needs…..from my initial look….it appeared to be about a 5-1 increase in storage needs for every 1 record in a smatdb tp the old smat/idx/msg par

    Example:

    if I’m retaining 15 days of SMATS and it requires 2.8 gig (previous to compressing):

    I should plan on  ~ 14 gig for storing smatdb?  (compression gains very little on the smatdb files)

    Does that sound accurate for those that have migrated to smatdb?

    Bob

Viewing 7 reply threads
  • Author
    Replies
    • #83761
      aaron kaufman-moore
      Participant

      I didn’t reach quite that much growth, I recently installed 6.1.1 in a test server and I experienced a 3 to 1 ratio:

      Size comparison – took /scrhist/1wkago (1/12/16-1/19/16) and the compressed flat files took: 9.39 GB, when converted to SMATDB (all uncompressed) it took 28.10 GB in space

      The source files were compressed using AIX’s built in compress command (not sure which protocol it uses)

    • #83762
      Rob Lindsey
      Participant

      We are planning on moving from SMAT files to SMAT DB.  We have done so only in DEV and in one site on One of our QA systems.  I have to figure out how/why cycle_save is not working on most of the threads that have SMAT DB turned on but it does on a few.

      Just as a FYI, if you have Global Monitor with Searching turned on with SMAT files, you will eat up a TON of the file system (disk space) with indexing turned on.  On our system it is eating up almost 48GB of space and we only have 7 days available that are uncompressed.

    • #83763
      Steve Pringle
      Participant

      We also noticed Global Monitor eating up disk space with searching turned on, we eventually turned it off, it was taking up too much space.

      Does anyone know if Global Monitor 6.1 still eats up space with smatdb files?

      thanks,

      Steve

    • #83764
      bill bearden
      Participant

      In our tests, GM 6.1.1 SMAT search does not use disk space on the CIS server when “indexing” SMATDB.

      You still go through the step of selecting directories to “index”. But it didn’t seem to create the smatsearchindex folder in the site directory.

    • #83765
      Rob Abbott
      Keymaster

      GM needs the Hostserver process to index SMAT files in order for the search to function.  This is an expensive operation in terms of both CPU and disk space (which is why the application allows you to decide which directories to index).

      This indexing does not happen if you are using SMAT DB as the indexes are built into the database.

      Rob Abbott
      Cloverleaf Emeritus

    • #83766
      Todd Gruden
      Participant

      I am an Engineer on Bob Schmid’s team and I ran through a couple of scenarios to compare disk usage between SMAT files and SMATdb.  

      1. I took a SMATdb and ran hcismatconvert to create .msg, .idx and .ecd files.  The SMATdb size is 206,908,416 bytes and the combined size of the resulting .msg, .idx and .ecd files is 206,158,596 bytes (uncompressed)

      2. I loaded flat SMAT files into the SMAT tool, where the combined .msg, .idx and .ecd file size is 15,713,925 bytes.  I resent all messages to an outbound thread with protocol:file dev/null and SMATdb enabled (no encryption).  The resulting SMATdb on the outbound thread is 10,676,259 bytes.

      Does it seem reasonable that the SMATdb would be about the same size or even smaller than uncompressed flat SMAT files?

      Thanks,

      Todd

    • #83767
      Mark Thompson
      Participant

      Todd,

      If you look at the contents of the .idx file you will see there is a certain amount of “fixed” overhead for each message that does not depend on message size.  The .msg file exactly corresponds to the the sum of the message sizes (plus 10 bypes per message for length encoding).

      I have not done much work with SMATdb, but I assume the ratio between flat file sizes and SMATdb sizes depends largely on your average message size.  If SMATdb is more effecient (storage-wise) at indexing the messages than the .idx file, your DB size COULD actually be smaller, especially if you add in the overhead of the .ecd file.

      - Mark Thompson
      HealthPartners

    • #83768
      Matthew Rasmussen
      Participant

      Good info on SmatDB – that appears to be the best solution, and i think we are working toward that.  But in the meantime, does anyone know of a good way to manage the size of these index files, without breaking the indexes?  I was thinking of just deleting the smatsearchindex directories nightly, but then the indexes are broken in GM.  Does anyone know of a way to automatically trigger a “re-index” on the GM side, to occur after the index directories are purged on the engine?

Viewing 7 reply threads
  • The forum ‘Cloverleaf’ is closed to new topics and replies.

Forum Statistics

Registered Users
5,117
Forums
28
Topics
9,293
Replies
34,435
Topic Tags
286
Empty Topic Tags
10