SMATDB disk requirements to idx requirements

Clovertech Forums Read Only Archives Cloverleaf Cloverleaf SMATDB disk requirements to idx requirements

  • Creator
    Topic
  • #55002
    Bob Schmid
    Participant

      Trying to get a feel for data retention needs…..from my initial look….it appeared to be about a 5-1 increase in storage needs for every 1 record in a smatdb tp the old smat/idx/msg par

      Example:

      if I’m retaining 15 days of SMATS and it requires 2.8 gig (previous to compressing):

      I should plan on  ~ 14 gig for storing smatdb?  (compression gains very little on the smatdb files)

      Does that sound accurate for those that have migrated to smatdb?

      Bob

    Viewing 7 reply threads
    • Author
      Replies
      • #83761
        aaron kaufman-moore
        Participant

          I didn’t reach quite that much growth, I recently installed 6.1.1 in a test server and I experienced a 3 to 1 ratio:

          Size comparison – took /scrhist/1wkago (1/12/16-1/19/16) and the compressed flat files took: 9.39 GB, when converted to SMATDB (all uncompressed) it took 28.10 GB in space

          The source files were compressed using AIX’s built in compress command (not sure which protocol it uses)

        • #83762
          Rob Lindsey
          Participant

            We are planning on moving from SMAT files to SMAT DB.  We have done so only in DEV and in one site on One of our QA systems.  I have to figure out how/why cycle_save is not working on most of the threads that have SMAT DB turned on but it does on a few.

            Just as a FYI, if you have Global Monitor with Searching turned on with SMAT files, you will eat up a TON of the file system (disk space) with indexing turned on.  On our system it is eating up almost 48GB of space and we only have 7 days available that are uncompressed.

          • #83763
            Steve Pringle
            Participant

              We also noticed Global Monitor eating up disk space with searching turned on, we eventually turned it off, it was taking up too much space.

              Does anyone know if Global Monitor 6.1 still eats up space with smatdb files?

              thanks,

              Steve

            • #83764
              bill bearden
              Participant

                In our tests, GM 6.1.1 SMAT search does not use disk space on the CIS server when “indexing” SMATDB.

                You still go through the step of selecting directories to “index”. But it didn’t seem to create the smatsearchindex folder in the site directory.

              • #83765
                Rob Abbott
                Keymaster

                  GM needs the Hostserver process to index SMAT files in order for the search to function.  This is an expensive operation in terms of both CPU and disk space (which is why the application allows you to decide which directories to index).

                  This indexing does not happen if you are using SMAT DB as the indexes are built into the database.

                  Rob Abbott
                  Cloverleaf Emeritus

                • #83766
                  Todd Gruden
                  Participant

                    I am an Engineer on Bob Schmid’s team and I ran through a couple of scenarios to compare disk usage between SMAT files and SMATdb.  

                    1. I took a SMATdb and ran hcismatconvert to create .msg, .idx and .ecd files.  The SMATdb size is 206,908,416 bytes and the combined size of the resulting .msg, .idx and .ecd files is 206,158,596 bytes (uncompressed)

                    2. I loaded flat SMAT files into the SMAT tool, where the combined .msg, .idx and .ecd file size is 15,713,925 bytes.  I resent all messages to an outbound thread with protocol:file dev/null and SMATdb enabled (no encryption).  The resulting SMATdb on the outbound thread is 10,676,259 bytes.

                    Does it seem reasonable that the SMATdb would be about the same size or even smaller than uncompressed flat SMAT files?

                    Thanks,

                    Todd

                  • #83767
                    Mark Thompson
                    Participant

                      Todd,

                      If you look at the contents of the .idx file you will see there is a certain amount of “fixed” overhead for each message that does not depend on message size.  The .msg file exactly corresponds to the the sum of the message sizes (plus 10 bypes per message for length encoding).

                      I have not done much work with SMATdb, but I assume the ratio between flat file sizes and SMATdb sizes depends largely on your average message size.  If SMATdb is more effecient (storage-wise) at indexing the messages than the .idx file, your DB size COULD actually be smaller, especially if you add in the overhead of the .ecd file.

                      - Mark Thompson
                      HealthPartners

                    • #83768
                      Matthew Rasmussen
                      Participant

                        Good info on SmatDB – that appears to be the best solution, and i think we are working toward that.  But in the meantime, does anyone know of a good way to manage the size of these index files, without breaking the indexes?  I was thinking of just deleting the smatsearchindex directories nightly, but then the indexes are broken in GM.  Does anyone know of a way to automatically trigger a “re-index” on the GM side, to occur after the index directories are purged on the engine?

                    Viewing 7 reply threads
                    • The forum ‘Cloverleaf’ is closed to new topics and replies.