Sequential vs. Simultaneous processing

Clovertech Forums Read Only Archives Cloverleaf General Sequential vs. Simultaneous processing

  • Creator
    Topic
  • #48158
    Goran Djuranovic
    Participant

      For those of you who have some experience in single- and multi- threading application processing, here is the situation:

      1. Multiple HL7 files come into CloverLeaf from different sources/systems

      2. HL7 files are translated into XML files

      3. XML files are copied to a file server

      4. XML files picked up by application and sent to SQL server for processing

      Now, potentialy, two (or more) XML files can contain different information for the same record to be updated in the database.

      For example, both files need to update Person’s address whose PersonID=”5″. The address in the first file is “Apple St. #1”, and the address in the second file is “Orange St. #2”.

      So, in single-threaded processing, the XML file that was detected first, is processed first. The second file is processed ONLY after the first file processing is finished. This eliminates the conflict/error of trying to update the same record twice at the same time.

      In multi-threaded processing, there is a huge possibility this error would occur, no question about it.

      Single-threaded/Sequential processing = slow, but safe.

      Multi-threaded/Simultaneous processing = fast, risky.

      Now, if I had to develop an application that processes these XML files, which type of processing would you suggest and why?

      Thanks in advance

      Goran

    Viewing 5 reply threads
    • Author
      Replies
      • #57864
        Anonymous
        Participant

          Goran,

          I’m going to assume that you’re going to develop this application for this special purpose.  Here’s an idea that comes off the top of my head; write a program that is allowed to have many instances running at the same time.  However, when a message is being processed, the application must lock the person id so that other instances can’t work on that person at the same time.  As long as all instances of the application can access the lock mechanism you can prevent the contention problem you describe.

          What would be the possibility of the application listening on a socket?  Then perhaps Cloverleaf could send the xml files directly to the application.  Or what about using Cloverleaf to execute the SQL.  Just random thoughts.

          Cheers,

        • #57865
          Jim Kosloskey
          Participant

            My .02 worth…

            I vote for slow but safe.

            However, I guess it would be possible for system A in File 1 to set an address, then system B in File 2 to set the address for the same person to something else, and then system A in File 3 set the address for the same person.

            In that case, it seems no matter how you process the data, it is possible to have the wrong result. Ideally you would want one system to be the system of ‘truth’ for certain data (say the person’s address) and ignore transactions for that data from other systems.

            Jim Kosloskey

            email: jim.kosloskey@jim-kosloskey.com 29+ years Cloverleaf, 59 years IT - old fart.

          • #57866
            Goran Djuranovic
            Participant

              Greg,

              Before I comment on your reply, let me first expand on what I said before. The application is actually going to be a service. So, roughly explained, the service is envisioned to be only a handler of XML files (plus error, log, and message handler). It simply watches the folder for new files, and as soon as the file is found (exported from CloverLeaf), it is picked up, validated, and sent to SQL server’s stored procedures. Then, the stored procedures process (update, insert, delete, etc.) records (one by one) from the file, and therefore, are responsible for locking the record being processed.

              From the above, you can see that the service is not the problem. Basically, it can pick and send 50,000-record XML file in couple of seconds, but the stored procs will take a lot of time to process.

              As for the locking, I was affraid you were going to mention that

            • #57867
              Goran Djuranovic
              Participant

                Jim Kosloskey wrote:

                My .02 worth…

                Ideally you would want one system to be the system of ‘truth’ for certain data (say the person’s address) and ignore transactions for that data from other systems. Jim Kosloskey


                Not a bad idea. At least some of the data processing could be set up like that, if not all.

                Thanks

                Goran

              • #57868
                Dennis Pfeifer
                Participant

                  I’ve had a similar problem in which I wanted to increase throughput, but had a serialization problem.

                  The solutions:.. multiple queues, with multiple processes, with each queue segmented by a unique key ..

                  ok ..so what does that mean ..

                  In medicine, our primary key is generally the MRN..

                  (for some it may be accession number .. others order number…)

                  so .. let’s create 5 queues .. mod the MRN by 5 .. so that you have a number from 0 to 4 ..

                  create 5 directories… each with a unique sequencing ..

                  store the XML in the associated directory ..

                  Now .. you have created a sequencing by patient, but allow 5 separate processes to work on the data.. .

                  Just how I’d approach it ..

                  I’m doing this today with some Lab and ADT interfaces ..

                  … I prefer fast and safe

                  That’s my $1.98 (plus shipping and handling) .. inflation hit’s everything!

                  Dennis

                • #57869
                  Anonymous
                  Participant

                    All,

                    I could be wrong, but is my understanding that the multiserver protocol will not process more than one message at a time. After all, it is configured in a thread that is running under only one process. What this protocol does is to allow more than one connection at the same time (producing the multiprocess sensation) but once the messages are in the engine, they get a number and their outbound order is established… So this should be fast and safe at the same time.

                    Am I missing something?

                Viewing 5 reply threads
                • The forum ‘General’ is closed to new topics and replies.