To split out files I would use pdftk.
The difficulty will be determining which pages belong together; if each page can be stand-alone then we’re done with pdftk’s burst feature.
It has a function to uncompress PDFs, which may allow you to use a lexical analyzer to determine which PDFs belong together.
My first approach would be:
1) split out all the pages into separate files
2) figure out a way to classify each of the separate PDFs. My first lead here is the uncompress feature of pdftk. Otherwise I’m going to be doing a lot of Google searches, and finally looking to grab some libraries out of a PDF handler like evince or okular.
3) join up the groups of pages into their own PDFs.
This may not be the best approach, but it’s what I would take if trying to do this sort of thing.