below is what I get with notepad:
I would begine to tell you how to parse this unless each line started with a known sequence or each line was some sort of flat record layout
I don’t know how you would find all of the 03 records without some sort of rule-based parsing. How would you differeniate between 03 as a record start and 03 in the date field?
Maybe it is something this web site does to you data?
01 2867840 “last,first m” 4061976 1 999-99-9999 1 2062012 2323 4 02 2 2
“last,first” 1010 OAK 03 7241976 1042 1901203VC42 BPZ810914413 2 “last,first” 1
306 HEATHER LANE 2 1012011 555-555-5555 01 2980946 “last,first” 10281949 2
9999-99-9999 6 1 3272012 771 4 02 1 999-99-9999 230766 5 1979 MILKY WAY
608-271-9000 555-555-5555 2 “last,first” 2405 BROOKFIELD AVE 03 10281949 2211
999999999A 2 “last,first” 5 2405 BROOKFIELD AVE 01 12012011 800-800-8000 03
10281949 69521 xxx123 R59723788 2 “last,first” 5 2405 BROOKFIELD AVE 1 1012009
555-555-5555 01 2999240 “last,first” 1311989 1 1 2062012 15376 4 02 1 475678 1
555-555-5555 1 “last,first” 4102 GLADDEN AVE 03 1311989 1193 44702569500 1
“last,first” 1 4102 GLADDEN AVE 01 555-555-5550