US20090048877A1 - Insurance claim forecasting system - Google Patents
Insurance claim forecasting system Download PDFInfo
- Publication number
- US20090048877A1 US20090048877A1 US12/145,281 US14528108A US2009048877A1 US 20090048877 A1 US20090048877 A1 US 20090048877A1 US 14528108 A US14528108 A US 14528108A US 2009048877 A1 US2009048877 A1 US 2009048877A1
- Authority
- US
- United States
- Prior art keywords
- cost
- period
- data
- level
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 claims abstract description 285
- 230000036541 health Effects 0.000 claims abstract description 175
- 230000008569 process Effects 0.000 claims abstract description 94
- 238000011161 development Methods 0.000 claims abstract description 85
- 230000003993 interaction Effects 0.000 claims abstract description 58
- 230000000694 effects Effects 0.000 claims abstract description 26
- 230000001419 dependent effect Effects 0.000 claims description 79
- 238000012545 processing Methods 0.000 claims description 54
- 230000008901 benefit Effects 0.000 claims description 30
- 230000034994 death Effects 0.000 claims description 27
- 231100000517 death Toxicity 0.000 claims description 27
- 238000013488 ordinary least square regression Methods 0.000 claims description 23
- 238000013528 artificial neural network Methods 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000003745 diagnosis Methods 0.000 claims description 14
- 230000007774 longterm Effects 0.000 claims description 13
- 230000006698 induction Effects 0.000 claims description 9
- 230000014759 maintenance of location Effects 0.000 claims description 9
- 230000002068 genetic effect Effects 0.000 claims description 8
- 230000009466 transformation Effects 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 3
- 238000000844 transformation Methods 0.000 claims description 2
- 230000018109 developmental process Effects 0.000 description 61
- 239000000047 product Substances 0.000 description 50
- 238000012360 testing method Methods 0.000 description 26
- 210000004027 cell Anatomy 0.000 description 16
- 238000007477 logistic regression Methods 0.000 description 11
- 230000007717 exclusion Effects 0.000 description 9
- 239000000203 mixture Substances 0.000 description 9
- 241000270295 Serpentes Species 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 7
- 238000007418 data mining Methods 0.000 description 7
- 208000014674 injury Diseases 0.000 description 7
- 206010028980 Neoplasm Diseases 0.000 description 6
- 208000027418 Wounds and injury Diseases 0.000 description 6
- 230000002776 aggregation Effects 0.000 description 6
- 238000004220 aggregation Methods 0.000 description 6
- 238000009826 distribution Methods 0.000 description 6
- 229940079593 drug Drugs 0.000 description 6
- 239000003814 drug Substances 0.000 description 6
- 230000002860 competitive effect Effects 0.000 description 5
- 238000013502 data validation Methods 0.000 description 5
- 238000005457 optimization Methods 0.000 description 5
- 230000035935 pregnancy Effects 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 208000002177 Cataract Diseases 0.000 description 4
- 208000010040 Sprains and Strains Diseases 0.000 description 4
- 208000007502 anemia Diseases 0.000 description 4
- 210000001367 artery Anatomy 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 230000006378 damage Effects 0.000 description 4
- 238000013480 data collection Methods 0.000 description 4
- 230000002124 endocrine Effects 0.000 description 4
- 210000004392 genitalia Anatomy 0.000 description 4
- 206010033675 panniculitis Diseases 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 208000002874 Acne Vulgaris Diseases 0.000 description 3
- 208000019901 Anxiety disease Diseases 0.000 description 3
- 206010010356 Congenital anomaly Diseases 0.000 description 3
- 206010011224 Cough Diseases 0.000 description 3
- 206010013975 Dyspnoeas Diseases 0.000 description 3
- 206010017076 Fracture Diseases 0.000 description 3
- 206010019233 Headaches Diseases 0.000 description 3
- 208000019695 Migraine disease Diseases 0.000 description 3
- 208000008589 Obesity Diseases 0.000 description 3
- 208000005374 Poisoning Diseases 0.000 description 3
- 206010052428 Wound Diseases 0.000 description 3
- 206010000496 acne Diseases 0.000 description 3
- 208000006673 asthma Diseases 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 208000035269 cancer or benign tumor Diseases 0.000 description 3
- 229910000078 germane Inorganic materials 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 206010027599 migraine Diseases 0.000 description 3
- 230000000241 respiratory effect Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 210000004304 subcutaneous tissue Anatomy 0.000 description 3
- 230000009897 systematic effect Effects 0.000 description 3
- 208000008035 Back Pain Diseases 0.000 description 2
- 208000010392 Bone Fractures Diseases 0.000 description 2
- 208000003643 Callosities Diseases 0.000 description 2
- 241000321538 Candidia Species 0.000 description 2
- 208000020401 Depressive disease Diseases 0.000 description 2
- 208000000059 Dyspnea Diseases 0.000 description 2
- 206010019909 Hernia Diseases 0.000 description 2
- 206010020649 Hyperkeratosis Diseases 0.000 description 2
- 206010024453 Ligament sprain Diseases 0.000 description 2
- 208000008930 Low Back Pain Diseases 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 239000000011 acetone peroxide Substances 0.000 description 2
- 230000036506 anxiety Effects 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 230000035606 childbirth Effects 0.000 description 2
- 210000002808 connective tissue Anatomy 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 206010012601 diabetes mellitus Diseases 0.000 description 2
- 238000000502 dialysis Methods 0.000 description 2
- 230000001079 digestive effect Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000013277 forecasting method Methods 0.000 description 2
- 239000000383 hazardous chemical Substances 0.000 description 2
- 231100000206 health hazard Toxicity 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 230000002458 infectious effect Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003340 mental effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 235000020824 obesity Nutrition 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 231100000614 poison Toxicity 0.000 description 2
- 239000002574 poison Substances 0.000 description 2
- 231100000572 poisoning Toxicity 0.000 description 2
- 230000000607 poisoning effect Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 206010042772 syncope Diseases 0.000 description 2
- 230000002485 urinary effect Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 208000030507 AIDS Diseases 0.000 description 1
- 208000004998 Abdominal Pain Diseases 0.000 description 1
- 206010001076 Acute sinusitis Diseases 0.000 description 1
- 239000004394 Advantame Substances 0.000 description 1
- 208000007848 Alcoholism Diseases 0.000 description 1
- 206010002383 Angina Pectoris Diseases 0.000 description 1
- 206010003211 Arteriosclerosis coronary artery Diseases 0.000 description 1
- 208000036487 Arthropathies Diseases 0.000 description 1
- 208000006096 Attention Deficit Disorder with Hyperactivity Diseases 0.000 description 1
- 208000036864 Attention deficit/hyperactivity disease Diseases 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 208000019775 Back disease Diseases 0.000 description 1
- 206010060999 Benign neoplasm Diseases 0.000 description 1
- 206010004446 Benign prostatic hyperplasia Diseases 0.000 description 1
- 241000131971 Bradyrhizobiaceae Species 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 206010006448 Bronchiolitis Diseases 0.000 description 1
- 101100001642 Caenorhabditis elegans amt-1 gene Proteins 0.000 description 1
- 101100055523 Caenorhabditis elegans amt-2 gene Proteins 0.000 description 1
- 101100055533 Caenorhabditis elegans amt-3 gene Proteins 0.000 description 1
- 206010007134 Candida infections Diseases 0.000 description 1
- 206010007559 Cardiac failure congestive Diseases 0.000 description 1
- 206010007882 Cellulitis Diseases 0.000 description 1
- 206010008479 Chest Pain Diseases 0.000 description 1
- 206010009137 Chronic sinusitis Diseases 0.000 description 1
- 206010009192 Circulatory collapse Diseases 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 206010010741 Conjunctivitis Diseases 0.000 description 1
- 206010010774 Constipation Diseases 0.000 description 1
- 208000034656 Contusions Diseases 0.000 description 1
- 206010010904 Convulsion Diseases 0.000 description 1
- 206010010984 Corneal abrasion Diseases 0.000 description 1
- 208000034423 Delivery Diseases 0.000 description 1
- 206010012426 Dermal cyst Diseases 0.000 description 1
- 201000004624 Dermatitis Diseases 0.000 description 1
- 206010012442 Dermatitis contact Diseases 0.000 description 1
- 206010012504 Dermatophytosis Diseases 0.000 description 1
- 206010013559 Diverticulum intestinal Diseases 0.000 description 1
- 208000004232 Enteritis Diseases 0.000 description 1
- 208000010305 Epidermal Cyst Diseases 0.000 description 1
- 208000007530 Essential hypertension Diseases 0.000 description 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- 241000776457 FCB group Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 206010016997 Forearm fracture Diseases 0.000 description 1
- 208000007882 Gastritis Diseases 0.000 description 1
- 208000012671 Gastrointestinal haemorrhages Diseases 0.000 description 1
- 208000010412 Glaucoma Diseases 0.000 description 1
- 208000031886 HIV Infections Diseases 0.000 description 1
- 208000037357 HIV infectious disease Diseases 0.000 description 1
- 206010019280 Heart failures Diseases 0.000 description 1
- 208000032843 Hemorrhage Diseases 0.000 description 1
- 206010020751 Hypersensitivity Diseases 0.000 description 1
- 206010021928 Infertility female Diseases 0.000 description 1
- 206010021929 Infertility male Diseases 0.000 description 1
- 208000012659 Joint disease Diseases 0.000 description 1
- 239000004201 L-cysteine Substances 0.000 description 1
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 208000034702 Multiple pregnancies Diseases 0.000 description 1
- 201000002481 Myositis Diseases 0.000 description 1
- 206010029148 Nephrolithiasis Diseases 0.000 description 1
- 239000004157 Nitrosyl chloride Substances 0.000 description 1
- 206010030216 Oesophagitis Diseases 0.000 description 1
- 206010033078 Otitis media Diseases 0.000 description 1
- 208000030852 Parasitic disease Diseases 0.000 description 1
- 201000007100 Pharyngitis Diseases 0.000 description 1
- 206010034839 Pharyngitis streptococcal Diseases 0.000 description 1
- 206010035664 Pneumonia Diseases 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 102100026827 Protein associated with UVRAG as autophagy enhancer Human genes 0.000 description 1
- 101710102978 Protein associated with UVRAG as autophagy enhancer Proteins 0.000 description 1
- 201000004681 Psoriasis Diseases 0.000 description 1
- 208000028017 Psychotic disease Diseases 0.000 description 1
- 206010037660 Pyrexia Diseases 0.000 description 1
- 206010057190 Respiratory tract infections Diseases 0.000 description 1
- 206010038848 Retinal detachment Diseases 0.000 description 1
- 206010039085 Rhinitis allergic Diseases 0.000 description 1
- 206010039796 Seborrhoeic keratosis Diseases 0.000 description 1
- 241001393742 Simian endogenous retrovirus Species 0.000 description 1
- 206010040799 Skin atrophy Diseases 0.000 description 1
- 208000028979 Skull fracture Diseases 0.000 description 1
- 238000012896 Statistical algorithm Methods 0.000 description 1
- 239000004383 Steviol glycoside Substances 0.000 description 1
- 240000004584 Tamarindus indica Species 0.000 description 1
- 208000004760 Tenosynovitis Diseases 0.000 description 1
- 208000002474 Tinea Diseases 0.000 description 1
- 208000030886 Traumatic Brain injury Diseases 0.000 description 1
- 208000025865 Ulcer Diseases 0.000 description 1
- 208000024780 Urticaria Diseases 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- 244000126002 Ziziphus vulgaris Species 0.000 description 1
- 230000003187 abdominal effect Effects 0.000 description 1
- 230000001594 aberrant effect Effects 0.000 description 1
- 206010000210 abortion Diseases 0.000 description 1
- 206010000269 abscess Diseases 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 208000009621 actinic keratosis Diseases 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 208000016150 acute pharyngitis Diseases 0.000 description 1
- 208000026345 acute stress disease Diseases 0.000 description 1
- 206010001093 acute tonsillitis Diseases 0.000 description 1
- 208000012826 adjustment disease Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 208000026935 allergic disease Diseases 0.000 description 1
- 201000010105 allergic rhinitis Diseases 0.000 description 1
- 230000007815 allergy Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000002266 amputation Methods 0.000 description 1
- 238000000540 analysis of variance Methods 0.000 description 1
- 210000003423 ankle Anatomy 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 206010003119 arrhythmia Diseases 0.000 description 1
- 210000002565 arteriole Anatomy 0.000 description 1
- 208000010668 atopic eczema Diseases 0.000 description 1
- 208000015802 attention deficit-hyperactivity disease Diseases 0.000 description 1
- 206010004398 benign neoplasm of skin Diseases 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 206010006451 bronchitis Diseases 0.000 description 1
- 201000003984 candidiasis Diseases 0.000 description 1
- 208000003295 carpal tunnel syndrome Diseases 0.000 description 1
- 210000000845 cartilage Anatomy 0.000 description 1
- 239000003518 caustics Substances 0.000 description 1
- 208000026106 cerebrovascular disease Diseases 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 208000027157 chronic rhinosinusitis Diseases 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 206010009887 colitis Diseases 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 208000010247 contact dermatitis Diseases 0.000 description 1
- 239000003433 contraceptive agent Substances 0.000 description 1
- 230000002254 contraceptive effect Effects 0.000 description 1
- 230000009519 contusion Effects 0.000 description 1
- 230000036461 convulsion Effects 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- 208000029078 coronary artery disease Diseases 0.000 description 1
- 208000026758 coronary atherosclerosis Diseases 0.000 description 1
- 230000001955 cumulated effect Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 201000003146 cystitis Diseases 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 210000002249 digestive system Anatomy 0.000 description 1
- 239000004205 dimethyl polysiloxane Substances 0.000 description 1
- 208000002173 dizziness Diseases 0.000 description 1
- 206010013663 drug dependence Diseases 0.000 description 1
- 208000024732 dysthymic disease Diseases 0.000 description 1
- 206010014910 enthesopathy Diseases 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 208000001780 epistaxis Diseases 0.000 description 1
- 208000006881 esophagitis Diseases 0.000 description 1
- 210000002388 eustachian tube Anatomy 0.000 description 1
- 210000001752 female genitalia Anatomy 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000000232 gallbladder Anatomy 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 230000008570 general process Effects 0.000 description 1
- 230000003394 haemopoietic effect Effects 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 208000014617 hemorrhoid Diseases 0.000 description 1
- 208000033519 human immunodeficiency virus infectious disease Diseases 0.000 description 1
- 230000001969 hypertrophic effect Effects 0.000 description 1
- 208000003532 hypothyroidism Diseases 0.000 description 1
- 208000035231 inattentive type attention deficit hyperactivity disease Diseases 0.000 description 1
- 230000004968 inflammatory condition Effects 0.000 description 1
- 230000002757 inflammatory effect Effects 0.000 description 1
- 208000002551 irritable bowel syndrome Diseases 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- -1 liqd Substances 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 210000003141 lower extremity Anatomy 0.000 description 1
- PWPJGUXAGUPAHP-UHFFFAOYSA-N lufenuron Chemical compound C1=C(Cl)C(OC(F)(F)C(C(F)(F)F)F)=CC(Cl)=C1NC(=O)NC(=O)C1=C(F)C=CC=C1F PWPJGUXAGUPAHP-UHFFFAOYSA-N 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 230000001926 lymphatic effect Effects 0.000 description 1
- 208000024714 major depressive disease Diseases 0.000 description 1
- 206010025482 malaise Diseases 0.000 description 1
- 208000029565 malignant colon neoplasm Diseases 0.000 description 1
- 210000001595 mastoid Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 229940127554 medical product Drugs 0.000 description 1
- 230000005906 menstruation Effects 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 210000002346 musculoskeletal system Anatomy 0.000 description 1
- 208000031225 myocardial ischemia Diseases 0.000 description 1
- 229910052754 neon Inorganic materials 0.000 description 1
- GKAOGPIIYCISHV-UHFFFAOYSA-N neon atom Chemical compound [Ne] GKAOGPIIYCISHV-UHFFFAOYSA-N 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 208000025319 neurotic depression Diseases 0.000 description 1
- 208000015238 neurotic disease Diseases 0.000 description 1
- 238000009206 nuclear medicine Methods 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 230000035764 nutrition Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 201000008482 osteoarthritis Diseases 0.000 description 1
- 101150093826 par1 gene Proteins 0.000 description 1
- 230000003071 parasitic effect Effects 0.000 description 1
- 230000009984 peri-natal effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 210000001147 pulmonary artery Anatomy 0.000 description 1
- 239000001300 quillaia extract Substances 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000004264 retinal detachment Effects 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 230000000698 schizophrenic effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 201000003385 seborrheic keratosis Diseases 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 201000010088 skin benign neoplasm Diseases 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 201000010153 skin papilloma Diseases 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 208000016765 streptococcal sore throat Diseases 0.000 description 1
- 208000011117 substance-related disease Diseases 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 208000012157 syncope and collapse Diseases 0.000 description 1
- 229920006345 thermoplastic polyamide Polymers 0.000 description 1
- 210000001685 thyroid gland Anatomy 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000002627 tracheal intubation Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 231100000397 ulcer Toxicity 0.000 description 1
- 208000019206 urinary tract infection Diseases 0.000 description 1
- 210000002229 urogenital system Anatomy 0.000 description 1
- 229960005486 vaccine Drugs 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
Definitions
- This invention pertains to health, disability and life insurance systems, particularly including processing data (in the business of health insurance) for estimating future costs or liability and setting optimal pricing. For convenience, we call one embodiment of our invention More Accurate Predictions for Health Insurance Premiums or MAP4HIP.
- Group health insurance is typically priced through a series of steps. Historical claims costs are calculated by summing the costs of insured individuals. Actuaries estimate what the general cost inflation trend will be next period. If an insured group is large enough to have credible experience (historical costs), the inflation trend may be applied to the historical claims experience to produce an estimate of the expected claims for next period. A profit margin and administrative costs are added to the expected group claims costs to produce the so-called “experience rate”. An underwriter reviews the group's experience and adjusts the cost and profit margin-based price depending on special circumstances and competitive pressure. The standard practice is to use group-level data for estimating costs and setting prices except for very small groups, individual policies or specific medical stop loss insurance. Information on the insured's (i.e., individual's) medical conditions is typically not used when group-level data are used for underwriting and pricing the group's aggregate cost forecast.
- the current standard practice for estimating future health care costs for groups of 50 or more employees plus their dependents uses one of two methods or is a combination of those methods. If the group is large enough to have credible, stable experience, the historical costs are assumed to be the best estimate of next period's costs after a cost trend factor for inflation has been included. If the group is too small to have credible historical costs, many groups are combined together and averaged so that a stable demographic look-up table of historical average costs by age group by gender by family size can be developed and used as a weighting mechanism for estimating the expected future costs for non-credible groups. Cost trend factors for inflation are then applied. If a group does not have completely credible or non-credible experience, a blended average of its experience and a demographic look-up table forecast is used. These standard actuarial methods do not account for person-level trends in historical costs nor medical information about the person.
- Small groups i.e., 50 or fewer employees plus their dependents
- individual medical policies may use medical questionnaires from initial enrollment applications as input to an underwriter for estimating next period's group-level costs.
- Manual underwriting is expensive due to the labor intensity and is prone to variability among underwriters as their experience varies.
- Some state Medicaid HMO programs e.g., Colorado and Maryland
- federal Medicare HMO programs are using statistical algorithms that make person-level cost forecasts based on diagnoses from the computerized medical bills and demographic factors.
- These “risk adjustment” methods do not use procedures or historical person-level costs as the governments do not want incentives for increased utilization of services and spending more money.
- the governments' intent for HMO payments or managed care is to make payments proportional to the insured populations need for care based on their health conditions but not on prior care.
- historical cost is the single best predictor of future medical cost for credible groups. Not using it as part of the forecasting method decreases the accuracy of the forecast.
- Some medical insurance companies may be using such “risk adjustment” algorithms used by Medicare, Medicaid and others intended for managed care cost forecasting or payment allocation.
- risk adjustment algorithms used by Medicare, Medicaid and others intended for managed care cost forecasting or payment allocation.
- Stop loss health (or medical) insurance is typically purchased by self-insured employers that wish to limit their medical expense exposure.
- the most common form of medical stop loss insurance is known as “specific stop loss” insurance which is a high deductible (usually $25,000 to $100,000) insurance policy per insured person.
- Specific stop loss medical insurance is designed to protect the employer or other payer from large catastrophic medical expenses such as those incurred for liver transplants or care for neonates with major repairable congenital anomalies.
- the standard method for underwriting specific stop loss medical insurance uses a demographic look-up table to estimate costs for individuals whose medical expenses were under 50% of the deductible in the previous year.
- aggregate stop loss medical insurance coverage is also purchased by the employer.
- Aggregate coverage (exclusive of specific payments) means that the insurer will pay the employer's or other payer's medical cost obligations for a covered group if those costs exceed an agreed upon amount (i.e., an “attachment point”).
- the attachment point is typically defined as 125% of the group's expected cost in the insured period.
- the industry standard for calculating the expected cost is substantially the same method as used for fully insured plans. In other words, if the group is large enough to have completely credible experience, the last year's experience is modified by forecast inflation and increased by 25% to produce the 125% attachment point.
- a weighted combination of experience and demographic look-up table model is used with an inflation forecast and increased 25% to calculate the 125% attachment point.
- the demographic look-up table model is used as the starting point then trended inflation increased by 25% is used to calculate the 125% attachment point.
- Aggregate only medical stop loss insurance has been recently offered by one company (Cairnstone) to credible groups, and we believe that it uses group-level experience plus trended inflation to estimate future costs. Price is usually determined by competitive pressure but the inventors are not familiar with proprietary techniques used by the insurers.
- Aggregate only stop loss health insurance A health insurance product for self funded employers that want to cap their maximum liability.
- the aggregate only policy will pay off costs above an agreed upon limit (i.e., the attachment point).
- the attachment point is 125% of expected costs but it could be 110% or some other amount.
- the expected costs are estimated using an embodiment of this invention or using standard actuarial methods.
- Aggregate only stop loss does not include specific stop loss. However, specifics can be combined with aggregate stop loss. In that case the specific payments are not included in the costs counted against the aggregate attachment point.
- Base Period A period of typically 12 consecutive months prior to the lag period during which services were provided to some enrollees and reflected by claims entered in a computer file. In practice, it may be more or less than 12 months. Risk factors are coded on data from the base period. These data are used to forecast the next period costs. In other words, these data are used to calculate the predictors for the development model and are not used for underwriting actual health insurance policies.
- 3 Book of Business The insurance of a given type (e.g., small group, individual, large group) for all persons covered by an insurer at a point in time or during a specified period.
- An insurer may have multiple books of business.
- Bias Test A comparison of observed to predicted values from a model. The totals of both these values are equal to the total population which served as the standard in the preparation of the model. Bias tests determine whether or not there is any meaningful systematic disparity between observed and predicted cost when persons are sorted by predicted values, age or family composition or other characteristics. Disparities are considered as bias which better models eliminate or reduce. Another related measures sorts by the actual rather than the predicted values and is a measure of the accuracy of the forecasts.
- Candidate Predictor Variable An array of variables derived from the CI (client insurer) database and available to the statistical software which selects those which are most predictive of the dependent variable (e.g., by stepwise OLS, CART regression trees).
- Claim amount This is the total cost or payments made by the insurer.
- Claim codes include ICD-9-CM diagnosis and procedures, CPT codes, National Drug Codes and other standardized coding systems values such as SNOWMED codes.
- Claim-based risk factors are risk factors derived from the claim code, claim amount and transformations of the claim amount, type and place of services, provider type, units of service and other information contained on a health care claim. These risk factors are present in either the base or underwriting period.
- Clinical risk factors Risk factors derived from the claim codes, type and place of service and provider type but not solely from the claim amount.
- Client Insurer (CI)—The insurance entity for which the invention is to be applied.
- Payments The amounts actually paid by the insurer. Payments are always less than the claims due to deductibles, benefits and non-covered services.
- Cost Inflation Userly with cost trend. The secular trend in costs per person for health care due to changes in practice patterns and price per service. Does not usually consider changes in a population's health care needs which are usually minimal in the short run. Differs from pure price inflation such as that measured in the consumer price index (CPI).
- CPI consumer price index
- Demographic look-up table This is a method used by actuaries to estimate group-level costs when the group is too small to have credible experience. Average costs are calculated across a large pool of groups and averages are calculated by cell in a table of age by sex by family composition or other similar demographics. The appropriate cell amounts are applied to each person or employee in a non credible group and summed to calculate its expected cost.
- the dependent measure is the forecast of the model through application of the interaction capturing technique. A transformation may be applied to the dependent measure to calculate the claim amount (e.g., multiplying a probability by an average cost).
- the dependent measure is the future cost of health care for the population which comprises the CI book of business at the time the rates are to be quoted.
- the dependent measure is disability days.
- the dependent variable is the probability of the event.
- Enrollment-based risk factors are risk factors that are derived from the enrollment information only such as age, sex, relationship to the enrollee, length of enrollment, geographic locale and type of coverage and does not include claim information or claim amount.
- the employees salary, disability coverage terms and term life insurance coverage terms may be included in the enrollment file also.
- a traditional group is a collection of employees and their dependents that work for an employer at a location.
- a group can be an individual or a family by purchasing an “individual” health insurance policy where the remaining immediate family may also be covered by the policy.
- Health Insurance Insurance for the array of benefits covered by the health insurance policies of the client insurance company or a self-insured company including hospital, surgical and medical care plus drug benefits for some plans. Medical insurance is used as a synonym.
- Hybrid Tree Analysis The use of regression trees (or other analytic method output) as input to other regression models such as OLS, median and logistic regression or neural networks. Additionally, a model's output (e.g., regression or neural network) may be used as input into the regression or probability tre.
- Interaction Capturing Technique A mathematical and logical transformation of independent variables that predicts a response or dependent variable.
- the interaction capturing technique includes main effects, interaction effects and possibly time series effects.
- Statistical techniques that are examples of interaction capturing techniques include, but are not limited to, ANOVA, regression methods (e.g., linear, logistic, shrinkage, robust, ridge), regression trees, moving averages and autoregressive moving averages, look-up tables, means, probability models, clustering algorithms and many other methods.
- Data mining techniques that are examples of interaction capturing techniques include, but are not limited to, decision trees, rule induction, genetic algorithms, neural networks, nearest neighbor and other data mining methods.
- Lag Period A period between the base period and the next period or the underwriting and policy period which is required because of delays in filing claims, preparing or revising model weights, calculating premium rates and submitting them to insured groups in a timely way.
- MAP 4 HIP This is an acronym of More Accurate Predictions for Health Insurance Premiums which in turn is a brief title for our invention for its application to health insurance.
- Next Period typically a 12 consecutive month period subsequent to the base period and the lag period that contains the data that comprise the dependent variable used in the development model. Actual insurance policies are not written for this period but are underwritten for the policy period.
- Specific stop loss health insurance A health insurance coverage for self-funded employers or other payor that has a very high deductible per person. Usually the deductible is at least $10,000 and may be as high as $500,000 per person. Typically the deductible is between $25,000- and $100,000 per person and is meant to pay for catastrophic care.
- Subscriber unit The family unit that health insurance premium is charged by. For example, the simplest are two units: 1) a single person and 2) two or more people. Single person, married couple and three or more people is a common classification but more detailed versions are also used. The subscriber is the employee.
- TPA Third Party Administration
- Underwriting Period A period of typically 12 consecutive months prior to the lag period during which services were provided to some enrollees and reflected by claims entered in a computer file. In practice, it may be more or less than 12 months. Risk factors are coded on data from the underwriting period. These data are used to forecast the policy period costs. In other words, these data are used to calculate the predictors for the model that is used for underwriting actual health insurance policies.
- Winsorize Data are Winsorized if the most extreme observations on one or both ends of the ordered samples are replaced by the nearest retained observation. Our cost distributions have no low cost outliers and hence Winsorization is applied only to the high end of the ordered sample.
- One aspect of the invention contemplates a computer-implemented process of developing a person-level cost model for forecasting future costs attributable to claims from members of a book of business, where person-level data regarding actual base period health care claims are available for a substantial portion of the members of the book of business for an actual underwriting period, and the forecast of interest (i.e., future claim amount) is for an actual policy period which can be, but is not necessarily contiguous with the actual underwriting period, having the steps of:
- a further aspect of the invention contemplates a computer-implemented process wherein the interaction capturing technique is selected from the group consisting of median regression tree techniques, least square regression tree techniques, rule induction techniques, ordinary least squares regression techniques, median regression techniques, robust regression techniques, genetic algorithms, rule induction, clustering techniques and neural network techniques.
- Yet another aspect of the invention is a computer implemented process wherein the person-level next period cost forecasts are adjusted by modifying the extant cost forecast by the expected cost trend.
- a yet further aspect of the invention is a computer implemented process of wherein the datum from the claims used as predictors consist essentially of the claim- and enrollment-based risk factors and the claim amount is a standardized cost of services provided and the model is used to allocate prospective payments to health care providers.
- a still yet further aspect of the invention is a computer implemented process wherein the data used from the claims data consist essentially of the claim code and selected mandatory procedures and the claim amount is a standardized cost of services provided during the same time period as the base period and the model is used to evaluate the efficiency of health care providers.
- Another aspect of the invention is a computer implemented process of forecasting future claim amounts attributable to claims from members of a book of business for an actual policy period, wherein the model development universe comprises data from the members of a book of business to be insured, further comprising:
- Yet another aspect of the invention is a computer implemented process comprising the step of: setting insurance reserves based on group-level forecast for the actual policy period, wherein the policy period is a reserving period for claims that have not occurred or that have occurred but not been reported.
- Yet still another further aspect of the invention is a computer implemented process, wherein claim amounts are a mix of fee for service payments and capitation payments so that the base and underwriting periods risk factors are appended to include dummy variables for the presence of capitation payments by provider type and the cost estimate in the next and policy periods is the fee for service cost that must be supplemented with the expected capitation payments.
- Still another aspect of the invention is a computer-implemented process of developing a hybrid person-level health care claim cost forecasting model for forecasting future medical costs attributable to health care claims from members of a book of business, where person-level data are available for a substantial portion of the members of the book of business, comprising the steps of:
- development universe data comprising person-level data for a statistically meaningful number of individuals, the person-level data comprising continuous variable data and categorical variable data;
- processing first the continuous variable data for each individual with a continuous processing technique that captures the predictive ability of main effects and interactions of continuous variables to generate a person-level continuous variable model;
- categorical variable data for each individual including the output from the continuous processing technique with a categorical processing technique that captures the predictive ability of main effects and interactions of categorical variables to generate a person-level categorical variable model;
- person-level continuous variable model and person-level categorical variable model together comprise a hybrid person-level health care claim amount forecasting model.
- Yet another aspect of the invention is a computer-implemented process of developing a claim amount forecasting model for use in forecasting the future claim amount for members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual base period, and the claim amount of interest for forecasting purposes is an actual next period which can be, but is not necessarily contiguous with the actual base period, comprising the steps of:
- having-claims cost forecasting model and the without-claims forecasting model comprise a claim amount forecasting model.
- Yet another aspect of the invention is a computer-implemented process of developing a health care claim amount forecasting model for use in forecasting the future medical claim amount for members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual base period, and the claim amount of interest for forecasting purposes is an actual next period which can be, but is not necessarily contiguous with the actual base period, comprising the steps of:
- Another aspect of the invention is a computer-implemented process of comprising:
- inlier-having-claims cost forecasting model and the inlier-without-claims forecasting model comprise an inlier claim amount forecasting model.
- a still further aspect of the invention is a computer-implemented process of forecasting a claim amount attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
- a still further aspect of the invention is a computer-implemented process of forecasting costs attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
- person-level data comprising enrollment data and actual underwriting period health care claims data, for members of a book of business, where the person-level data on a health care claim comprises at least a claim amount and a claim code and the actual underwriting period can be, but is not necessarily, contiguous with the actual policy period;
- a model development universe of person-level data comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a base period health care claim includes at least a claim amount and a claim code;
- Yet a further aspect of the invention is an automated system for forecasting future costs attributable to claims from members of a book of business during an actual policy period comprising:
- an insured person database accessible by the processor, wherein the database comprises person-level enrollment data and actual underwriting period health care claims data, for members of a book of business to be insured, where the person-level data on a health care claim comprises at least a claim amount and a claim code;
- model development universe database accessible by the processor, wherein the second database comprises model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on the base period health care claim includes at least a claim amount and a claim code;
- risk factor encoder accessible by the processor, wherein the risk factor encoder encodes claim-based risk factors for each historical base period based on the claim code associated with the health care claim and the risk factor encoder encodes at least one enrollment risk factor based on the enrollment data;
- a model generator accessible by the processor, that generates a cost-forecasting model by capturing the predictive capacity of the main effects and the interaction of the risk factors assigned by the risk factor encoder to forecast the historical next period of the model development universe data using the historical base period data;
- a person-level cost generator that applies the cost-forecasting model to the person-level actual underwriting period health care claims data of each of the members of the book of business to generate a person-level actual policy period claim amount forecast for each member of the book of business;
- an actual policy period group-level cost forecast generator that totals the person-level actual next period forecasts for each member of the group to generate an actual policy period group-level cost forecast.
- Still another aspect of the invention is a computer-implemented process of forecasting costs attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
- means for providing person-level data comprising enrollment data and actual underwriting period health care claims data, for members of a book of business, where the person-level data on a health care claim comprises at least a claim amount and a claim code and the actual underwriting period can be, but is not necessarily, contiguous with the actual policy period;
- means for providing a model development universe of person-level data comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a base period health care claim includes at least a claim amount and a claim code;
- a still further aspect of the invention is a group insurance product comprising:
- a stated monetary insurance premium including a forecast of said benefits, estimated costs of administering the insurance product, and optionally, an estimated profit
- Yet another aspect of the invention is a method of pricing group insurance including a cost of future benefits according to the computer-implemented process of forecasting future medical costs attributable to claims from members of a group during an actual underwriting period, comprising the steps of:
- each possible price also having an expected profit that is the amount of the price over the group level cost forecast plus the expected amount of administrative costs;
- Still another aspect of the invention is a method of underwriting an insurance product comprising the steps of:
- each of diagnosis and CPT based risk factor is independent of the sequence in time of other diagnosis and CPT based risk factors.
- a further aspect of the invention is a method of underwriting an insurance, for insuring short term disability costs wherein the interaction capturing technique uses a dependent measure from the next period and policy period comprising the number of STD days in the policy period and weights the dependent measure by the expected cost per day for the STD to produce the person-level expected STD costs and summed across the group to produce the group's expected STD cost.
- a still further aspect of the invention is insuring long term disability (LTD) claims wherein a dependent measure for generating the cost forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD and applying the cost forecasting model to the person-level data produces person-level expected LTD costs wherein summing the person-level expected LTD costs across the group to produce a group's expected LTD cost for an actual policy period.
- LTD long term disability
- a still yet further aspect of the invention is a cost forecast produced for first-dollar health insurance.
- Another aspect of the invention is a cost forecast produced for stop loss health insurance.
- a still further aspect of the invention is a cost forecast produced for aggregate-only stop loss health insurance.
- Still another aspect of the invention is a cost forecast produced for specific stop loss health insurance.
- Yet another aspect of the invention comprises is a cost forecast for insuring group term life insurance costs wherein a dependent measure for generating the cost forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost.
- model development universe comprises data from the members of a group in the book of business to be insured.
- a still yet further aspect of the invention comprises the step of: setting insurance reserves based on the renewal group-level forecast for the actual underwriting period, wherein the next period is a reserving period for claims that have not occurred or that have occurred but not been reported.
- FIG. 1 is a flowchart of an embodiment of an overview of a method for estimating future cost and optimizing pricing.
- FIG. 2 is a flowchart of an embodiment of a method like that of FIG. 1 which is particularly adapted for service bureau processing.
- FIG. 3 is a flowchart of an embodiment of a method like that of FIG. 1 which is particularly adapted for use as a software product, which may be functionally distributed locally or over the Internet.
- FIG. 4 is a more detailed flowchart of a process for data processing of steps 102 , 202 or 302 of FIGS. 1 , 2 and 3 .
- FIG. 5 is a more detailed flowchart illustrating a process for standardizing time periods, for use in the methods of FIGS. 1-3 , and in particular steps 102 , 202 and 302 .
- FIG. 6 is a flowchart illustrating data validation and standardization procedures for steps 102 , 202 and 302 of the methods of FIGS. 1-3 .
- FIG. 7 is a flowchart illustrating the matching and merging (integration) of data in the process steps 102 , 202 or 302 of FIGS. 1-3 .
- FIG. 8 is a flowchart illustrating the aggregation and risk factor coding for the steps 102 , 202 or 302 of the processes of FIGS. 1-3 .
- FIG. 9 is a flowchart of processing steps for developing cost forecasting models based on “inlier” data in steps 106 , 204 , 210 , 304 or 310 of the methods of FIGS. 1-3 .
- FIG. 10 is a detailed flowchart of process steps for developing cost forecasting models based on “outlier” data of the Winsorized data for the steps 106 , 204 , 210 , 304 or 310 , of the methods of FIGS. 1-3 .
- FIG. 11 is a detailed flowchart for scoring, testing and integrating the data, and adjusting for cost trends for use in steps 106 , 204 , 210 , 304 or 310 as well as 108 , 208 and 306 of the methods of FIGS. 1-3 .
- FIG. 12 is a detailed flowchart illustrating processing steps for developing group-level models and making adjustments to the summary of the person-level data of steps 106 and 108 of FIGS. 1 , 204 , 208 and 210 of FIG. 2 , or 304 , 306 and 310 of FIG. 3 .
- FIG. 13 is a detailed flowchart of an embodiment of a price optimization procedure which may be used to carry out steps 110 , 212 , or 308 of FIGS. 1-3 .
- the present invention is directed to insurance systems, particularly including methods for processing health insurance data to estimate future costs, and for optimizing pricing of health insurance products, including both first-dollar and stop loss insurance products.
- it involves processing historical data, developing algorithms, applying those algorithms, updating those algorithms and setting prices.
- the insurance systems that can benefit from the methods and systems disclosed herein also include, but are not limited to, health insurance, disability insurance, both short term and long term, as well as term life insurance systems.
- This invention comprises a series of related products that provide more accurate group-level claim amount forecasts (and person-level forecasts for individual or family health insurance) and more optimal group-level renewal prices for insurers at full risk for the health insurance (e.g., indemnity, PPO, HMO, POS) or aggregate only stop loss health insurance for self insured employers.
- These forecasting models for renewal price setting are not intended to be used for paying managed care providers but alternate related models are developed for that purpose (see B in Table 1 below).
- the products provide more accurate future cost estimates by forecasting person-level costs using models that include clinical information from historical health insurance claims as well as person-level demographic and historical cost data.
- effective models may be based on data from relatively large groups of at least 50,000 people, such as typically covering an entire book of business for an insurer (or a large subclass of the insurer's book of business such as all HMO groups of the insurer) or in the case of a TPA, the TPA's entire book of business.
- the most recent year of person-level medical claim data for the individuals of a particular book of business for which an accurate cost forecast is desired may be processed by this model, to produce an accurate projected cost for policy pricing, as will be described. Future cost trend estimates (inflation) are adjusted for each individual's characteristics and applied to the person-level estimates.
- Person-level cost forecasts are summarized to the family-level or group-level and family or group-level characteristics are used to adjust the summarized cost to produce the adjusted family or group-level cost forecast.
- the price is optimized using a system that estimates the probability of the group accepting the insurance at the price offered, given the group's historical insurance cost, historical claim's history, and local competitive market conditions.
- the probability is weighted by a function of the expected future profit, which equals the anticipated price less expected medical and administrative costs.
- the method and models with slight adjustments can be applied to self insured employers aggregate only, specific only or specific plus aggregate medical stop loss data.
- the products also include the use of the method applied to a client's book of business for estimating future claim amounts for purposes of setting a reserve by group and for cost forecasting and pricing for new groups or individuals for fully insured health insurance.
- Another alternative application would be the use of the method to develop and deliver products that allow HMO's to prospectively allocate health care payments to providers.
- Another product is the measurement of the efficiency of health care providers.
- These methods can be applied to medical claims linked to future short or long-term disability payments or indicators of disability and used to rate the relative risk of disability of groups or forecast their future costs by using the groups medical claims, enrollment data and summarized group-level or person-level disability payments.
- Another application is to group term life insurance.
- the dependent measure is the probability of death next period which is linked to medical claims in the base period and the potential risk factors are the same potential risk factors as used with the other models.
- the modeling strategy employed for the cost forecasting models contains several novel components. We have used a combination of specialized data collection and cleaning, regression trees and regression (ordinary least squares or OLS, logistic and median) models tailored to a client's book of business, and the application of these models to the client's book of business for improved decision making. While there are many published examples of OLS being used for purposes similar to this application, there are a few using trees. We are not aware of any reports using a combination of regression trees and other regression models to forecast health care costs. The use of the output of a tree model as an input to other regression algorithms is known as “hybrid” tree models. (See D. Steinberg and N.
- a typical group health insurance product in accordance with the present invention (such the various types of Blue CrossTM and Blue ShieldTM brand group health insurance policies, which are incorporated herein by reference) comprises an identification of the types of medical expenses which are agreed to be covered, paid or reimbursed by an insurer to or on behalf of members of the group (including their covered dependents) which are incurred by members of the group during a future time period, typically one year, in exchange for a stated monetary insurance premium which includes a forecast of said medical expenses in accordance with the methods described herein, estimated costs of administering the health insurance product, and an estimated profit.
- Table 1 summarizes the alternate uses of our method as applied to health care enrollment and claims data linked with claim amounts for first dollar and stop loss coverage, disability coverage, reserves and term life coverage. These alternate model development produce products that are customized for specialized applications. Row is the application of our invention which is presented in most detail in this application. The methods used in A-1 are clearly related to those in each of the other rows.
- Optimal pricing for a fully insured group requires an accurate forecast of the group's mean cost per person in the policy period.
- Optimal pricing for an aggregate only medical stop loss insurance for a self-insured employer also requires an accurate forecast of that group's mean cost per person in the policy period. Therefore, the exact same methodology can be used for the cost forecast for fully insured groups or for self-insured group's aggregate only stop loss insurance if the same data are available. There is a difference in the methods used to set prices since the employer will pay for the majority of the medical expenses when it is self-insured and thereby paying a premium that is far smaller than with full health insurance when the insurer pays all of the medical costs.
- CapCostTM is an aggregate only medical stop loss product that includes a system for making more accurate cost forecasts (for groups with 51 to 3000 employees mainly).
- the attachment point for CapCostTM can be the standard 125% of expected costs (called CapCost 125TM) but we will offer an attachment point at 110% of expected costs (called CapCost 110TM) and possibly other attachment points.
- CapCostTM are similar to those of traditional medical stop loss insurance, but there is cash flow protection, medical costs are cumulated on an incurred basis rather than a paid basis, and there is no specific stop loss coverage.
- CapCostTM is useful for employers since many will receive prices that are below the price of traditional specific plus aggregate medical stop loss insurance while the maximum aggregate medical liability for the group may be lower with CapCostTM than with traditional specific plus aggregate medical stop loss insurance. From the insurers perspective, the expected medical claims it must pay with CapCostTM are frequently below those of traditional medical stop loss products since specific stop loss coverage is not provided. Generally, CapCostTM is a better value for the employer than traditional stop loss coverage when the employer is larger than the average employer purchasing stop loss coverage or if the group has experienced some unusually high annual medical expenses due to a few high cost individuals that are unlikely to have high costs recurring in the near future.
- CapCostTM is novel in the way expected future medical costs are estimated. Historical medical claims, enrollment, benefit plan and employer files in electronic format are collected from the Third Party Administrators (TPA) or insurance company that is paying the employers medical bills. The electronic files containing the medical claims and enrollment data are collected for all people with medical coverage rather than from only those that had large claims.
- This invention's cost forecasting models are applied to the insured people covered by the employer. The inflation trend and optimized pricing are then applied to the cost estimates.
- the CapCostTM product is a system for data collection, cost estimates, and price optimization and is part of this invention. Separate products are designed for pricing new or renewal coverage for fully insured medical plans and for allocating reserves for such medical plans. Each contain a system for data collection and cost estimation. Price optimization is an additional part of this invention for fully insured medical plan renewals and stop loss coverage.
- the MAR is the mean of the absolute value of the difference between the actual and predicted cost of a group. A lower MAR is desirable since the predicted cost is closer to the actual cost.
- the results are presented as a percentage of the mean of the groups costs or the predicted divided by the actual times 100.
- the MAR was 11.6% for the invention's prediction, 14.2% for the experience model, and 25.8% for the demographic model for the 116 actual groups in our database.
- the invention forecast was substantially better than either of the two conventional forecast methods.
- a measure of model accuracy addresses whether and by how much the model systematically over or under predict the actual costs for various characteristics of the insured population.
- the actual cost is divided by the forecast cost to make an index.
- the index should be close to 1.0 if the model is accurate.
- the invention's forecast is always closer to 1.0 for every decile indicating that it is a superior model to the experience model.
- the invention's ratio of predicted to actual was about 0.91 for the lowest decile and about 1.32 for the highest decile while the experience models ratios were about 0.85 and about 1.55, respectively.
- the other deciles were closer to 1.0 but the invention forecast was always closer to 1.0 than the experience forecast.
- the invention includes a general process for developing models for forecasting health care costs.
- the invention also includes processes for products that incorporate a the process and provide information for improving specific business decisions made by health insurers, including, but not limited to, aggregate only, specific only and specific plus aggregate stop loss health insurance products.
- the models may be developed for specific insurers and their book of business, and may be different for each insurer.
- step 102 health data on members of the book of business is collected, cleaned, integrated and aggregated, as shown in step 102 . If the data are missing or miscoded, the cost forecasts may be inaccurate also. Most of the programming cost and analysis involves these phases of the process.
- the client's data may typically be in many different computer systems or databases, and the data may need to be combined to build person-level files that are complete for a specified time period.
- a twelve month “base period” is typically used as the period from which we collect this data to describe each person's history of claims, diagnoses and other factors.
- the base period could be a longer period or shorter period and will depend on how long the groups have been enrolled and the time for which adequate computer or other records are kept.
- the base period may have different time periods for people and groups that do not have the same enrollment renewal dates.
- the “next period” is typically the period of twelve months of insurance coverage immediately following the lag period.
- the claim amount forecast period is the next or policy period that is priced for the group.
- the “next period” is the relevant time period for the dependent variable in the cost forecast models.
- a new cost forecasting model may need to be developed for them, for example, as shown in step 104 of FIG. 1 .
- An alternative is to use existing forecasting models and recalibrate those models to the new or updated data.
- Our methods include a systematic process to develop new models or recalibrate old models. A new model is developed when the old database upon which the old model was developed is not representative of the new database.
- the new database is substantially different in size, covers a different geographic region, contains different types of insurees (e.g., predominantly elderly in Medicare; pregnancy and children are characteristic of Medicaid) or different types of payments (e.g., capitation payments plus fee for service payments).
- insurees e.g., predominantly elderly in Medicare; pregnancy and children are characteristic of Medicaid
- payments e.g., capitation payments plus fee for service payments
- the selection of the population to be modeled is of key importance since the predictor variables and their weights will reflect not only the specific needs of the population, but also the practice patterns of those providing care and the prices charged for its health care services.
- the ideal population to use as a standard is the CI's book of business for which the forecasts are needed, provided it is of sufficient size. We have found that an insured population (i.e., book of business) as small as 50,000 persons can produce robust cost forecasts.
- step 106 if it is determined that a new cost forecasting model should be developed, there is a specified process for developing the model.
- the method for developing the new cost forecasting model is part of our product and it can be applied to any medical insurance database that includes the necessary information.
- the cost forecasting model is calibrated on the historical data to model the dynamics of medical care, practice patterns, and pricing in the geographic markets and provider networks used by the customer.
- the groups of insured people used as a standard in our models must be enrolled for at least the last day of the “base period”, for the entire lag period and the next period. Multiple sets of base period, lag period and next period can be used to increase the amount of data used to create the cost forecasting model. More data produces more robust models, but must be adjusted for secular cost trends when there are multiple calendar years for the “base period”.
- Scoring the data for pricing insurance for the policy period involves applying the forecasting model to the data for the underwriting period that will be used to forecast cost for the policy period—the renewal year that needs pricing, as shown in processing block 108 .
- the first step in the scoring 108 is applying the data steps to the new underwriting period that have not been previously applied (e.g., coding of risk factors).
- the cost forecasting model is applied to the person-level data. External health care inflation forecasts from the CI or consulting organization are then used to adjust the prior year's trend inherent in the person-level forecasts. The person-level inflation adjusted cost forecasts are then aggregated to the group-level.
- group-level adjustments to the forecasts are applied for benefit plan design, SIC code, and other factors influencing group costs.
- the price to be charged for the medical insurance for the group for that period may be determined, as shown in block 110 of FIG. 1 .
- the insurer generally desires to obtain a fair, or even maximum profit, without causing the group to leave for another insurer.
- the competitiveness of the market, historical prices, and historical costs are all factors that will influence the likelihood of the group being retained at any given price.
- the policy premium, the price to be charged to the customer for the medical insurance coverage for the specific group comprises the forecast medical cost, the insurer's overhead and other business expenses, and a projected profit.
- the client's underwriter(s) are asked to provide explicit probabilities of retaining a group at various price increases.
- FIGS. 2 and 3 similarly provide an overview of the information flows for two different embodiments.
- the embodiment of FIG. 2 involves substantially only the transfer of data.
- the embodiment of FIG. 3 involves installing software at the client or an Internet connection with the client's software.
- Shown in FIG. 2 is a “service bureau” embodiment in which all of the data preparation, cost forecasting, model development, scoring the data, and pricing for specific individual groups is carried out at a service bureau location.
- medical history and claims data for members of the group are sent to the service bureau location, and a cost forecast or per group price or both are sent back to the client (see 212 ).
- An alternative is for software to be installed in the client's (insurance company's or third party administrator's) operations with model updates being periodically provided to the client.
- This historical data (typically provided by an insurance company or TPA) is used to develop a model that is calibrated to the book of business (see the sample data requested of the client, and/or for specific policy types of insurance companies). A base period, lag period, and next period are required as a minimum. The data are fully validated prior to the model development.
- cost forecasting models are developed which include person-level inlier models based on the Winsorized data (see FIG. 9 ) and outlier cost components (see FIG. 10 ), inflation adjustments (see FIG. 11 ), group-level attribute models (see FIG. 12 ), and pricing models (see FIG. 13 ).
- the data are stored and combined with the previous data submission until three to six months of new data are available, as shown in block 210 .
- the new data are combined with the most recent data from the previous data submission so that the most recent 12 months of data are available and are used as the updated next period for recalibration of the models to be used for scoring other groups.
- the old models are refit with the new data and updated cost trends are included also. Every one to two years the models may be revised with updated predictor variables and weights. Redoing the models will help capture changes in practice patterns and relative pricing.
- the summarized cost forecast and pricing information are sent to the client for use by underwriters or in an automated quotation system.
- the insurance company or other underwriter client may also use its own pricing algorithm using the cost forecast produced by the method of FIG. 2 .
- FIG. 3 similarly illustrates an overview of an embodiment of the present invention which may be directly utilized by a health insurer or medical underwriter.
- the various parts of operational software and work flows of the client database may be adapted to automatically extract data, validate it, score the data with the forecasting models, and price the groups.
- the medical history, cost and other data elements used, and timing of the data extracts are normalized or standardized for utilization in the method and automating the recalibration of the models as shown in block 310 .
- An alternative to installing the software on the client's computers is to perform that task using the Internet (as an Internet Service Provider or ISP) to extract the data and return cost forecast and group prices to the client.
- ISP Internet Service Provider
- processing software modules for carrying out the present method may be installed on client computers, to utilize the standardized data for the software.
- the prices are offered to that group for renewed medical insurance, whether it be first-dollar, stop-loss or other coverage. This can be done using a human underwriter or as part of an automated quotation system.
- the software will capture the updated data and combine it, as shown in block 310 . Those data will be used to recalibrate the models after about three to six months of data accumulation.
- the updating may be performed offline, or may include automatic database updating and model recalibration. Completely new models may be developed about every one to two years offline.
- the first step in the data portion of the process is the data request.
- This process is flexible so that it can be modified to work around alternative formats and data sets used to formulate the candidate predictor variables.
- the dollar value of claims made in the base period and claims paid (or disability or life indicator ratios) in the next period are essential. Enough time for run out of claims is necessary so that incurred but not reported (IBNR) claims are included in the data.
- IBNR incurred but not reported
- this data may preferably be in the form of five different data files that are linked by an encrypted identifier.
- the identifier should include unique characters for the company, family, and person.
- the data files should include group-level information, person-level information, detailed medical claims information (e.g., hospital, physician, durable medical equipment, home health, etc.), detailed pharmacy claims and capitation information, if germane.
- data for a relatively large number e.g., 500,000 people, covering 27 consecutive months (12 month base, 3 month lag, and 12 month test periods).
- Presence of other health insurance e.g., spouse coverage, Medicare
- Method for payment (e.g., per member per month)
- the models can be built without pharmacy data if that is not covered by the insurance. Enrollment and medical claims data are required. Many of the group-level variables are desirable, but optional.
- the data format would specify the dates for the beginning of the base period and the end of the next period or new base period to be used for the cost forecast for pricing. Because the data may originate from a variety of different databases and sources, control totals (e.g., number of records, sums of fields) are also included, to assure that the data is excerpted and formatted properly.
- the customer or TPA sends a layout and a sample database, so that tests can be run prior to extracting all of the data.
- Valid ranges of variables are checked as shown in block 406 . Control totals are matched, and encrypted IDs may be tested.
- the data need not be aggregated and tested since it is a small subset of the data universe, but the conformity of the sample data to the layout is checked.
- the data extraction program or layout are fixed and another sample data set or layout is tested.
- the dates for the model development overall, and the base period for actual cost forecasting and pricing are established and defined, and the respective dates for each respective group have been set prior to the data request. Now the dates for each group must be determined for its inclusion in the universe of the model development.
- a list is developed for the renewal dates for the first year of coverage that would have prospective prices set using this method, as shown in block 502 .
- Table 2 lists an example of time sequencing for developing models and implementing cost predictor models.
- the groups need to get a price in advance of the coverage date for new customers, or the renewal date for existing customers, to accept or reject it prior to the renewal coverage. Additionally, time for receiving data from the client or a TPA and analyzing it must be added to the lag period. We have used a three month lag period, may be used in processing block 504 , but it could be longer or shorter depending on database and business needs.
- the beginning of the lag period is the last date that bills can be paid for the base period of the model development period. Otherwise, the cost forecasting model would include information that would not be available in the future.
- the lag period information (claims paid or made) need not be used to provide an accurate cost forecast for a future time period for a particular group.
- the claims incurred during the next period is the dependent variable for the model of the illustrated embodiment. An estimate of claims incurred but not reported may be added on if there is insufficient time for a proper run-out period (i.e., if only one base period and next period are used for model development).
- the lag period precedes the next period and the base period is typically the year preceding the beginning of the lag period in the universe of model development.
- Table 2 illustrates one example of timing for the processing of block 508 .
- Column A represents the model development period and Column B represents timing for the application of cost forecasting and prospective pricing.
- the model development time period precedes the actual pricing period but there is overlap since the next period of the model development period is used as part of the underwriting period for the application of cost forecasting and the pricing model.
- the timeline will be modified when longer lag periods are required.
- Column B pertains to groups with the same renewal date. Alternate flowcharts may be used to represent each renewal date.
- FIG. 6 Illustrated in FIG. 6 is a flowchart illustrating data validation and standardization procedures for steps 102 , 202 and 302 of the methods of FIGS. 1-3 .
- Preliminary data validation checks, and initial data preparation as a second set of data checks as shown in block 602 . Utilizing a file structure that will allow for standards to be compared to the data prior to the data aggregation is a facilitating procedure.
- medical claims include diagnoses that are typically coded in ICD-9-CM codes, procedures that are coded in CPT codes, prescriptions that are coded using NDC codes, hospitalizations coded using DRGs, ICD-9-CM and other codes, that may appear on claims. Tables are developed that contain the values for all of these codes. These tables are standards for comparison with the customer's data and the values in the data must correspond to valid values for these coding systems.
- tables are made for each client, because the place of service, type of provider, dates, and other fields on the claims and enrollment data will frequently have values that are idiosyncratic to a particular database or customer.
- the values should preferably be put in a table format that will allow checking and standardizing the data for accuracy, as shown in block 606 .
- the time periods at the group-level may be used to screen if claims and insureds should be in the universe.
- a table is used for comparison.
- Prior experience permits the development of norms that can be used to check the data for reasonableness. Examples include the charge and payment per claim, the number of claims per person, and other norms. These values are put into a table for comparison, and processing in block 610 .
- Preparation (see block 612 ) of the raw data involves the same data process steps used in FIG. 4 , utilizing specified read programs.
- the data (see block 614 ) are provided in the agreed upon medium, the data are read and control totals are checked, see block 616 . If errors are noted, the cause is determined and corrected.
- the raw data are reformatted, see 618 , into a SAS database in the illustrated embodiment.
- Other database software e.g., SPSS, Oracle, etc.
- SPSS Simple Object Access Server
- Oracle Oracle
- the fields are reformatted (see 620 ) so that the values correspond to the standard tables, the group-level time period (see TABLE 2) tables are used to extract, see 622 , the universe of relevant claims and insured people, and claims for people that are not in the model development universe are put into a separate file (see 624 ).
- Data following the model development universe time period may fit into the underwriting period data that will be used for the application of cost forecasting and pricing.
- Data that do not match the standards are put, see 630 , into a separate file.
- the cause of the mismatches is evaluated, and the data is deleted or corrected where appropriate. Records may need to be sent back to the customer for replacement or fixing. If there is a large number of mismatches, they must be fixed prior to aggregation.
- FIG. 7 is a flowchart illustrating the matching and merging (integration) of data in the process steps 102 , 202 or 302 of FIGS. 1-3 .
- the social security number or other identifier is encrypted so that actual people cannot be identified and group numbers are used instead of company names. Street addresses are not used so the people cannot be personally identified.
- records need to be linked for accurate models and pricing.
- One linking system that is effective uses the group ID as a prefix, encrypted social security number of the enrollee as the family ID, and enrollee or dependent number as the person ID. birth dates and sex are useful as checks on the ID.
- the claims data are prepared separately, and a look-up table is generated that lists the group, family, person ID for all claims with the respective birth date and sex.
- the enrollment data are used to develop a separate enrollment look-up table which contains the same information as the claims look-up table. There will be more in the enrollment table since each person in the group does not necessarily have a claim but should be in the enrollment file.
- the tables are merged and compared.
- the claims table should be a subset of the enrollment table. Claim IDs that do not match enrollment IDs indicate an error. These claims are put into a separate file and manually analyzed.
- the person-level merged file contains the enrollment information and claim information, but the record is not aggregated.
- Additional data validation checks occur such as the number of insureds per group and the percentage of people within each group that have no claims.
- the data are valid and ready to transform into the analytic database.
- FIG. 8 is a flowchart illustrating the aggregation and risk factor coding for the steps 102 , 202 or 302 of the processes of FIGS. 1-3 .
- the respective processing blocks of FIG. 8 are described as follows:
- the claims data are sorted by person ID by incurred date of the claim.
- This sort allows for a final screening on the chronological eligibility.
- a person in the group typically needs to have at least one day of eligibility in the base period and next period and continuous eligibility between those dates. Otherwise, they are dropped from the modeling database. If a person loses eligibility prior to next period, he or she is dropped from the entire analytic database. If the person enrolls in the lag period, that person is kept in a separate analytic database. This last category of people will have their next period payments compared to those of similar demographics. If a person is enrolled in the base period and disenrolls during the next period, those people are put into a separate file in the analytic database. Their next period payments will be compared to people with the same characteristics that did not leave in the next period. People in other time sequences may be dropped from the analytic database.
- a new record is produced for each person. It includes the enrollment data and information extracted from the claims, when available.
- the risk factors use ICD-9-CM codes, CPT codes, place of service, provider type, demographic data, and other variables (see risk factor listing in Appendix G).
- the new record is a vector of variables that are initialized to zero and then incremented by one when that variable is read in the claims. These variables are coded from claims from the base period only. Payments and charges are summed for the base period, lag period, and next period. It is important to compare the expected cost from the forecasting model with the actual cost next period of those that were not in the modeling universe. If there are large discrepancies, the model may need adjustment.
- the risk factors are then coded by processing the information on each person's aggregated record (See Appendix G).
- Risk factors were developed using a combination of expert medical opinion, statistical analyses, and knowledge of the medical insurance market. Diagnoses are divided into diseases and conditions and by inherent risk. Procedures are divided by body system, type of test, type of procedure, and type and site of care. Other risk factors are designed based on the relationship to the enrollee, family composition and demographics. There is a trade off between a very specific risk factor that has very few but very homogeneous people in it and broad risk factors that have heterogeneous people in it. Correlations with the next periods payments and regression models are two ways to determine if a risk factor is worthwhile empirically.
- the base period charges and payments plus the shape of relative amounts of those payments by month, day, or other amount of time are some of the strongest risk factors (See TABLE 4).
- the amount of time enrolled in the base period is another risk factor.
- the key is developing robust risk factors that are not too heterogeneous. A priori logic plus trial and error are useful approaches.
- Our candidate risk factor codes are listed in Appendix G. TABLE 5 illustrates two family composition risk factors. A detailed listing of risk factors is contained in Appendix G: Risk Factors.
- FFS fee for service
- FIG. 9 is a flowchart of processing steps for developing cost forecasting models based on “inlier” data in steps 106 , 204 , 210 , 304 or 310 of the methods of FIGS. 1-3 . Processing blocks of FIG. 9 are described as follows:
- the modeling universe database is separated into Winsorized data (i.e., inliers) and the outlier data.
- Winsorized data i.e., inliers
- the independent variables are similar for the inliers and outliers. It has been found that models are more accurate when average payments per day is used as the dependent variable and average charges per day as predictor variables (and components of it such as the lowest ten months average charge per day). Cost per day adjusts for persons not enrolled for a complete year.
- the Winsorization point is typically selected as the top 5% of payments per day. If that value is $55 per day, then the inlier model uses a value of $55 per day as the dependent variable for people with greater than or equal to $55 per day in payments. People with under $55 per day in payments do not have their dependent variable changed.
- the database for the outlier models flags people with next period payments greater than or equal to the Winsorization value (e.g., $55 per day). If they are at or over the Winsorization amount, the flag equals one and zero otherwise. Also, the actual payments per day next period less the Winsorization amount is calculated. If it is negative, the outlier payment is set to zero.
- the Winsorization value e.g., $55 per day.
- Winsorized modeling universe database is separated into two separate components: those individuals with claims in the base period and those individuals without claims in the base period. Those without claims have only demographic risk factors whereas those people with claims have a payment history and clinical information as additional risk factors. Those without claims are on average lower in risk than those with claims.
- the no claims database includes demographic variables, such as age and the family relationship to the enrollee plus risk factors from the enrollment file.
- the initial person-level model for people with claims uses the continuous independent variables only. Examples include the age, number of days enrolled in the base period, charges in the peak spending month, and average charge per day in the lowest ten months.
- the dependent variable is the Winsorized payment per day (or a transformation of it such as the fifth root) in the next period.
- An ordinary least squares (OLS) model has been used.
- Other forms of regression models (e.g., median or robust) or neural networks could be used.
- the example given in the software in the CD-ROM Appendix does not include this step, but the program above does provide an example. This step can be important when there are several numerical candidate predictor variables.
- a CART median regression tree or other data mining technique is used to model the “no claims” Winsorized database.
- the first model i.e., the one for continuous variables used in 908
- This model uses the same statistical techniques as 910 but its independent variables are limited to those that can be derived from the enrollment file.
- the output from the regression tree (terminal nodes) identifies groupings of people that have homogeneous next period payments.
- the regression tree terminal node's groups people with similar median payments next period.
- a set of dummy variables is developed that identify people in each terminal node.
- These dummy variables, the variables that were used to form the dummy variables, and the significant variables from 908 are entered into a final prediction model.
- the result of those models is an expected payment per person per day in the next period. This only includes the Winsorized portion of the payments for people with claims in the base period.
- An example of a program to run OLS regression using terminal nodes from regression tree and other important risk factors from the tree is found in Appendix B.
- Model testing can be done at this point or after each step in the modeling process (i.e., after 908 , 910 , and 914 for models for people with claims). It is probably more efficient done after the final step.
- the mean absolute residual, r 2 , accuracy measure (previously defined), bias, and cross validation.
- Mean absolute residual, accuracy measure (previously defined) and r 2 are related to the accuracy of the forecast. Bias refers to systematic over or under prediction when cases are sorted by their expected value. Regression models can be biased but regression trees are not biased.
- Cross validation refers to the accuracy of the models when they are applied to different sets of data. The tree software tests for cross validation. Hold-out samples can be used for testing the entire hybrid models.
- An example of a Program to run bias test, mean absolute residual, and r 2 analyses (examples of model testing) is found in Appendix C.
- FIG. 10 is a detailed flowchart of process steps for developing cost forecasting models based on “outlier” data of the Winsorized data for the steps 106 , 204 , 210 , 304 or 310 , of the methods of FIGS. 1-3 .
- the illustrated processing blocks of FIG. 10 are described as follows:
- the outlier database has next period's payments of zero for everybody whose payments were below the Winsorization point and the amount above the Winsorization point for everybody else.
- the outliers can have very high cost per day so the variability is very large. Therefore, we have chosen to model the outlier portion separately. This two step approach leads to more accurate and stable results since the extreme outliers are almost impossible to predict accurately.
- the same continuous risk factors available for 908 are used to model the probability of these people having payments above the Winsorization point.
- the dependent variable is 1 if the total amount of next period's payment is above the Winsorization point or zero otherwise.
- a logistic regression is used to estimate the probability of each person's next period's payments exceeding the Winsorization point.
- Other types of regression models can be used instead of logistic regressions.
- the model is tested for accuracy using the criteria described in 918 . Note that the probability of each person being an outlier is being modeled rather than classifying each person as an outlier or not an outlier. All of the techniques from processing block 918 of FIG. 9 are applicable.
- a regression tree is used to refine the estimated probability of being an outlier.
- the dependent variable is the same as 1008 .
- the expected value from the logistic regression plus all of the categorical risk factors from the claims data and enrollment file are used as candidate independent variables (See 910 ).
- the output are terminal nodes of a least squares regression tree that have homogeneous probabilities of being an outlier. The probability of each person is determined by their terminal nodes. Note that this is not a classification tree.
- the same methods are applied to the people with no claims data (See 1012 ).
- the output are groupings of people with homogeneous probabilities of being an outlier.
- terminal nodes and risk factors defining those terminal nodes are used as input into another logistic regression or other forecasting technique (see 914 and 916 ).
- the examples in Appendix E are for 1017 since it includes data from claims.
- the median payment above the Winsorization point next period is calculated.
- the terminal nodes mean above the Winsorization point
- the terminal nodes are combined for additional stability. Note that the probabilities are not combined.
- the means are calculated arithmetically for the people in the combined terminal nodes and for those kept in separate nodes due to their distinctive median dollar costs.
- the means are then multiplied by the respective probabilities for each person giving the expected outlier payments for each person.
- the probability from the logistic regression (see 1017 and 1019 ) is used rather than from the regression tree. People are “tagged” with their respective terminal nodes (see 1012 and 1014 ) so that the correct mean is multiplied by the probability.
- the process of scoring the data refers to applying the model to a set of data.
- the data need not be the same data on which the model was developed. However, it is best if the weights are derived from that client's book of business.
- the data need to have the same risk factors coded on it that were included in the models of the probability of being an outlier and those used for the expected inlier payment calculations.
- the models must be applied to the universe of people that were defined using the same criteria that were used to define the model universe.
- the model gives a set of weights applied to individual risk factors or combinations of risk factors yielding the expected payments or probability.
- Most statistical packages or data mining software have automated methods for scoring data once the risk factors are properly coded.
- FIG. 11 Illustrated in FIG. 11 is a detailed flowchart for scoring, testing and integrating the data, and adjusting for cost trends for use in steps 106 , 204 , 210 , 304 or 310 as well as 108 , 208 and 306 of the methods of FIGS. 1-3 .
- the description is written as steps in developing the model so the data are referred to as the base and next periods.
- the application of the model to the actual underwriting data is essentially the same and it produces the policy period expected cost.
- the respective processing blocks of FIG. 11 are described as follows:
- the expected next period inlier (less than or equal to the Winsorization point) payments are added to the expected next period outlier payments to produce the total expected payments in the next period for people with no claims (from 920 ) and for people with claims in the base period (from 918 ).
- the following program is an example of scoring inlier data with claims.
- This database includes everybody that was included in the modeling universe (i.e., the standard population). However, there are people that were enrolled next period but not included in the modeling universe.
- the adjustment factor will be (excluded category mean next period cost/day) divided by (included category mean next period cost/day).
- the number of persons in category 1 can be determined for those who actually enrolled in the lag period while the number in category 2 can be estimated from underwriting period data.
- the final adjustment factor will be the product of the per person adjustment factor (as above) and the proportion of all next period person days estimated to be comprised by those in category 1.
- the proportion of next period person days comprised by those in the model will have an adjustment factor of 1.00.
- any subsets with actual values differing significantly from expected values can be the basis of adjustment.
- the proportions of person days in category 3b can be estimated from the available data.
- the database of all people covered next period is compiled next.
- a flag is set to one if the person has an expected payment next period that was calculated from the risk adjustment models. Only the new joiners in the lag period or next period cannot have an expectation calculated from the risk adjustment model.
- the risk adjustment models include the historical cost trend since it was present in the data. In other words, no additional adjustment was required for the modeling since the model uses the base period to forecast next period's payments so the cost trend inherent in the data is built into the model. Note that with a 3 month lag period, this is a 15 month cost trend. If the future annual cost trend is expected to be identical to he cost trend between the base period and the next period, then no further adjustment is needed since it is already incorporated in the data and model. If the future cost trend is different from the cost trend implicit in the data used for model development, the ratio of the future cost trend divided by the model period cost trend should be used as an adjustment.
- the simplest group-level cost forecast for a credible group is last year's cost multiplied by cost trend producing the “experience” forecast.
- the CI will provide a cost trend forecast for use in this invention.
- the development model has an implicit cost trend built into it since it was present in the model development data. Therefore, the development model must be detrended and then the CI's cost trend forecast can be applied to the person-level cost forecast when the model is applied to the underwriting period data. In order to detrend the development model, we calculate the cost for a standardized population for the book of business in the base and next periods.
- the standardized population assumes a specific mix of demographics in the CI's book of business for the base and next periods.
- a particular embodiment would calculate the proportion of cost in each of the following categories: male employee; female employee; male spouse; female spouse and other dependent cross-classified by 5-10 age categories (e.g., ⁇ 5, 5-17, 18-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75+).
- This particular classification would produce up to 40 demographic cells. Other classifications could be used. Too many cells will cause a loss of robustness in the estimates.
- the mean cost per person per cell in the next period divided by the associated mean cost in the same demographic cell in the base period calculates the cost trend per cell during the model development period.
- One method to standardize the population in order to produce a single cost trend for the entire book of business is to weight each cell by the proportion of cost it accounts for in the base period.
- the weighted average of the cells' cost trend is a summary cost trend for the book of business for that standard population for the time period between the base and next periods. If those periods are contiguous and one year each, the annual development model cost trend has been calculated. Otherwise, an adjustment must be made for the time periods to calculate an annual trend. If the base, lag and next period are each one year, the square root of the cost trend will calculate the annual cost trend since the trend compounds. If the lag period is three months and the base and next period are one year, the fifth root of the cost trend is the three month cost trend.
- the three month cost trend is taken to the fourth power to calculate the annual cost trend.
- the reciprocal of the annual development model cost trend is multiplied by the CI's annual cost trend to calculate the cost trend that should be applied to the underwriting period data after application of the development model. This method works for first dollar medical insurance, aggregate only medical stop loss and reserving for those insurance products.
- the development model next period data need to be detrended and then retrended with the CI's cost trend forecast prior to calibrating the development model for specific stop loss coverage or aggregate stop loss in combination with specific stop loss coverage. Once those adjustments are made, additional cost trend adjustments do not need to be made before applying the specific or aggregate in combination with specific stop loss models to the underwriting period data to forecast the policy period costs.
- the CI may have cost trend calculated separately by geographic locale or by provider type (e.g., drugs, physician, inpatient hospital). If the CI's cost trend is specific to each geographic locale, the same method of demographic cell adjustments can be employed as previously described but a separate table is calculated for each geographic locale. The CI's locale specific cost trend is applied to the cost trend estimated for the model development period using the standardized population adjustments for each locale. Each locale's detrending and retrending is applied to the underwriting data for that locale to calculate the policy period cost for that locale.
- provider type e.g., drugs, physician, inpatient hospital
- the CI's cost trend forecast is by provider type, we need to estimate the development model trend by provider type so that the policy period forecast will be appropriately detrended and retrended. This can be done by cross-classifying the demographic cells by provider type costs for the base and next periods and calculating the provider type trend for each demographic cell separately by provider type.
- the provider type cost trend by demographic cell are combined by weighting by the proportion of base year cost by each by the proportion of total cost for that demographic cell for each provider type separately. This calculates a provider type cost trend for the base to next period for the entire book of business.
- the CI's forecast cost trend by provider type is multiplied by the reciprocal of the model development cost trend for the same provider type.
- This adjusted cost trend by provider type is multiplied by the cost forecast for each terminal node by the associated cost by provider type in the policy period and then summed across provider type by person to calculate the policy period forecast cost per person.
- the associated cost in the policy period by provider type is calculated by multiplying the proportion of cost by provider type in the next period by terminal node by the total forecast cost for the policy period for that terminal node.
- the person-level inflation adjusted forecasts are summed by group and actual is compared to forecast.
- the group-level models make adjustments when the actual is different from forecast.
- the underwriting period data are scored using the model developed on the base and next periods. Risk factors need to be calculated for the underwriting period data in order to apply the model.
- FIG. 12 is a detailed flowchart illustrating processing steps for developing group-level models and making adjustments to the summary of the person-level data of steps 106 and 108 of FIGS. 1 , 204 , 208 and 210 of FIG. 2 , or 304 , 306 and 310 of FIG. 3 .
- the steps are similar to the person-level modeling steps.
- First the development model is calculated using the base and next period data.
- the model is then applied to the underwriting period data (i.e., scoring the data) to forecast the policy period costs.
- the underwriting period data i.e., scoring the data
- the processing block descriptions for FIG. 12 are:
- the group-level development models have the following characteristics:
- a least square regression tree including selected interaction terms as predictors is developed on the group-level data. This second level of modeling makes adjustments for information not included at the person-level.
- the candidate predictor variables include the terminal nodes as dummy variables and the main effects used to define the terminal nodes.
- the predicted values from the model in 1208 are the average per person per day error (i.e., residual) in the estimate of next period's payments for everybody in the group. This residual is added to each person's next period expected payments from the person-level models (subtracted if it is a negative value).
- the model is developed on historical data that have no need for a cost trend adjustment except to be annualized since the cost trend is in the data.
- the inflation adjusted person-level next period payment estimates are used as input and the groups are scored using the group-level models. Risk factors are coded for the group using the underwriting period data and the groups are scored with the group-level model to produce the policy period expected group-level costs.
- the MAP4HIP method can be used to forecast person-level cost for individual (or family) renewal health insurance. The same methods apply but there is no “group” other than the family. The cost for the individual family members are summed to produce the family-level forecast. A family-level model can be used for final cost adjustments.
- the family-level risk factors are family composition, benefit plan, geographic locale and other factors germane to the family rather than an employment “group”.
- FIG. 13 is a detailed flowchart of an embodiment of a price optimization procedure which may be used to carry out steps 110 , 212 , or 308 of FIGS. 1-3 .
- the processing block procedures of FIG. 13 are:
- the group cost estimate is the final output from the cost estimation system (i.e., expected medical costs in the policy period). It is at the group-level and includes the inflation trend estimate.
- the CI provides three sets of inputs that are used in the price optimization.
- the first set of input is their expected probability of retaining the group if the group's price is increased a specified amount. Rate increases will not be negative, generally, unless there is medical price deflation. Many probability estimates are gathered with small changes in the price increase around the client's target profit and fewer more sparse estimates further from the targeted profit margin.
- the client needs to consider the group's historical costs, inflation, local competitive pricing, and other factors that influence the group's likelihood of accepting the various price increases.
- Another necessary input from the client is the administrative costs allocable to that group. This cost may be expressed as a percentage of the expected medical costs or in dollars per year.
- the final input required is a minimum expected profit or profit margin that is acceptable.
- Table 3 is an example of price forecasting using probability of retention and other related input data for steps 1304 , 1306 , 1308 and 1310 :
- the optimal price is $1740 per person or a 16% increase. Costs are expected to be $1375/person and there is a 58% chance of retaining the group. This yields $211.70 expected profit per person.
- the maximum expected profit is the largest amount (or the closest to zero if they are negative) calculated in the preceding step. The largest expected profit is compared to the client's minimum acceptable expected profit.
- Another consideration when pricing the product is the variability of the forecast cost for the policy year. Greater variability should carry an additional risk premium. Therefore, the standard error of the group's expected medical cost is calculated and printed also. SAS or S Plus regressions will calculate the variability of the mean or the standard error of the estimate of the policy year cost by combining the standard errors of the person-level forecasts. The price that provides a 90% (or some other high probability) chance of break-even is calculated using the standard error and printed. An underwriter can use the break-even with a high probability price and the relative standard error in negotiating price. If there is a large relative standard (e.g., standard error of group/average standard error), the underwriter would be less inclined to discount the price in a competitive market since the likelihood of a loss is increased. Code for a program to run a pricing example is found in Appendix F.
- the underwriter offers the group the price that produces the minimally acceptable profit for the client even if the group is expected to reject the offer.
- the final step in pricing involves translating the average price per person per day into a monthly price per subscriber unit (e.g., single person, enrollee with spouse, enrollee with two or more additional dependents—other subscriber unit constellations are also possible). Costs are traditionally presented in cost per member per month or pmpm. However, subscriber units are used for pricing and it is important that costs are rationally allocated to the subscriber units. The price is multiplied by 365/12 to calculate the monthly price (or rescaled for another time period). One alternative for pricing the subscriber units is to calculate the mean cost forecast per subscriber unit for the group and then inflate each mean subscriber cost by the average profit margin for the group (i.e., recommended optimal price/expected cost).
- the mean cost forecast per subscriber unit is calculated by summing the forecast cost per person for each person that is a member of that type of subscriber unit in the underwriting period and then dividing that sum by the number of subscribers of that type (not people) in the underwriting period. This gives the group's mean daily cost per subscriber for each different type of subscriber unit.
- Another pricing alternative is to set the price for the subscriber units that are considered to be very price sensitive just below the market price. The remaining subscriber units must then be priced so that the overall expected profit is maintained. That can be calculated by estimating the expected profit for the market priced subscriber units and subtracting it from the total expected profit for the group. The other subscriber units must account for the remaining profit requirement.
- Estimating costs that need to be considered for reserves for first dollar health insurance and for stop loss coverage are alternative uses for the cost forecasting process. Rather than predicting payments that will occur over the entire policy period, reserving requires predicting costs that will occur in the upcoming financial reporting period (e.g., fiscal year or quarter).
- the same cost forecasting process using data collection and validation, risk factors, data mining and statistical techniques at the person and group-levels, testing and reporting can be applied to produce cost estimates to be used in setting reserves.
- the dependent variable needs to be changed so that the reserving model is calibrated to the appropriate time period.
- the model for reserving forecast's costs that have been incurred but not reported (IBNR) and this may include some costs of claims that have not occurred yet but are in the financial reporting period.
- IBNR The model for reserving forecast's costs that have been incurred but not reported
- the reserving period will run through the end of the current fiscal quarter or year. Inflation needs to be accounted for but the time period is far shorter than for the renewal cost forecast product, but the same techniques apply over the shortened time period.
- a development period model is calibrated using the risk factors from the claims and enrollment data in a base period to forecast total incurred claims for the financial reporting period.
- the underwriting period for reserving can be the previous 12 months of claims (if available) preceding the reserving date or some other time period such as this policy period to the reserving date.
- the base period for the developmental model must have approximately the same number of days as the underwriting period so the forecast will not be biased.
- the policy period for IBNR claims begins at the first date of the financial reporting period and ends at the last day of the reporting period. The next period for the model development cost for IBNR or claims that have not occurred yet must be of the same length as the actual reserving period during the policy period for correct model calibration.
- the total forecast claims are summed to provide a total claim amount forecast.
- This is used as an independent variable and is supplemented by additional independent variables that include the reported claims, historical completion rates by time into the reserving period, claims backlogs and seasonality.
- the total of the IBNR claims from the reserving period is the dependent variable. Note that this model is at the book of business level. A quarter will yield only one data point for the book of business. If there are too few quarters for developing a stable model, an alternative approach is recommended.
- the alternative approach defines reserves as the difference between the total claim forecast for the reserving period and the incurred and reported claims during that period. In other words, the sum of the incurred and reported claims is subtracted from the total forecast claims and this equals the reserve forecast.
- the reserving product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured or stop loss coverage.
- the pricing module is not relevant for reserving.
- the fully insured medical product uses claims information as a critical component of the cost forecasting model. Claims are available if the group is renewing first dollar health insurance but not for a new group. Enrollment data may be available for new groups (possibly only for employees) or individual health insurance. The same process can be applied to new groups or individual (or family but called by convention individual) policies by using the method for the people with no claims and only enrollment data. The base period enrollment data must contain the same potential risk factors as are available for the new groups. Note that there is only one model since there are no claims data so people cannot be separated into claims and no claims people in the base or underwriting periods. The cost forecasting model should be developed on the client's current book of business. The dependent variable is next period's payments.
- the independent variables are the same as the risk factors used in the no claims model (i.e., detailed enrollment data only).
- the modeling universe includes everybody rather than only those with no claims.
- claims data are available for high cost cases in the new group and also may include the demographics and diagnoses associated with those high cost cases. This information can be included as person-level risk factors but the same information will need to be included as potential person-level risk factors in the base period for the development model.
- a group-level model can be applied to the summarized group-level data as with renewal business. Frequently the total cost for the new group last year is available and may be used as a risk factor for the group-level model. The total group cost would then need to be included in the base period as a potential risk factor also.
- the fully insured new business cost forecasting and pricing product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured or stop loss coverage.
- Aggregate only medical stop loss insurance such as CapCost
- CapCost can have different data sources than fully insured insurance (where the data is held and owned by the insurance company), as a TPA pays the claims and holds the data for the self-insured employer. It is our intent to get the data for all of the TPA's groups so that our client, the stop loss insurer, can bid on all of the groups serviced by the TPA. Therefore, any renewal business for the TPA can use the full cost forecasting models. New business for the TPA will not have claims data available.
- the enrollment data only new business model cost forecasting technique is applicable for new business for the TPA.
- the enrollment data are needed for the new group. Future refinements will include combining the historical payments, summarized by month or quarter, with the enrollment information since person-level claims will not be available.
- CapCost 110TM medical claims payments for groups of 50 employees is about 80% of the claims paid out for traditional $50,000 specific plus $125% aggregate stop loss. Once there are 250 or more employees the CapCost 110TM claims pay out is less than 50% of the traditional stop loss coverage. Similar results were seen for $25,000 specific and $75,000 specific both plus 125% aggregate coverage. The pay out for CapCost 110TM is much lower for $25,000 specific plus 125% aggregate and closer to the $75,000 specific plus 125% aggregate.
- the mean and standard deviation are presented in TABLE 6 for three different size groups. 125% aggregate is included with each of the specific coverage. The mean claims paid out are less with CapCost 110TM and the standard deviation is smaller than with traditional stop loss coverage.
- CapCost 110TM The claims paid out for CapCost 110TM and traditional stop loss are highly correlated:
- the MAP4HIP cost forecasting method is recommended as the preferred embodiment since the predicted mean cost is more accurate than the predicted mean cost derived using the standard approach with group-level experience as predictor.
- the same steps are taken in developing the models for CapCost as are used with the general MAP4HIP process.
- the only difference is the variety of TPAs as multiple data sources versus one CI with fully insured medical.
- Person-level and group-level models are developed for cost per person per day. The risk factors, statistical methods and dependent variables are the same.
- the attachment point needs to be set to the appropriate amount (e.g., a 110% attachment point is calculated by multiplying the cost trend adjusted forecast cost by 1.1).
- the aggregate only cost forecasting product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured coverage.
- the MAP4HIP method can be used for cost forecasting for specific stop loss coverage.
- Specific stop loss pays for claims above a specified threshold (i.e., the deductible).
- Those claims costs can be forecast using the same techniques that MAP4HIP uses for forecasting outlier amounts.
- the forecast inflation or cost trend adjustment for the policy period must be applied to the model development data. This is a different order of steps from the standard MAP4HIP sequence but it is necessary due to the specific deductible. For example, if there is a $50,000 deductible and a 10% cost trend then a $50,000 claim in the next period would yield a $0 specific claim.
- the person-level forecasts are summed to make the group-level forecast.
- Group-level models with the same risk factors as MAP4HIP are developed using the residual of the actual specific payments per person per day minus the forecast specific costs. After development period models are complete, they can be applied to data from an underwriting period to develop cost forecasts for a policy period.
- Aggregate stop loss is frequently added to specific coverage.
- the aggregate coverage with specific coverage is paid exclusive of specific claims and specific claims are not used in defining the attachment point. Therefore, aggregate stop loss (with specific coverage also) claim amount can be modeled using the inlier methods in the MAP4HIP method.
- the Winsorization point is the specific deductible.
- the cost trend forecast for the policy period must be applied to the next period data prior to the inlier calculations. Only inliers are modeled since the specific costs will be borne by the specific coverage. Both the specific and aggregate with specific should be modeled and priced separately. Note that this is different from aggregate only stop loss coverage since all costs contribute to the attachment point and aggregate claim amount for aggregate only stop loss coverage.
- the specific cost forecasting and specific plus aggregate cost forecasting products can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured coverage.
- Group short term disability insurance is insurance that pays a portion of an employees wages (typically 50-100%), a flat amount or the lesser of the portion or the flat amount when an employee is disabled due to a non-work related accident, sickness or pregnancy.
- the duration of the salary replacement is typically 13, 26 or 52 weeks.
- the MAP4HIP method can be applied to forecast STD payments with a few modifications.
- the potential risk factors are the same as the risk factors used with medical insurance and described in section 806 with the additional risk factors of number of STD days and payments in the base and underwriting periods and job classification when these data are available. Otherwise, the exact same potential risk factors as used with MAP4HIP can be linked to the STD days next year and modeled using the MAP4HIP modeling techniques and processes.
- the dependent variable in the model development database is the number of STD days in the next period.
- the medical claims and STD days in the base period are linked in the database to STD days in the next period for the same person and a STD day forecasting model for the next period is developed.
- the interaction capturing techniques and other modeling methods are the same as for medical claims but it is unlikely that the data need to be Winsorized and outliers modeled separately since STD is capped at a short period.
- the development model is applied to score the actual underwriting period data to calculate the expected number of STD days during the policy period to calculate the forecast claim amount.
- the expected number of STD days needs to be weighted by the expected cost per STD day.
- each person's salary or flat rate benefit is linked to the database and the forecast STD days are multiplied by the STD per day benefit amount (i.e., portion of salary covered by STD) and increased by the salary inflation history.
- the STD cost per person is summed to produce the group's expected cost. Confidence bounds can be calculated for the number of expected STD days to provide a range of high to low cost for the group.
- a group-level model is built using the same group characteristics as with MAP4HIP and possibly supplemented with characteristics of the benefit plan.
- the group-level dependent variable is residual STD days per person weighted by the mean cost per person per day to calculate the forecast claim amount.
- the STD cost forecasting product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows (with STD days and salary information added) as used with the cost forecasting models for fully insured coverage.
- LTD Long term disability insurance
- the insurer does not bear the cost of continuing disability liability from previous periods unless it was the insurer for that period also.
- the insurer will bear the cost for new long term disabilities that occur during the policy period and will continue to be responsible for that cost until the coverage expires (e.g., the beneficiary dies or turns 65 years old) or the beneficiary can go back to work.
- the probability of a LTD claim occurring during the policy period i.e., the dependent measure
- the base period risk factors are the same as the STD model, including medical claims, and STD claims with the addition of LTD claims linked, recoded and used as supplemental risk factors when available.
- the forecasting model can be built using only medical claims and enrollment information.
- Logistic regression, regression tree or hybrid tree with terminal nodes feeding into a logistic regression are the statistical techniques for modeling the incidence rate of LTD claims during the next period (typically one year). Other interaction capturing techniques can be used to predict the incidence rate but must be appropriate for modeling a variable that is bounded by 0 and 1.
- the development model is applied to underwriting period data to calculate the expected probability of a LTD claim during the policy period.
- the probabilities need to be weighted by the expected net present value of the disability to estimate the total cost of the disability (i.e., the claim amount).
- the net present value of the disability cost is obtained from actuarial tables.
- the expected costs are summed across the group members to produce the expected group cost.
- the net present value needs to be derived from other databases and should be conditionalized on the cause of the disability since the cost will vary depending on the cause.
- the cause of the disability can be estimated by the clinical conditions defining the terminal node of the person. A more accurate total cost of the disability will be calculated if the weights are conditionalized on the cause of the disability.
- an index can be calculated. This index is the expected number of new disabilities for the group during the policy period divided by the “average” number of disabilities calculated using standard actuarial techniques for new business for LTD.
- a confidence interval can be calculated for the expected number of disabilities using the expected probability of disability per person and computing the upper and lower bounds for the group by using a Lexian distribution that calculates the exact probabilities. A binomial distribution can be used but the confidence interval will not be exact since it assumes that everybody has the same average probability within the group.
- Group's that have a confidence interval that does not cover the “average” calculated from standard actuarial techniques are significantly higher or lower in risk and should be priced differently than the average group.
- the group's standard deviation from the mean expected number of LTD cases can be calculated using on of the distributions above.
- the number of standard deviations from the mean is a scale that can be used for pricing. The end points of the scale can be anchored by market prices for the lowest and highest risk market prices or by actual historical LTD experience, conditionalized on group size.
- the LTD cost forecasting product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows (with the addition of STD and LTD claims and salary information) as used with the cost forecasting models for fully insured coverage.
- Group term life insurance is very similar to group disability, it is for a policy period (usually one year) and the coverage and rates are typically not guaranteed beyond that period.
- the death benefit is a one-time payment for a known amount (the amount is usually a multiple of salary up to a limit) so there is no uncertainty over the size of the benefit. Therefore, knowing the expected number of deaths (weighted by the amount of the life insurance) will provide an accurate estimate of the cost of that group.
- a relative risk index can be calculated in the same manner as with LTD.
- the numerator is the expected number of deaths (possibly weighted by the death benefit) and the denominator is the “average” number of deaths (possibly weighted by the death benefit) where the average is calculated using the age by sex distribution and standard life tables calculated by actuaries.
- the significance of the index can be calculated using the Lexian (preferably) or binomial distributions for the person-level probabilities and testing if the average is covered by the confidence bounds for the group. Groups with expected numbers of deaths outside the average should have higher or lower rates than average. Groups with large confidence intervals should be charged more than groups with small confidence intervals, all other factors being equal.
- the same approach for developing the person-level probability models is used for life insurance as is used for LTD. Medical claims from a base period are linked with deaths occurring in the next period for a very large block of business. The risk factors are the same as or developed using a similar technique as used with the medical cost forecasting models. The dependent variable is the probability of death.
- the same interaction capturing techniques used for the LTD probability model are used for the life insurance model (i.e., the preferred embodiment is the hybrid probability tree).
- the developmental model is applied to medical claims during an underwriting period and death forecasts are calculated for the policy period. The probability of death is weighted by the death benefit to calculate the forecast claim amount per person. The claim amounts are summed across people in the group.
- a group-level model can be developed that uses the sum of the probabilities (i.e., the number of expected deaths), actual number of deaths in the base period and the number and amount of STD and LTD claims to supplement the risk factors used in a standard MAP4HIP group-level model, when available. Otherwise, the same medical claims and enrollment information used with MAP4HIP will suffice.
- the dependent measure is the forecast number of deaths and is weighted by the expected death benefit per person to calculate the forecast claim amount.
- the group term life insurance death rate and claim amount forecasting products can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows (preferably supplemented with the addition of death and salary information) as used with the cost forecasting models for fully insured medical coverage.
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Technology Law (AREA)
- Game Theory and Decision Science (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
A computer-implemented process of developing a person-level cost model for forecasting future costs attributable to claims from members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual underwriting period, and the forecast of interest is for a policy period is disclosed. The process uses development universe data comprising person-level enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals. The process also provides at least one claim-based risk factor for each historical base period claim based on the claim code associated with the health care claim and provides at least one enrollment-based risk factor based on the enrollment data. The process also develops a cost forecasting model by capturing the predictive ability of the main effects and interactions of claim based risk factors and enrollment-based risk factors, with the development universe data through the application of an interaction capturing technique to the development universe data.
Description
- This application is based on provisional applications 60/249,060, filed Nov. 15, 2000, and 60/267,131 filed Feb. 7, 2001, which are incorporated by reference herein.
- A computer program listing appendix has been submitted on compact disc for this disclosure. The material on that compact disc is incorporated by reference herein. The compact disc was filed with 2 copies, and contains the following files with:
-
NAME OF FILE DATE OF CREATION SIZE IN BYTES APPENDIX.TXT May 14, 2001 281,991
The names above are the names of the files on the compact disc, the dates are the dates the files were created on the compact disk, and the size in bytes is the size of the file. Please note that there is a glossary of terms included at the end of the Background section. - This invention pertains to health, disability and life insurance systems, particularly including processing data (in the business of health insurance) for estimating future costs or liability and setting optimal pricing. For convenience, we call one embodiment of our invention More Accurate Predictions for Health Insurance Premiums or MAP4HIP.
- Group health insurance is typically priced through a series of steps. Historical claims costs are calculated by summing the costs of insured individuals. Actuaries estimate what the general cost inflation trend will be next period. If an insured group is large enough to have credible experience (historical costs), the inflation trend may be applied to the historical claims experience to produce an estimate of the expected claims for next period. A profit margin and administrative costs are added to the expected group claims costs to produce the so-called “experience rate”. An underwriter reviews the group's experience and adjusts the cost and profit margin-based price depending on special circumstances and competitive pressure. The standard practice is to use group-level data for estimating costs and setting prices except for very small groups, individual policies or specific medical stop loss insurance. Information on the insured's (i.e., individual's) medical conditions is typically not used when group-level data are used for underwriting and pricing the group's aggregate cost forecast.
- The current standard practice for estimating future health care costs for groups of 50 or more employees plus their dependents uses one of two methods or is a combination of those methods. If the group is large enough to have credible, stable experience, the historical costs are assumed to be the best estimate of next period's costs after a cost trend factor for inflation has been included. If the group is too small to have credible historical costs, many groups are combined together and averaged so that a stable demographic look-up table of historical average costs by age group by gender by family size can be developed and used as a weighting mechanism for estimating the expected future costs for non-credible groups. Cost trend factors for inflation are then applied. If a group does not have completely credible or non-credible experience, a blended average of its experience and a demographic look-up table forecast is used. These standard actuarial methods do not account for person-level trends in historical costs nor medical information about the person.
- Small groups (i.e., 50 or fewer employees plus their dependents) or individual medical policies may use medical questionnaires from initial enrollment applications as input to an underwriter for estimating next period's group-level costs. Manual underwriting is expensive due to the labor intensity and is prone to variability among underwriters as their experience varies.
- Some state Medicaid HMO programs (e.g., Colorado and Maryland) and federal Medicare HMO programs are using statistical algorithms that make person-level cost forecasts based on diagnoses from the computerized medical bills and demographic factors. These “risk adjustment” methods do not use procedures or historical person-level costs as the governments do not want incentives for increased utilization of services and spending more money. The governments' intent for HMO payments or managed care is to make payments proportional to the insured populations need for care based on their health conditions but not on prior care. However, historical cost is the single best predictor of future medical cost for credible groups. Not using it as part of the forecasting method decreases the accuracy of the forecast.
- Some medical insurance companies may be using such “risk adjustment” algorithms used by Medicare, Medicaid and others intended for managed care cost forecasting or payment allocation. However, the prospective use of historical costs, types of services and procedures as well as diagnoses and demographics, as well as combinations of these variables, to produce more accurate cost forecasts than “risk adjustment” algorithms using only diagnoses and demographic factors, would be desirable.
- There are person-level diagnosis and procedure models that measure the efficiency of medical practices (i.e., costs of care given the patient's conditions). These models are typically concurrent or retrospective in nature and not prospective. Symmetry's ETGs are a good example of this class of models. It lacks cost experience as a predictor since that is intended as the dependent variable. It also may limit use of demographic variables. Forecasting models would be desirable which are prospective and not designed for concurrent or retrospective analysis. The methods of the present invention can be applied to concurrent data to develop models for efficiency analysis, as will be described.
- Stop loss health (or medical) insurance is typically purchased by self-insured employers that wish to limit their medical expense exposure. The most common form of medical stop loss insurance is known as “specific stop loss” insurance which is a high deductible (usually $25,000 to $100,000) insurance policy per insured person. Specific stop loss medical insurance is designed to protect the employer or other payer from large catastrophic medical expenses such as those incurred for liver transplants or care for neonates with major repairable congenital anomalies. The standard method for underwriting specific stop loss medical insurance uses a demographic look-up table to estimate costs for individuals whose medical expenses were under 50% of the deductible in the previous year. If an insured's medical expenses were over a predetermined amount, such as over 50% of the specific deductible, the insured's medical records are reviewed manually by an underwriter, and next year's costs are estimated by the underwriter or a doctor or nurse using their experience and expert opinion. Manual medical underwriting for specific stop loss has the same problems as manual underwriting for small group medical insurance; it is expensive and prone to underwriter variability.
- Frequently, “aggregate stop loss medical insurance” coverage is also purchased by the employer. Aggregate coverage (exclusive of specific payments) means that the insurer will pay the employer's or other payer's medical cost obligations for a covered group if those costs exceed an agreed upon amount (i.e., an “attachment point”). The attachment point is typically defined as 125% of the group's expected cost in the insured period. The industry standard for calculating the expected cost is substantially the same method as used for fully insured plans. In other words, if the group is large enough to have completely credible experience, the last year's experience is modified by forecast inflation and increased by 25% to produce the 125% attachment point. If the group's experience is partially credible, then a weighted combination of experience and demographic look-up table model is used with an inflation forecast and increased 25% to calculate the 125% attachment point. When the group is too small to have credible experience, the demographic look-up table model is used as the starting point then trended inflation increased by 25% is used to calculate the 125% attachment point. Aggregate only medical stop loss insurance has been recently offered by one company (Cairnstone) to credible groups, and we believe that it uses group-level experience plus trended inflation to estimate future costs. Price is usually determined by competitive pressure but the inventors are not familiar with proprietary techniques used by the insurers.
- We are including a glossary of terms that are used in describing the invention so that we are precise in our description. Additionally, SAS computer code and CART modeling language will be included to provide concrete examples of the implementation of the process or products. The software Appendix found on the compact disc filed with the present disclosure contains computer code (minus copyrighted formats) of a simpler embodiment of the invention. That code is in SAS and S Plus and the regression tree used is RPART. Details are provided for the fully insured renewal product. The aggregate only stop loss product uses the same steps for cost estimation. The short term disability, long term disability and life insurance products use the same techniques for forecasting but the dependent variables are changed to reflect the insurance type.
- 1. Aggregate only stop loss health insurance—A health insurance product for self funded employers that want to cap their maximum liability. The aggregate only policy will pay off costs above an agreed upon limit (i.e., the attachment point). Usually, the attachment point is 125% of expected costs but it could be 110% or some other amount. The expected costs are estimated using an embodiment of this invention or using standard actuarial methods. Aggregate only stop loss does not include specific stop loss. However, specifics can be combined with aggregate stop loss. In that case the specific payments are not included in the costs counted against the aggregate attachment point.
- 2 Base Period—A period of typically 12 consecutive months prior to the lag period during which services were provided to some enrollees and reflected by claims entered in a computer file. In practice, it may be more or less than 12 months. Risk factors are coded on data from the base period. These data are used to forecast the next period costs. In other words, these data are used to calculate the predictors for the development model and are not used for underwriting actual health insurance policies.
- 3 Book of Business—The insurance of a given type (e.g., small group, individual, large group) for all persons covered by an insurer at a point in time or during a specified period. An insurer may have multiple books of business.
- 4 Bias Test—A comparison of observed to predicted values from a model. The totals of both these values are equal to the total population which served as the standard in the preparation of the model. Bias tests determine whether or not there is any meaningful systematic disparity between observed and predicted cost when persons are sorted by predicted values, age or family composition or other characteristics. Disparities are considered as bias which better models eliminate or reduce. Another related measures sorts by the actual rather than the predicted values and is a measure of the accuracy of the forecasts.
- 5 Candidate Predictor Variable—An array of variables derived from the CI (client insurer) database and available to the statistical software which selects those which are most predictive of the dependent variable (e.g., by stepwise OLS, CART regression trees).
- 6 Claim amount: This is the total cost or payments made by the insurer.
- 7 Claim codes: These include ICD-9-CM diagnosis and procedures, CPT codes, National Drug Codes and other standardized coding systems values such as SNOWMED codes.
- 8 Claim-based risk factors: These are risk factors derived from the claim code, claim amount and transformations of the claim amount, type and place of services, provider type, units of service and other information contained on a health care claim. These risk factors are present in either the base or underwriting period.
- 9 Clinical risk factors: Risk factors derived from the claim codes, type and place of service and provider type but not solely from the claim amount.
- 10 Client Insurer (CI)—The insurance entity for which the invention is to be applied.
- 11 Concurrent Cost Models—Used synonymously with Retrospective Cost Models and defined elsewhere.
- 12 Costs of health care—May be defined as either of the following. Measured in dollars (usually per person per day in this application)
- a. Claims—total bills for care submitted to the insurer for reimbursement
- b. Payments—The amounts actually paid by the insurer. Payments are always less than the claims due to deductibles, benefits and non-covered services.
- 13 Cost Inflation—Used synonymously with cost trend. The secular trend in costs per person for health care due to changes in practice patterns and price per service. Does not usually consider changes in a population's health care needs which are usually minimal in the short run. Differs from pure price inflation such as that measured in the consumer price index (CPI).
- 14 Credibility—The degree to which this experience may confidently be used as the basis for future rates relates to its credibility.
- 15. Demographic look-up table—This is a method used by actuaries to estimate group-level costs when the group is too small to have credible experience. Average costs are calculated across a large pool of groups and averages are calculated by cell in a table of age by sex by family composition or other similar demographics. The appropriate cell amounts are applied to each person or employee in a non credible group and summed to calculate its expected cost.
- 16 Dependent Measure—The dependent measure is the forecast of the model through application of the interaction capturing technique. A transformation may be applied to the dependent measure to calculate the claim amount (e.g., multiplying a probability by an average cost). For health insurance and medical stop loss insurance the dependent measure is the future cost of health care for the population which comprises the CI book of business at the time the rates are to be quoted. For short-term disability the dependent measure is disability days. For long term disability and life insurance the dependent variable is the probability of the event.
- 17 Enrollment-based risk factors—These are risk factors that are derived from the enrollment information only such as age, sex, relationship to the enrollee, length of enrollment, geographic locale and type of coverage and does not include claim information or claim amount. The employees salary, disability coverage terms and term life insurance coverage terms may be included in the enrollment file also.
- 18 Experience model—This is a method used by actuaries for estimating cost next year at the group-level. If the group is deemed credible, the last year's cost (or experience) is considered to be the best estimate of next year's cost. A cost trend is added to account for medical inflation for next year's cost.
- 19 Group—A group is a collection of one or more people that are covered by one insurance policy. A traditional group is a collection of employees and their dependents that work for an employer at a location. A group can be an individual or a family by purchasing an “individual” health insurance policy where the remaining immediate family may also be covered by the policy.
- 20 Health Insurance—Insurance for the array of benefits covered by the health insurance policies of the client insurance company or a self-insured company including hospital, surgical and medical care plus drug benefits for some plans. Medical insurance is used as a synonym.
- 21 Hybrid Tree Analysis—The use of regression trees (or other analytic method output) as input to other regression models such as OLS, median and logistic regression or neural networks. Additionally, a model's output (e.g., regression or neural network) may be used as input into the regression or probability tre.
- 22 Interaction Capturing Technique—A mathematical and logical transformation of independent variables that predicts a response or dependent variable. The interaction capturing technique includes main effects, interaction effects and possibly time series effects. Statistical techniques that are examples of interaction capturing techniques include, but are not limited to, ANOVA, regression methods (e.g., linear, logistic, shrinkage, robust, ridge), regression trees, moving averages and autoregressive moving averages, look-up tables, means, probability models, clustering algorithms and many other methods. Data mining techniques that are examples of interaction capturing techniques include, but are not limited to, decision trees, rule induction, genetic algorithms, neural networks, nearest neighbor and other data mining methods.
- 23 Lag Period—A period between the base period and the next period or the underwriting and policy period which is required because of delays in filing claims, preparing or revising model weights, calculating premium rates and submitting them to insured groups in a timely way.
- 24 MAP 4 HIP—This is an acronym of More Accurate Predictions for Health Insurance Premiums which in turn is a brief title for our invention for its application to health insurance.
- 25 Next Period—Typically a 12 consecutive month period subsequent to the base period and the lag period that contains the data that comprise the dependent variable used in the development model. Actual insurance policies are not written for this period but are underwritten for the policy period.
- 26 Policy Period—Typically a 12 consecutive month period subsequent to the underwriting period and the lag period that contains the data that comprise the actual cost borne by the insurer. These costs are forecast using the application of the development model to the data from the underwriting period with appropriate adjustments made for assumptions about inflation.
- 27 Prospective Cost Models—The candidate predictor variables relate to a time period which precedes the dependent variable.
- 28 Retrospective Cost Models—The candidate predictor variables relate to the same time period as the dependent variable.
- 29. Specific stop loss health insurance—A health insurance coverage for self-funded employers or other payor that has a very high deductible per person. Usually the deductible is at least $10,000 and may be as high as $500,000 per person. Typically the deductible is between $25,000- and $100,000 per person and is meant to pay for catastrophic care.
- 30 Standard population—The cases in the data set which are used to select predictor variables and to weight them by their relation to the dependent variable. For this invention, the cases are an insured population.
- 31. Subscriber unit—The family unit that health insurance premium is charged by. For example, the simplest are two units: 1) a single person and 2) two or more people. Single person, married couple and three or more people is a common classification but more detailed versions are also used. The subscriber is the employee.
- 32 Third Party Administration or TPA—A company that processes the health insurance claims for a self funded employer. The TPA may be part of an insurance company or not.
- 33 Underwriting Period—A period of typically 12 consecutive months prior to the lag period during which services were provided to some enrollees and reflected by claims entered in a computer file. In practice, it may be more or less than 12 months. Risk factors are coded on data from the underwriting period. These data are used to forecast the policy period costs. In other words, these data are used to calculate the predictors for the model that is used for underwriting actual health insurance policies.
- 34 Winsorize—Data are Winsorized if the most extreme observations on one or both ends of the ordered samples are replaced by the nearest retained observation. Our cost distributions have no low cost outliers and hence Winsorization is applied only to the high end of the ordered sample.
- One aspect of the invention contemplates a computer-implemented process of developing a person-level cost model for forecasting future costs attributable to claims from members of a book of business, where person-level data regarding actual base period health care claims are available for a substantial portion of the members of the book of business for an actual underwriting period, and the forecast of interest (i.e., future claim amount) is for an actual policy period which can be, but is not necessarily contiguous with the actual underwriting period, having the steps of:
- providing development universe data comprising person-level enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a health care claim comprises at least a claim code and a claim amount;
- providing at least one claim-based risk factor for each historical base period claim based on the claim code associated with the health care claim and providing at least one enrollment-based risk factor based on the enrollment data; and
- developing a cost forecasting model by capturing the predictive ability of the main effects and interactions of claim based risk factors and enrollment-based risk factors, with the development universe data through the application of an interaction capturing technique to the development universe data.
- A further aspect of the invention contemplates a computer-implemented process wherein the interaction capturing technique is selected from the group consisting of median regression tree techniques, least square regression tree techniques, rule induction techniques, ordinary least squares regression techniques, median regression techniques, robust regression techniques, genetic algorithms, rule induction, clustering techniques and neural network techniques.
- Yet another aspect of the invention is a computer implemented process wherein the person-level next period cost forecasts are adjusted by modifying the extant cost forecast by the expected cost trend.
- A yet further aspect of the invention is a computer implemented process of wherein the datum from the claims used as predictors consist essentially of the claim- and enrollment-based risk factors and the claim amount is a standardized cost of services provided and the model is used to allocate prospective payments to health care providers.
- A still yet further aspect of the invention is a computer implemented process wherein the data used from the claims data consist essentially of the claim code and selected mandatory procedures and the claim amount is a standardized cost of services provided during the same time period as the base period and the model is used to evaluate the efficiency of health care providers.
- Another aspect of the invention is a computer implemented process of forecasting future claim amounts attributable to claims from members of a book of business for an actual policy period, wherein the model development universe comprises data from the members of a book of business to be insured, further comprising:
- applying the cost-forecasting model to the actual underwriting period person-level data of each of the members of the book of business to generate a person-level actual policy period cost forecast for each member of the book of business; and
- producing a group-level forecast for the actual underwriting period from the person-level forecasts of each member of the group by totaling the person-level actual policy period cost forecasts for the group for the policy period.
- Yet another aspect of the invention is a computer implemented process comprising the step of: setting insurance reserves based on group-level forecast for the actual policy period, wherein the policy period is a reserving period for claims that have not occurred or that have occurred but not been reported.
- Yet still another further aspect of the invention is a computer implemented process, wherein claim amounts are a mix of fee for service payments and capitation payments so that the base and underwriting periods risk factors are appended to include dummy variables for the presence of capitation payments by provider type and the cost estimate in the next and policy periods is the fee for service cost that must be supplemented with the expected capitation payments.
- Still another aspect of the invention is a computer-implemented process of developing a hybrid person-level health care claim cost forecasting model for forecasting future medical costs attributable to health care claims from members of a book of business, where person-level data are available for a substantial portion of the members of the book of business, comprising the steps of:
- providing development universe data comprising person-level data for a statistically meaningful number of individuals, the person-level data comprising continuous variable data and categorical variable data;
- processing first the continuous variable data for each individual with a continuous processing technique that captures the predictive ability of main effects and interactions of continuous variables to generate a person-level continuous variable model; and
- processing the categorical variable data for each individual including the output from the continuous processing technique with a categorical processing technique that captures the predictive ability of main effects and interactions of categorical variables to generate a person-level categorical variable model;
- wherein the person-level continuous variable model and person-level categorical variable model together comprise a hybrid person-level health care claim amount forecasting model.
- Yet another aspect of the invention is a computer-implemented process of developing a claim amount forecasting model for use in forecasting the future claim amount for members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual base period, and the claim amount of interest for forecasting purposes is an actual next period which can be, but is not necessarily contiguous with the actual base period, comprising the steps of:
- processing the base period data having claims to generate a having-claims claim amount forecasting model; and
- processing the base period data without claims to generate a without-claims claim amount forecasting model,
- wherein the having-claims cost forecasting model and the without-claims forecasting model comprise a claim amount forecasting model.
- Yet another aspect of the invention is a computer-implemented process of developing a health care claim amount forecasting model for use in forecasting the future medical claim amount for members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual base period, and the claim amount of interest for forecasting purposes is an actual next period which can be, but is not necessarily contiguous with the actual base period, comprising the steps of:
- providing development universe data comprising person-level data for a statistically meaningful plurality of individuals, wherein the person-level data for an individual comprises health care claims data for the individual and the data on a health care claim comprises at least a claim amount and a claim code;
- Winsorizing the person-level data to yield inlier data and outlier data;
- processing the inlier data to generate an inlier cost forecasting model; and
- processing the outlier data to generate an outlier cost forecasting model;
- wherein the combination of the results of the inlier and outlier cost forecasting models together produce a person-level claim amount forecast model.
- Another aspect of the invention is a computer-implemented process of comprising:
- Winsorizing the inlier data to yield inlier data having claims and inlier data without claims;
- processing the inlier data having claims to generate an inlier-having-claims claim amount forecasting model; and
- processing the inlier data without claims to generate an inlier-without-claims claim amount forecasting model,
- wherein the inlier-having-claims cost forecasting model and the inlier-without-claims forecasting model comprise an inlier claim amount forecasting model.
- A still further aspect of the invention is a computer-implemented process of forecasting a claim amount attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
- providing person-level data, comprising enrollment data for members of a book of business to be insured for an actual underwriting period that can be, but is not necessarily, contiguous with the actual policy period;
- providing a model development universe of person-level data, comprising enrollment data from the historical base period and historical next period heath care claims data for a statistically meaningful number of individuals;
- providing enrollment-based risk factors for each historical base period and providing next period claim amounts;
- developing a health care cost-forecasting model for the enrollment data by capturing the predictive ability of main effects and interactions of enrollment-based risk factors through the application of an interaction capturing techniques to the model development universe;
- applying the health care cost-forecasting model to the person-level underwriting period enrollment data of each of the members of the book of business to generate a person-level expected cost forecast for the policy period for each member of the book of business; and
- producing a group-level forecast for the expected cost of the policy period from the person-level forecasts of each person of the group by totaling the person-level expected cost forecasts for the actual policy period.
- A still further aspect of the invention is a computer-implemented process of forecasting costs attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
- providing person-level data, comprising enrollment data and actual underwriting period health care claims data, for members of a book of business, where the person-level data on a health care claim comprises at least a claim amount and a claim code and the actual underwriting period can be, but is not necessarily, contiguous with the actual policy period;
- providing a model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a base period health care claim includes at least a claim amount and a claim code;
- providing claim-based risk factors for each historical base period based on the claim code associated with the health care claim and providing at least one enrollment risk factor based on the enrollment data;
- developing a cost-forecasting model by capturing the predictive ability of main effects and interactions of risk factors through the application of an interaction capturing technique to the model development universe;
- applying the cost-forecasting model to the person-level data of each of the individuals or members of a group to generate a person-level actual policy period expected cost forecast for each member of the group; and
- producing a group-level forecast for the actual policy period from the person-level forecasts of each individual or member of the group by totaling the person-level cost forecasts for the actual policy period.
- Yet a further aspect of the invention is an automated system for forecasting future costs attributable to claims from members of a book of business during an actual policy period comprising:
- a central processing unit;
- an insured person database, accessible by the processor, wherein the database comprises person-level enrollment data and actual underwriting period health care claims data, for members of a book of business to be insured, where the person-level data on a health care claim comprises at least a claim amount and a claim code;
- a model development universe database, accessible by the processor, wherein the second database comprises model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on the base period health care claim includes at least a claim amount and a claim code;
- a risk factor encoder, accessible by the processor, wherein the risk factor encoder encodes claim-based risk factors for each historical base period based on the claim code associated with the health care claim and the risk factor encoder encodes at least one enrollment risk factor based on the enrollment data;
- a model generator, accessible by the processor, that generates a cost-forecasting model by capturing the predictive capacity of the main effects and the interaction of the risk factors assigned by the risk factor encoder to forecast the historical next period of the model development universe data using the historical base period data;
- a person-level cost generator that applies the cost-forecasting model to the person-level actual underwriting period health care claims data of each of the members of the book of business to generate a person-level actual policy period claim amount forecast for each member of the book of business; and
- an actual policy period group-level cost forecast generator that totals the person-level actual next period forecasts for each member of the group to generate an actual policy period group-level cost forecast.
- Still another aspect of the invention is a computer-implemented process of forecasting costs attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
- means for providing person-level data, comprising enrollment data and actual underwriting period health care claims data, for members of a book of business, where the person-level data on a health care claim comprises at least a claim amount and a claim code and the actual underwriting period can be, but is not necessarily, contiguous with the actual policy period;
- means for providing a model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a base period health care claim includes at least a claim amount and a claim code;
- means for providing claim-based risk factors for each historical base period based on the claim code associated with the health care claim and providing at least one enrollment risk factor based on the enrollment data;
- means for developing a cost-forecasting model by capturing the predictive ability of main effects and interactions of risk factors through the application of an interaction capturing technique to the model development universe;
- means for applying the cost-forecasting model to the person-level data of each of the individuals or members of a group to generate a person-level actual policy period expected cost forecast for each member of the group; and
- means for producing a group-level forecast for the actual policy period from the person-level forecasts of each individual or member of the group by totaling the person-level cost forecasts for the actual policy period.
- A still further aspect of the invention is a group insurance product comprising:
- an identification of the types of benefits which are agreed to be provided by an insurer to or on behalf of members of a group, which will be incurred by members of said group during a future time period; and
- a stated monetary insurance premium including a forecast of said benefits, estimated costs of administering the insurance product, and optionally, an estimated profit,
- whereby an insurer agrees to cover the identified benefits in exchange for the payment of the stated monetary insurance premium.
- Yet another aspect of the invention is a method of pricing group insurance including a cost of future benefits according to the computer-implemented process of forecasting future medical costs attributable to claims from members of a group during an actual underwriting period, comprising the steps of:
- providing an expected amount of administrative costs allocable to providing health insurance coverage to the group;
- providing a minimum acceptable expected profit;
- totaling the group level cost forecast, expected amount of administrative costs, and minimum acceptable expected profit are to yield a total minimum price, and
- providing a plurality of expected probabilities of retention for the group corresponding to a plurality of possible prices greater than or equal to the total minimum price, each possible price also having an expected profit that is the amount of the price over the group level cost forecast plus the expected amount of administrative costs; and
- calculating a plurality of possible maximum profits by multiplying each of the plurality of possible profits by the corresponding expected probability of retention, wherein the largest possible maximum profit, is used to price the group insurance.
- Still another aspect of the invention is a method of underwriting an insurance product comprising the steps of:
- providing an identification of the coverage of the insurance product which identifies the conditions of payment under the product during a policy period;
- providing person-level health care claim information comprising enrollment data, and base period and underwriting period claim data, the claim data comprising claim codes having associated claim costs;
- capturing the predictive ability of the person-level health care claim information through the application of an interaction capturing technique; and
- forecasting a predicted cost of the insurance product during the policy period based on the identification of the coverage of the insurance product and the captured predictive ability of the person-level health care claim information;
- wherein each of diagnosis and CPT based risk factor is independent of the sequence in time of other diagnosis and CPT based risk factors.
- A further aspect of the invention is a method of underwriting an insurance, for insuring short term disability costs wherein the interaction capturing technique uses a dependent measure from the next period and policy period comprising the number of STD days in the policy period and weights the dependent measure by the expected cost per day for the STD to produce the person-level expected STD costs and summed across the group to produce the group's expected STD cost.
- A still further aspect of the invention is insuring long term disability (LTD) claims wherein a dependent measure for generating the cost forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD and applying the cost forecasting model to the person-level data produces person-level expected LTD costs wherein summing the person-level expected LTD costs across the group to produce a group's expected LTD cost for an actual policy period.
- A still yet further aspect of the invention is a cost forecast produced for first-dollar health insurance.
- Another aspect of the invention is a cost forecast produced for stop loss health insurance.
- A still further aspect of the invention is a cost forecast produced for aggregate-only stop loss health insurance.
- Still another aspect of the invention is a cost forecast produced for specific stop loss health insurance.
- Yet another aspect of the invention comprises is a cost forecast for insuring group term life insurance costs wherein a dependent measure for generating the cost forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost.
- In a still another aspect of the model development universe comprises data from the members of a group in the book of business to be insured.
- A still yet further aspect of the invention comprises the step of: setting insurance reserves based on the renewal group-level forecast for the actual underwriting period, wherein the next period is a reserving period for claims that have not occurred or that have occurred but not been reported.
-
FIG. 1 is a flowchart of an embodiment of an overview of a method for estimating future cost and optimizing pricing. -
FIG. 2 is a flowchart of an embodiment of a method like that ofFIG. 1 which is particularly adapted for service bureau processing. -
FIG. 3 is a flowchart of an embodiment of a method like that ofFIG. 1 which is particularly adapted for use as a software product, which may be functionally distributed locally or over the Internet. -
FIG. 4 is a more detailed flowchart of a process for data processing ofsteps FIGS. 1 , 2 and 3. -
FIG. 5 is a more detailed flowchart illustrating a process for standardizing time periods, for use in the methods ofFIGS. 1-3 , and inparticular steps -
FIG. 6 is a flowchart illustrating data validation and standardization procedures forsteps FIGS. 1-3 . -
FIG. 7 is a flowchart illustrating the matching and merging (integration) of data in the process steps 102, 202 or 302 ofFIGS. 1-3 . -
FIG. 8 is a flowchart illustrating the aggregation and risk factor coding for thesteps FIGS. 1-3 . -
FIG. 9 is a flowchart of processing steps for developing cost forecasting models based on “inlier” data insteps FIGS. 1-3 . -
FIG. 10 is a detailed flowchart of process steps for developing cost forecasting models based on “outlier” data of the Winsorized data for thesteps FIGS. 1-3 . -
FIG. 11 is a detailed flowchart for scoring, testing and integrating the data, and adjusting for cost trends for use insteps FIGS. 1-3 . -
FIG. 12 is a detailed flowchart illustrating processing steps for developing group-level models and making adjustments to the summary of the person-level data ofsteps FIGS. 1 , 204, 208 and 210 ofFIG. 2 , or 304, 306 and 310 ofFIG. 3 . -
FIG. 13 is a detailed flowchart of an embodiment of a price optimization procedure which may be used to carry outsteps FIGS. 1-3 . - The present invention is directed to insurance systems, particularly including methods for processing health insurance data to estimate future costs, and for optimizing pricing of health insurance products, including both first-dollar and stop loss insurance products. In various aspects, it involves processing historical data, developing algorithms, applying those algorithms, updating those algorithms and setting prices. However, the insurance systems that can benefit from the methods and systems disclosed herein also include, but are not limited to, health insurance, disability insurance, both short term and long term, as well as term life insurance systems.
- This invention comprises a series of related products that provide more accurate group-level claim amount forecasts (and person-level forecasts for individual or family health insurance) and more optimal group-level renewal prices for insurers at full risk for the health insurance (e.g., indemnity, PPO, HMO, POS) or aggregate only stop loss health insurance for self insured employers. These forecasting models for renewal price setting are not intended to be used for paying managed care providers but alternate related models are developed for that purpose (see B in Table 1 below). The products provide more accurate future cost estimates by forecasting person-level costs using models that include clinical information from historical health insurance claims as well as person-level demographic and historical cost data. In this regard, effective models may be based on data from relatively large groups of at least 50,000 people, such as typically covering an entire book of business for an insurer (or a large subclass of the insurer's book of business such as all HMO groups of the insurer) or in the case of a TPA, the TPA's entire book of business. The most recent year of person-level medical claim data for the individuals of a particular book of business for which an accurate cost forecast is desired may be processed by this model, to produce an accurate projected cost for policy pricing, as will be described. Future cost trend estimates (inflation) are adjusted for each individual's characteristics and applied to the person-level estimates. Person-level cost forecasts are summarized to the family-level or group-level and family or group-level characteristics are used to adjust the summarized cost to produce the adjusted family or group-level cost forecast. The price is optimized using a system that estimates the probability of the group accepting the insurance at the price offered, given the group's historical insurance cost, historical claim's history, and local competitive market conditions. The probability is weighted by a function of the expected future profit, which equals the anticipated price less expected medical and administrative costs. The method and models with slight adjustments can be applied to self insured employers aggregate only, specific only or specific plus aggregate medical stop loss data. The products also include the use of the method applied to a client's book of business for estimating future claim amounts for purposes of setting a reserve by group and for cost forecasting and pricing for new groups or individuals for fully insured health insurance. Another alternative application would be the use of the method to develop and deliver products that allow HMO's to prospectively allocate health care payments to providers. Another product is the measurement of the efficiency of health care providers. These methods can be applied to medical claims linked to future short or long-term disability payments or indicators of disability and used to rate the relative risk of disability of groups or forecast their future costs by using the groups medical claims, enrollment data and summarized group-level or person-level disability payments. Another application is to group term life insurance. The dependent measure is the probability of death next period which is linked to medical claims in the base period and the potential risk factors are the same potential risk factors as used with the other models.
- The modeling strategy employed for the cost forecasting models contains several novel components. We have used a combination of specialized data collection and cleaning, regression trees and regression (ordinary least squares or OLS, logistic and median) models tailored to a client's book of business, and the application of these models to the client's book of business for improved decision making. While there are many published examples of OLS being used for purposes similar to this application, there are a few using trees. We are not aware of any reports using a combination of regression trees and other regression models to forecast health care costs. The use of the output of a tree model as an input to other regression algorithms is known as “hybrid” tree models. (See D. Steinberg and N. Scott Cardell, Improving Data Mining with New Hybrid Methods, Salford Systems, May 27, 1998, Powerpoint@ http://www.Salford-systems.com). They give examples of models with a binary (yes-no) dependent variable for which they used the regression tree output as predictors in a regression model. They demonstrated that this hybrid combination was superior to either method used alone. When our dependent variable is cost we used OLS regression with the output of regression trees and when the dependent variable is a probability, we used logistic regression. This allowed us to have continuous valued predictions rather than the step-like predictions characteristic of trees and contingency table forecasts. Our use of the terminal nodes of a regression tree as predictions in an OLS or logistic regression model provides an effective way to have both the main effects and complex interactions of candidate predictors properly weighted in our final model.
- A typical group health insurance product in accordance with the present invention (such the various types of Blue Cross™ and Blue Shield™ brand group health insurance policies, which are incorporated herein by reference) comprises an identification of the types of medical expenses which are agreed to be covered, paid or reimbursed by an insurer to or on behalf of members of the group (including their covered dependents) which are incurred by members of the group during a future time period, typically one year, in exchange for a stated monetary insurance premium which includes a forecast of said medical expenses in accordance with the methods described herein, estimated costs of administering the health insurance product, and an estimated profit.
- Table 1 summarizes the alternate uses of our method as applied to health care enrollment and claims data linked with claim amounts for first dollar and stop loss coverage, disability coverage, reserves and term life coverage. These alternate model development produce products that are customized for specialized applications. Row is the application of our invention which is presented in most detail in this application. The methods used in A-1 are clearly related to those in each of the other rows.
-
TABLE 1 Applications of the Invention's Modeling Methods Allowable Sources of Candidate Risk Dependent Variable for Reference Times for Predictors Services Provided During Dep. Dependent & Predictor Application Enrollment Data Claims Data Variable Ref. Time Variables Model Type A. Predict Future Costs of Health Insurance 1. Renewal Groups or All All Cost of Claims Predictor Variable Precedes Prospective Individuals 2. Stop Loss: Specific All All Cost of Claims over Predictor Variable Precedes Prospective Only, Aggregate Only Deductible, over Attachment or Specific Plus Point or Both Aggregate 3. Required Reserves All All Reserve Period IBNR Predictor Variable Precedes Prospective 4. New Groups or All None Cost of Claims Predictor Variable Precedes Prospective Individuals B. Allocate payments All Diagnosis Standardized Costs of services Predictor Variable Prospective to health care providers provided C. Measure “Efficiency” of All Diagnosis & selected Standardized Costs of services Predictor Variable Retrospective care providers mandatory procedures provided* concurrent with Dependent Variable D. Short Term Disability All All + STD Claims STD days, Cost or Index Predictor Precedes Prospective Payments E. Long Term Disability All All + STD + LTD Probability LTD, Cost or Index Predictor Precedes Prospective F. Group Term Life All All + Death Probability Death, Cost Predictor Precedes Prospective *Costs per service can be standardized by use of relative values for CPT codes and DRG weights for hospital care or average actual costs for each service - Optimal pricing for a fully insured group requires an accurate forecast of the group's mean cost per person in the policy period. Optimal pricing for an aggregate only medical stop loss insurance for a self-insured employer also requires an accurate forecast of that group's mean cost per person in the policy period. Therefore, the exact same methodology can be used for the cost forecast for fully insured groups or for self-insured group's aggregate only stop loss insurance if the same data are available. There is a difference in the methods used to set prices since the employer will pay for the majority of the medical expenses when it is self-insured and thereby paying a premium that is far smaller than with full health insurance when the insurer pays all of the medical costs.
- CapCost™ is an aggregate only medical stop loss product that includes a system for making more accurate cost forecasts (for groups with 51 to 3000 employees mainly). The attachment point for CapCost™ can be the standard 125% of expected costs (called CapCost 125™) but we will offer an attachment point at 110% of expected costs (called
CapCost 110™) and possibly other attachment points. The terms of CapCost™ are similar to those of traditional medical stop loss insurance, but there is cash flow protection, medical costs are cumulated on an incurred basis rather than a paid basis, and there is no specific stop loss coverage. CapCost™ is useful for employers since many will receive prices that are below the price of traditional specific plus aggregate medical stop loss insurance while the maximum aggregate medical liability for the group may be lower with CapCost™ than with traditional specific plus aggregate medical stop loss insurance. From the insurers perspective, the expected medical claims it must pay with CapCost™ are frequently below those of traditional medical stop loss products since specific stop loss coverage is not provided. Generally, CapCost™ is a better value for the employer than traditional stop loss coverage when the employer is larger than the average employer purchasing stop loss coverage or if the group has experienced some unusually high annual medical expenses due to a few high cost individuals that are unlikely to have high costs recurring in the near future. - CapCost™ is novel in the way expected future medical costs are estimated. Historical medical claims, enrollment, benefit plan and employer files in electronic format are collected from the Third Party Administrators (TPA) or insurance company that is paying the employers medical bills. The electronic files containing the medical claims and enrollment data are collected for all people with medical coverage rather than from only those that had large claims. This invention's cost forecasting models are applied to the insured people covered by the employer. The inflation trend and optimized pricing are then applied to the cost estimates. The CapCost™ product is a system for data collection, cost estimates, and price optimization and is part of this invention. Separate products are designed for pricing new or renewal coverage for fully insured medical plans and for allocating reserves for such medical plans. Each contain a system for data collection and cost estimation. Price optimization is an additional part of this invention for fully insured medical plan renewals and stop loss coverage.
- One of the important measures of the quality of a model is the mean absolute residual (MAR). The MAR is the mean of the absolute value of the difference between the actual and predicted cost of a group. A lower MAR is desirable since the predicted cost is closer to the actual cost. We compared the MAR for this invention's predicted cost with the MAR calculated using an experience model and the MAR calculated using a demographic look-up table model. The results are presented as a percentage of the mean of the groups costs or the predicted divided by the actual times 100. The MAR was 11.6% for the invention's prediction, 14.2% for the experience model, and 25.8% for the demographic model for the 116 actual groups in our database. The invention forecast was substantially better than either of the two conventional forecast methods.
- We conducted a Monte Carlo simulation for groups with various numbers of employees since our database is too small to analyze by group size. We randomly selected 1500 enrollees and their dependents and made 500 synthetic groups. The MAR as a percentage of the groups actual cost was about 7% for the inventions forecast and just under 10% for the experience forecast. A demographic forecast was not compared since groups with over 1500 employees and their dependents are deemed completely credible.
- A measure of model accuracy addresses whether and by how much the model systematically over or under predict the actual costs for various characteristics of the insured population. In order to compare this accuracy measure of two models, we sort the (actual) cost of groups into deciles from the lowest 10% to the highest 10%. We calculate the predicted (forecast) cost for the groups in each (or finer gradation) decile. The actual cost is divided by the forecast cost to make an index. The index should be close to 1.0 if the model is accurate. In our simulation tests (500 groups of 1500 employees), the invention's forecast is always closer to 1.0 for every decile indicating that it is a superior model to the experience model. The invention's ratio of predicted to actual was about 0.91 for the lowest decile and about 1.32 for the highest decile while the experience models ratios were about 0.85 and about 1.55, respectively. The other deciles were closer to 1.0 but the invention forecast was always closer to 1.0 than the experience forecast.
- The invention includes a general process for developing models for forecasting health care costs. The invention also includes processes for products that incorporate a the process and provide information for improving specific business decisions made by health insurers, including, but not limited to, aggregate only, specific only and specific plus aggregate stop loss health insurance products. The models may be developed for specific insurers and their book of business, and may be different for each insurer. A software listing of an embodiment of a program for carrying out a forecasting process in accordance with the present invention is present on the above-cited CD-ROMs. Illustrated in
FIG. 1 is a flowchart which represents an overview of an embodiment of a method in accordance with the present invention as applied to cost forecasting and pricing of renewals for health insurance for fully insured groups as shown inFIG. 1 . - In accordance with the method of
FIG. 1 , health data on members of the book of business is collected, cleaned, integrated and aggregated, as shown instep 102. If the data are missing or miscoded, the cost forecasts may be inaccurate also. Most of the programming cost and analysis involves these phases of the process. The client's data may typically be in many different computer systems or databases, and the data may need to be combined to build person-level files that are complete for a specified time period. - A twelve month “base period” is typically used as the period from which we collect this data to describe each person's history of claims, diagnoses and other factors. The base period could be a longer period or shorter period and will depend on how long the groups have been enrolled and the time for which adequate computer or other records are kept. The base period may have different time periods for people and groups that do not have the same enrollment renewal dates.
- There is typically a period between the “base period” (or underwriting period) and the “next period” (or policy period) during which medical claims data are not available, since they were incurred but not reported or they are between the time of the price quote for policy period's renewal and the renewal date. We call this the “lag period”. The examples here use a lag period of three months but that could be a longer or shorter time period depending on the needs and constraints of the available data, the insurer or others.
- The “next period” is typically the period of twelve months of insurance coverage immediately following the lag period. The claim amount forecast period is the next or policy period that is priced for the group. The “next period” is the relevant time period for the dependent variable in the cost forecast models.
- If the insurer for which future health costs are to be forecast (e.g., a business entity which desires to provide health insurance) is a new client, (e.g., has not had models previously built on their book of business) then a new cost forecasting model may need to be developed for them, for example, as shown in
step 104 ofFIG. 1 . An alternative is to use existing forecasting models and recalibrate those models to the new or updated data. Our methods include a systematic process to develop new models or recalibrate old models. A new model is developed when the old database upon which the old model was developed is not representative of the new database. This might occur if the new database is substantially different in size, covers a different geographic region, contains different types of insurees (e.g., predominantly elderly in Medicare; pregnancy and children are characteristic of Medicaid) or different types of payments (e.g., capitation payments plus fee for service payments). - The selection of the population to be modeled is of key importance since the predictor variables and their weights will reflect not only the specific needs of the population, but also the practice patterns of those providing care and the prices charged for its health care services. The ideal population to use as a standard is the CI's book of business for which the forecasts are needed, provided it is of sufficient size. We have found that an insured population (i.e., book of business) as small as 50,000 persons can produce robust cost forecasts.
- Use of another, smaller or less representative population as a standard can cause problems in both the selection of risk factors because there is no reason to believe that needs per person even after adjustment for demographic factors, nor practice patterns of providers, nor prices per service will be similar enough in the index population as what amounts to a convenience sample, no matter how large the latter may be. The three cost component factors are known to vary from geographic locale by socioeconomic status of the insured and the characteristics of the providers and the features of their health insurance.
- As shown in
step 106, if it is determined that a new cost forecasting model should be developed, there is a specified process for developing the model. The method for developing the new cost forecasting model is part of our product and it can be applied to any medical insurance database that includes the necessary information. - To develop a new cost forecasting model for a specific customer, we need data from groups that were in its historical “base period” and “next period”. Claims data from the “lag period” are not necessary since it need not be used in the model but it is generally collected. The cost forecasting model is calibrated on the historical data to model the dynamics of medical care, practice patterns, and pricing in the geographic markets and provider networks used by the customer. The groups of insured people used as a standard in our models must be enrolled for at least the last day of the “base period”, for the entire lag period and the next period. Multiple sets of base period, lag period and next period can be used to increase the amount of data used to create the cost forecasting model. More data produces more robust models, but must be adjusted for secular cost trends when there are multiple calendar years for the “base period”.
- Scoring the data for pricing insurance for the policy period involves applying the forecasting model to the data for the underwriting period that will be used to forecast cost for the policy period—the renewal year that needs pricing, as shown in
processing block 108. Generally, the most recent nine months of the previous next period will be in the new underwriting period offset by the three month lag period. This helps in processing the data needed for predicting future costs. The first step in the scoring 108 is applying the data steps to the new underwriting period that have not been previously applied (e.g., coding of risk factors). Second, the cost forecasting model is applied to the person-level data. External health care inflation forecasts from the CI or consulting organization are then used to adjust the prior year's trend inherent in the person-level forecasts. The person-level inflation adjusted cost forecasts are then aggregated to the group-level. Third, group-level adjustments to the forecasts are applied for benefit plan design, SIC code, and other factors influencing group costs. - Having forecast the group's future medical expenses, over the selected (e.g., 1 year) period, the price to be charged for the medical insurance for the group for that period may be determined, as shown in
block 110 ofFIG. 1 . The insurer generally desires to obtain a fair, or even maximum profit, without causing the group to leave for another insurer. The competitiveness of the market, historical prices, and historical costs are all factors that will influence the likelihood of the group being retained at any given price. The policy premium, the price to be charged to the customer for the medical insurance coverage for the specific group, comprises the forecast medical cost, the insurer's overhead and other business expenses, and a projected profit. The client's underwriter(s) are asked to provide explicit probabilities of retaining a group at various price increases. These probabilities are multiplied by the expected profit if the group is retained, resulting in the expected profit for that group at each price increase. The information is presented to the underwriter with the premium price that optimizes profit highlighted and recommended. These recommended prices may be more or less than prior prices, but will typically more accurately reflect the future medical costs of the specific group. -
FIGS. 2 and 3 similarly provide an overview of the information flows for two different embodiments. The embodiment ofFIG. 2 involves substantially only the transfer of data. The embodiment ofFIG. 3 involves installing software at the client or an Internet connection with the client's software. - Shown in
FIG. 2 is a “service bureau” embodiment in which all of the data preparation, cost forecasting, model development, scoring the data, and pricing for specific individual groups is carried out at a service bureau location. As shown inblock 202, medical history and claims data for members of the group are sent to the service bureau location, and a cost forecast or per group price or both are sent back to the client (see 212). An alternative is for software to be installed in the client's (insurance company's or third party administrator's) operations with model updates being periodically provided to the client. - This historical data (typically provided by an insurance company or TPA) is used to develop a model that is calibrated to the book of business (see the sample data requested of the client, and/or for specific policy types of insurance companies). A base period, lag period, and next period are required as a minimum. The data are fully validated prior to the model development.
- As shown in
block 204, cost forecasting models are developed which include person-level inlier models based on the Winsorized data (seeFIG. 9 ) and outlier cost components (seeFIG. 10 ), inflation adjustments (seeFIG. 11 ), group-level attribute models (seeFIG. 12 ), and pricing models (seeFIG. 13 ). - As shown in
block 206, once those models are developed and preferably fully tested, we are ready to work with the most recent data available to score the data as shown inblock 208 and establish cost forecasts and set prices for upcoming medical insurance coverage. The most recent data are sent to us for validation, scoring, future cost estimation, cost trend adjustments and pricing (blocks 206 and 208). The data submission is done approximately on a monthly or quarterly basis. There is a trade-off between getting the most recent claims data available for pricing and the effort required to validate the data submitted at a higher frequency and shorter intervals. - The data are stored and combined with the previous data submission until three to six months of new data are available, as shown in
block 210. The new data are combined with the most recent data from the previous data submission so that the most recent 12 months of data are available and are used as the updated next period for recalibration of the models to be used for scoring other groups. In other words, the old models are refit with the new data and updated cost trends are included also. Every one to two years the models may be revised with updated predictor variables and weights. Redoing the models will help capture changes in practice patterns and relative pricing. - As shown in
block 212, the summarized cost forecast and pricing information are sent to the client for use by underwriters or in an automated quotation system. The insurance company or other underwriter client may also use its own pricing algorithm using the cost forecast produced by the method ofFIG. 2 . - As indicated,
FIG. 3 similarly illustrates an overview of an embodiment of the present invention which may be directly utilized by a health insurer or medical underwriter. - As shown in
blocks block 310. An alternative to installing the software on the client's computers is to perform that task using the Internet (as an Internet Service Provider or ISP) to extract the data and return cost forecast and group prices to the client. - As shown in
block 306, processing software modules for carrying out the present method may be installed on client computers, to utilize the standardized data for the software. - As shown in
block 308, after determining the medical cost forecast for a specific group, the prices are offered to that group for renewed medical insurance, whether it be first-dollar, stop-loss or other coverage. This can be done using a human underwriter or as part of an automated quotation system. - The software will capture the updated data and combine it, as shown in
block 310. Those data will be used to recalibrate the models after about three to six months of data accumulation. The updating may be performed offline, or may include automatic database updating and model recalibration. Completely new models may be developed about every one to two years offline. - Having described an overview of several embodiments as illustrated in
FIGS. 1-3 , various processing steps of the illustrated methods will now be described in more detail. - 402 The first step in the data portion of the process is the data request. We do not need to have data in a predetermined layout or format. Some variables may not be available for a given CI, TPA or other data provider. This process is flexible so that it can be modified to work around alternative formats and data sets used to formulate the candidate predictor variables. However, the dollar value of claims made in the base period and claims paid (or disability or life indicator ratios) in the next period are essential. Enough time for run out of claims is necessary so that incurred but not reported (IBNR) claims are included in the data. The following is an example of a data formats, which may be used as a request for health and medical cost data to be used in the forecasting of medical costs:
- In a preferred embodiment, this data may preferably be in the form of five different data files that are linked by an encrypted identifier. The identifier should include unique characters for the company, family, and person. The data files should include group-level information, person-level information, detailed medical claims information (e.g., hospital, physician, durable medical equipment, home health, etc.), detailed pharmacy claims and capitation information, if germane.
- Preferably, data for a relatively large number, e.g., 500,000 people, covering 27 consecutive months (12 month base, 3 month lag, and 12 month test periods).
- Descriptions of preferred data are as follows. Some of these variables may not be readily available, especially some of the group-level variables, and accordingly would not be used in the model building and medical cost forecasting. Other data which may define useful variables may also be included.
- 1. Group-level data (for any group covered during the test period)
- a. Company identifier
- b. Group location (zip code or state and county codes)
- c. Benefit plan description (format and content TBD)
- d. SIC code or other industry classification
- e. Original group effective date
- f. Employer and Employee premium contribution %
- g. Total number of covered employees on date last renewed or date lapsed
- h. Next scheduled renewal date
- i. % employee participation
- j. Capitation payments by provider type by geographic locale
- 2. Enrollment data (person-level for each person covered above)
- a. Company identifier
- b. Person identifier
- c. Age and birth date
- d. Sex
- e. Relationship to employee
- f. Status of employee (e.g., COBRA, pensioner)
- g. Employee type (e.g., hourly)
- h. Zip code of residence
- i. Date of enrollment
- j. Date of termination during study period, if any
- k. Presence of other health insurance (e.g., spouse coverage, Medicare)
- l. Salary or wage
- m. Amount of term life coverage
- n. Amount and terms of disability coverage
- 3. Medical claims (claim-level)
- a. Person/company identifier
- b. Service line-level information:
-
- i. Billed charges, covered charges, payments, amounts applied to deductibles, coinsurance, co-pays, and out-of-network penalties, amounts of COB, pre-existing, capitation payments and other cutbacks
- ii. Dates-incurred, entered, and paid
- iii. Array of ICD-9 diagnoses (5+) for each claim
- iv. CPT code for each claim
- v. Provider type (e.g., physical therapist, clinical psychologist, cardiologist)
- vi. For confinement in any sort of inpatient facility, include partial bills, DRG for inpatient hospital, admission and discharge dates, partial/final bill indicator
- vii. Service type/location (e.g., ER, surgicenter, home)
- viii. Amount of subrogation
- ix. Type of payment (e.g., fee for service or capitation)
4. Pharmacy data (claim-level)
- a. Person/company identifier
- b. National Drug Code or other classification
- c. Date of prescription
- d. Number of units, dose of units, and number of units/day (if available)
- e. Billed charges, discounted charges, and payments
- 5. Capitation payments, if germane
- a. Geographic locale or market
- b. Provider type
- c. Amount and dates
- d. Method for payment (e.g., per member per month)
- The models can be built without pharmacy data if that is not covered by the insurance. Enrollment and medical claims data are required. Many of the group-level variables are desirable, but optional. The data format would specify the dates for the beginning of the base period and the end of the next period or new base period to be used for the cost forecast for pricing. Because the data may originate from a variety of different databases and sources, control totals (e.g., number of records, sums of fields) are also included, to assure that the data is excerpted and formatted properly. The customer or TPA may provide a layout or format for the data, because a specific format is not required. The layout or other documentation should, however, describe all of the legitimate values for the variables and the meaning of those values (e.g., provider type=3=physician).
- As shown in
block 404 ofFIG. 4 , the customer or TPA sends a layout and a sample database, so that tests can be run prior to extracting all of the data. Valid ranges of variables are checked as shown inblock 406. Control totals are matched, and encrypted IDs may be tested. The data need not be aggregated and tested since it is a small subset of the data universe, but the conformity of the sample data to the layout is checked. - If the database is accurate, the entire universe of data is processed, as shown in
block 408. - If the database and layout do not correspond or there are data values outside of the range of legitimate values, the data extraction program or layout are fixed and another sample data set or layout is tested.
- The dates for the model development overall, and the base period for actual cost forecasting and pricing are established and defined, and the respective dates for each respective group have been set prior to the data request. Now the dates for each group must be determined for its inclusion in the universe of the model development.
- As shown in
FIG. 5 , the process perhaps is easiest to understand by working it backwards. A list is developed for the renewal dates for the first year of coverage that would have prospective prices set using this method, as shown inblock 502. - The following Table 2 lists an example of time sequencing for developing models and implementing cost predictor models.
-
TABLE 2 Time Sequences for preparing and Implementing Cost Prediction Models Ba Number Model Implementation for of Consecutive A Predicting Costs and Setting Calendar Months Model Development Prospective Prices 12 Base Period Data Underwriting Period Data 3b Lag in Data Forecast Cost, Incorporate 12 Next period Inflation Forecast and Set 3b Model Weight (re) Premium calibration 12 Policy Period aColumn B pertains to Groups which have the same renewal data (e.g., January 1) bPeriods greater than 3 months, may be required for these phases depending on clients needs - The groups need to get a price in advance of the coverage date for new customers, or the renewal date for existing customers, to accept or reject it prior to the renewal coverage. Additionally, time for receiving data from the client or a TPA and analyzing it must be added to the lag period. We have used a three month lag period, may be used in
processing block 504, but it could be longer or shorter depending on database and business needs. - As shown in
block 506, the beginning of the lag period is the last date that bills can be paid for the base period of the model development period. Otherwise, the cost forecasting model would include information that would not be available in the future. The lag period information (claims paid or made) need not be used to provide an accurate cost forecast for a future time period for a particular group. The claims incurred during the next period is the dependent variable for the model of the illustrated embodiment. An estimate of claims incurred but not reported may be added on if there is insufficient time for a proper run-out period (i.e., if only one base period and next period are used for model development). The lag period precedes the next period and the base period is typically the year preceding the beginning of the lag period in the universe of model development. - Table 2 illustrates one example of timing for the processing of
block 508. Column A represents the model development period and Column B represents timing for the application of cost forecasting and prospective pricing. The model development time period precedes the actual pricing period but there is overlap since the next period of the model development period is used as part of the underwriting period for the application of cost forecasting and the pricing model. The timeline will be modified when longer lag periods are required. Column B pertains to groups with the same renewal date. Alternate flowcharts may be used to represent each renewal date. - Illustrated in
FIG. 6 is a flowchart illustrating data validation and standardization procedures forsteps FIGS. 1-3 . - Preliminary data validation checks, and initial data preparation as a second set of data checks, as shown in
block 602. Utilizing a file structure that will allow for standards to be compared to the data prior to the data aggregation is a facilitating procedure. - As shown for processing by
block 604, medical claims include diagnoses that are typically coded in ICD-9-CM codes, procedures that are coded in CPT codes, prescriptions that are coded using NDC codes, hospitalizations coded using DRGs, ICD-9-CM and other codes, that may appear on claims. Tables are developed that contain the values for all of these codes. These tables are standards for comparison with the customer's data and the values in the data must correspond to valid values for these coding systems. - As shown in
block 606, tables are made for each client, because the place of service, type of provider, dates, and other fields on the claims and enrollment data will frequently have values that are idiosyncratic to a particular database or customer. - The values should preferably be put in a table format that will allow checking and standardizing the data for accuracy, as shown in
block 606. - As shown in 608, the time periods at the group-level (see TABLE 2) may be used to screen if claims and insureds should be in the universe. A table is used for comparison. Prior experience permits the development of norms that can be used to check the data for reasonableness. Examples include the charge and payment per claim, the number of claims per person, and other norms. These values are put into a table for comparison, and processing in
block 610. - Preparation (see block 612) of the raw data involves the same data process steps used in
FIG. 4 , utilizing specified read programs. - The data (see block 614) are provided in the agreed upon medium, the data are read and control totals are checked, see
block 616. If errors are noted, the cause is determined and corrected. - The raw data are reformatted, see 618, into a SAS database in the illustrated embodiment. Other database software (e.g., SPSS, Oracle, etc.) could be used which are also capable of handling large scale databases.
- In subsequent process steps as shown in
FIG. 6 , the fields are reformatted (see 620) so that the values correspond to the standard tables, the group-level time period (see TABLE 2) tables are used to extract, see 622, the universe of relevant claims and insured people, and claims for people that are not in the model development universe are put into a separate file (see 624). Data following the model development universe time period may fit into the underwriting period data that will be used for the application of cost forecasting and pricing. - The claims and enrollment data from the model development universe are compared, (see 626) to the standards. A decision is made, see 628, whether the data are in compliance with the standards.
- Data that do not match the standards are put, see 630, into a separate file. The cause of the mismatches is evaluated, and the data is deleted or corrected where appropriate. Records may need to be sent back to the customer for replacement or fixing. If there is a large number of mismatches, they must be fixed prior to aggregation.
- The records that match the standards need to be matched and merged, see 632, into person-level summaries. Incomplete data should not be aggregated as it will be misleading.
-
FIG. 7 is a flowchart illustrating the matching and merging (integration) of data in the process steps 102, 202 or 302 ofFIGS. 1-3 . - In order to match and merge the enrollment and claims data, there needs to be a unique group, family within group, and enrollee or dependent within family identifier, as indicated in the processing of
block 702. The social security number or other identifier is encrypted so that actual people cannot be identified and group numbers are used instead of company names. Street addresses are not used so the people cannot be personally identified. However, records need to be linked for accurate models and pricing. One linking system that is effective uses the group ID as a prefix, encrypted social security number of the enrollee as the family ID, and enrollee or dependent number as the person ID. Birth dates and sex are useful as checks on the ID. - As shown in processing blocks 704, 706, the claims data are prepared separately, and a look-up table is generated that lists the group, family, person ID for all claims with the respective birth date and sex.
- In accordance with
processing blocks - The processing for the respective blocks of
FIG. 7 are described as follows: - 712 The tables are merged and compared. The claims table should be a subset of the enrollment table. Claim IDs that do not match enrollment IDs indicate an error. These claims are put into a separate file and manually analyzed.
- 714 The claims records that match enrollment records are merged together into one long variable length record.
- 716 The person-level merged file contains the enrollment information and claim information, but the record is not aggregated.
- 718 A flag is assigned to people that have claims and enrollment information since these records will require aggregation.
- 720 A flag is assigned to people that do not have any claims since their record does not require aggregation.
- 722 Additional data validation checks occur such as the number of insureds per group and the percentage of people within each group that have no claims.
- 724 If there are aberrations in the data, there is a manual review. If that does not fix the problem, the errors are reviewed with the customer.
- 726 The data are valid and ready to transform into the analytic database.
-
FIG. 8 is a flowchart illustrating the aggregation and risk factor coding for thesteps FIGS. 1-3 . The respective processing blocks ofFIG. 8 are described as follows: - 802 The claims data are sorted by person ID by incurred date of the claim.
- 803 This sort allows for a final screening on the chronological eligibility. A person in the group typically needs to have at least one day of eligibility in the base period and next period and continuous eligibility between those dates. Otherwise, they are dropped from the modeling database. If a person loses eligibility prior to next period, he or she is dropped from the entire analytic database. If the person enrolls in the lag period, that person is kept in a separate analytic database. This last category of people will have their next period payments compared to those of similar demographics. If a person is enrolled in the base period and disenrolls during the next period, those people are put into a separate file in the analytic database. Their next period payments will be compared to people with the same characteristics that did not leave in the next period. People in other time sequences may be dropped from the analytic database.
- 804 A new record is produced for each person. It includes the enrollment data and information extracted from the claims, when available. The risk factors use ICD-9-CM codes, CPT codes, place of service, provider type, demographic data, and other variables (see risk factor listing in Appendix G). As the records for a person are read, the ICD-9-CM diagnosis codes, CPT codes and other variables that are used to define the risk factors are extracted from the claim records. The new record is a vector of variables that are initialized to zero and then incremented by one when that variable is read in the claims. These variables are coded from claims from the base period only. Payments and charges are summed for the base period, lag period, and next period. It is important to compare the expected cost from the forecasting model with the actual cost next period of those that were not in the modeling universe. If there are large discrepancies, the model may need adjustment.
- 806 The risk factors are then coded by processing the information on each person's aggregated record (See Appendix G). Risk factors were developed using a combination of expert medical opinion, statistical analyses, and knowledge of the medical insurance market. Diagnoses are divided into diseases and conditions and by inherent risk. Procedures are divided by body system, type of test, type of procedure, and type and site of care. Other risk factors are designed based on the relationship to the enrollee, family composition and demographics. There is a trade off between a very specific risk factor that has very few but very homogeneous people in it and broad risk factors that have heterogeneous people in it. Correlations with the next periods payments and regression models are two ways to determine if a risk factor is worthwhile empirically. The base period charges and payments plus the shape of relative amounts of those payments by month, day, or other amount of time are some of the strongest risk factors (See TABLE 4). The amount of time enrolled in the base period is another risk factor. The key is developing robust risk factors that are not too heterogeneous. A priori logic plus trial and error are useful approaches. Our candidate risk factor codes are listed in Appendix G. TABLE 5 illustrates two family composition risk factors. A detailed listing of risk factors is contained in Appendix G: Risk Factors.
-
TABLE 4 Risk Factors for person level experience Hibymos1 The maximum cost per day for any month cost for the base period Hibymos2 The 2nd Highest cost per day for any month for the base period Hibych2a (1, 0) 1 = The second highest month cost per day is adjacent to the highest month Hibych2b (1, 0) 1 = The second highest month cost per day is not adjacent to the highest month Hi1dvby The index of Highest cost per day divided by average cost per day per month Hi2dvby The index of 2nd highest cost per day divide by average cost per day per month Tenmoch Average from the sum of all months in the base period excluding the 2 highest months per day -
TABLE 5 Risk Factors - Family Composition Ensxkd Combines the use of Employee Relationship: ‘1’ = ‘A Enrollee’ ‘2’ = ‘B Spouse’ ‘3’ = ‘C Son’ ‘4’ = ‘D Daughter’ ‘5’ = ‘E Stepson’ ‘6’ = ‘F Stepdaughter’ ‘7’ = ‘G Other Male’ ‘8’ = ‘H Other Female’ ‘9’ = ‘I Surv Spouse’ and Gender of Enrollee: ‘M’ = “male’ ‘F’ = “female” values for ensxkd: 1 Enrollee, Male 2 Enrollee, Female 3 Spouse, Male 4 Spouse, Female 5 Son, daughter, Stepson or Stepdaughter 6 Other Female or Surviving Spouse kid1_3 Count of the Number of Children in a family. 0 = no children, 1, 2 or 3 or more children - Some insurance plans are paid on the basis of a combination of fee for service (FFS) payments and capitation payments. The previous discussion has assumed a FFS payment system. If the combination or hybrid payment system is used, then adjustments for capitation payments must be made at the person and group levels. We recommend developing risk factors as dummy variables when there are capitation payments for a particular provider types (e.g., primary care, obgyn). This is especially important when the capitation coverage is not consistent across groups or geographic region.
- 808 Validation checks can now be made on person-level data. Frequency counts for dichotomous or categorical variables are prepared and compared among groups, geographic area, time period, as well as against norms. Missing value percentages are calculated by group, time period and geographic area for each risk factor. The mean number of claims per day and mean dollars per claim (this can be Winsorized) are calculated by group, time period and geographic region. Large discrepancies in the number or average claim size is reviewed and analyzed to uncover data errors. The ratio of charges to payments is calculated by group, time period, and geographic region and compared with norms.
- 810 Aberrant results are evaluated to determine if there is an error. If data cannot be corrected or replaced, those people are dropped from the model universe.
- 812 The model universe is left and ready for final preparation for analysis.
-
FIG. 9 is a flowchart of processing steps for developing cost forecasting models based on “inlier” data insteps FIGS. 1-3 . Processing blocks ofFIG. 9 are described as follows: - 901 A clean analytic database is required as the modeling universe. Otherwise, spurious results will lead to idiosyncratic, non-reliable models or, at best, weakly predictive models.
- 902 The modeling universe database is separated into Winsorized data (i.e., inliers) and the outlier data. There is an “inlier” model with the dependent variable Winsorized and an “outlier” model that uses the difference between the actual claims next period and their Winsorized values. The independent variables are similar for the inliers and outliers. It has been found that models are more accurate when average payments per day is used as the dependent variable and average charges per day as predictor variables (and components of it such as the lowest ten months average charge per day). Cost per day adjusts for persons not enrolled for a complete year.
- The Winsorization point is typically selected as the top 5% of payments per day. If that value is $55 per day, then the inlier model uses a value of $55 per day as the dependent variable for people with greater than or equal to $55 per day in payments. People with under $55 per day in payments do not have their dependent variable changed.
- The database for the outlier models flags people with next period payments greater than or equal to the Winsorization value (e.g., $55 per day). If they are at or over the Winsorization amount, the flag equals one and zero otherwise. Also, the actual payments per day next period less the Winsorization amount is calculated. If it is negative, the outlier payment is set to zero.
- 903 The Winsorized modeling universe database is separated into two separate components: those individuals with claims in the base period and those individuals without claims in the base period. Those without claims have only demographic risk factors whereas those people with claims have a payment history and clinical information as additional risk factors. Those without claims are on average lower in risk than those with claims.
- 904 The no claims database includes demographic variables, such as age and the family relationship to the enrollee plus risk factors from the enrollment file.
- 906 People with claims in the base period also have the enrollment file risk factors plus those risk factors derived from the claims file.
- An example of a program segment to run OLS regression model on inlier with claims data is as follows:
-
*** ‘5th root of winsorized cost is DEP measure ’; ***OLS MODEL; proc reg data=‘DATA WITH CLAIMS’ outest=’OLS 1st MODEL FOR LAD CART’; exp9olsd : model w5_6850= ensagen sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a h5bchg2b ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2 hi1dvby hi2dvby / selection=stepwise selection=backward details; run; proc score data=‘DATA WITH CLAIMS’ score=’OLS 1st MODEL FOR LAD CART’ out=‘DATA WITH CLAIMS’ type=PARMS predict; var ensagen sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a h5bchg2b ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2 hi1dvby hi2dvby; run; ***CHECK RESULTS; proc means data=‘DATA WITH CLAIMS’ ; class modeled; var w5_6850 exp9olsd ; proc corr data=‘DATA WITH CLAIMS’ ; var w5_6850 exp9olsd ; where modeled eq ‘YES’; - 908 The initial person-level model for people with claims uses the continuous independent variables only. Examples include the age, number of days enrolled in the base period, charges in the peak spending month, and average charge per day in the lowest ten months. The dependent variable is the Winsorized payment per day (or a transformation of it such as the fifth root) in the next period. An ordinary least squares (OLS) model has been used. Other forms of regression models (e.g., median or robust) or neural networks could be used. The example given in the software in the CD-ROM Appendix does not include this step, but the program above does provide an example. This step can be important when there are several numerical candidate predictor variables.
- 910 The expected payments per day from the previous step is used as an input to the next model along with the categorical variables (e.g., sex, site of care, diagnosis, etc.) We have found that a regression tree is a very effective method for capturing the interactions between the clinical variables and the amount charged in the base period. The CART software with the median regression tree option has produced the best results to date. Other forms of data mining (e.g., rule induction, clustering, F genetic algorithms, neural networks) could also be used. The key is to capture the interactions between base period charges and both clinical and demographic risk factors. An example of a Program to run CART median regression tree using expectations created from OLS regression (see 910) and other risk factors is found in Appendix A.
- 912 A CART median regression tree or other data mining technique is used to model the “no claims” Winsorized database. The first model (i.e., the one for continuous variables used in 908) is omitted since none of the continuous variables derived from claims are available for this universe other than age or length of enrollment. This model uses the same statistical techniques as 910 but its independent variables are limited to those that can be derived from the enrollment file. The output from the regression tree (terminal nodes) identifies groupings of people that have homogeneous next period payments.
- 914 The regression tree terminal node's groups people with similar median payments next period. A set of dummy variables is developed that identify people in each terminal node. These dummy variables, the variables that were used to form the dummy variables, and the significant variables from 908 are entered into a final prediction model. We have used OLS, but other techniques, such as median or robust regression, neural networks or other modeling methods could be used instead. The result of those models is an expected payment per person per day in the next period. This only includes the Winsorized portion of the payments for people with claims in the base period. An example of a program to run OLS regression using terminal nodes from regression tree and other important risk factors from the tree (see 910 and Appendix A) is found in Appendix B.
- 916 The same technique as 914 is applied to the model output from 912. The result of this model is the expected payments per day for next period for people that do not have claims in the base period.
- 918 Model testing can be done at this point or after each step in the modeling process (i.e., after 908, 910, and 914 for models for people with claims). It is probably more efficient done after the final step. There are five criteria that are used in model evaluation in the illustrated embodiment: the mean absolute residual, r2, accuracy measure (previously defined), bias, and cross validation. Mean absolute residual, accuracy measure (previously defined) and r2 are related to the accuracy of the forecast. Bias refers to systematic over or under prediction when cases are sorted by their expected value. Regression models can be biased but regression trees are not biased. Cross validation refers to the accuracy of the models when they are applied to different sets of data. The tree software tests for cross validation. Hold-out samples can be used for testing the entire hybrid models. An example of a Program to run bias test, mean absolute residual, and r2 analyses (examples of model testing) is found in Appendix C.
- 920 The same tests of the quality of the models are applied to the models developed on people without claims in the base period. The model tests are probably most efficiently applied after the final model is developed (i.e., 920). These models will have far less predictive accuracy than the models covering people with base period claims since there are fewer risk factors and the variability in next periods payments is not very predictable.
-
FIG. 10 is a detailed flowchart of process steps for developing cost forecasting models based on “outlier” data of the Winsorized data for thesteps FIGS. 1-3 . The illustrated processing blocks ofFIG. 10 are described as follows: - 1002 The outlier database has next period's payments of zero for everybody whose payments were below the Winsorization point and the amount above the Winsorization point for everybody else. The outliers can have very high cost per day so the variability is very large. Therefore, we have chosen to model the outlier portion separately. This two step approach leads to more accurate and stable results since the extreme outliers are almost impossible to predict accurately.
- 1004 People with base period claims are modeled separately as they have risk factors not available with people without base period claims (e.g., diagnosis and amount charged).
- 1006 People with no base period claims are modeled separately since they only have risk factors available from the enrollment file.
- 1008 The same continuous risk factors available for 908 are used to model the probability of these people having payments above the Winsorization point. The dependent variable is 1 if the total amount of next period's payment is above the Winsorization point or zero otherwise. A logistic regression is used to estimate the probability of each person's next period's payments exceeding the Winsorization point. Other types of regression models (median or robust), neural networks, or other predictive modeling can be used instead of logistic regressions.
- A program to run logistic regression probability model on outliers with claims follows.
-
**HILO is the 1=Outlier, 0=Inlier; proc logistic data=‘DATA WITH CLAIMS’ outest=’LOGISTIC WEIGHTS’; exphilo : model HILO=ensagen sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a h5bchg2b ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2 hi1dvby hi2dvby; ; proc score data=‘DATA WITH CLAIMS’ score=’LOGISTIC WEIGHTS’ out=‘DATA WITH CLAIMS’ type=PARMS predict; var ensagen sq5chg1 sq5chg2a sq5chg2b sq5oth agesq h5bchg1 h5bchg2a h5bchg2b ten5moch zeroa zerob zerooth enrldayb hibymos1 hibymos2 hi1dvby hi2dvby ; run; run; data ‘DATA WITH CLAIMS’; set ‘DATA WITH CLAIMS’; exphilo=exhbilo*’mean of outliers’; - 1010 The model is tested for accuracy using the criteria described in 918. Note that the probability of each person being an outlier is being modeled rather than classifying each person as an outlier or not an outlier. All of the techniques from processing
block 918 ofFIG. 9 are applicable. - 1012 A regression tree is used to refine the estimated probability of being an outlier. The dependent variable is the same as 1008. We recommend a least square regression tree but other types of predictive models could be used that capture interactions (e.g., neural network, rule induction or genetic algorithms). The expected value from the logistic regression plus all of the categorical risk factors from the claims data and enrollment file are used as candidate independent variables (See 910). The output are terminal nodes of a least squares regression tree that have homogeneous probabilities of being an outlier. The probability of each person is determined by their terminal nodes. Note that this is not a classification tree.
- A program to run CART least squares probability tree on outlier with claims data using expectations from OLS regression (see 1008) and other risk factors is found in Appendix D.
- 1014 The same methods are applied to the people with no claims data (See 1012). The output are groupings of people with homogeneous probabilities of being an outlier.
- 1016 and 1018 The models are tested for accuracy, bias and cross validation as the models were tested in 918.
- 1017 and 1019 The terminal nodes and risk factors defining those terminal nodes are used as input into another logistic regression or other forecasting technique (see 914 and 916). The examples in Appendix E are for 1017 since it includes data from claims.
- 1020 and 1022 For each terminal node, the median payment above the Winsorization point next period is calculated. When the medians are not significantly different, the terminal nodes (mean above the Winsorization point) are combined for additional stability. Note that the probabilities are not combined. The means are calculated arithmetically for the people in the combined terminal nodes and for those kept in separate nodes due to their distinctive median dollar costs. The means are then multiplied by the respective probabilities for each person giving the expected outlier payments for each person. The probability from the logistic regression (see 1017 and 1019) is used rather than from the regression tree. People are “tagged” with their respective terminal nodes (see 1012 and 1014) so that the correct mean is multiplied by the probability.
- 1024 The inlier Winsorized cost forecast and the expected cost of the outlier portion are summed to give the total expected cost for next period.
- The process of scoring the data refers to applying the model to a set of data. The data need not be the same data on which the model was developed. However, it is best if the weights are derived from that client's book of business. The data need to have the same risk factors coded on it that were included in the models of the probability of being an outlier and those used for the expected inlier payment calculations. Also, the models must be applied to the universe of people that were defined using the same criteria that were used to define the model universe. The model gives a set of weights applied to individual risk factors or combinations of risk factors yielding the expected payments or probability. Most statistical packages or data mining software have automated methods for scoring data once the risk factors are properly coded.
- Illustrated in
FIG. 11 is a detailed flowchart for scoring, testing and integrating the data, and adjusting for cost trends for use insteps FIGS. 1-3 . The description is written as steps in developing the model so the data are referred to as the base and next periods. The application of the model to the actual underwriting data is essentially the same and it produces the policy period expected cost. The respective processing blocks ofFIG. 11 are described as follows: - 1102 The probability of a person being an outlier (i.e., with policy period payments greater than the Winsorization point) is calculated for all people without claims. Their probabilities will be lower than those with base period claims.
- 1104 The mean for each terminal node or group of terminal nodes (block 1022 of
FIG. 10 ) is multiplied by the associated probability. This calculates the amount over the Winsorization point that each person is expected to cost in the next period. This gives the expected outlier dollars per day for each person. The mean expected dollars per day for each person is well below the Winsorization point. - 1106 and 1108 The exact same process is applied to the outlier probability model and mean policy period payments for people that have base period claims. The expected value is calculated by multiplying the probability by the mean.
- An example of a Program to score the outlier with claims data (see 1017) is as follows:
-
proc score data=’data from cart’ score=’logistic output ‘ out=’data with claims’ type=PARMS predict; var ensagen agesq exp9olsd exp9sqd sq5oth ten5moch dxresp othdiges hi2dvby dxdigest dxcircul tnde5ls1 tnde5ls3-tnde5ls5 ensxkd1a ensxkd2b ensxkd3c ensxkd4d ensxkd6f; run; run; data ‘DATA WITH CLAIMS’; set ‘DATA WITH CLAIMS’; expprob=hilols*’mean of outliers’; - 1110 and 1112 The expected next period inlier (less than or equal to the Winsorization point) payments are added to the expected next period outlier payments to produce the total expected payments in the next period for people with no claims (from 920) and for people with claims in the base period (from 918). The following program is an example of scoring inlier data with claims.
-
Program to run scoring of inlier with claims data (output from OLS regression see 914) ***score ALL data; PROC score data=‘DATA WITH CLAIMS’ score=‘OLS regression scores’ out=‘DATA WITH CLAIMS’ type=PARMS predict; var ensagen agesq exp9olsd exp9sqd sq5oth ten5moch dxresp othdiges hi2dvby dxdigest dxcircul td5lad2-td5lad13 ensxkd1a ensxkd2b ensxkd3c ensxkd4d ensxkd6f ; run; run; title2 ‘REPORT TO REVIEW SCORED DATA With model universe’; PROC means data=‘DATA WITH CLAIMS’ ; var wins6850 expolsls exp5rLAD exp5rtLs ensagen agesq exp9olsd exp9sqd sq5oth ten5moch dxresp othdiges hi2dvby dxdigest dxcircul td5lad2-td5lad13 ensxkd1a ensxkd2b ensxkd3c ensxkd4d ensxkd6f; where exp9olsd ge 1.15; - 1114 This database includes everybody that was included in the modeling universe (i.e., the standard population). However, there are people that were enrolled next period but not included in the modeling universe.
- 1116 When everybody included in the modeling database is combined, the sum of the expected payments per day next period should equal the actual payments. Additional model testing is performed at this point. The same methods (see 918 and 920) that were used to test the models developed on subsets of the modeling universe are reapplied now. This summary testing is even more important than testing the components of the complete model.
- 1118 There are three categories of persons used for which insurers will be at risk during the next period but who are excluded in the modeling database (i.e., the standard population).
- 1. Persons enrolling during the lag period
- 2. Persons enrolling during the next period
- 3. Persons terminating during next period
-
- a. in 1 or 2 above
- b. other categories
- For those in categories 1 or 2, no base period claims data are available when the rates must be developed and offered. Consequently no model predictions can be made for them. However, we know their actual payment costs during next period. The following tabulations will show if any adjustment in expected next period costs is needed for them.
- Compare the next period actual costs per persons per day for those in categories 1 and 2 with both the expected next period cost per person per day and the actual next period cost per person per day for those in the following categories (note that these are detailed examples of subscriber units that could be used for pricing also):
- Subscriber only
- Subscriber and spouse
- Subscriber spouse and 1 dependent
- Subscriber spouse and 2+ dependents
- Subscriber and 1 dependent, no spouse
- Subscriber and 2+ dependents, no spouse
- Because outlier next period costs may distort these findings, the following quantities of costs per person per day should also be compared to reduce the effects of outlier.
- Median
- 75th percentile
- 90th percentile
- If there are no significant differences between the excluded and included categories of persons, no adjustment is needed. For those categories for which there are significant differences, the adjustment factor will be (excluded category mean next period cost/day) divided by (included category mean next period cost/day).
- The number of persons in category 1 can be determined for those who actually enrolled in the lag period while the number in category 2 can be estimated from underwriting period data. The final adjustment factor will be the product of the per person adjustment factor (as above) and the proportion of all next period person days estimated to be comprised by those in category 1. The proportion of next period person days comprised by those in the model will have an adjustment factor of 1.00.
- The use of these adjustment factors can be further refined by applying them separately for sets of insured groups which have similar adjustment factors, instead of applying one adjustment factor to all groups.
- Additional adjustment for those in category 3a above is not required since these persons experience will be included in the adjustment for those in categories 1 and 2. Those persons in category 3b will be included in the population used as the standard for our overall risk models. They can thus be scored by their base period attributes, and their next period expected costs can be estimated from the described models. We can thus score them by their base period attributes and estimate their next period expected costs from our models. These can then be compared to the actual next period costs per person per day, in total and by the subscriber family categories listed above.
- After checking for the influence of outliers, any subsets with actual values differing significantly from expected values can be the basis of adjustment. The proportions of person days in category 3b can be estimated from the available data.
- As noted above, separate adjustments can be made to expected next period costs for groups which have similar adjustment component factors.
- 1. actual to expected costs
- 2. proportion of next period person days attributable to those in category 3b There may well be an interaction in these two factors.
- 1120. The database of all people covered next period is compiled next. A flag is set to one if the person has an expected payment next period that was calculated from the risk adjustment models. Only the new joiners in the lag period or next period cannot have an expectation calculated from the risk adjustment model.
- 1122 When this product is used for an application of prospective pricing for insurance coverage, the future cost of health care needs to be included. The risk adjustment models include the historical cost trend since it was present in the data. In other words, no additional adjustment was required for the modeling since the model uses the base period to forecast next period's payments so the cost trend inherent in the data is built into the model. Note that with a 3 month lag period, this is a 15 month cost trend. If the future annual cost trend is expected to be identical to he cost trend between the base period and the next period, then no further adjustment is needed since it is already incorporated in the data and model. If the future cost trend is different from the cost trend implicit in the data used for model development, the ratio of the future cost trend divided by the model period cost trend should be used as an adjustment.
- All health insurance companies use an estimate of the future medical cost trend to increase future expected claim costs to what they expect them to be in the policy period. The simplest group-level cost forecast for a credible group is last year's cost multiplied by cost trend producing the “experience” forecast. The CI will provide a cost trend forecast for use in this invention. The development model has an implicit cost trend built into it since it was present in the model development data. Therefore, the development model must be detrended and then the CI's cost trend forecast can be applied to the person-level cost forecast when the model is applied to the underwriting period data. In order to detrend the development model, we calculate the cost for a standardized population for the book of business in the base and next periods. The standardized population assumes a specific mix of demographics in the CI's book of business for the base and next periods. A particular embodiment would calculate the proportion of cost in each of the following categories: male employee; female employee; male spouse; female spouse and other dependent cross-classified by 5-10 age categories (e.g., <5, 5-17, 18-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75+). This particular classification would produce up to 40 demographic cells. Other classifications could be used. Too many cells will cause a loss of robustness in the estimates. The mean cost per person per cell in the next period divided by the associated mean cost in the same demographic cell in the base period calculates the cost trend per cell during the model development period. One method to standardize the population in order to produce a single cost trend for the entire book of business is to weight each cell by the proportion of cost it accounts for in the base period. The weighted average of the cells' cost trend is a summary cost trend for the book of business for that standard population for the time period between the base and next periods. If those periods are contiguous and one year each, the annual development model cost trend has been calculated. Otherwise, an adjustment must be made for the time periods to calculate an annual trend. If the base, lag and next period are each one year, the square root of the cost trend will calculate the annual cost trend since the trend compounds. If the lag period is three months and the base and next period are one year, the fifth root of the cost trend is the three month cost trend. The three month cost trend is taken to the fourth power to calculate the annual cost trend. To apply the CI's single number cost trend (which will be an annual trend), the reciprocal of the annual development model cost trend is multiplied by the CI's annual cost trend to calculate the cost trend that should be applied to the underwriting period data after application of the development model. This method works for first dollar medical insurance, aggregate only medical stop loss and reserving for those insurance products.
- The development model next period data need to be detrended and then retrended with the CI's cost trend forecast prior to calibrating the development model for specific stop loss coverage or aggregate stop loss in combination with specific stop loss coverage. Once those adjustments are made, additional cost trend adjustments do not need to be made before applying the specific or aggregate in combination with specific stop loss models to the underwriting period data to forecast the policy period costs.
- Alternatively, the CI may have cost trend calculated separately by geographic locale or by provider type (e.g., drugs, physician, inpatient hospital). If the CI's cost trend is specific to each geographic locale, the same method of demographic cell adjustments can be employed as previously described but a separate table is calculated for each geographic locale. The CI's locale specific cost trend is applied to the cost trend estimated for the model development period using the standardized population adjustments for each locale. Each locale's detrending and retrending is applied to the underwriting data for that locale to calculate the policy period cost for that locale.
- If the CI's cost trend forecast is by provider type, we need to estimate the development model trend by provider type so that the policy period forecast will be appropriately detrended and retrended. This can be done by cross-classifying the demographic cells by provider type costs for the base and next periods and calculating the provider type trend for each demographic cell separately by provider type. The provider type cost trend by demographic cell are combined by weighting by the proportion of base year cost by each by the proportion of total cost for that demographic cell for each provider type separately. This calculates a provider type cost trend for the base to next period for the entire book of business. The CI's forecast cost trend by provider type is multiplied by the reciprocal of the model development cost trend for the same provider type. This adjusted cost trend by provider type is multiplied by the cost forecast for each terminal node by the associated cost by provider type in the policy period and then summed across provider type by person to calculate the policy period forecast cost per person. The associated cost in the policy period by provider type is calculated by multiplying the proportion of cost by provider type in the next period by terminal node by the total forecast cost for the policy period for that terminal node.
- 1124 The person-level inflation adjusted forecasts are summed by group and actual is compared to forecast. The group-level models make adjustments when the actual is different from forecast.
- The underwriting period data are scored using the model developed on the base and next periods. Risk factors need to be calculated for the underwriting period data in order to apply the model. The summed scored data, with appropriate cost trend assumptions, produce the expected policy period costs or actual expected cost for the policy period using the person-level models.
-
FIG. 12 is a detailed flowchart illustrating processing steps for developing group-level models and making adjustments to the summary of the person-level data ofsteps FIGS. 1 , 204, 208 and 210 ofFIG. 2 , or 304, 306 and 310 ofFIG. 3 . The steps are similar to the person-level modeling steps. First the development model is calculated using the base and next period data. The model is then applied to the underwriting period data (i.e., scoring the data) to forecast the policy period costs. With the group-level model there is the model development using the base and next period and then the risk factor coding and scoring of the underwriting period to produce the estimated policy period costs for pricing the policy. The processing block descriptions forFIG. 12 are: - 1202 There are likely to be characteristics of insured groups which can influence the group's costs of care over and above that based on the characteristics of the persons in the insured groups. For this reason we develop a model to identify such intergroup differences and a way of applying the model's results to adjust each groups expected payments from the models based on individuals. First, the person-level expected payments are summed by group.
- 1204 The group-level development models have the following characteristics:
-
- Unit of observation—the “group”
- Dependent variable—Next period residual dollars per person per day in the group (i.e., group total next period actual payments less Group total next period forecast payments divided by the number of people in the group divided by 365 days)
- Candidate predictor variables are coded and include the following
- Benefit attributes
- alternative insurance plan
- deductible
- co payment
- exclusions
- dependent coverage
- Benefit plan type: indemnity, PPO, POS, lock-in HMO
- Payment type: fee for service or capitation
- Demographic cells: proportion in age range by relationship by sex
- COB in Base period
- Capitation payments by provider type
- Number of subscribers
- Average family size and proportion in each family composition class
- SIC code
- Geographic locale
- Actual mean payments in base (underwriting) period per person per day
- Expected mean payments in next (policy) period per person per day
- Percent of enrollees joining during base period or leaving during base period
- Benefit attributes
- Payment carve outs for capitation—if specific types of are paid by capitation (e.g., primary care, obgyn), then risk factors need to be developed that will allow the group-level model to reduce the payments since the services are covered by the capitation payments. Dummy risk factors for the presence or absence of capitated payments by provider type will need to be included when all services are not covered by fee for service payments.
- 1206 A least square regression tree including selected interaction terms as predictors (other data mining techniques that develop and test numerous interactions such as neural networks, rule induction, genetic algorithms, clustering techniques or other methods could be used instead of regression trees) is developed on the group-level data. This second level of modeling makes adjustments for information not included at the person-level.
- 1208 An ordinary least squares model (other types of regressions, neural networks, or other types of predictive models could be used instead of the OLS regression) is applied to the predictor variables that were important in the model preceding this step. The candidate predictor variables include the terminal nodes as dummy variables and the main effects used to define the terminal nodes.
- 1210 The predicted values from the model in 1208 are the average per person per day error (i.e., residual) in the estimate of next period's payments for everybody in the group. This residual is added to each person's next period expected payments from the person-level models (subtracted if it is a negative value). The model is developed on historical data that have no need for a cost trend adjustment except to be annualized since the cost trend is in the data. When the models will be used for setting prices for the policy period, the inflation adjusted person-level next period payment estimates are used as input and the groups are scored using the group-level models. Risk factors are coded for the group using the underwriting period data and the groups are scored with the group-level model to produce the policy period expected group-level costs.
- Alternatively, the MAP4HIP method can be used to forecast person-level cost for individual (or family) renewal health insurance. The same methods apply but there is no “group” other than the family. The cost for the individual family members are summed to produce the family-level forecast. A family-level model can be used for final cost adjustments. The family-level risk factors are family composition, benefit plan, geographic locale and other factors germane to the family rather than an employment “group”.
-
FIG. 13 is a detailed flowchart of an embodiment of a price optimization procedure which may be used to carry outsteps FIGS. 1-3 . The processing block procedures ofFIG. 13 are: - 1302—The group cost estimate is the final output from the cost estimation system (i.e., expected medical costs in the policy period). It is at the group-level and includes the inflation trend estimate.
- 1304—The CI provides three sets of inputs that are used in the price optimization. The first set of input is their expected probability of retaining the group if the group's price is increased a specified amount. Rate increases will not be negative, generally, unless there is medical price deflation. Many probability estimates are gathered with small changes in the price increase around the client's target profit and fewer more sparse estimates further from the targeted profit margin. The client needs to consider the group's historical costs, inflation, local competitive pricing, and other factors that influence the group's likelihood of accepting the various price increases. Another necessary input from the client is the administrative costs allocable to that group. This cost may be expressed as a percentage of the expected medical costs or in dollars per year. The final input required is a minimum expected profit or profit margin that is acceptable.
- The following Table 3 is an example of price forecasting using probability of retention and other related input data for
steps -
TABLE 3 Price Forecast Example Probability Ratio Next Next Price of Admin Year Year Expected Increase retention to Cost Price Total Cost Profit 0.00 0.95 0.25 1500 1375 118.75 0.02 0.92 0.25 1530 1375 142.60 0.04 0.90 0.25 1560 1375 166.50 0.06 0.85 0.25 1590 1375 182.75 0.08 0.80 0.25 1620 1375 196.00 0.10 0.73 0.25 1650 1375 200.75 0.12 0.68 0.25 1680 1375 207.40 0.14 0.63 0.25 1710 1375 211.05 0.16 0.58 0.25 1740 1375 211.70 0.18 0.53 0.25 1770 1375 209.35 0.20 0.45 0.25 1800 1375 191.25 0.25 0.35 0.25 1875 1375 175.00 0.30 0.25 0.25 1950 1375 143.75 0.35 0.15 0.25 2025 1375 97.50 0.40 0.05 0.25 2100 1375 36.25 0.45 0.01 0.25 2175 1375 8.00 0.50 0.00 0.25 2250 1375 0.00 - The optimal price is $1740 per person or a 16% increase. Costs are expected to be $1375/person and there is a 58% chance of retaining the group. This yields $211.70 expected profit per person.
- 1306—The expected profit (or profit margin) is calculated by the following formula: expected profit=(probability of accepting price offered)×[((1+proportion price increase)×(price in previous period))−(expected policy year medical costs)−(administrative costs)].
- This is the expected profit (margin is calculated by dividing by the group's price) and it is calculated for each rate increase and probability of retention or acceptance. The maximum expected profit is the largest amount (or the closest to zero if they are negative) calculated in the preceding step. The largest expected profit is compared to the client's minimum acceptable expected profit.
- 1308—If the expected profit is below the minimally acceptable, then the expected profit calculations are printed out and the underwriter may run additional analyses to test the sensitivity of the assumptions. Also, the price at which the expected profit equals the minimally acceptable profit is printed out. If the underwriter wants to modify the probabilities in the retention curve, those are changed and 1304 is repeated.
- 1310 If the maximum expected profit is greater than the minimum acceptable profit, then the price optimizing profit, its percentage increase, expected costs and profits are printed out for the underwriter along with the same output for non-optimal prices. The underwriter would offer the price that maximizes their profits.
- Another consideration when pricing the product is the variability of the forecast cost for the policy year. Greater variability should carry an additional risk premium. Therefore, the standard error of the group's expected medical cost is calculated and printed also. SAS or S Plus regressions will calculate the variability of the mean or the standard error of the estimate of the policy year cost by combining the standard errors of the person-level forecasts. The price that provides a 90% (or some other high probability) chance of break-even is calculated using the standard error and printed. An underwriter can use the break-even with a high probability price and the relative standard error in negotiating price. If there is a large relative standard (e.g., standard error of group/average standard error), the underwriter would be less inclined to discount the price in a competitive market since the likelihood of a loss is increased. Code for a program to run a pricing example is found in Appendix F.
- 1312—If the underwriter does not want to modify the retention curve, the underwriter offers the group the price that produces the minimally acceptable profit for the client even if the group is expected to reject the offer.
- 1314 The final step in pricing involves translating the average price per person per day into a monthly price per subscriber unit (e.g., single person, enrollee with spouse, enrollee with two or more additional dependents—other subscriber unit constellations are also possible). Costs are traditionally presented in cost per member per month or pmpm. However, subscriber units are used for pricing and it is important that costs are rationally allocated to the subscriber units. The price is multiplied by 365/12 to calculate the monthly price (or rescaled for another time period). One alternative for pricing the subscriber units is to calculate the mean cost forecast per subscriber unit for the group and then inflate each mean subscriber cost by the average profit margin for the group (i.e., recommended optimal price/expected cost). The mean cost forecast per subscriber unit is calculated by summing the forecast cost per person for each person that is a member of that type of subscriber unit in the underwriting period and then dividing that sum by the number of subscribers of that type (not people) in the underwriting period. This gives the group's mean daily cost per subscriber for each different type of subscriber unit. Another pricing alternative is to set the price for the subscriber units that are considered to be very price sensitive just below the market price. The remaining subscriber units must then be priced so that the overall expected profit is maintained. That can be calculated by estimating the expected profit for the market priced subscriber units and subtracting it from the total expected profit for the group. The other subscriber units must account for the remaining profit requirement. Their price can be set so that the profit margin equals the remaining profit requirement by solving the following equation for price per subscriber unit: (total expected profit-market priced subscriber profit)=remaining profit=(number remaining subscriber units)×((price/remaining subscriber unit)−(mean expected cost/remaining subscriber unit)). Solving the equation provides an average price/remaining subscriber unit or (price/remaining subscriber unit)=((remaining profit)/(number remaining subscriber units))+(mean expected cost/remaining subscriber unit). If there are two or more remaining subscriber units, the price can be pro rated based on the average forecast cost/remaining subscriber unit. This approach can be used for pricing stop loss medical insurance also. Alternative allocation of profits to subscriber groups are possible. Those of ordinary skill will appreciate that the relation of expected cost to the terms of the medical insurance will vary among insurance types. For example, first dollar products will have a higher expected costs than stop-loss products.
- Estimating costs that need to be considered for reserves for first dollar health insurance and for stop loss coverage are alternative uses for the cost forecasting process. Rather than predicting payments that will occur over the entire policy period, reserving requires predicting costs that will occur in the upcoming financial reporting period (e.g., fiscal year or quarter). The same cost forecasting process using data collection and validation, risk factors, data mining and statistical techniques at the person and group-levels, testing and reporting can be applied to produce cost estimates to be used in setting reserves. The dependent variable needs to be changed so that the reserving model is calibrated to the appropriate time period.
- The model for reserving forecast's costs that have been incurred but not reported (IBNR) and this may include some costs of claims that have not occurred yet but are in the financial reporting period. Typically, the reserving period will run through the end of the current fiscal quarter or year. Inflation needs to be accounted for but the time period is far shorter than for the renewal cost forecast product, but the same techniques apply over the shortened time period.
- A development period model is calibrated using the risk factors from the claims and enrollment data in a base period to forecast total incurred claims for the financial reporting period. The underwriting period for reserving can be the previous 12 months of claims (if available) preceding the reserving date or some other time period such as this policy period to the reserving date. The base period for the developmental model must have approximately the same number of days as the underwriting period so the forecast will not be biased. The policy period for IBNR claims begins at the first date of the financial reporting period and ends at the last day of the reporting period. The next period for the model development cost for IBNR or claims that have not occurred yet must be of the same length as the actual reserving period during the policy period for correct model calibration. This is a standard person-level model for MAP4HIP with a shorter next period (e.g., quarter) possibly. The total forecast claims are summed to provide a total claim amount forecast. This is used as an independent variable and is supplemented by additional independent variables that include the reported claims, historical completion rates by time into the reserving period, claims backlogs and seasonality. The total of the IBNR claims from the reserving period is the dependent variable. Note that this model is at the book of business level. A quarter will yield only one data point for the book of business. If there are too few quarters for developing a stable model, an alternative approach is recommended.
- The alternative approach defines reserves as the difference between the total claim forecast for the reserving period and the incurred and reported claims during that period. In other words, the sum of the incurred and reported claims is subtracted from the total forecast claims and this equals the reserve forecast.
- The reserving product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured or stop loss coverage. The pricing module is not relevant for reserving.
- The fully insured medical product uses claims information as a critical component of the cost forecasting model. Claims are available if the group is renewing first dollar health insurance but not for a new group. Enrollment data may be available for new groups (possibly only for employees) or individual health insurance. The same process can be applied to new groups or individual (or family but called by convention individual) policies by using the method for the people with no claims and only enrollment data. The base period enrollment data must contain the same potential risk factors as are available for the new groups. Note that there is only one model since there are no claims data so people cannot be separated into claims and no claims people in the base or underwriting periods. The cost forecasting model should be developed on the client's current book of business. The dependent variable is next period's payments. The independent variables are the same as the risk factors used in the no claims model (i.e., detailed enrollment data only). The modeling universe includes everybody rather than only those with no claims. Sometimes claims data are available for high cost cases in the new group and also may include the demographics and diagnoses associated with those high cost cases. This information can be included as person-level risk factors but the same information will need to be included as potential person-level risk factors in the base period for the development model. A group-level model can be applied to the summarized group-level data as with renewal business. Frequently the total cost for the new group last year is available and may be used as a risk factor for the group-level model. The total group cost would then need to be included in the base period as a potential risk factor also.
- The fully insured new business cost forecasting and pricing product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured or stop loss coverage.
- Aggregate only medical stop loss insurance, such as CapCost, can have different data sources than fully insured insurance (where the data is held and owned by the insurance company), as a TPA pays the claims and holds the data for the self-insured employer. It is our intent to get the data for all of the TPA's groups so that our client, the stop loss insurer, can bid on all of the groups serviced by the TPA. Therefore, any renewal business for the TPA can use the full cost forecasting models. New business for the TPA will not have claims data available. The enrollment data only new business model cost forecasting technique is applicable for new business for the TPA. The enrollment data are needed for the new group. Future refinements will include combining the historical payments, summarized by month or quarter, with the enrollment information since person-level claims will not be available.
- In order to understand the performance of CapCost versus the traditional specific plus aggregate stop loss insurance, we had to create synthetic groups since our database only contained 116 actual groups of very different sizes. Monte Carlo random samples were developed for synthetic groups of 50, 100, 250, 500, 750, 1000, and 1500 employees plus their dependents. A group of 50 employees is smaller than the smallest employer in the target market and 1500 employees is toward the upper end of the target market for stop loss health insurance. Five hundred random groups were selected with replacement. All family members of the employees were included in the group. The claims payments were calculated for traditional $50,000 specific with 125% aggregate exclusive of specific and for
CapCost 110™.CapCost 110™ is aggregate only at 110% of the attachment point. TruRisk models were applied to forecast next years claim payments.CapCost 110™ medical claims payments for groups of 50 employees is about 80% of the claims paid out for traditional $50,000 specific plus $125% aggregate stop loss. Once there are 250 or more employees theCapCost 110™ claims pay out is less than 50% of the traditional stop loss coverage. Similar results were seen for $25,000 specific and $75,000 specific both plus 125% aggregate coverage. The pay out forCapCost 110™ is much lower for $25,000 specific plus 125% aggregate and closer to the $75,000 specific plus 125% aggregate. The mean and standard deviation are presented in TABLE 6 for three different size groups. 125% aggregate is included with each of the specific coverage. The mean claims paid out are less withCapCost 110™ and the standard deviation is smaller than with traditional stop loss coverage. The main factor causing this is the far lower frequency of claims withCapCost 110™ (18-26% of groups) as compared to traditional specific plus aggregate coverage (87-98% of groups). When a claim was made withCapCost 110™ coverage, it was greater and the standard deviation was also greater than for claims with traditional stop loss coverage. - The claims paid out for
CapCost 110™ and traditional stop loss are highly correlated: - R=0.95 for 250 employees with $25,000 specific and 125% aggregate
R=0.91 for 500 employees with $50,000 specific and 125% aggregate
R=0.87 for 750 employees with $75,000 specific and 125% aggregate
The risks or claims paid out are correlated but lower forCapCost 110™ since the claim frequency is far lower with that coverage. - An aggregate only policy can be underwritten using the group-level experience for credible groups. However, it is very important to accurately estimate the group's costs for next year since that determines the 110% attachment point. Therefore, the MAP4HIP cost forecasting method is recommended as the preferred embodiment since the predicted mean cost is more accurate than the predicted mean cost derived using the standard approach with group-level experience as predictor. The same steps are taken in developing the models for CapCost as are used with the general MAP4HIP process. The only difference is the variety of TPAs as multiple data sources versus one CI with fully insured medical. Person-level and group-level models are developed for cost per person per day. The risk factors, statistical methods and dependent variables are the same. The attachment point needs to be set to the appropriate amount (e.g., a 110% attachment point is calculated by multiplying the cost trend adjusted forecast cost by 1.1).
- The aggregate only cost forecasting product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured coverage.
-
TABLE 6 250 employees 500 employees 750 employees CapCost $25,000 spec CapCost $50,000 spec CapCost $75,000 spec total 500 groups $/employee 229 582 104 278 87 212 std. dev. 681 789 301 385 229 303 group claims > 0 % groups > 0 26.40% 98.20% 18.20% 89.20% 20.80% 87.00% # groups > 0 132 491 91 446 104 435 $/employee 867 592 569 311 419 243 std. dev. 1099 791 483 394 336 312 minimum 2.08 0.3 8.04 6.34 5.32 2.63 maximum 6479 7066 1823 1921 2027 2026 - The MAP4HIP method can be used for cost forecasting for specific stop loss coverage. Specific stop loss pays for claims above a specified threshold (i.e., the deductible). Those claims costs can be forecast using the same techniques that MAP4HIP uses for forecasting outlier amounts. First, the forecast inflation or cost trend adjustment for the policy period must be applied to the model development data. This is a different order of steps from the standard MAP4HIP sequence but it is necessary due to the specific deductible. For example, if there is a $50,000 deductible and a 10% cost trend then a $50,000 claim in the next period would yield a $0 specific claim. If that claim occurred in the policy period after 10% inflation it would produce a $5,000 specific claim ($50,000×1.1=$55,000 subtracting the $50,000 deductible yields a $5,000 specific claim). Inflation during the lag period must be added also and inflation built into the development model must be divided out to provide accurate future cost estimates for modeling specific claims. After the inflation adjustment for the next period data, costs are then recalculated so that they are zero if the person's claims are below the deductible in the next year (similar to Winzorization). If costs total above the deductible, then the specific cost is set to that amount. Probability models are developed for claims and no claims people in the base period. The probabilities are weighted by the average cost in the terminal node (above the deductible) to produce the expected cost. The person-level forecasts are summed to make the group-level forecast. Group-level models with the same risk factors as MAP4HIP are developed using the residual of the actual specific payments per person per day minus the forecast specific costs. After development period models are complete, they can be applied to data from an underwriting period to develop cost forecasts for a policy period.
- Aggregate stop loss is frequently added to specific coverage. The aggregate coverage with specific coverage is paid exclusive of specific claims and specific claims are not used in defining the attachment point. Therefore, aggregate stop loss (with specific coverage also) claim amount can be modeled using the inlier methods in the MAP4HIP method. The Winsorization point is the specific deductible. As with specific, the cost trend forecast for the policy period must be applied to the next period data prior to the inlier calculations. Only inliers are modeled since the specific costs will be borne by the specific coverage. Both the specific and aggregate with specific should be modeled and priced separately. Note that this is different from aggregate only stop loss coverage since all costs contribute to the attachment point and aggregate claim amount for aggregate only stop loss coverage.
- The specific cost forecasting and specific plus aggregate cost forecasting products can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows as used with the cost forecasting models for fully insured coverage.
- Group short term disability insurance (STD) is insurance that pays a portion of an employees wages (typically 50-100%), a flat amount or the lesser of the portion or the flat amount when an employee is disabled due to a non-work related accident, sickness or pregnancy. The duration of the salary replacement is typically 13, 26 or 52 weeks. The MAP4HIP method can be applied to forecast STD payments with a few modifications. The potential risk factors are the same as the risk factors used with medical insurance and described in
section 806 with the additional risk factors of number of STD days and payments in the base and underwriting periods and job classification when these data are available. Otherwise, the exact same potential risk factors as used with MAP4HIP can be linked to the STD days next year and modeled using the MAP4HIP modeling techniques and processes. The dependent variable in the model development database is the number of STD days in the next period. In other words, the medical claims and STD days in the base period are linked in the database to STD days in the next period for the same person and a STD day forecasting model for the next period is developed. The interaction capturing techniques and other modeling methods are the same as for medical claims but it is unlikely that the data need to be Winsorized and outliers modeled separately since STD is capped at a short period. The development model is applied to score the actual underwriting period data to calculate the expected number of STD days during the policy period to calculate the forecast claim amount. The expected number of STD days needs to be weighted by the expected cost per STD day. This can be calculated by averaging the STD cost per day in the underwriting period and increasing it by wage inflation and multiplying it by the expected number of STD days. Alternatively and preferably, each person's salary or flat rate benefit is linked to the database and the forecast STD days are multiplied by the STD per day benefit amount (i.e., portion of salary covered by STD) and increased by the salary inflation history. The STD cost per person is summed to produce the group's expected cost. Confidence bounds can be calculated for the number of expected STD days to provide a range of high to low cost for the group. A group-level model is built using the same group characteristics as with MAP4HIP and possibly supplemented with characteristics of the benefit plan. The group-level dependent variable is residual STD days per person weighted by the mean cost per person per day to calculate the forecast claim amount. - The STD cost forecasting product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows (with STD days and salary information added) as used with the cost forecasting models for fully insured coverage.
- Long term disability insurance (LTD) is wage replacement insurance for disabilities that run longer than STD coverage and may continue until the insured is 65 years old. Group LTD coverage is for a policy period that is typically one year. The insurer does not bear the cost of continuing disability liability from previous periods unless it was the insurer for that period also. The insurer will bear the cost for new long term disabilities that occur during the policy period and will continue to be responsible for that cost until the coverage expires (e.g., the beneficiary dies or turns 65 years old) or the beneficiary can go back to work. The probability of a LTD claim occurring during the policy period (i.e., the dependent measure) can be modeled and forecast using linked medical and LTD claims at the person-level. The base period risk factors are the same as the STD model, including medical claims, and STD claims with the addition of LTD claims linked, recoded and used as supplemental risk factors when available. The forecasting model can be built using only medical claims and enrollment information. Logistic regression, regression tree or hybrid tree with terminal nodes feeding into a logistic regression (the hybrid tree being the preferred embodiment) are the statistical techniques for modeling the incidence rate of LTD claims during the next period (typically one year). Other interaction capturing techniques can be used to predict the incidence rate but must be appropriate for modeling a variable that is bounded by 0 and 1. The development model is applied to underwriting period data to calculate the expected probability of a LTD claim during the policy period. The probabilities need to be weighted by the expected net present value of the disability to estimate the total cost of the disability (i.e., the claim amount). The net present value of the disability cost is obtained from actuarial tables. The expected costs are summed across the group members to produce the expected group cost. The net present value needs to be derived from other databases and should be conditionalized on the cause of the disability since the cost will vary depending on the cause. The cause of the disability can be estimated by the clinical conditions defining the terminal node of the person. A more accurate total cost of the disability will be calculated if the weights are conditionalized on the cause of the disability.
- If a good estimate of the net present value of the future cost or length of the disability is not available for the various terminal nodes, then an index can be calculated. This index is the expected number of new disabilities for the group during the policy period divided by the “average” number of disabilities calculated using standard actuarial techniques for new business for LTD. A confidence interval can be calculated for the expected number of disabilities using the expected probability of disability per person and computing the upper and lower bounds for the group by using a Lexian distribution that calculates the exact probabilities. A binomial distribution can be used but the confidence interval will not be exact since it assumes that everybody has the same average probability within the group. Group's that have a confidence interval that does not cover the “average” calculated from standard actuarial techniques are significantly higher or lower in risk and should be priced differently than the average group. Alternatively and preferably, the group's standard deviation from the mean expected number of LTD cases can be calculated using on of the distributions above. The number of standard deviations from the mean is a scale that can be used for pricing. The end points of the scale can be anchored by market prices for the lowest and highest risk market prices or by actual historical LTD experience, conditionalized on group size.
- The LTD cost forecasting product can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows (with the addition of STD and LTD claims and salary information) as used with the cost forecasting models for fully insured coverage.
- Group term life insurance is very similar to group disability, it is for a policy period (usually one year) and the coverage and rates are typically not guaranteed beyond that period. Unlike LTD, the death benefit is a one-time payment for a known amount (the amount is usually a multiple of salary up to a limit) so there is no uncertainty over the size of the benefit. Therefore, knowing the expected number of deaths (weighted by the amount of the life insurance) will provide an accurate estimate of the cost of that group. Alternatively, a relative risk index can be calculated in the same manner as with LTD. The numerator is the expected number of deaths (possibly weighted by the death benefit) and the denominator is the “average” number of deaths (possibly weighted by the death benefit) where the average is calculated using the age by sex distribution and standard life tables calculated by actuaries. The significance of the index can be calculated using the Lexian (preferably) or binomial distributions for the person-level probabilities and testing if the average is covered by the confidence bounds for the group. Groups with expected numbers of deaths outside the average should have higher or lower rates than average. Groups with large confidence intervals should be charged more than groups with small confidence intervals, all other factors being equal.
- The same approach for developing the person-level probability models is used for life insurance as is used for LTD. Medical claims from a base period are linked with deaths occurring in the next period for a very large block of business. The risk factors are the same as or developed using a similar technique as used with the medical cost forecasting models. The dependent variable is the probability of death. The same interaction capturing techniques used for the LTD probability model are used for the life insurance model (i.e., the preferred embodiment is the hybrid probability tree). The developmental model is applied to medical claims during an underwriting period and death forecasts are calculated for the policy period. The probability of death is weighted by the death benefit to calculate the forecast claim amount per person. The claim amounts are summed across people in the group. A group-level model can be developed that uses the sum of the probabilities (i.e., the number of expected deaths), actual number of deaths in the base period and the number and amount of STD and LTD claims to supplement the risk factors used in a standard MAP4HIP group-level model, when available. Otherwise, the same medical claims and enrollment information used with MAP4HIP will suffice. The dependent measure is the forecast number of deaths and is weighted by the expected death benefit per person to calculate the forecast claim amount.
- The group term life insurance death rate and claim amount forecasting products can be delivered as a service bureau product or as software, either stand alone or an ISP model, using the same data flows (preferably supplemented with the addition of death and salary information) as used with the cost forecasting models for fully insured medical coverage.
- While the present invention has been described with respect to specific embodiments, it will be appreciated that various alternatives and modifications will be apparent based on the present disclosure, and are intended to be within the spirit and scope of the following claims.
-
APPENDIX G Data Elements & Descriptions For Software Of CD-ROM Appendix Field Names Descriptions Legal Values abdpain Abdominal pain or dxvar = ‘7890’ 1, 0 abheart Abnormal heart sounds or ‘7850’ <= dxvar <= ‘7853’ 1, 0 acne Acne or ‘706’ <= dxvar <= ‘7061’ 1, 0 actinseb Actinic and seborrheic keratosis or ‘702’ <= dxvar <= ‘70219’ 1, 0 acubronc Acute bronchitis and brochiolitis-dx = 466 1, 0 acuphary Acute pharyngitis-dxvar = ‘462’ 1, 0 acusinu Acute sinusitis-dxvar = :‘461’ 1, 0 acutonsl Acute tonsillitis-dxvar = ‘463’ 1, 0 add Attention deficit disorder-dxvar = : ‘3140’ 1, 0 agebrk35 age 35+ 1 (35+), 0 (35 under) agegp 0-0.9 then agegp = ‘a’; 1-4.9 then agegp = ‘b’; agegroups values = a-k 5.0-17.9 then agegp = ‘c’; 18-24.9 then agegp = ‘d’; 25-34.9 then agegp = ‘e’; 35-44.9 then agegp = ‘f’; 45-54.9 then agegp = ‘g’; 55-64.9 then agegp = ‘h’; 65-74.9 then agegp = ‘i’; 75-84.9 then agegp = ‘j’; ge 85 then agegp = ‘k’; agesq Age Squared ahypothy Acquired hypothyroidism-dxvar =: ‘244’ 1, 0 aidstest AIDS-cpt testing codes if cpts{i} in 0, 1, 2 . . . number of (‘86687’, ‘86701’, ‘86702’, ‘86703’, ‘86688’, ‘86689’) then tests aidstest = sum (aidstest, 1); alcohdep Alcohol dependence syndrome-dxvar = :‘303’ 1, 0 alerhin Allergic rhinitis or dxvar = :‘477’ 1, 0 amt generic-test purposes 1, 0 anemia Anemia-‘280’ <= dxvar <= ‘2859’ 1, 0 anginap Angina pectoris or dxvar = :‘413’ 1, 0 antitemp temporary to assist in coding prenatal cpts{i} in (‘59425’, ‘59426’) 1, 0 anxiety Anxiety states-dxvar = :‘3000’ 1, 0 artery Dis of the arteries, arterioles, and capillaries-‘440’ <= dxvar <= ‘4489’ 1, 0 arthero Coronary atherosclerosis or dxvar = :‘4140’ 1, 0 artipost Artificial opening status and oth postsurgical states or ‘V44’ <= dxvar <= 1, 0 ‘V4589’ assault Assault or ‘E960’ <= dxvar <= ‘E969’ 1, 0 asthma Asthma-dxvar = : ‘493’ 1, 0 attsurgd Attention to surgical dressing and sutures or dxvar = ‘V583’ 1, 0 bargain BARGAIN STATUS- ? H, S basecat a thru v-see basecata-basecatv a . . . v basecata .0001 <= chgd <= .33999 1, 0 basecatb .34 <= chgd <= .48999 1, 0 basecatc .49 <= chgd <= .70999 1, 0 basecatd .71 <= chgd <= 1.03999 1, 0 basecate 1.04 <= chgd <= 1.4999 1, 0 basecatf 1.5 <= chgd <= 1.99999 1, 0 basecatg 2 <= chgd <= 2.59999 1, 0 basecath 2.6 <= chgd <= 3.44999 1, 0 basecati 3.45 <= chgd <= 4.54999 1, 0 basecatj 4.55 <= chgd <= 5.9999 1, 0 basecatk 6 <= chgd <= 7.89999 1, 0 basecatl 7.9 <= chgd <= 10.44999 1, 0 basecatm 10.45 <= chgd <= 13.7999 1, 0 basecatn 13.8 <= chgd <= 18.19999 1, 0 basecato 18.2 <= chgd <= 23.99999 1, 0 basecatp 24 <= chgd <= 35.99999 1, 0 basecatq 36 <= chgd <= 53.99999 1, 0 basecatr 54 <= chgd <= 80.99999 1, 0 basecats 81 <= chgd <= 121.49999 1, 0 basecatt 121.5000 <= chgd <= 181.99999 1, 0 basecatu 182 <= chgd <= 272.99999 1, 0 basecatv chgd ge 273 1, 0 baseclmn # of claims in base pd baseclms presence of base claims - yes/no 1, 0 basemos # OF MONTHS in base period 1-12 baseyr Year associated with base 2 digit yr - 95, 96, 97, 98 bdate birth date VALID DATE benfitcd BENEFIT CODE 3 = medical claim bn_oth oth benign neoplasm- (‘210’ <= dxvar <= ‘2159’) or (‘217’ <= dxvar <= ‘2299’) 1, 0 bn_skin Benign neoplasm of skin-dxvar=:‘216’ 1, 0 bychggp .0001 <= chgd <= .33999 bychggp = 1; .34 <= chgd <= .48999 bychggp = 2; 1 thru 22 .49 <= chgd <= .70999 bychggp = 3; .71 <= chgd <= 1.03999 bychggp = 4; 1.04 <= chgd <= 1.4999 bychggp = 5; 1.5 <= chgd <= 1.99999 bychggp = 6; 2 <= chgd <= 2.59999 bychggp = 7; 2.6 <= chgd <= 3.44999 bychggp = 8; 3.45 <= chgd <= 4.54999 bychggp = 9; 4.55 <= chgd <= 5.9999 bychggp = 10; 6 <= chgd <= 7.89999 bychggp = 11; 7.9 <= chgd <= 10.44999 bychggp = 12; 10.45 <= chgd <= 13.7999 bychggp = 13; 13.8 <= chgd <= 18.819999 bychggp = 14; 18.2 <= chgd <= 23.99999 bychggp = 15; 24 <= chgd <= 36.99999 bychggp = 16; 36 <= chgd <= 53.99999 bychggp = 17; if 54 <= chgd <= 80.99999 bychggp = 18; if 81 <= chgd <= 121.49999 bychggp = 19; 121.5000 <= chgd <= 181.99999bychggp = 20; 182 <= chgd <= 272.99999 bychggp = 21; chgd ge 273 bychggp = 22; calckidy Calculus of kidney and ureter 1, 0 cancldte enrl cancel date candidia Candidiasis-dxvar =: ‘112’ 1, 0 carddysr Cardiac dysrhythmias-dxvar = : ‘427’ 1, 0 carpltun Carpal tunnel syndrome-dxvar = ‘3540’ 1, 0 CAT4BASE see cat4base1 - 4-code as 1 thru 4 1, 2, 3, 4 cat4base1 group 1 of 4 set-base yr claim .0001 <= chgd <= 1.49999 1, 0 cat4base2 1.5 <= chgd <= 5.99999 1, 0 cat4base3 6.0 <= chgd <= 23.999999 1, 0 cat4base4 chgd ge 24 1, 0 cataract Cataract-dxvar = : ‘366’ 1, 0 CATBMOS groupings of months in base year (with or without chgs) for an ordered A thru F candidate predictor variable WITH BASE CLAIMS 1 <= basemos <= 3 > catbmos = ‘A: 1-3’; basemos = (4, 5) > catbmos = ‘B: 4-5’; basemos = (6, 7) > catbmos = ‘C: 6-7’; basemos = (8, 9) > catbmos = ‘D: 8-9’; basemos = (10, 11) > catbmos = ‘E: 10-11’; basemos = (12) then catbmos = ‘F: 12’; WITHOUT BASE CLAIMS basemos = 1 > catbmos = ‘A: 1’; basemos = (2, 3, 4, 5) > catbmos = ‘B: 2-5’; basemos = (6, 7, 8) > catbmos = ‘C: 6-8’; basemos = (9) > catbmos = ‘D: 9’; basemos = (10, 11) > catbmos = ‘E: 10-11’; basemos = (12) > catbmos = ‘F: 12’; celluabs Cellulitis and abscess 1, 0 cerebrov Cerebrovascular disease-‘430’ <= dxvar <= ‘4389’ 1, 0 charge CHARGE CHEMO chegroup codeapy combines (lspxchem, mcptchem) 1, 0 chestpn Chest pain or dxvar = :‘7865’ 1, 0 chf Congestive heart failure-dxvar = ‘4280’ 1, 0 chg Charge in Base Year chgage Base yr chg * Age CHGC2NZ base pd 2n chg category chgcata Base charge category A 1, 0 chgd Base Pd Chg per Enrolled Day chgdiff Pred Pd Chg-Base Pd Chg chgl log10 base pd charge chgp Charge in Prediction Year CHGPC2NZ Nxy yr pd 2n chg category chgpcata Next Year charge category A 1, 0 chgpd Pred Pd Chg per Enrolled Day chgpl log10 pred pd charge chgps Spec Pred Pd chg chgpw Pred Pd Chg Winsor $400k, if chgp > 400000 then chgpw = 400000 + (.5 * (chgp − 400000)); chgpwc Pred Charges w/ Claims chgpwd Pred Pd Chg per Enrolled Day Winsor $400k chgpwl log10 pred pd winsorized charge chgs if chg >50000 then chgs = chg − 50000; else chgs = 0; chgsq Base yr chg squared chgt1 Charge in Base Year Trimester 1 chgt2 Charge in Base Year Trimester 2 chgt3 Charge in Base Year Trimester 3 chgw Base Pd Chg Winsor $400k chgwc Base Charges w/ Claims chgwd96 Next Pd Chg per Enrolled Day Winsor $96pd DOLLAR SPECIFIC TO SOURCE chgwl log10 base pd winsorized charge chlamyd Unspecified viral and chlamydial infections-dxvar in (‘0799’, ‘07998’, ‘07988’ 1, 0 chronbro Chronic and unspecified bronchiolitis or ‘490’ <= dxvar <= ‘491’ 1, 0 chrsinu Chronic sinusitis or dxvar = : ‘473’ 1, 0 ckasrcl Check Age/Sex/Relation cell, occurs when all sex/relationship variables 1, 0 f/m###en, kd or cl are exhausted claimcde CLAIM CODE REQUIRE SOURCE INPUT claimno CLAIM NBR REQUIRE SOURCE INPUT clmim1 # Claims October 1995 clmim10 # Claims July 1996 clmim16 # Claims January 1997 clmim22 # Claims July 1997 clmim28 # Claims January 1998 clmim34 # Claims July 1998 clmim39 # Claims December 1998 clmim4 # Claims January 1996 cmpms Comps of surg and med care, not elsewhere classified-‘996’ <= dxvar <= 1, 0 ‘9999’ cobgnpd Month # Company Enrollment Starts 1-39 coendpd Month # Company Enrollment Ends 1-39 coins COINSURANCE amount commdsr Potential health hazards related to communicable Dis 1, 0 compcde enrl company REQUIRE SOURCE INPUT compcode COMPANY CODE REQUIRE SOURCE INPUT complic combines (cmpms, dxcomp, gadvmed) 1, 0 compname COMPANY NAME REQUIRE SOURCE INPUT conderm Contact dermatitis and oth eczema 1, 0 conjunct Conjunctivitis-‘3720’ <= dxvar <= ‘3729’ 1, 0 constip Constipation or dxvar = ‘5640’ 1, 0 contus Contusions with intact skin surfaces or ‘920’ <= dxvar <= ‘9249’ 1, 0 convuls Convulsions or dxvar = : ‘7803’ 1, 0 corncal Corns, callosities, and oth hypertrophic and atrophic skin or ‘700’ <= dxvar <= 1, 0 ‘7019’ cough Cough or dxvar = ‘7862’ 1, 0 cpt CPT CODE cutobjs Cutting or piercing instruments or objects or dxvar = ‘E920’ 1, 0 cycle Pedal cycle, nontraffic and oth or dxvar IN 1, 0 (‘E8003’, ‘E8013’, ‘E8023’, ‘E8043’, ‘E8053’, ‘E8063’, ‘E8073’, ‘E8206’, ‘E8216’, ‘E8226’, ‘E8236’, ‘E8246’, ‘E8256’, ‘E8261’, ‘E8269’) cystbldd Cystitis and oth dsrs of the bladder or ‘595’ <= dxvar <= ‘5969’ 1, 0 cysturin combines(cystbldd, othurin) 1, 0 datechk CHECK DATE REQUIRE SOURCE INPUT datefrom FROM DATE REQUIRE SOURCE INPUT dateproc PROCESS DATE REQUIRE SOURCE INPUT daterpt REPORTED DATE REQUIRE SOURCE INPUT datethru THRU DATE REQUIRE SOURCE INPUT deduct DEDUCTIBLE REQUIRE SOURCE INPUT deltemp cpts{i} in (‘59100’, ‘59830’, ‘59430’) or ‘59120’ <= cpts{i} <= ‘59160’ or ‘59812’ <= 1, 0 cpts{i} <= ‘59821’ or ‘59840’ <= cpts{i} <= ‘59857’ or ‘59400’ <= cpts{i} <= ‘59414’ or ‘59510’ <= cpts{i} <= ‘59525’ or dxs starting with (‘V22’, ‘V23’) depnbr DEP NBR 01 = enrollee depress Major depressive disorder- (‘2962’ <= dxvar <= ‘2963’) 1, 0 dermtosi Dermatophytosis-dxvar =: ‘110’ 1, 0 diab combines (diabmell, dxdiabet) 1, 0 diabmell Diabetes mellitus-dxvar = :‘250’ 1, 0 dial combines (lspxdial, mcptdial) 1, 0 discdsr Intervertebral disc dsrs or dxvar = :‘722’ 1, 0 disstat DISCHARGE STATUS diverint Diverticula of intestine or dxvar = :‘562’ 1, 0 dizzi Dizziness and giddiness or dxvar = ‘7804’ 1, 0 dob date of birth VALID DATE dobpatn PATIENT BIRTH DATE VALID DATE docspec DOCTOR SPECIALITY ABBR REQUIRE SOURCE INPUT doctype DOCTOR TYPE REQUIRE SOURCE INPUT drg DRG specify version drgaltst drug, alcohol, methodone usage tsts (cpt) if (‘80100’ <= cpts{i} <= ‘80103’) or number of tests (cpts{i} eq ‘82055’) or (‘80150’ <= cpts{i} <= ‘80299’) then drgaltst = sum(drgaltst, 1); drugdep Drug dependence and nondependent use of drugs-‘304’ <= dxvar <= ‘3059’ 1, 0 dsranal Anal and rectal Dis or ‘569’ <= dxvar <= ‘56949’ 1, 0 dsrbone dsrs of bone and cartilage or ‘730’ <= dxvar <= ‘73399’ 1, 0 dsrbrst dsrs of breast-‘610’ <= dxvar <= ‘6119’ 1, 0 dsrear dsrs of external ear-dxvar = : ‘380’ 1, 0 dsreyeld dsrs of eyelids-‘373’ <= dxvar <= ‘3749’ 1, 0 dsrgallb dsrs of the gallbladder and biliary tract-‘574’ <= dxvar <= ‘5769’ 1, 0 dsrlipid dsrs of lipid metabolism-dxvar = : ‘272’ 1, 0 dsrmens dsrs of menstruation and abnormal bleeding-dxvar = : ‘626’ 1, 0 dsrrefra dsrs of refraction and accommodation-dxvar = : ‘367’ 1, 0 dx1/proc1 ICD-9-CM CODE specify version dx2/proc2 ICD-9-CM CODE 2 specify version dx3/proc3 ICD-9-CM CODE 3 specify version dx4 ICD-9-CM CODE 4 specify version dx5 ICD-9-CM CODE 5 specify version dx6-40 ICD-9-CM Diagnosis (after aggregate) specify version dxabort DX Abortion-630” <= substr (dxvar, 1, 3) <= “639 1, 0 dxblood DX Blood-“280” <= substr (dxvar, 1, 3) <= “289” 1, 0 dxcircul DX Circul System-390” <= substr (dxvar, 1, 3) <= “459 1, 0 dxcomp DX Complications of Care-“996” <= substr (dxvar, 1, 3) <= “999” 1, 0 dxcondtn DX Condn Influence Health Status-V40” <= substr (dxvar,1,3) <= “V49 1, 0 dxcongen DX Congenital Anomaly-740” <= substr (dxvar,1,3) <= “759 1, 0 dxdiabet DX Diabetes-“250” = substr (dxvar,1,3) 1, 0 dxdigest DX Digestive System-520” <= substr (dxvar,1,3) <= “579 1, 0 dxdonor V59” = substr (dxvar,1,3 1, 0 dxecode DX E-Code-“E01” <= substr (dxvar,1,3) <= “E99” 1, 0 dxendocr DX Endocrine, Nutrition, Metabolic-“240” <= substr (dxvar,1,3) <= “249” or 1, 0 “251” <= substr (dxvar,1,3) <= “279” dxgu DX GU System-580” <= substr (dxvar,1,3) <= “629 1, 0 dxinfec DX Infec & Parasite-“001” <= substr (dxvar,1,3) <= “139” 1, 0 dxinjury DX Injury-“800” <= substr (dxvar,1,3) <= “959” or 1, 0 “980” <= substr (dxvar,1,3) <= “959” dxlvebrn DX Liveborn-V30” <= substr (dxvar,1,3) <= “V39 1, 0 dxmental DX Mental-“290” <= substr (dxvar,1,3) <= “319” 1, 0 dxmgest DX Multiple Gestation-“651” = substr (dxvar,1,3) 1, 0 dxmskel DX Musculoskel & connect tiss-710” <= substr (dxvar,1,3) <= “739 1, 0 dxneoben DX Neoplasm Benign-210” <= substr (dxvar,1,3) <= “229 1, 0 dxneomal DX Neoplasm Malig-“140” <= substr (dxvar,1,3) <= “209” 1, 0 dxnerves DX Nervous System-“320” <= substr (dxvar,1,3) <= “359” 1, 0 dxob DX Preg, Childbirth, Puerp-630” <= substr (dxvar,1,3) <= “677 1, 0 dxperhis DX Personal History-dxvar: V10-V19 1, 0 dxperntl DX Perinatal-760” <= substr (dxvar,1,3) <= “779 1, 0 dxpoison DX Poisoning 1, 0 dxpreg DX Pregnancy-640” <= substr (dxvar,1,3) <= “649” 1, 0 or“652” <= substr (dxvar,1,3) <= “667” dxpregv DX Pregnancy V-Code-V20” <= substr (dxvar,1,3) <= “V29 1, 0 dxresp DX Resp System-460” <= substr (dxvar,1,3) <= “519 1, 0 dxsense 360” <= substr (dxvar,1,3) <= “389 dxskin DX Skin & Subcut-680” <= substr (dxvar,1,3) <= “709 1, 0 dxspecpx DX Spec Procs & Aftercare-V50” <= substr (dxvar,1,3) <= “V58 1, 0 dxsymptm DX Symptoms, Signs, & III Defined-“780” <= substr (dxvar,1,3) <= “799” 1, 0 dxvaccin DX Disease Contact or Vaccine 1, 0 dxvgnldl DX Normal Delivery-“650” = substr (dxvar,1,3) 1, 0 dysp_pul combines (cyspnea, othopd) 1, 0 dyspnea Dyspnea and respiratory abnormalities-dxvar= :‘7860’ 1, 0 effdte enrl eff date VALID DATE encoconr Encounter for contraceptive management-dxvar= :‘V25’ 1, 0 ENRLADDR1 address 1 CONFIDENTIAL enRLADDR2 Address 2 CONFIDENTIAL enrlarea AREA CODE CONFIDENTIAL enrlcity city CONFIDENTIAL enrlm1 Enrolled October 1995 1, 0 enrlm10 Enrolled July 1996 1, 0 enrlm16 Enrolled January 1997 1, 0 enrlm22 Enrolled July 1997 1, 0 enrlm28 Enrolled January 1998 1, 0 enrlm34 Enrolled July 1998 1, 0 enrlm39 Enrolled December 1998 1, 0 enrlm4 Enrolled January 1996 1, 0 enrlphne phone number CONFIDENTIAL enrlst state REQUIRE SOURCE INPUT enrollee Person is Enrollee 1, 0 enrrelfm enrollee relationship enrrells ensagenc Age at end of year 0 Code, .<age < 1 then ensagenc = “<1”; SEE DESCRIPTION 1 <= age < 5 then ensagenc = “01-05”; 5 <= age < 18 then ensagenc = “05-18”; 18 <= age < 25 then ensagenc = “18-25”; 25 <= age < 45 then ensagenc = “25-45”; 45 <= age < 65 then ensagenc = “45-65”; 65 <= age < 80 then ensagenc = “65-80”; 80 <= age then ensagenc = “80+”; ensxkd enrrells = 1 & ensex = M > ensxkd = A; enrrells = 1 & ensex = F > ensxkd = B; A thru F enrrells = 2 & ensex = M > ensxkd = C; enrrells = 2 & ensex = F > ensxkd = D; enrrells = (3, 4, 5, 6) > ensxkd = E; else ensxkd = F; entrost comines(artipost, lspxentr, lspxgast) 1, 0 epistax Epistaxis dxvar = 7847 1, 0 esopha Esophagitis dxvar = 5301 1, 0 esshyp Essential hypertension-dxvar= :‘401’ 1, 0 excamt1 EXCLUSION AMT 1 excamt2 EXCLUSION AMT 2 excamt3 EXCLUSION AMT 3 excamt4 EXCLUSION AMT 4 exccatg1 EXCLUSION CATG 1-CATEGORY DEF - 1-coverage inelig, 2-medical 18-Jan necessity, 3-n/a, 4-deductibles, 5-coins, 6-cob, 7-medicare, 8- contract max, 9-dupicate, 10-n/a, 11-non-cov, 12-copay, 13- flexplan, 14-n/a, 15-exceeds sched, 16-alt proc, 17-panel contract fee, 18-n/a exccatg2 EXCLUSION CATG 2 - see description of catg 1 see exccatg1 desc exccatg3 EXCLUSION CATG 3 - see description of catg 1 see exccatg1 desc exccatg4 EXCLUSION CATG 4 - see description of catg 1 see exccatg1 desc exchg2a 2nd Highest month chg ADJacent to 1st’ exchg2b 2nd Highest month chg NOT ADJacent to 1st’ exclh1 Base Year Highest Monthly Pymt Per Day’ exclh1ch Base Year Highest Monthly chg Per Day’ exclh2a Baseyr ‘2nd Highest Monthly Pymt ADJacent to 1st’ exclh2b Baseyr ‘2nd Highest Monthly Pymt NOT ADJacent to 1st’ eyemix combines(cataract, lensrepl, retinldt, scpteye) 1, 0 f0105kd Female 01-05 Child 1, 0 f0518kd Female 05-18 Child 1, 0 f1825en Female 18-25 Enrollee 1, 0 f1825sp Female 18-25 Spouse 1, 0 f1865kd Female 18-65 Child 1, 0 f2545en Female 25-45 Enrollee 1, 0 f2545sp Female 25-45 Spouse 1, 0 f4565en Female 45-65 Enrollee 1, 0 f4565sp Female 45-65 Spouse 1, 0 f4580ss Female 45-80 Widow 1, 0 f6580en Female 65-80 Enrollee 1, 0 f6580sp Female 65-80 Spouse 1, 0 f80pen Female 80+ Enrollee 1, 0 f80psp Female 80+ Spouse, Widow 1, 0 fall Falls 1, 0 fam1p1c Family is 1 Par 1 Child 1, 0 fam1p2cp Family is 1 Parent 2+ Children 1, 0 fam2p1cp Family is 2 Parents 1+ Children 1, 0 famcoup Family is Couple 1, 0 famdau Daughter in Family 1, 0 famempo Family Employee Only 1, 0 famenr Enrollee in Family 1, 0 famlst trimn(famlst)||enrrells; 1, 0 famnkid # of Kids per Enrollee 1, 0 famofem Oth Female in Family 1, 0 famomal Oth Male in Family 1, 0 famsdau Step Daughter in Family 1, 0 famsize # Covered Lives Per Enrollee COUNT famson Son in Family 1, 0 famspse Spouse in Family 1, 0 famsson Step Son in Family 1, 0 famsurv Surviving Spouse in Family; 1, 0 firearm Firearm missile 1, 0 firestem Fire, flames, hot sub, object, caustic, corrosive, steam 1, 0 flt1kd Female < 1 Child 1, 0 followup Follow-up examination dxvar =: V67 1, 0 frachand Fracture of hand and fingers dxvar =: (814-8171) 1, 0 fracllim Fracture of lower limb dxvar =: (820-8291) 1, 0 fracoth oth fractures dxvar = 800-81259 or 818-8191 1, 0 fracrad Fracture of radius and ulna dxvar =: 813 1, 0 fracskul Intracranial injury, excluding those with skull fracture-‘850’ <= dxvar <= ‘8541’ 1, 0 gadvmed Adverse effects of medical treatment dxvar = E870-E879, E930-E9499 1, 0 gasthemm Gastrointestinal hemorrhage dxvar =:578 1, 0 gastri Gastritis and duodenitis dxvar = 535 1, 0 gblood Dis of the blood and blood-forming organ-‘280’ <= dxvar <= ‘2899’-group code 1, 0 of anemia, othblood gcircul Dis of the circulatory system-‘390’ <= dxvar <= ‘4599’-group code of anginap, 1, 0 arthero, othische, carddysr, chf, othheart, esshyp, cerebrov, artery, hermorrh, othcirc gconanom Congenital anomalies dxvar = 740-7599 1, 0 gdigest Dis of the digestive system dxvar = 520-5799 1, 0 gendo Endocrine, nutril and metab Dis, and immunity dsrs-‘240’ <= dxvar <= ‘2799’- 1, 0 group code code-ahypothy, othhyr, diabmell, dsrlipid, obesity, othendo genmedex General medical examination dxvar =: V70 1, 0 ggenito Dis of the genitourinary system - GROUP OF 1, 0 OTHURIN, CALCIDY, CYSTBLDD, HYPROS, INFLFEML, OTHNOLE, DSRBRST, NINFFEM DSRMENS gibluc combines(gasthemm, stomulcr) 1, 0 gihs Suppl classif of factors influ hlth stat & contact w hlth se dxvar = V01-V829 1, 0 ginfect Infectious and parasitic Dis-‘001’ <= dxvar <= ‘1398’ group code of 1, 0 strep, hivinfec, virlwart, chlamyd, dermtosi, candidia & ohtinfs ginjpoi Injury and poisoning group code fracrad frachand fracllim fracoth sprnwrst 1, 0 sprnkne sprnankl sprnneck sprnobk sprnostr fracskul owndhd owndhnd othopnwd suprcorn othspin contus oinjury poison unspex cmpms ginjudet Injuries of undetermined intent no sub dxvar = E980-E989 1, 0 gintinj Intentional injuries - group code assault, selfinfl, voilenc, 1, 0 glacoma Glaucoma-dxvar =: ‘365’ 1, 0 gmentl Mental dsrs-‘290’ <= dxvar <= ‘319’ - group code of schizo, depress, othpsycy, 1, 0 anxiety, neurotic, alcohdep, drugdep, stress, othdepr, add & othmentl gmuscu Dis of the musculoskeletal system and connective tissue dxvar = 710-7399 1, 0 gneoplsm Neoplasm-‘140’ <= dxvar <= ‘2399’ - group code of mn_coln, mn_skin, mn_brst, 1, 0 mn_pros, mn_lymp, mn_oth, & secondary neo's bn_skin, bn_oth, neounsp gnervous Dis of the nervous system and sense organs-‘320’ <= dxvar <= ‘3899’ group 1, 0 code of migraine, othcentr, carpltun, othnerv, retinldt, glacoma, cataract, dsrrefra, conjunct, dsreyeld, otheye, dsrear, otitismd, othear gperi Certain cond originating in the perinatal period NO SUB CATS-760-7799 1, 0 gpregn Comps of pregnancy, childbirth, and the puerperium NO SUB CATS 1, 0 dxvar = 630-677 gpsorias Psoriasis and similar dsrs group code of oinfskin, corncal, actinseb, acne, 1, 0 sepacyst, urticari, osksub gresp Dis of the respiratory system-‘460’ <= dxvar <= ‘5199’ - group code of 1, 0 acusinu, acuphary, acutonsl, acubronc, othacres, chrsinu, alerhin, chronbro, asthmas, othopd, othresp gskin Dis of the skin and subcutaneous tissue group code - celluabs, oiskin, 1, 0 conderm gsymsig Symptons, Signs, and III-defined cond group code - syncope, convuls, dizzi, 1, 0 pyrexi, suminteg, headach epistax, abheart, dyspnea, cough, chestpn, sympurin, abdpain, othssil gunint Unintentional injuries group code - fall, mototraf, struck, overext, cutobjs, 1, 0 natenvr, poisdrg, firestem, machinr, cycle, mototra, othtran, firearm, othclas, mechunsp gynexam Gynecological examination-dxvar = ‘V723’ 1, 0 hchc hcpcs CODES 1, 0 headach Headache-dxvar = ‘7840’ 1, 0 hemat combines (anemia, dxblood, acutonsl, gblood, othblood) 1, 0 hermorrh Hemorrhoids-dxvar =: ‘455’ 1, 0 herniabd Hernia of abdominal cavity-‘550’ <= dxvar <= ‘5539’ 1, 0 Hi1dvby The index of Highest cost per day divided by Average cost per day per month Hi2dvby The index of 2nd Highest cost per day divide by Average cost per day per month Hibych2a (1, 0) 1 = The second highest month cost per day is adjacent to the first month Hibymos1 The maximum cost per day for any month cost for the base year Hibych2b (1, 0) 1 = The second highest month cost per day is not adjacent to the first month Hibymos2 The 2nd Highest cost per day for any month for the base year hilo classify high cost nxt yr cases based on charges 0-low <96, 1-High ge 96 hilopay classify high cost nxt yr cases based on payments 0-low <68.5, 1-High ge 68.5 hivinfec HIV infection-dx starting w/042 1, 0 hspatri1 Hosp Admit in Trimes 1 COUNT hspatri2 Hosp Admit in Trimes 2 COUNT hspatri3 Hasp Admit in Trimes 3 COUNT hsptlos Total Hospital LOS DAYS hsptlosc Total Hospital LOS Category hyprpros Hyperplasia of prostate-dxvar = ‘600’ 1, 0 icu_etc combines(lspxvein, lspxvent, mcptccth, mcptintr, pcptcrit, pulart) 1, 0 infertil Any mention of infertility male or female (cpt) or dxvar in: (‘628’, ‘606’) 1, 0 inflfeml Inflammatory dsrs of female pelvic organs-‘614’ <= dxvar <= ‘6169’ 1, 0 irratcol Irritable colon-dxvar = ‘5641’ 1.0 itemno ITEM NBR jntdsrs Derangements and oth and unspecified joint dsrs-‘717’ <= dxvar <= ‘7199’ 1, 0 kid1_3 Count of the Number of Children in a family. 0 = no children, 1, 2 or 3 or more 0-3 children lensrepl Lens replaced by pseudophakos-dxvar = ‘V431’ 1, 0 locatnme LOCATION NAME CONFIDENTIAL locatno LOCATION CONFIDENTIAL logi combines (constip, diverint, othdiges) 1, 0 lspxampu Life PX Amputation cpts{i} in 1, 0 (‘23900’, ‘23920’, ‘24900’, ‘25900’, ‘25927’, ‘27295’, ‘27590’, ‘27591’, ‘27592’, ‘27596’, 1, 0 ‘27598’, ‘27880’, ‘27881’, ‘27882’, ‘27886’, ‘27888’, ‘27889’, ‘28880’, ‘28805’) lspxchem Life PX Chegroup codeapy-cpts{i} in 1, 0 (’96400’, ‘96408’, ‘96410’, ‘96412’, ‘96414’, ‘96420’, ‘96422’, ‘96423’, ‘96425’, ‘96445’, ‘96450’, ‘96520’) lspxdial Life PX Dialysis cpts in ‘90935’, ‘90937’, ‘90945’, ‘90947’ 1, 0 lspxentr Life PX Enterostomy-cpts{i} in 1, 0 (‘44300’, ‘44310’, ‘44312’, ‘44314’, ‘44316’, ‘44320’, ‘44322’, ‘44340’, ‘44345’, ‘44346’) lspxgast Life PX Gastrostomy cpts in ‘3750’, ‘43760’, ‘43830’, ‘43832’ 1, 0 lspxorgn Life PX Major Organ Transplants-cpts{i} in (‘33935’, ‘33945’, ‘47135’, ‘40260’) 1, 0 lspxradt Life PX Radiation Therapy-if cpts{i} 1, 0 in ‘77261 ‘77263’, ‘77280’, ‘77285’, ‘77290’, ‘77295’, ‘77299’, ‘77300’, ‘77305’, ‘77310’, ‘77315’, ‘77321’, ‘77326’, ‘77327-8, ‘77331’, ‘77336’, ‘77370’, ‘77399’, ‘77401- 4’, ‘77406-9’, ‘77411-4’, ‘77416-20’, ‘77425’, ‘77430- 2’, ‘77470’, ‘77499’, ‘77000’, ‘77605’, ‘77610’, ‘77615’, ‘77620’, ‘77750’, ‘77761’- 3’, ‘77776-8’, ‘77781-4’, ‘77789’, ‘77790’, ‘77799’ lspxtrch Life PX Tracheostomy-cpts{i} in (‘31600’, ‘31603’, ‘31610’) 1, 0 lspxvein Life PX Venous Access Port-cpts{i} in (‘36495’, ‘36496’) 1, 0 lspxvent Life PX Intubation/Ventilation-cpts in (‘31500’, ‘94650’, ‘94651’, ‘94656’, ‘94657’ 1, 0 lumbago Lumbago dxvar = ‘7242’ 1, 0 m0105kd Male 01-15 Child 1, 0 m0518kd Male 05-18 Child 1, 0 m1825en Male 18-25 Enrollee 1, 0 m1845sp Male 18-45 Spouse 1, 0 m1865kd Male 18-65 Child 1, 0 m2545en Male 25-45 Enrollee 1, 0 m4565en Male 45-65 Enrollee 1, 0 m4565sp Male 45-65 Spouse 1, 0 m6580en Male 65-80 Enrollee 1, 0 m6580sp Male 65-80 Spouse 1, 0 m80pen Male 80+ Enrollee 1, 0 m80psp Male 80+ Spouse, 65+ Widower 1, 0 machinr Machinery-dxvar =: ‘E919’ 1, 0 male 1, 0 mcptallr Med CPT Allergy, “95004” <= cpts{i} <= “95199” 1, 0 mcptcard Med CPT Cardiogr, “93000” <= cpts{i} <= “93350” 1, 0 mcptcarv Med CPT CardVascThor, “92950” <= cpts{i} <= “92996” 1, 0 mcptccth Med CPT CardCath, “93501” <=cpts{i} <= “93572” 1, 0 mcptchem Med CPT Chegroup code, “96400” <= cpts{i} <= “96549” 1, 0 mcptcns Med CPT CNS, “96100” <= cpts{i} <= “96117” 1, 0 mcptderm Med CPT Dermatology, “96900” <= cpts{i} <= “96999” 1, 0 mcptdial Med CPT Dialysis, “90918” <= cpts{i} <= “90999” 1, 0 mcptent Med CPT ENT, “92502” <= cpts{i} <= “92599” 1, 0 mcptintr Med CPT IntraCard, “93600” <= cpts{i} <= “93660” 1, 0 mcptneur Med CPT Neurology, “95805” <= cpts{i} <= “95975” 1, 0 mcptopth Med CPT Opthalm, “92002” <= cpts{i} <= “92499” 1, 0 mcptoste Med CPT OsteoPath, “98926” <= cpts{i} <= “98929” 1, 0 mcptphys Med CPT PhysTher, “97010” <= cpts{i} <= “97999” 1, 0 mcptpsy Med CPT Psych, “90801” <= cpts{i} <= “90899” 1, 0 mcptpulm Med CPT Pulmon, “94010” <= cpts{i} <= “94799” 1, 0 mcptvasc Med CPT VascStudy, “93875” <= cpts{i} <= “93980” 1, 0 mechunsp Mechanism unspecified 1, 0 menopa Menopausal and postmenopausal dsrs dxvar=:627 1, 0 migraine Migraine-dxvar = : ‘346’ 1, 0 misc_hrt combines (arthero, carddysr, mcptccth, mcptintr, othische) 1, 0 mlt1kd Male < 1 Child 1, 0 mn_brst Malignant neoplasm of breast-‘174’ <= dxvar <= ‘1759’) or (dxvar = ‘19881’) 1, 0 mn_coln Malignant neoplasm of colon and rectum-(‘153’ <= dxvar <= ‘1548’) or 1, 0 (dxvar = ‘1975’) mn_lymp Malignant neoplasm of lymphatic and hematopoietic tissue-dxvar in (‘1765’, 1, 0 ‘1969’)) or (‘200’ <= dxvar <= ‘20891’) mn_oth oth malignant neoplasm-(‘140’<= dxvar <= ‘1529’) or (‘155’-‘1719’) or(‘1761’-‘1764’) 1, 0 or (‘1766’-‘1849’) or (‘186’-‘1958’) or (‘197’-‘1974’) or (‘1976’-‘1981’) or (‘1983’-‘1987’) or (‘19882’-‘1991’) or (‘230’-‘2349’) or dxvar = ‘1988’ mn_pros Malignant neoplasm of prostate-dxvar = ‘185’ 1, 0 mn_skin Malignant neoplasm of skin-(‘172’ <= dxvar <= ‘1739’) or 1, 0 (dxvar in (‘1760’, ‘1982’)) MOSA thru F dummies for CATBMOS 1, 0 motontra Motor vehicle, nontraffic- 1, 0 dx(‘E8200’, ‘E8210’, ‘E8220’, ‘E8230’, ‘E8240’, ‘E8250’, ‘E8205’, ‘E8215’, ‘E8225’, ‘E8235’, ‘E8245’, ‘E8255’, ‘E8207’, ‘E8217’, ‘E8227’, ‘E8237’, ‘E8247’, ‘E8257’, ‘E8209’, ‘E8219’, ‘E8229’, ‘E8239’, ‘E8249’, ‘E8259’) mototraf Motor vehicle, traffic-‘E810’ <= dxvar <= ‘E8199’ 1, 0 mrh1drg 1st Most Recent Hosp DRG DRG mrh1los 1st Most Recent Hosp LOS DAYS mrh1mdc 1st Most Recent Hosp MDC MDC mrh1ms 1st Most Recent Hosp Medsurg Medical surgical indicator mrh2drg 2nd Most Recent Hosp DRG DRG mrh2los 2nd Most Recent Hosp LOS DAYS mrh2mdc 2nd Most Recent Hosp MDC MDC mrh2ms 2nd Most Recent Hosp Medsurg Medical surgical indicator mrh3drg 3rd Most Recent Hosp DRG DRG mrh3los 3rd Most Recent Hosp LOS DAYS mrh3mdc 3rd Most Recent Hosp MDC MDC mrh3ms 3rd Most Recent Hosp Medsurg Medical surgical indicator mxchgtri replaces chgt1-t3 and uses index 1, 2, 3 mylagi Myalgia and myositis, unspecified-dxvar = ‘7291’ 1, 0 namefir FIRST NAME confidential namelast LAST NAME confidential namemidl MIDDLE INITIAL confidential natenvr Natural and environmental factors-(‘E900’ <= dxvar <= ‘E9099’) or (‘E9280’ <= 1, 0 dxvar <= ‘E9282’) ncpt9xc # of 9xxxx cpts in year category count ncpt9xxx # of 9xxxx cpts in year count neounsp Neop of uncertain behavior and unspec nature-‘235’ <= dxvar <= ‘2399’ 1, 0 nervsys combines (gneoplsm, othcentr) 1, 0 netwkcd NETWORK CODE confidential netwknme NETWORK NAME confidential neurotic Neurotic depression-dxvar = ‘3004’ 1, 0 newchg10 Mean of BaseChg months minus 2 highest months' newpay10 Mean of BasePay months minus 2 highest months' nhosps # of hosp visits count nhospsc # of hosp visits Category ninenter Noninfectious enteritis and colitis ‘555’ <= dxvar <= ‘5589’ 1, 0 ninffem Noninflammatory dsrs of female genital organs dxvar = 622-6249 1, 0 nobasepy ‘basechg without payment’ 1, 0 noclaims No Claims in Base or Study Period 1, 0 normpreg Normal pregnancy 1, 0 numagegp 0 <= ensagen <= 0.9 numagegp = 1; 1 <= ensagen <= 4.9 numagegp = 2; 1 thru 11 5.0 <= ensagen <= 17.9 numagegp = 3; 18 <= ensagen <= 24.9 numagegp = 4; 25 <= ensagen <= 34.9 numagegp = 5; 35 <= ensagen <= 44.9 numagegp = 6; 45 <= ensagen <= 54.9 numagegp = 7; 55 <= ensagen <= 64.9 numagegp = 8; 65 <= ensagen <= 74.9 numagegp = 9; 75 <= ensagen <= 84.9 numagegp = 10; ensagen ge 85 numagegp = 11; obesity Obesity-dxvar = : ‘2780’ 1, 0 obseval Observation and evaluation for suspected cond not found dxvar = : v71 1, 0 oinfskn other inflammatory condition of skin and subcutaneous tissue dxvar = 690-6918, 1, 0 693-6959, 697-6989 oinjury oth injuries 1, 0 oiskin oth infection of the skin and subcutaneous tissue 1, 0 omusccn oth Dis of the muscutoskeletal system and connective tissue dxvar-734-7399 1, 0 osksub oth dsrs of the skin and subcutaneous tissue-dxvar: 7028, 709, 703-7059, 1, 0 7063-7079 ostealld Osteoarthrosis and allied dsrs-dxvar: 715 1, 0 othacres oth acute respiratory infections-(dxvar = ‘460’) or (‘464’ <= dxvar <= 1, 0 ‘4659’) otharth oth arthropathies and related dsrs-dxvar 710-7138, 7141-7149, :716 1, 0 othblood oth Dis of the blood and blood-forming organs-‘286’ <= dxvar <= ‘2899’ 1, 0 othcentr oth dsrs of the central nervous system-(‘320’ <= dxvar <= ‘326’) or (‘330’ <= 1, 0 dxvar <= ‘3379’) or (‘340’ <= dxvar <=‘3459’) or (‘347’ <= dxvar <= ‘3499’) othcirc oth Dis of the circulatory system-(dxvar IN (‘390’, ‘3929’, ‘403’, ‘405’, ‘417’)) 1, 0 or (‘451’-‘4549’) or (‘456’-‘4599’) othclas oth and not elsewhere classified-dxvar E925-E9269, E988, E9290-E929, E925-E9269, 1, 0 E9288, E9290-E929 othdepr Depressive reaction, not elsewhere classified-dxvar = ‘311’ 1, 0 othdiges oth Dis of the digestive system-DXVAR 526-5300, 5302-5309, 536-5439, 5642-5649, 1, 0 567-5689, 5695-5739, :(560, 577, 579) othdorso oth dorsopathies-DXVAR 720-72191, 723-7241, 7243-7249 1, 0 othear oth Dis of the ear and mastoid process-‘383’ <= dxvar <= ‘3899’ 1, 0 othendo oth endocrine, nutrit and metabolic Dis, and immunity dsrs- 1, 0 (‘251’ <= dxvar <= ‘2719’) or (‘273’ <= dxvar <= ‘2779’) or (‘2781’<= dxvar <= ‘27903’) otheye oth dsrs of the eye and adnexa-(dxvar = :‘360’) or (‘363’-‘3649’)or (‘368’-‘3699’) 1, 0 or (‘370’-‘3719’) or (‘3724’-‘3729’) or (‘375’-‘3799’) otheye (dxvar = :‘360’) or (‘363’ <= dxvar <= ‘3649’) or (‘368’ <= dxvar <= 1, 0 ‘3699’) or (‘370’ <= dxvar <= ‘3719’) (‘3724’ <= dxvar <= ‘3729’) or (‘375’ <= dxvar <= ‘3799’) othfeml oth dsrs of the female genital tract-DXVAR 617-6199, 621, 625 628, 629 1, 0 othhealt oth factors influencing hlth stat and contact with hlth serv-DXVAR V200-201, 1, 0 :V21, V290-V430, V432-V389, V46-V669, V68-V699, V720-V722, V724-V829 othheart oth heart disease-(‘391’ <= dxvar <= ‘3920’) or (‘393’-‘39899’) or (dxvar IN 1, 0 :(‘402’, ‘404’)) or (‘415’-‘4169’)or (‘420’-‘4269’) or (‘4281’-‘4299’) othinfs oth infectious and parasitic disease-(‘001’ <= dxvar <= ‘0339’) or (‘0341’-‘0419’) 1, 0 or (‘045’-‘0780’) or (‘0782’-‘07981’) or (-‘07999’) or (‘080’-‘1049’) or (dxvar= :‘111’) or (‘114’-‘1398’) othische oth ischemic heart disease-DXVAR 410-412, 4141-4149 1, 0 othmale oth dsrs of male genital organs-DXVAR 601-6089 1, 0 othmentl oth mental dsrs-(‘312’ <= dxvar <= ‘3139’) or (‘3141’-‘319’) or (‘3001’-‘3003’) 1, 0 or (‘3005’-‘3009’) or (‘301’-‘3026’) or (‘306’-‘3079’) or (dxvar =: ‘310’) othnerv oth dsrs of the nervous system-(‘350’ <= dxvar <= ‘3539’) or (‘3541’ <= 1, 0 dxvar <= ‘3599’) othopd oth COPD and allied cond-DXVAR 492, 494-496 1, 0 othopnwd oth open wound-DXVAR 874-8812, 884-8977 1, 0 othpsych oth psychoses-(‘290’ <= dxvar <= ‘2949’) or (‘2960’ <= dxvar <= ‘2961’) 1, 0 or (‘2964’ <= dxvar <= ‘2999’) othrepro oth encounter related to reproduction-V23-V242, V26-V289 1, 0 othresp oth Dis of the respiratory system-470-4722, 474-4761, 478, 4780, 4781, :487, 1, 0 500-5199 othrhexb oth rheumatism, excluding back-DXVAR 725, 7271-7279, :728, :7290, 7292-7299 1, 0 othspin oth superficial injury-DXVAR 910-9180, 9182-9199 1, 0 othssil oth symptoms, signs, and ill-defined cond-DXVAR 7800-7801, 1, 0 :(7805, 781, 783, 7861), 7807-7809, 7841-78469, 7848-7849, 7854-7859, 7863-7864, 7866-78799, 7891-7999 oththyr oth dsrs of the thyroid gland-(‘240’ <= dxvar <= ‘243’) or 1, 0 (‘245’ <= dxvar <= ‘2469’) othtran oth transportation dxvar FOR E800X-E807X WHEN X EQ 0, 2, 8 OR 9 1, 0 othtype Other genetic typing tsts for transplants cpts{i} in 1, 0 (‘86805’, ‘86806’, ‘86807’, ‘86808’, ‘86821’, ‘86822’, ‘86849’) othurin oth Dis of the urinary system-580-5899, 590-591, 593-5949, 597-5989, 5991-5999 1, 0 otitismd Otitis media and Eustachian tube dsrs-‘381’ <= dxvar <= ‘3829’ 1, 0 ounspex oth and unspecified effects of external causes DXVAR 990-99589 1, 0 overext Overexertion and strenuous movements DXVAR E927 1, 0 owndhd Open wound of head-DXVAR 870-8739 1, 0 owndhnd Open wound of hand and fingers DXVAR 882-8832 1, 0 pay Payment in Base Year payment PAYMENT AMT payp Payment in Prediction Year pbynoby ‘prebase, base 1, 0 pbyothr ‘prebase, base >0’ 1, 0 pcptborn CPT Place Newborn-99431” <= cpts{i} <= “99490 1, 0 pcptcons CPT Place Consult-99241” <= cpts{i} <= “99275 1, 0 pcptcrit CPT Place Critical Care-99291” <= cpts{i} <= “99292 1, 0 pcpter CPT Place ER 99281” <= cpts{i} <= “99288 1, 0 pcpthome CPT Place Home-99341” <= cpts{i} <= “99353 1, 0 pcpthosp CPT Place Hosp-99217” <= cpts{i} <= “99238 1, 0 pcptnicu CPT Place Neon ICU-99295” <= cpts{i} <= “99298 1, 0 pcptnurs CPT Place Nurs Facil 99301” <= cpts{i} <= “99313 1, 0 pcptoff CPT Place Office“99201” <= cpts{i} <= “99215” 1, 0 pcptoltf CPT Place Oth LTCF-99321” <= cpts{i} <= “99333 1, 0 pcptpmed CPT Place Prev Med-99381” <= cpts{i} <= “99429 1, 0 penvasc combines (artery, mcptvasc, othcirc) 1, 0 periph Peripheral enthesopathies and allied dsrs-DXVAR: 726 1, 0 pershyst Potential health hazards related to personal and family hist-DXVAR V10-V198 1, 0 pharclms Pharmacy Claims count planname PLAN NAME confidential planno SERIAL-PLAN NBR confidential pmtchg base & (pmt/basechg ge .2) as 1 1, 0 pneumon Pneumonia-DXVAR 480-486 1, 0 poisdrg Psning drugs, med subst, biolog, oth solid, liqd, gases, vapor 1, 0 poison Poisonings-DXVAR 960-9899 1, 0 postpart Postpartum care and examination-DXVAR: V24 1, 0 prenatal Undelivered Pregnancy-Prenatal care 1, 0 prgage age It35 (1), 0 1, 0 provlocn PROVIDER LOCATION confidential provname PROVIDER NAME confidential provnetw PROVIDER NETWORK confidential provno PROVIDER NBR confidential provst PROVIDER STATE confidential provtype PROVIDER TYPE confidential pulart pulmonary artery cath placement cpts{i} eq ‘93503’ 1, 0 pyrexi Pyrexia of unknown origin: 7806 1, 0 rad combines (Ispxradt, radther, radnuc) 1, 0 radnuc Nuclear Medicine cpts{i} starting with (‘78’, ‘79’) 1, 0 radther Any radiation therapy cpts{i} starting with ‘77’ 1, 0 relation RELATIONSHIP 1-9 ‘1’ = ‘A Enrollee’ ‘2’ = ‘B Spouse’ ‘3’ = ‘C Son’ ‘4’ = ‘D Daughter’ ‘5’ = ‘E Stepson’ ‘6’ = ‘F Stepdaughter’ ‘7’ = ‘G Other Male’ ‘8’ = ‘H Other Female’ ‘9’ = ‘I Surv Spouse’ retinldt Retinal detachment and oth retinal dsrs-‘361’ <= dxvar <= ‘3629’ 1, 0 rheuarth Rheumatoid arthritis-DXVAR 7140 1, 0 routchk Routine infant or child health checks-DXVAR V202 1, 0 schizo Schizophrenic dsrs-dxvar = ‘295’ 1, 0 scptaudi Surg CPT Auditory, “69” = substr(cpts{i}, 1, 2) 1, 0 scptbaby Surg CPT Matern, “59” = substr(cpts{i}, 1, 2) 1, 0 scptcard Surg CPT Card Vasc, “33” <= substr(cpts{i}, 1, 2) <= “37” 1, 0 scptdgst Surg CPT Digest, “40” <= substr(cpts{i}, 1, 2) <= “49” 1, 0 scptdiap Surg CPT MED & Diaphr, “39” = substr(cpts{i}, 1, 2) 1, 0 scptendo Surg CPT Endocrine, “60” = substr(cpts{i}, 1, 2) 1, 0 scpteye Surg CPT Eye, “65” <= substr(cpts{i}, 1, 2) <= “68” 1, 0 scptfem Surg CPT Lap/Perit/Hyst Female Genital, “56” <= substr(cpts{i}, 1, 2) <= “58” 1, 0 scpthern Surg CPT Hernia & Lymph, “38” = substr(cpts{i}, 1, 2) 1, 0 scptmale Surg CPT Male Genital, “54” <= substr(cpts{i}, 1 ,2) <= “55” 1, 0 scptmskl Surg CPT Muscular-Skeleton, “20” <= substr(cpts{i}, 1, 2) <= “29” 1, 0 scptnrve Surg CPT Nerve, “61” <= substr(cpts{i}, 1, 2) <= “64” 1, 0 scotresp Surg CPT Respiratory, “30” <= substr(cpts{i}, 1, 2) <= “32” 1, 0 scptskin Surg CPT Integument, “10” <= substr(cpts{i}, 1, 2) <= “19” 1, 0 scpturin Surg CPT Unirnary, “50” <= substr(cpts{i}, 1, 2) <= “53” 1, 0 selfinfl Self-inflicted-dxvar e950-e959 1, 0 selmalig combines(mn_coln, mn_lymph, mn_oth, mn_pros) 1, 0 sepacyst Sebaceous cyst-dxvar 7062 1, 0 servamb Serv Locn Ambulance-servlocn = “11” 1, 0 servasrg Serv Locn Ambul Surg-servlocn = “16” 1, 0 servecc Serv Locn EM Care Ctr-servlocn = “09” 1, 0 servehsp Serv Locn Emerg Hosp-servlocn = “07” 1, 0 servhmhl Serv Locn Home Hlth servlocn = “12” 1, 0 servhome Serv Locn Home-servlocn = “04” 1, 0 serviane Serv Locn Inpat Anes-servlocn = “15” 1, 0 servih Serv Locn Inpat Hosp-servlocn = “01” 1, 0 servilab Serv Locn Indep Lab-servlocn = “08” 1, 0 servlocn SERVICE LOCATION 00-16 servnurs Serv Locn Nurs Home-servlocn = “05” 1, 0 servoane Serv Locn Outpat Anes-servlocn = “14” 1, 0 servoff Serv Locn Office-servlocn = “03” 1, 0 servoh Serv Locn Outpat Hosp-servlocn = “02” 1, 0 servothl Serv Locn Other locn-servlocn = “10” 1, 0 servphar Sent Locn Pharmacy-servlocn = “13” 1, 0 servsnf Serv Locn SNF-servlocn = “06” 1, 0 servtype SERV TYPE list provided by source sex sex 1, 2, 9 sexpatn PATIENT SEX 1, 2, 9 sprnankl Sprains and strains of ankle dxvar-: 8450 1, 0 sprnkne Sprains and strains of knee and leg dxvar-: 844 1, 0 sprnneck Sprains and strains of neck-dxvar 8470 1, 0 sprnobk oth sprains and strains of back dxvar: 846, 8471-8479 1, 0 sprnostr oth sprains and strains nos-840-8419, :(843, 8451, 848, :842) 1, 0 sprnwrst Sprains and strains of wrist and hand 1, 0 sqech1 ‘square high chg’ 1, 0 sqech2a ‘Square adj chg’ 1, 0 sqech2b Square not ADJacent chg’ 1, 0 sqexc2a ‘Square adj pay’ 1, 0 sqexc2b Square not ADJacent pay’ 1, 0 sqexch1 ‘square high pay’ 1, 0 sqnewchg ‘Square 10mos Bchg’ 1, 0 sqnewpy ‘Square 10mos Bpay’ 1, 0 ssn Enrollee SS number 1, 0 statact Enrollment Type Active, status = ‘00’ 1, 0 statcobr Enrollment Type Cobra, status = 15, 16, 17, 18, 19 1, 0 statlife Enrollment Type Life Only 1, 0 statltd Enrollment Type LTD, status = 50 1, 0 statmult Enrollment Type Multiple - if 1, 0 sum(statact,statss,statpens,statltd,statcobr,statlife)>1 then statmult = 1; statpens Enrollment Type Pensioner-status = 10 1, 0 statss Enrollment Type Surv Spouse-status = 01 1, 0 status STATUS, ‘00 Active’, ‘01 Surv Spouse’, ‘10 Pensioner’, ‘12 LTD’, ‘15 Cobra’, ‘16 Cobra’, ‘17 Cobra’, ‘18 Cobra’, ‘19 Cobra’, ‘50 Life Only’ stomulcr Ulcer of stomach and small intestine-531-5349 1, 0 strep Streptococcal sore throat-dxo340 1, 0 stress Acute reaction to stress and adjustment reaction-‘308’ <= dxvar <= ‘3099’ 1, 0 struck Striking against or struck accidentally by objects or person 1, 0 suprcorn Superficial injury of cornea-dxvar 9181 surgpath surgical path levels 4, 5, 6 cpt in (‘88305’, ‘88307’, ‘88309’ 1, 0 syminteg Symptoms involving skin and oth integumentary tissue dxvar: 782 1, 0 sympurin Symptoms involving urinary systemdxvar: 788 1, 0 syncope Syncope and collapse dxvar 7802 1, 0 synovit Synovitis and tenosynovitis dxvar: 7270 1, 0 teeth Dis of the teeth and supporting structures-dxvar 520-5259 1, 0 temppace Temporary pacer placement cpts{i} in (‘33210’, ‘33211’) 1, 0 tenmoch Average from the sum of all months in the base year excluding the 2 highest months per day transfus Transfusion Medical ‘86850’ <= cpts{i} <= ‘86999’ 1, 0 trantype Transplant donor and genetic typing ‘86812’ <= cpts{i} <= ‘86817’ 1, 0 units UNITS COUNT urticari Urticaria dxvar: 708 1, 0 uti_unsp Urinary tract infection, site not specified dxvar 5990 1, 0 violenc oth causes of violence dxvar E970-E978, E990-E999 1, 0 virlwart Viral warts-dxvar =: ‘0781’ 1, 0 wbasechg ‘basechg present’ 1, 0 wbasepy ‘basepmt present’ 1, 0 zipenrl ENROLLEE ZIP CODE 5 digitl zipprov PROVIDER ZIP CODE 5 digit
Confidential information may be encrypted to protect the identity and privacy of individuals
Claims (80)
1. A computer-implemented process of developing a person-level cost model for forecasting future costs attributable to claims from members of a book of business, where person-level data regarding actual base period health care claims are available for a substantial portion of the members of the book of business for an actual underwriting period, and the forecast of interest (i.e., future claim amount) is for an actual policy period which can be, but is not necessarily contiguous with the actual underwriting period, comprising the steps of:
providing development universe data comprising person-level enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a health care claim comprises at least a claim code and a claim amount;
providing at least one claim-based risk factor for each historical base period claim based on the claim code associated with the health care claim and providing at least one enrollment-based risk factor based on the enrollment data; and
developing a cost forecasting model by capturing the predictive ability of the main effects and interactions of claim based risk factors and enrollment-based risk factors, with the development universe data through the application of an interaction capturing technique to the development universe data.
2. The computer-implemented process of claim 1 , wherein the interaction capturing technique is selected from the group consisting of median regression tree techniques, least square regression tree techniques, rule induction techniques, ordinary least squares regression techniques, median regression techniques, robust regression techniques, genetic algorithms, rule induction, clustering techniques and neural network techniques.
3. The computer implemented process of claim 1 wherein the person-level next period cost forecasts are adjusted by modifying the extant cost forecast by the expected cost trend.
4. The computer implemented process of claim 1 wherein the datum from the claims used as predictors consist essentially of the claim- and enrollment-based risk factors and the claim amount is a standardized cost of services provided and the model is used to allocate prospective payments to health care providers.
5. The computer implemented process of claim 1 wherein the data used from the claims data consist essentially of the claim code and selected mandatory procedures and the claim amount is a standardized cost of services provided during the same time period as the base period and the model is used to evaluate the efficiency of health care providers.
6. The computer implemented process of claim 1 , further comprising a computer implemented process of forecasting future claim amounts attributable to claims from members of a book of business for an actual policy period, wherein the model development universe comprises data from the members of a book of business to be insured, further comprising:
applying the cost-forecasting model to the actual underwriting period person-level data of each of the members of the book of business to generate a person-level actual policy period cost forecast for each member of the book of business; and
producing a group-level forecast for the actual underwriting period from the person-level forecasts of each member of the group by totaling the person-level actual policy period cost forecasts for the group for the policy period.
7. The computer implemented process of claim 6 , comprising in addition the step of: setting insurance reserves based on group-level forecast for the actual policy period, wherein the policy period is a reserving period for claims that have not occurred or that have occurred but not been reported.
8. The computer implemented process of claim 6 , wherein claim amounts are a mix of fee for service payments and capitation payments so that the base and underwriting periods risk factors are appended to include dummy variables for the presence of capitation payments by provider type and the cost estimate in the next and policy periods is the fee for service cost that must be supplemented with the expected capitation payments.
9. The computer implemented process of claim 6 , wherein the cost forecast is produced for first-dollar health insurance.
10. The computer implemented process of claim 6 , wherein the cost forecast is produced for specific plus aggregate stop loss health insurance.
11. The computer implemented process of claim 10 , wherein the cost forecast produced is for aggregate-only stop loss health insurance.
12. The computer implemented process of claim 10 , wherein the cost forecast produced is for specific stop loss health insurance.
13. The computer implemented process of claim 1 , wherein each of the diagnosis and CPT based risk factors is independent of the sequence in time of the other diagnosis and CPT based risk factors.
14. The computer implemented process of claim 1 , wherein the providing of risk factors for the health care claim data is substantially free of human expert interaction.
15. The computer implemented process of claim 1 , wherein capturing the predictive ability of the main effects and interactions of claim based risk factors and enrollment-based risk factors is substantially free of human expert interaction.
16. The computer implemented process of claim 1 , comprising in addition the step of: setting medical insurance reserves through application of the health care cost forecasting model, wherein the next period is a reserving period for claim amounts that have not occurred or that have occurred but not been reported.
17. The computer implemented process of claim 1 for forecasting short term disability (STD) costs wherein a dependent measure for generating the cost forecasting model is the number of STD days in the policy period and is weighted by the expected cost per day for the STD to produce the person-level forecast STD costs and summed across the group to produce the group's forecast STD cost.
18. The computer implemented process of claim 1 , for forecasting a probability of long term disability (LTD) claims wherein a dependent measure for generating the cost forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD claim amount and comprises in addition producing person-level expected LTD costs and summing person-level expected LTD costs across the group to produce a group's expected LTD cost.
19. The computer implemented process of claim 1 for forecasting group term life insurance costs wherein a dependent measure for generating the forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost which is summed across the group to produce the group's expected term life insurance cost.
20. The computer implemented process of claim 1 , wherein claim amounts are a mix of fee for service payments and capitation payments so that the base and underwriting periods risk factors are appended to include dummy variables for the presence of capitation payments by provider type.
21. A computer-implemented process of developing a hybrid person-level health care claim cost forecasting model for forecasting future medical costs attributable to health care claims from members of a book of business, where person-level data are available for a substantial portion of the members of the book of business, comprising the steps of:
providing development universe data comprising person-level data for a statistically meaningful number of individuals, the person-level data comprising continuous variable data and categorical variable data;
processing first the continuous variable data for each individual with a continuous processing technique that captures the predictive ability of main effects and interactions of continuous variables to generate a person-level continuous variable model; and
processing the categorical variable data for each individual including the output from the continuous processing technique with a categorical processing technique that captures the predictive ability of main effects and interactions of categorical variables to generate a person-level categorical variable model;
wherein the person-level continuous variable model and person-level categorical variable model together comprise a hybrid person-level health care claim amount forecasting model.
22. The computer-implemented process of claim 21 , wherein the continuous variable data comprises data selected from the group consisting of age, length of prior enrollment, historical claim amounts and transformations and trends in the person level claim amounts.
23. The computer-implemented process of claim 21 , wherein the categorical variable data comprises data selected from the group consisting of clinical risk factors, provider type and site of care.
24. The computer-implemented process of claim 21 , wherein the continuous processing technique is selected from the group consisting of regression techniques and neural network techniques.
25. The computer-implemented process of claim 21 , wherein the categorical processing technique is selected from the group consisting of median regression tree techniques, least square regression tree techniques, rule induction techniques, and neural network techniques.
26. The computer-implemented process of claim 21 , wherein the person-level data is available for a substantial portion of the members of the book of business for an actual underwriting period, and the claim amount of interest for forecasting purposes are during an actual policy period which can be, but is not necessarily contiguous with the actual underwriting period, and the development universe data comprises person-level data for each individual for a historical base period and a historical next period.
27. The computer-implemented process of claim 21 , wherein the hybrid person-level health care claim cost forecasting model is used as an input into an interaction capturing technique that uses all of the risk factors that were meaningful in the hybrid person-level health care claim cost forecasting model to forecast future medical claim amounts.
28. A computer-implemented process of developing a claim amount forecasting model for use in forecasting the future claim amount for members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual base period, and the claim amount of interest for forecasting purposes is an actual next period which can be, but is not necessarily contiguous with the actual base period, comprising the steps of:
processing the base period data having claims to generate a having-claims claim amount forecasting model; and
processing the base period data without claims to generate a without-claims claim amount forecasting model,
wherein the having-claims cost forecasting model and the without-claims forecasting model comprise a claim amount forecasting model.
29. A computer-implemented process of developing a health care claim amount forecasting model for use in forecasting the future medical claim amount for members of a book of business, where person-level data are available for a substantial portion of the members of the book of business for an actual base period, and the claim amount of interest for forecasting purposes is an actual next period which can be, but is not necessarily contiguous with the actual base period, comprising the steps of:
providing development universe data comprising person-level data for a statistically meaningful plurality of individuals, wherein the person-level data for an individual comprises health care claims data for the individual and the data on a health care claim comprises at least a claim amount and a claim code;
Winsorizing the person-level data to yield inlier data and outlier data;
processing the inlier data to generate an inlier cost forecasting model; and
processing the outlier data to generate an outlier cost forecasting model;
wherein the combination of the results of the inlier and outlier cost forecasting models together produce a person-level claim amount forecast model.
30. The computer-implemented process of claim 29 further comprising:
Winsorizing the inlier data to yield inlier data having claims and inlier data without claims;
processing the inlier data having claims to generate an inlier-having-claims claim amount forecasting model; and
processing the inlier data without claims to generate an inlier-without-claims claim amount forecasting model,
wherein the inlier-having-claims cost forecasting model and the inlier-without-claims forecasting model comprise an inlier claim amount forecasting model.
31. A computer-implemented process of forecasting a claim amount attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
providing person-level data, comprising enrollment data for members of a book of business to be insured for an actual underwriting period that can be, but is not necessarily, contiguous with the actual policy period;
providing a model development universe of person-level data, comprising enrollment data from the historical base period and historical next period heath care claims data for a statistically meaningful number of individuals;
providing enrollment-based risk factors for each historical base period and providing next period claim amounts;
developing a health care cost-forecasting model for the enrollment data by capturing the predictive ability of main effects and interactions of enrollment-based risk factors through the application of an interaction capturing techniques to the model development universe;
applying the health care cost-forecasting model to the person-level underwriting period enrollment data of each of the members of the book of business to generate a person-level expected cost forecast for the policy period for each member of the book of business; and
producing a group-level forecast for the expected cost of the policy period from the person-level forecasts of each person of the group by totaling the person-level expected cost forecasts for the actual policy period.
32. A computer-implemented process of forecasting costs attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
providing person-level data, comprising enrollment data and actual underwriting period health care claims data, for members of a book of business, where the person-level data on a health care claim comprises at least a claim amount and a claim code and the actual underwriting period can be, but is not necessarily, contiguous with the actual policy period;
providing a model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a base period health care claim includes at least a claim amount and a claim code;
providing claim-based risk factors for each historical base period based on the claim code associated with the health care claim and providing at least one enrollment risk factor based on the enrollment data;
developing a cost-forecasting model by capturing the predictive ability of main effects and interactions of risk factors through the application of an interaction capturing technique to the model development universe;
applying the cost-forecasting model to the person-level data of each of the individuals or members of a group to generate a person-level actual policy period expected cost forecast for each member of the group; and
producing a group-level forecast for the actual policy period from the person-level forecasts of each individual or member of the group by totaling the person-level cost forecasts for the actual policy period.
33. The computer implemented process of claim 32 , comprising in addition the step of: setting claim amount reserves based on the individual or group-level forecast, wherein the next period is a reserving period for claims that have not occurred or that have occurred but not been reported.
34. The computer implemented process of claim 32 for forecasting short term disability costs wherein the interaction capturing technique uses a dependent measure from the next period and policy period comprising the number of STD days in the policy period and weights the dependent measure by the expected cost per day for the STD to produce the person-level expected STD costs and summed across the group to produce the group's expected STD cost.
35. The computer implemented process of claim 32 , for forecasting a probability of long term disability (LTD) claims wherein a dependent measure for generating the cost forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD and applying the cost forecasting model to the person-level data produces person-level expected LTD costs wherein summing the person-level expected LTD costs across the group to produce a group's expected LTD cost for an actual policy period.
36. The computer implemented process of claim 32 , wherein the cost forecast is produced for first-dollar health insurance.
37. The computer implemented process of claim 32 , wherein the cost forecast is produced for specific plus aggregate stop loss health insurance.
38. The computer implemented process of claim 32 , wherein the cost forecast produced is for aggregate-only stop loss health insurance.
39. The computer implemented process of claim 32 , wherein the cost forecast produced is for specific stop loss health insurance.
40. The computer implemented process of claim 32 for forecasting group term life insurance costs wherein a dependent measure for generating the cost forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost which is summed across the group to produce the group's expected term life insurance cost.
41. The computer implemented process of claim 32 , wherein claim amounts are a mix of fee for service payments and capitation payments so that the base and underwriting periods risk factors are appended to include dummy variables for the presence of capitation payments by provider type and the cost estimate in the next and policy periods is the fee for service cost that must be supplemented with the expected capitation payments.
42. The process of claim 32 further comprising developing group-level cost-forecasting model for groups in the book of business by capturing the predictive ability of main effects and interactions of group-level risk factors which include but are not limited to groups historical claim amounts, group-level sum of the person-level forecasts, SIC code or industry type, characteristics of the benefit plan design, geographic locale, and number of people and length of time covered by the insurance through the application of an interaction capturing technique to the model development universe of groups.
43. The computer implemented process of claim 42 , comprising in addition the step of: setting medical insurance reserves based on the group-level forecast, wherein the next period is a reserving period for claims that have not occurred or that have occurred but not been reported.
44. The computer implemented process of claim 42 for forecasting short term disability costs wherein the interaction capturing technique uses a group-level dependent measure of residual STD days at the group-level calculate forecast STD costs by weighting by the group's expected STD cost per day.
45. The computer implemented process of claim 42 , wherein medical claim amounts are a mix of fee for service payments and capitation payments so that the base and underwriting periods group-level risk factors are appended to include dummy variables for the presence of capitation payments by provider type and the cost estimate in the next and policy periods is the fee for service cost that must be supplemented with the expected capitation payments.
46. The process of claim 32 comprising in addition the steps of:
providing a provider type cost trend forecast adjustment to be utilized by at least one member of the group to be insured;
adjusting the person-level next period cost forecast for each member using the health care provider type with the provider type cost trend forecast adjustment.
47. An automated system for forecasting future costs attributable to claims from members of a book of business during an actual policy period comprising:
a central processing unit;
an insured person database, accessible by the processor, wherein the database comprises person-level enrollment data and actual underwriting period health care claims data, for members of a book of business to be insured, where the person-level data on a health care claim comprises at least a claim amount and a claim code;
a model development universe database, accessible by the processor, wherein the second database comprises model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on the base period health care claim includes at least a claim amount and a claim code;
a risk factor encoder, accessible by the processor, wherein the risk factor encoder encodes claim-based risk factors for each historical base period based on the claim code associated with the health care claim and the risk factor encoder encodes at least one enrollment risk factor based on the enrollment data;
a model generator, accessible by the processor, that generates a cost-forecasting model by capturing the predictive capacity of the main effects and the interaction of the risk factors assigned by the risk factor encoder to forecast the historical next period of the model development universe data using the historical base period data;
a person-level cost generator that applies the cost-forecasting model to the person-level actual underwriting period health care claims data of each of the members of the book of business to generate a person-level actual policy period claim amount forecast for each member of the book of business; and
an actual policy period group-level cost forecast generator that totals the person-level actual next period forecasts for each member of the group to generate an actual policy period group-level cost forecast.
48. The system of claim 47 wherein the model generator captures the predictive ability of main effects and interactions of group-level risk factors which include but are not limited to groups historical claim amounts, group-level sum of the person-level forecasts, SIC code or industry type, characteristics of the benefit plan design, geographic locale, and the number of people and length of time covered by the insurance through the application of an interaction capturing technique to the model development universe of groups.
49. A computer-implemented process of forecasting costs attributable to claims from members of a book of business during an actual policy period, comprising the steps of:
means for providing person-level data, comprising enrollment data and actual underwriting period health care claims data, for members of a book of business, where the person-level data on a health care claim comprises at least a claim amount and a claim code and the actual underwriting period can be, but is not necessarily, contiguous with the actual policy period;
means for providing a model development universe of person-level data, comprising enrollment data, historical base period health care claims data and historical next period claim amount data for a statistically meaningful number of individuals, where the person-level data on a base period health care claim includes at least a claim amount and a claim code;
means for providing claim-based risk factors for each historical base period based on the claim code associated with the health care claim and providing at least one enrollment risk factor based on the enrollment data;
means for developing a cost-forecasting model by capturing the predictive ability of main effects and interactions of risk factors through the application of an interaction capturing technique to the model development universe;
means for applying the cost-forecasting model to the person-level data of each of the individuals or members of a group to generate a person-level actual policy period expected cost forecast for each member of the group; and
means for producing a group-level forecast for the actual policy period from the person-level forecasts of each individual or member of the group by totaling the person-level cost forecasts for the actual policy period.
50. The system recited in claim 49 wherein the system further is automated such that when actual underwriting period data is provided the system automatically provides an actual policy period claim amount forecast.
51. The system recited in claim 49 for use by a client having data and an Internet client application, further comprising an Internet server application such that when the client provides actual underwriting period data to the Internet server application, the Internet server application automatically provides an actual policy period claim amount forecast.
52. A group insurance product comprising:
an identification of the types of benefits which are agreed to be provided by an insurer to or on behalf of members of a group, which will be incurred by members of said group during a future time period; and
a stated monetary insurance premium including a forecast of said benefits made in accordance with the process of claim 32 , estimated costs of administering the insurance product, and optionally, an estimated profit,
whereby an insurer agrees to cover the identified benefits in exchange for the payment of the stated monetary insurance premium.
53. The group health insurance product of claim 52 for insuring short term disability costs wherein the interaction capturing technique uses a dependent measure from the next period and policy period comprising the number of STD days in the policy period and weights the dependent measure by the expected cost per day for the STD to produce the person-level expected STD costs and summed across the group to produce the group's expected STD cost.
54. The group health insurance product of claim 52 for insuring long term disability (LTD) claims wherein a dependent measure for generating the claim amount forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD and applying the cost forecasting model to the person-level data produces person-level expected LTD costs wherein summing the person-level expected LTD costs across the group to produce a group's expected LTD cost for an actual policy period.
55. The group health insurance product of claim 52 , wherein the cost forecast is produced for first-dollar health insurance.
56. The group health insurance product of claim 52 , wherein the cost forecast is produced for specific plus aggregate stop loss health insurance.
57. The group health insurance product of claim 52 , wherein the cost forecast produced is for aggregate-only stop loss health insurance.
58. The group health insurance product of claim 52 , wherein the cost forecast produced is for specific stop loss health insurance.
59. The group health insurance product of claim 52 for insuring group term life insurance costs wherein a dependent measure for generating the cost forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost.
60. The group health insurance product of claim 52 , comprising a renewal product, wherein the model development universe comprises data from the members of a group in the book of business to be insured.
61. A method of reserving for the group health insurance product of claim 48 , comprising in addition the step of: setting insurance reserves based on the renewal group-level forecast for the actual underwriting period, wherein the next period is a reserving period for claims that have not occurred or that have occurred but not been reported.
62. A method of pricing group insurance including a cost of future benefits according to the computer-implemented process of forecasting future medical costs attributable to claims from members of a group during an actual underwriting period of claim 32 , comprising the additional steps of:
providing an expected amount of administrative costs allocable to providing health insurance coverage to the group;
providing a minimum acceptable expected profit;
totaling the group level cost forecast, expected amount of administrative costs, and minimum acceptable expected profit are to yield a total minimum price, and
providing a plurality of expected probabilities of retention for the group corresponding to a plurality of possible prices greater than or equal to the total minimum price, each possible price also having an expected profit that is the amount of the price over the group level cost forecast plus the expected amount of administrative costs; and
calculating a plurality of possible maximum profits by multiplying each of the plurality of possible profits by the corresponding expected probability of retention,
wherein the largest possible maximum profit, is used to price the group insurance.
63. A method of pricing group insurance of claim 62 for insuring short term disability costs wherein the interaction capturing technique uses a dependent measure from the next period and policy period comprising the number of STD days in the policy period and weights the dependent measure by the expected cost per day for the STD to produce the person-level expected STD costs and summed across the group to produce the group's expected STD cost.
64. A method of pricing group insurance of claim 62 for insuring long term disability (LTD) claims wherein a dependent measure for generating the cost forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD and applying the cost forecasting model to the person-level data produces person-level expected LTD costs wherein summing the person-level expected LTD costs across the group to produce a group's expected LTD cost for an actual policy period.
65. A method of pricing group insurance of claim 62 , wherein the pricing is produced for first-dollar health insurance.
66. A method of pricing group insurance of claim 62 , wherein the pricing is produced for stop loss health insurance.
67. A method of pricing group insurance of claim 62 , wherein the pricing produced is for aggregate-only stop loss health insurance.
68. A method of pricing group insurance of claim 62 , wherein the pricing produced is for specific stop loss health insurance.
69. A method of pricing group insurance of claim 62 for insuring group term life insurance costs wherein a dependent measure for generating the cost forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost.
70. A method of pricing group insurance of claim 62 , comprising a renewal product, wherein the model development universe comprises data from the members of a group in the book of business to be insured.
71. A method of underwriting an insurance product comprising the steps of:
providing an identification of the coverage of the insurance product which identifies the conditions of payment under the product during a policy period;
providing person-level health care claim information comprising enrollment data, and base period and underwriting period claim data, the claim data comprising claim codes having associated claim costs;
capturing the predictive ability of the person-level health care claim information through the application of an interaction capturing technique; and
forecasting a predicted cost of the insurance product during the policy period based on the identification of the coverage of the insurance product and the captured predictive ability of the person-level health care claim information;
wherein each of diagnosis and CPT based risk factor is independent of the sequence in time of other diagnosis and CPT based risk factors.
72. The method of underwriting an insurance of claim 71 , for insuring short term disability costs wherein the interaction capturing technique uses a dependent measure from the next period and policy period comprising the number of STD days in the policy period and weights the dependent measure by the expected cost per day for the STD to produce the person-level expected STD costs and summed across the group to produce the group's expected STD cost.
73. The method of underwriting a insurance of claim 71 , for insuring long term disability (LTD) claims wherein a dependent measure for generating the cost forecasting model is the probability of a LTD claim in the policy period where the probability is weighted by the net present value of the LTD and applying the cost forecasting model to the person-level data produces person-level expected LTD costs wherein summing the person-level expected LTD costs across the group to produce a group's expected LTD cost for an actual policy period.
74. The method of underwriting a insurance of claim 71 , wherein the cost forecast is produced for first-dollar health insurance.
75. The method of underwriting a insurance of claim 71 , wherein the cost forecast is produced for stop loss health insurance.
76. The method of underwriting a insurance of claim 71 wherein the cost forecast produced is for aggregate-only stop loss health insurance.
77. The method of underwriting a insurance of claim 71 wherein the cost forecast produced is for specific stop loss health insurance.
78. The method of underwriting a insurance of claim 71 for insuring group term life insurance costs wherein a dependent measure for generating the cost forecasting model is the expected probability of death weighted by the amount of life insurance to produce the person-level expected term life insurance cost.
79. The method of underwriting a insurance of claim 71 comprising renewal underwriting, wherein the model development universe comprises data from the members of a group in the book of business to be insured.
80. The method of underwriting a insurance of claim 71 comprising in addition the step of: setting insurance reserves based on the renewal group-level forecast for the actual underwriting period, wherein the next period is a reserving period for claims that have not occurred or that have occurred but not been reported.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/145,281 US20090048877A1 (en) | 2000-11-15 | 2008-06-24 | Insurance claim forecasting system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US24906000P | 2000-11-15 | 2000-11-15 | |
US26713101P | 2001-02-07 | 2001-02-07 | |
US09/861,379 US7392201B1 (en) | 2000-11-15 | 2001-05-18 | Insurance claim forecasting system |
US12/145,281 US20090048877A1 (en) | 2000-11-15 | 2008-06-24 | Insurance claim forecasting system |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/861,379 Division US7392201B1 (en) | 2000-11-15 | 2001-05-18 | Insurance claim forecasting system |
Publications (1)
Publication Number | Publication Date |
---|---|
US20090048877A1 true US20090048877A1 (en) | 2009-02-19 |
Family
ID=39530091
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/861,379 Active 2025-06-24 US7392201B1 (en) | 2000-11-15 | 2001-05-18 | Insurance claim forecasting system |
US12/145,281 Abandoned US20090048877A1 (en) | 2000-11-15 | 2008-06-24 | Insurance claim forecasting system |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/861,379 Active 2025-06-24 US7392201B1 (en) | 2000-11-15 | 2001-05-18 | Insurance claim forecasting system |
Country Status (1)
Country | Link |
---|---|
US (2) | US7392201B1 (en) |
Cited By (89)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090222290A1 (en) * | 2008-02-29 | 2009-09-03 | Crowe Michael K | Methods and Systems for Automated, Predictive Modeling of the Outcome of Benefits Claims |
US20100185466A1 (en) * | 2009-01-20 | 2010-07-22 | Kenneth Paradis | Systems and methods for tracking health-related spending for validation of disability benefits claims |
US20110082712A1 (en) * | 2009-10-01 | 2011-04-07 | DecisionQ Corporation | Application of bayesian networks to patient screening and treatment |
US20110112853A1 (en) * | 2009-11-06 | 2011-05-12 | Ingenix, Inc. | System and Method for Condition, Cost, and Duration Analysis |
US20110112873A1 (en) * | 2009-11-11 | 2011-05-12 | Medical Present Value, Inc. | System and Method for Electronically Monitoring, Alerting, and Evaluating Changes in a Health Care Payor Policy |
US20110131072A1 (en) * | 2002-11-26 | 2011-06-02 | Dominion Ventures, Llc | Method for health plan management |
US7966200B1 (en) | 2008-03-18 | 2011-06-21 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US7966201B1 (en) | 2008-03-18 | 2011-06-21 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US7966202B1 (en) | 2008-03-18 | 2011-06-21 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US7974859B1 (en) | 2008-03-18 | 2011-07-05 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US20110166883A1 (en) * | 2009-09-01 | 2011-07-07 | Palmer Robert D | Systems and Methods for Modeling Healthcare Costs, Predicting Same, and Targeting Improved Healthcare Quality and Profitability |
US7983938B1 (en) * | 2008-03-18 | 2011-07-19 | United Services Automobile Association | Systems and methods for modeling recommended insurance coverage |
US7983937B1 (en) * | 2008-03-18 | 2011-07-19 | United Services Automobile Association | Systems and methods for modeling recommended insurance coverage |
US20110202372A1 (en) * | 2010-02-12 | 2011-08-18 | Assets Quest, Inc. | Method and system for estimating unpaid claims |
US8060385B1 (en) | 2006-12-08 | 2011-11-15 | Safeco Insurance Company Of America | System, program product and method for segmenting and underwriting using voting status |
US20110320225A1 (en) * | 2010-06-18 | 2011-12-29 | Strategic Healthplan Services, Llc | Method and apparatus for automatic healthplan data retrieval and reconciliation using a processing device |
US20120221349A1 (en) * | 2011-02-25 | 2012-08-30 | Eric Mora | Systems and methods for the prediction of health care costs |
US8271378B2 (en) | 2007-04-12 | 2012-09-18 | Experian Marketing Solutions, Inc. | Systems and methods for determining thin-file records and determining thin-file risk levels |
US8312033B1 (en) | 2008-06-26 | 2012-11-13 | Experian Marketing Solutions, Inc. | Systems and methods for providing an integrated identifier |
US8321952B2 (en) | 2000-06-30 | 2012-11-27 | Hitwise Pty. Ltd. | Method and system for monitoring online computer network behavior and creating online behavior profiles |
US20130035947A1 (en) * | 2011-08-01 | 2013-02-07 | Infosys Limited | Claims payout simulator |
US8452611B1 (en) | 2004-09-01 | 2013-05-28 | Search America, Inc. | Method and apparatus for assessing credit for healthcare patients |
US20130173329A1 (en) * | 2012-01-04 | 2013-07-04 | Honeywell International Inc. | Systems and methods for the solution to the joint problem of parts order scheduling and maintenance plan generation for field maintenance |
US20130197936A1 (en) * | 2012-02-01 | 2013-08-01 | Richard R. Willich | Predictive Healthcare Diagnosis Animation |
US20130253949A1 (en) * | 2010-09-01 | 2013-09-26 | Vishnuvyas Sethumadhavan | Systems and methods for extraction of clinical knowledge with reimbursement potential |
US8583593B1 (en) | 2005-04-11 | 2013-11-12 | Experian Information Solutions, Inc. | Systems and methods for optimizing database queries |
US8589190B1 (en) | 2006-10-06 | 2013-11-19 | Liberty Mutual Insurance Company | System and method for underwriting a prepackaged business owners insurance policy |
US8626646B2 (en) | 2006-10-05 | 2014-01-07 | Experian Information Solutions, Inc. | System and method for generating a finance attribute from tradeline data |
US8639616B1 (en) | 2010-10-01 | 2014-01-28 | Experian Information Solutions, Inc. | Business to contact linkage system |
US20140108057A1 (en) * | 2012-10-17 | 2014-04-17 | Kip Robert Daniels | Insurance instrument, insurance coverage, and method of use thereof |
US20140122098A1 (en) * | 2012-10-31 | 2014-05-01 | DaVincian Technologies, Inc. | Statistical financial system and method to value patient visits to healthcare provider organizations for follow up prioritization |
JP2014081757A (en) * | 2012-10-16 | 2014-05-08 | Hst-Labo Co Ltd | Method of processing electronic receipt data for aggregate analysis of disease conditions, healthcare cost, and others |
US20140129237A1 (en) * | 2012-11-02 | 2014-05-08 | QMedtrix Systems, Inc. | Estimating market-driven medical facility rates and/or charges |
US8725613B1 (en) | 2010-04-27 | 2014-05-13 | Experian Information Solutions, Inc. | Systems and methods for early account score and notification |
US20140156518A1 (en) * | 2012-12-04 | 2014-06-05 | Xerox Corporation | Method and systems for sub-allocating computational resources |
US8782217B1 (en) | 2010-11-10 | 2014-07-15 | Safetyweb, Inc. | Online identity management |
US20150052053A1 (en) * | 2013-08-15 | 2015-02-19 | Mastercard International Incorporated | Internet site authentication with payments authorization data |
US9147042B1 (en) | 2010-11-22 | 2015-09-29 | Experian Information Solutions, Inc. | Systems and methods for data verification |
US9286035B2 (en) | 2011-06-30 | 2016-03-15 | Infosys Limited | Code remediation |
US9342783B1 (en) | 2007-03-30 | 2016-05-17 | Consumerinfo.Com, Inc. | Systems and methods for data verification |
US9529851B1 (en) | 2013-12-02 | 2016-12-27 | Experian Information Solutions, Inc. | Server architecture for electronic data quality processing |
US9576030B1 (en) | 2014-05-07 | 2017-02-21 | Consumerinfo.Com, Inc. | Keeping up with the joneses |
US9690820B1 (en) | 2007-09-27 | 2017-06-27 | Experian Information Solutions, Inc. | Database system for triggering event notifications based on updates to database records |
US20170186093A1 (en) * | 2015-12-23 | 2017-06-29 | Aetna Inc. | Resource allocation |
US9697263B1 (en) | 2013-03-04 | 2017-07-04 | Experian Information Solutions, Inc. | Consumer data request fulfillment system |
US9734290B2 (en) | 2011-12-16 | 2017-08-15 | Neela SRINIVAS | System and method for evidence based differential analysis and incentives based healthcare policy |
WO2017148161A1 (en) * | 2016-03-04 | 2017-09-08 | 深圳市前海安测信息技术有限公司 | Underwriting and actuarial database system for assessing risks of subject matter of insurance |
US9805422B1 (en) | 2012-05-24 | 2017-10-31 | Allstate Insurance Company | Systems and methods for calculating seasonal insurance premiums |
CN107515819A (en) * | 2016-06-16 | 2017-12-26 | 平安科技(深圳)有限公司 | Medicare system method of testing and device |
US9853959B1 (en) | 2012-05-07 | 2017-12-26 | Consumerinfo.Com, Inc. | Storage and maintenance of personal data |
JPWO2017013712A1 (en) * | 2015-07-17 | 2018-03-08 | 株式会社日立製作所 | Insurance information providing system and insurance information providing method |
CN108154444A (en) * | 2018-01-17 | 2018-06-12 | 众安信息技术服务有限公司 | For delivering the method, apparatus and computer-readable medium of shift classification |
US10102536B1 (en) | 2013-11-15 | 2018-10-16 | Experian Information Solutions, Inc. | Micro-geographic aggregation system |
US10242019B1 (en) | 2014-12-19 | 2019-03-26 | Experian Information Solutions, Inc. | User behavior segmentation using latent topic detection |
US10262362B1 (en) | 2014-02-14 | 2019-04-16 | Experian Information Solutions, Inc. | Automatic generation of code for attributes |
US10489860B1 (en) * | 2013-12-23 | 2019-11-26 | Massachusetts Mutual Life Insurance Company | Systems and methods for developing convertible term products |
US10650928B1 (en) | 2017-12-18 | 2020-05-12 | Clarify Health Solutions, Inc. | Computer network architecture for a pipeline of models for healthcare outcomes with machine learning and artificial intelligence |
US10678894B2 (en) | 2016-08-24 | 2020-06-09 | Experian Information Solutions, Inc. | Disambiguation and authentication of device users |
US10726359B1 (en) | 2019-08-06 | 2020-07-28 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and automated scalable regularization |
US10910113B1 (en) | 2019-09-26 | 2021-02-02 | Clarify Health Solutions, Inc. | Computer network architecture with benchmark automation, machine learning and artificial intelligence for measurement factors |
US10922652B2 (en) * | 2019-04-16 | 2021-02-16 | Advanced New Technologies Co., Ltd. | Blockchain-based program review system, method, computing device and storage medium |
US10963434B1 (en) | 2018-09-07 | 2021-03-30 | Experian Information Solutions, Inc. | Data architecture for supporting multiple search models |
US10998104B1 (en) | 2019-09-30 | 2021-05-04 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and automated insight generation |
US11030562B1 (en) | 2011-10-31 | 2021-06-08 | Consumerinfo.Com, Inc. | Pre-data breach monitoring |
US20210200896A1 (en) * | 2019-12-30 | 2021-07-01 | Unitedhealth Group Incorporated | Programmatic determinations using decision trees generated from relational database entries |
US11195213B2 (en) | 2010-09-01 | 2021-12-07 | Apixio, Inc. | Method of optimizing patient-related outcomes |
US11194784B2 (en) * | 2018-10-19 | 2021-12-07 | International Business Machines Corporation | Extracting structured information from unstructured data using domain problem application validation |
US11227001B2 (en) | 2017-01-31 | 2022-01-18 | Experian Information Solutions, Inc. | Massive scale heterogeneous data ingestion and user resolution |
US20220164698A1 (en) * | 2020-11-25 | 2022-05-26 | International Business Machines Corporation | Automated data quality inspection and improvement for automated machine learning |
US11481411B2 (en) | 2010-09-01 | 2022-10-25 | Apixio, Inc. | Systems and methods for automated generation classifiers |
US20220351842A1 (en) * | 2021-05-03 | 2022-11-03 | Evernorth Strategic Development, Inc. | Automated bias correction for database systems |
US11527313B1 (en) | 2019-11-27 | 2022-12-13 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and care groupings |
US11544652B2 (en) | 2010-09-01 | 2023-01-03 | Apixio, Inc. | Systems and methods for enhancing workflow efficiency in a healthcare management system |
US11581097B2 (en) | 2010-09-01 | 2023-02-14 | Apixio, Inc. | Systems and methods for patient retention in network through referral analytics |
WO2023023342A1 (en) * | 2021-08-19 | 2023-02-23 | Allstate Insurance Company | Automated iterative predictive modeling computing platform |
US11605465B1 (en) | 2018-08-16 | 2023-03-14 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and patient risk scoring |
US20230077820A1 (en) * | 2018-03-27 | 2023-03-16 | Healthplan Data Solutions Llc | Method and system for monitoring prescription drug data and determining claim data accuracy |
US11610653B2 (en) | 2010-09-01 | 2023-03-21 | Apixio, Inc. | Systems and methods for improved optical character recognition of health records |
US11621085B1 (en) | 2019-04-18 | 2023-04-04 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and active updates of outcomes |
US11625789B1 (en) * | 2019-04-02 | 2023-04-11 | Clarify Health Solutions, Inc. | Computer network architecture with automated claims completion, machine learning and artificial intelligence |
US11636497B1 (en) | 2019-05-06 | 2023-04-25 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and risk adjusted performance ranking of healthcare providers |
US11645344B2 (en) | 2019-08-26 | 2023-05-09 | Experian Health, Inc. | Entity mapping based on incongruent entity data |
US11694239B2 (en) | 2010-09-01 | 2023-07-04 | Apixio, Inc. | Method of optimizing patient-related outcomes |
US11880377B1 (en) | 2021-03-26 | 2024-01-23 | Experian Information Solutions, Inc. | Systems and methods for entity resolution |
US11941065B1 (en) | 2019-09-13 | 2024-03-26 | Experian Information Solutions, Inc. | Single identifier platform for storing entity data |
US12079230B1 (en) | 2024-01-31 | 2024-09-03 | Clarify Health Solutions, Inc. | Computer network architecture and method for predictive analysis using lookup tables as prediction models |
US12165754B2 (en) | 2010-09-01 | 2024-12-10 | Apixio, Llc | Systems and methods for improved optical character recognition of health records |
US12198820B2 (en) | 2010-09-01 | 2025-01-14 | Apixio, Llc | Systems and methods for patient retention in network through referral analytics |
US12266019B2 (en) * | 2023-09-28 | 2025-04-01 | Healthplan Data Solutions, Inc. | Method and system for monitoring prescription drug data and determining claim data accuracy |
Families Citing this family (101)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7343308B1 (en) * | 2000-05-26 | 2008-03-11 | Hartford Fire Insurance Compnay | Method and system for identifying subrogation potential and valuing a subrogation file |
WO2002049260A2 (en) * | 2000-10-23 | 2002-06-20 | Deloitte & Touche Llp | Commercial insurance scoring system and method |
JP4029593B2 (en) * | 2001-09-11 | 2008-01-09 | 株式会社日立製作所 | Process analysis method and information system |
US20030093304A1 (en) * | 2001-11-02 | 2003-05-15 | Keller James B. | System and method for managing short term risk |
US8200511B2 (en) * | 2001-11-28 | 2012-06-12 | Deloitte Development Llc | Method and system for determining the importance of individual variables in a statistical model |
WO2003048891A2 (en) * | 2001-11-29 | 2003-06-12 | Swiss Reinsurance Company | System and method for developing loss assumptions |
US7818186B2 (en) | 2001-12-31 | 2010-10-19 | Genworth Financial, Inc. | System for determining a confidence factor for insurance underwriting suitable for use by an automated system |
US7899688B2 (en) * | 2001-12-31 | 2011-03-01 | Genworth Financial, Inc. | Process for optimization of insurance underwriting suitable for use by an automated system |
US7844476B2 (en) | 2001-12-31 | 2010-11-30 | Genworth Financial, Inc. | Process for case-based insurance underwriting suitable for use by an automated system |
US8793146B2 (en) * | 2001-12-31 | 2014-07-29 | Genworth Holdings, Inc. | System for rule-based insurance underwriting suitable for use by an automated system |
US8005693B2 (en) | 2001-12-31 | 2011-08-23 | Genworth Financial, Inc. | Process for determining a confidence factor for insurance underwriting suitable for use by an automated system |
US7895062B2 (en) * | 2001-12-31 | 2011-02-22 | Genworth Financial, Inc. | System for optimization of insurance underwriting suitable for use by an automated system |
US7844477B2 (en) | 2001-12-31 | 2010-11-30 | Genworth Financial, Inc. | Process for rule-based insurance underwriting suitable for use by an automated system |
US6963870B2 (en) * | 2002-05-14 | 2005-11-08 | Microsoft Corporation | System and method for processing a large data set using a prediction model having a feature selection capability |
US8036919B2 (en) | 2002-07-10 | 2011-10-11 | Deloitte & Touche Llp | Licensed professional scoring system and method |
US20110161094A1 (en) * | 2002-08-23 | 2011-06-30 | Dxcg, Inc. | System and method for health care costs and outcomes modeling using dosage and routing pharmacy information |
US7813945B2 (en) | 2003-04-30 | 2010-10-12 | Genworth Financial, Inc. | System and process for multivariate adaptive regression splines classification for insurance underwriting suitable for use by an automated system |
US7383239B2 (en) * | 2003-04-30 | 2008-06-03 | Genworth Financial, Inc. | System and process for a fusion classification for insurance underwriting suitable for use by an automated system |
US7801748B2 (en) | 2003-04-30 | 2010-09-21 | Genworth Financial, Inc. | System and process for detecting outliers for insurance underwriting suitable for use by an automated system |
US7831451B1 (en) * | 2003-06-27 | 2010-11-09 | Quantitative Data Solutions, Inc. | Systems and methods for insurance underwriting |
US8398406B2 (en) * | 2003-08-07 | 2013-03-19 | Swiss Reinsurance Company Ltd. | Systems and methods for auditing auditable instruments |
US6999935B2 (en) * | 2003-09-30 | 2006-02-14 | Kiritharan Parankirinathan | Method of calculating premium payment to cover the risk attributable to insureds surviving a specified period |
US7685008B2 (en) * | 2004-02-20 | 2010-03-23 | Accenture Global Services Gmbh | Account level participation for underwriting components |
US20050222922A1 (en) * | 2004-03-18 | 2005-10-06 | Lynch Robert G | Method for calculating IBNP health reserves with low variance |
US7693728B2 (en) * | 2004-03-31 | 2010-04-06 | Aetna Inc. | System and method for administering health care cost reduction |
EP1792276A4 (en) * | 2004-09-10 | 2009-12-23 | Deloitte Dev Llc | METHOD AND SYSTEM FOR ESTIMATING INSURANCE LOSSES AND CONFERENCE INTERVALS USING PREDICTIVE MODELING OF INSURANCE CONTRACT AND CLAIM LEVEL DETAILS |
US7860812B2 (en) * | 2005-03-02 | 2010-12-28 | Accenture Global Services Limited | Advanced insurance record audit and payment integrity |
US7664690B2 (en) * | 2005-07-29 | 2010-02-16 | Accenture Global Services Gmbh | Insurance claim management |
US8069067B2 (en) * | 2006-01-30 | 2011-11-29 | Swiss Reinsurance Company | Computer-based system and method for estimating costs of a line of business included in a multi-line treaty |
US20070219824A1 (en) * | 2006-03-17 | 2007-09-20 | Jean Rawlings | System and method for identifying and analyzing patterns or aberrations in healthcare claims |
US7739129B2 (en) * | 2006-04-10 | 2010-06-15 | Accenture Global Services Gmbh | Benefit plan intermediary |
US20070239492A1 (en) * | 2006-04-10 | 2007-10-11 | Sweetland Christopher L | Estimating benefit plan costs |
US20080010086A1 (en) * | 2006-07-05 | 2008-01-10 | Aetna Inc. | Health financial needs calculator |
US20080140456A1 (en) * | 2006-09-11 | 2008-06-12 | Glick Gregg W | Evaluating susceptibility to a claim occurring infrequently |
US8032396B2 (en) * | 2006-09-25 | 2011-10-04 | Aetna Inc. | System and method for offering and guaranteeing renewal of suspendable healthcare benefits |
US20080077449A1 (en) * | 2006-09-25 | 2008-03-27 | Aetna Inc. | Providing and Financing Post-Employment Health Care Benefits |
US8359209B2 (en) * | 2006-12-19 | 2013-01-22 | Hartford Fire Insurance Company | System and method for predicting and responding to likelihood of volatility |
WO2008127627A1 (en) * | 2007-04-12 | 2008-10-23 | Warren Pamela A | Psychological disability evaluation software, methods and systems |
WO2008151042A1 (en) * | 2007-06-01 | 2008-12-11 | American International Group, Inc. | Method and system for projecting catastrophe exposure |
US8606604B1 (en) * | 2007-06-12 | 2013-12-10 | David L. Huber | Systems and methods for remote electronic transaction processing |
US20090043615A1 (en) * | 2007-08-07 | 2009-02-12 | Hartford Fire Insurance Company | Systems and methods for predictive data analysis |
US20090094261A1 (en) * | 2007-10-04 | 2009-04-09 | Jung Edward K Y | Systems and methods for correlating epigenetic information with disability data |
US20100027780A1 (en) * | 2007-10-04 | 2010-02-04 | Searete Llc, A Limited Liability Corporation Of The State Of Delaware | Systems and methods for anonymizing personally identifiable information associated with epigenetic information |
US20090100095A1 (en) * | 2007-10-04 | 2009-04-16 | Jung Edward K Y | Systems and methods for reinsurance utilizing epigenetic information |
US20090099877A1 (en) * | 2007-10-11 | 2009-04-16 | Hyde Roderick A | Systems and methods for underwriting risks utilizing epigenetic information |
US20090094282A1 (en) * | 2007-10-04 | 2009-04-09 | Searete Llc, A Limited Liability Corporation Of The State Of Delaware | Systems and methods for correlating past epigenetic information with past disability data |
US20090094065A1 (en) * | 2007-10-04 | 2009-04-09 | Hyde Roderick A | Systems and methods for underwriting risks utilizing epigenetic information |
US8244654B1 (en) * | 2007-10-22 | 2012-08-14 | Healthways, Inc. | End of life predictive model |
US7707087B1 (en) * | 2008-01-31 | 2010-04-27 | Intuit Inc. | Presenting data-driven health care cost-saving opportunities to health care consumers |
US20090319297A1 (en) * | 2008-06-18 | 2009-12-24 | Upmc | Workplace Absenteeism Risk Model |
US20100063907A1 (en) * | 2008-09-11 | 2010-03-11 | American Management Group, LLC | Insurance Billing System |
US8676598B2 (en) * | 2009-03-31 | 2014-03-18 | Jacob George Kuriyan | Chronic population based cost model to compare effectiveness of preventive care programs |
US20100299161A1 (en) * | 2009-05-22 | 2010-11-25 | Hartford Fire Insurance Company | System and method for administering subrogation related transactions |
US20110071363A1 (en) * | 2009-09-22 | 2011-03-24 | Healthways, Inc. | System and method for using predictive models to determine levels of healthcare interventions |
US8131571B2 (en) * | 2009-09-23 | 2012-03-06 | Watson Wyatt & Company | Method and system for evaluating insurance liabilities using stochastic modeling and sampling techniques |
US20110119109A1 (en) * | 2009-11-13 | 2011-05-19 | Bank Of America Corporation | Headcount forecasting system |
US8355934B2 (en) * | 2010-01-25 | 2013-01-15 | Hartford Fire Insurance Company | Systems and methods for prospecting business insurance customers |
US20110313794A1 (en) * | 2010-06-22 | 2011-12-22 | Feeney Rosa M | Insurance Coverage Validation |
US20120232936A1 (en) * | 2011-03-11 | 2012-09-13 | Castlight Health, Inc. | Reference Pricing of Health Care Deliverables |
US9947050B1 (en) * | 2011-03-21 | 2018-04-17 | Allstate Insurance Company | Claims adjuster allocation |
US10531251B2 (en) | 2012-10-22 | 2020-01-07 | United States Cellular Corporation | Detecting and processing anomalous parameter data points by a mobile wireless data network forecasting system |
US10445697B2 (en) | 2012-11-26 | 2019-10-15 | Hartford Fire Insurance Company | System for selection of data records containing structured and unstructured data |
US20140180949A1 (en) * | 2012-12-24 | 2014-06-26 | Cognizant Technology Solutions India Pvt. Ltd. | System and method for automated coding and testing of benefits |
US10825096B1 (en) * | 2013-05-23 | 2020-11-03 | United Services Automobile Association (Usaa) | Systems and methods for water loss mitigation messaging |
US10380696B1 (en) | 2014-03-18 | 2019-08-13 | Ccc Information Services Inc. | Image processing system for vehicle damage |
US10373262B1 (en) | 2014-03-18 | 2019-08-06 | Ccc Information Services Inc. | Image processing system for vehicle damage |
US10373260B1 (en) | 2014-03-18 | 2019-08-06 | Ccc Information Services Inc. | Imaging processing system for identifying parts for repairing a vehicle |
US10776799B2 (en) * | 2015-03-17 | 2020-09-15 | Mp Cloud Technologies, Inc. | Software for emergency medical services |
US20160321748A1 (en) * | 2015-04-29 | 2016-11-03 | International Business Machines Corporation | Method for market risk assessment for healthcare applications |
US20170091401A1 (en) * | 2015-09-24 | 2017-03-30 | Innodata Synodex, Llc | System and method for determining a heathcare utilization rate score |
US10650927B2 (en) * | 2015-11-13 | 2020-05-12 | Cerner Innovation, Inc. | Machine learning clinical decision support system for risk categorization |
US11017058B1 (en) * | 2015-11-20 | 2021-05-25 | Kwesi McDavid-Arno | Expert medical system and methods therefor |
US20170212997A1 (en) * | 2015-12-01 | 2017-07-27 | James BUONFIGLIO | Automated modeling and insurance recommendation method and system |
US10394871B2 (en) | 2016-10-18 | 2019-08-27 | Hartford Fire Insurance Company | System to predict future performance characteristic for an electronic record |
US20180130135A1 (en) * | 2016-11-09 | 2018-05-10 | Melissa Norwicke | System and method for obtaining information about a deceased person's life insurance policy and submitting a claim thereunder |
CN106875030B (en) * | 2016-12-14 | 2020-11-24 | 武汉默联股份有限公司 | Intelligent online direct claim recommendation system and method for business health insurance |
US11790454B1 (en) | 2017-01-16 | 2023-10-17 | Bind Benefits, Inc. | Use determination risk coverage datastructure for on-demand and increased efficiency coverage detection and rebalancing apparatuses, methods and systems |
US11663670B1 (en) * | 2017-01-16 | 2023-05-30 | Bind Benefits, Inc. | Use determination risk coverage datastructure for on-demand and increased efficiency coverage detection and rebalancing apparatuses, methods and systems |
US20210256616A1 (en) | 2017-09-27 | 2021-08-19 | State Farm Mutual Automobile Insurance Company | Automobile Monitoring Systems and Methods for Risk Determination |
US20190172564A1 (en) * | 2017-12-05 | 2019-06-06 | International Business Machines Corporation | Early cost prediction and risk identification |
CN109064343B (en) * | 2018-08-13 | 2023-09-26 | 中国平安人寿保险股份有限公司 | Risk model building method, risk matching device, risk model building equipment and risk matching medium |
US11341546B2 (en) * | 2018-12-18 | 2022-05-24 | Clover Health | Bid tool optimization |
CN109658270A (en) * | 2018-12-19 | 2019-04-19 | 前海企保科技(深圳)有限公司 | It is a kind of to read the core compensation system and method understood based on insurance products |
US11410243B2 (en) * | 2019-01-08 | 2022-08-09 | Clover Health | Segmented actuarial modeling |
CN109886819B (en) * | 2019-01-16 | 2023-10-24 | 平安科技(深圳)有限公司 | Method for predicting insurance payment expenditure, electronic device and storage medium |
CN109902856A (en) * | 2019-01-17 | 2019-06-18 | 深圳壹账通智能科技有限公司 | Outstanding loss reserve prediction technique, device, computer equipment and storage medium |
EP3942488A1 (en) * | 2019-03-22 | 2022-01-26 | Swiss Reinsurance Company Ltd. | Structured liability risks parametrizing and forecasting system providing composite measures based on a reduced-to-the-max optimization approach and quantitative yield pattern linkage and corresponding method |
US12067151B2 (en) | 2019-04-30 | 2024-08-20 | Enya Inc. | Resource-efficient privacy-preserving transactions |
US10635837B1 (en) | 2019-04-30 | 2020-04-28 | HealthBlock, Inc. | Dynamic data protection |
US20200349652A1 (en) * | 2019-05-03 | 2020-11-05 | Koninklijke Philips N.V. | System to simulate outcomes of a new contract with a financier of care |
US11816584B2 (en) * | 2019-11-05 | 2023-11-14 | Optum Services (Ireland) Limited | Method, apparatus and computer program products for hierarchical model feature analysis and decision support |
CN111105316B (en) * | 2019-11-13 | 2023-06-09 | 泰康保险集团股份有限公司 | Data processing method and device for long-term care insurance, medium and electronic equipment |
CN111652614B (en) * | 2020-06-01 | 2023-08-22 | 泰康保险集团股份有限公司 | Data processing system, data processing method and device |
CN111986808B (en) * | 2020-07-30 | 2023-12-12 | 珠海中科先进技术研究院有限公司 | Health insurance risk assessment and control method, device and medium |
JP7422651B2 (en) * | 2020-12-16 | 2024-01-26 | 株式会社日立製作所 | Information processing system and selection support method |
CN114757785A (en) * | 2020-12-29 | 2022-07-15 | 天津幸福生命科技有限公司 | Data prediction method, apparatus, electronic device and computer readable medium |
GB2606452A (en) * | 2021-03-24 | 2022-11-09 | Frontline Insurance Managers Inc | System and Method of Determining and Providing Bindable Insurance Quotes |
US12229690B2 (en) * | 2021-06-24 | 2025-02-18 | The Toronto-Dominion Bank | System and method for determining expected loss using a machine learning framework |
US20230056462A1 (en) * | 2021-08-19 | 2023-02-23 | Marc R. Deschenaux | Cascading initial public offerings or special purpose acquisitions companies for corporate capitalization |
CN117011074A (en) * | 2023-07-25 | 2023-11-07 | 明亚保险经纪股份有限公司 | Risk warning methods and platforms |
CN118521274B (en) * | 2024-07-22 | 2024-12-31 | 支付宝(杭州)信息技术有限公司 | Project processing method and device based on strategy tree |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5613072A (en) * | 1991-02-06 | 1997-03-18 | Risk Data Corporation | System for funding future workers compensation losses |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4667292A (en) | 1984-02-16 | 1987-05-19 | Iameter Incorporated | Medical reimbursement computer system |
US5018067A (en) | 1987-01-12 | 1991-05-21 | Iameter Incorporated | Apparatus and method for improved estimation of health resource consumption through use of diagnostic and/or procedure grouping and severity of illness indicators |
US4975840A (en) * | 1988-06-17 | 1990-12-04 | Lincoln National Risk Management, Inc. | Method and apparatus for evaluating a potentially insurable risk |
WO1991017510A1 (en) | 1990-05-01 | 1991-11-14 | Healthchex, Inc. | Health care services comparison processing |
US5544044A (en) | 1991-08-02 | 1996-08-06 | United Healthcare Corporation | Method for evaluation of health care quality |
US5652842A (en) | 1994-03-01 | 1997-07-29 | Healthshare Technology, Inc. | Analysis and reporting of performance of service providers |
US5557514A (en) * | 1994-06-23 | 1996-09-17 | Medicode, Inc. | Method and system for generating statistically-based medical provider utilization profiles |
US5918208A (en) | 1995-04-13 | 1999-06-29 | Ingenix, Inc. | System for providing medical information |
US5835897C1 (en) | 1995-06-22 | 2002-02-19 | Symmetry Health Data Systems | Computer-implemented method for profiling medical claims |
US5809478A (en) * | 1995-12-08 | 1998-09-15 | Allstate Insurance Company | Method for accessing and evaluating information for processing an application for insurance |
US5976082A (en) * | 1996-06-17 | 1999-11-02 | Smithkline Beecham Corporation | Method for identifying at risk patients diagnosed with congestive heart failure |
US5893072A (en) | 1996-06-20 | 1999-04-06 | Aetna Life & Casualty Company | Insurance classification plan loss control system |
US5873066A (en) | 1997-02-10 | 1999-02-16 | Insurance Company Of North America | System for electronically managing and documenting the underwriting of an excess casualty insurance policy |
US5970464A (en) | 1997-09-10 | 1999-10-19 | International Business Machines Corporation | Data mining based underwriting profitability analysis |
US6061657A (en) | 1998-02-18 | 2000-05-09 | Iameter, Incorporated | Techniques for estimating charges of delivering healthcare services that take complicating factors into account |
US6078890A (en) | 1998-06-01 | 2000-06-20 | Ford Global Technologies, Inc. | Method and system for automated health care rate renewal and quality assessment |
-
2001
- 2001-05-18 US US09/861,379 patent/US7392201B1/en active Active
-
2008
- 2008-06-24 US US12/145,281 patent/US20090048877A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5613072A (en) * | 1991-02-06 | 1997-03-18 | Risk Data Corporation | System for funding future workers compensation losses |
Cited By (152)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8321952B2 (en) | 2000-06-30 | 2012-11-27 | Hitwise Pty. Ltd. | Method and system for monitoring online computer network behavior and creating online behavior profiles |
US20110131072A1 (en) * | 2002-11-26 | 2011-06-02 | Dominion Ventures, Llc | Method for health plan management |
US8452611B1 (en) | 2004-09-01 | 2013-05-28 | Search America, Inc. | Method and apparatus for assessing credit for healthcare patients |
US8930216B1 (en) | 2004-09-01 | 2015-01-06 | Search America, Inc. | Method and apparatus for assessing credit for healthcare patients |
US8583593B1 (en) | 2005-04-11 | 2013-11-12 | Experian Information Solutions, Inc. | Systems and methods for optimizing database queries |
US11954731B2 (en) | 2006-10-05 | 2024-04-09 | Experian Information Solutions, Inc. | System and method for generating a finance attribute from tradeline data |
US10963961B1 (en) | 2006-10-05 | 2021-03-30 | Experian Information Solutions, Inc. | System and method for generating a finance attribute from tradeline data |
US11631129B1 (en) | 2006-10-05 | 2023-04-18 | Experian Information Solutions, Inc | System and method for generating a finance attribute from tradeline data |
US10121194B1 (en) | 2006-10-05 | 2018-11-06 | Experian Information Solutions, Inc. | System and method for generating a finance attribute from tradeline data |
US8626646B2 (en) | 2006-10-05 | 2014-01-07 | Experian Information Solutions, Inc. | System and method for generating a finance attribute from tradeline data |
US9563916B1 (en) | 2006-10-05 | 2017-02-07 | Experian Information Solutions, Inc. | System and method for generating a finance attribute from tradeline data |
US8589190B1 (en) | 2006-10-06 | 2013-11-19 | Liberty Mutual Insurance Company | System and method for underwriting a prepackaged business owners insurance policy |
US8285618B1 (en) | 2006-12-08 | 2012-10-09 | Safeco Insurance Company Of America | System, program product and method for segmenting and underwriting using voting status |
US8060385B1 (en) | 2006-12-08 | 2011-11-15 | Safeco Insurance Company Of America | System, program product and method for segmenting and underwriting using voting status |
US9342783B1 (en) | 2007-03-30 | 2016-05-17 | Consumerinfo.Com, Inc. | Systems and methods for data verification |
US11308170B2 (en) | 2007-03-30 | 2022-04-19 | Consumerinfo.Com, Inc. | Systems and methods for data verification |
US10437895B2 (en) | 2007-03-30 | 2019-10-08 | Consumerinfo.Com, Inc. | Systems and methods for data verification |
US8738515B2 (en) | 2007-04-12 | 2014-05-27 | Experian Marketing Solutions, Inc. | Systems and methods for determining thin-file records and determining thin-file risk levels |
US8271378B2 (en) | 2007-04-12 | 2012-09-18 | Experian Marketing Solutions, Inc. | Systems and methods for determining thin-file records and determining thin-file risk levels |
US11347715B2 (en) | 2007-09-27 | 2022-05-31 | Experian Information Solutions, Inc. | Database system for triggering event notifications based on updates to database records |
US10528545B1 (en) | 2007-09-27 | 2020-01-07 | Experian Information Solutions, Inc. | Database system for triggering event notifications based on updates to database records |
US9690820B1 (en) | 2007-09-27 | 2017-06-27 | Experian Information Solutions, Inc. | Database system for triggering event notifications based on updates to database records |
US11954089B2 (en) | 2007-09-27 | 2024-04-09 | Experian Information Solutions, Inc. | Database system for triggering event notifications based on updates to database records |
US20090222290A1 (en) * | 2008-02-29 | 2009-09-03 | Crowe Michael K | Methods and Systems for Automated, Predictive Modeling of the Outcome of Benefits Claims |
US20120059677A1 (en) * | 2008-02-29 | 2012-03-08 | The Advocator Group, Llc | Methods and systems for automated, predictive modeling of the outcome of benefits claims |
US7983938B1 (en) * | 2008-03-18 | 2011-07-19 | United Services Automobile Association | Systems and methods for modeling recommended insurance coverage |
US7974859B1 (en) | 2008-03-18 | 2011-07-05 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US7966200B1 (en) | 2008-03-18 | 2011-06-21 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US8494882B1 (en) | 2008-03-18 | 2013-07-23 | United Services Automobile Association (Usaa) | Modeling recommended insurance coverage |
US7966201B1 (en) | 2008-03-18 | 2011-06-21 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US7966202B1 (en) | 2008-03-18 | 2011-06-21 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US8719060B1 (en) | 2008-03-18 | 2014-05-06 | United Services Automobile Association | Systems and methods for modeling insurance coverage |
US7983937B1 (en) * | 2008-03-18 | 2011-07-19 | United Services Automobile Association | Systems and methods for modeling recommended insurance coverage |
US10075446B2 (en) | 2008-06-26 | 2018-09-11 | Experian Marketing Solutions, Inc. | Systems and methods for providing an integrated identifier |
US8954459B1 (en) | 2008-06-26 | 2015-02-10 | Experian Marketing Solutions, Inc. | Systems and methods for providing an integrated identifier |
US12205076B2 (en) | 2008-06-26 | 2025-01-21 | Experian Marketing Solutions, Llc | Systems and methods for providing an integrated identifier |
US11157872B2 (en) | 2008-06-26 | 2021-10-26 | Experian Marketing Solutions, Llc | Systems and methods for providing an integrated identifier |
US8312033B1 (en) | 2008-06-26 | 2012-11-13 | Experian Marketing Solutions, Inc. | Systems and methods for providing an integrated identifier |
US11769112B2 (en) | 2008-06-26 | 2023-09-26 | Experian Marketing Solutions, Llc | Systems and methods for providing an integrated identifier |
US20100185466A1 (en) * | 2009-01-20 | 2010-07-22 | Kenneth Paradis | Systems and methods for tracking health-related spending for validation of disability benefits claims |
US8224678B2 (en) | 2009-01-20 | 2012-07-17 | Ametros Financial Corporation | Systems and methods for tracking health-related spending for validation of disability benefits claims |
US20110166883A1 (en) * | 2009-09-01 | 2011-07-07 | Palmer Robert D | Systems and Methods for Modeling Healthcare Costs, Predicting Same, and Targeting Improved Healthcare Quality and Profitability |
US11562323B2 (en) * | 2009-10-01 | 2023-01-24 | DecisionQ Corporation | Application of bayesian networks to patient screening and treatment |
US20110082712A1 (en) * | 2009-10-01 | 2011-04-07 | DecisionQ Corporation | Application of bayesian networks to patient screening and treatment |
WO2011056984A1 (en) * | 2009-11-06 | 2011-05-12 | Ingenix, Inc. | System and method for condition, cost and duration analysis |
US20110112853A1 (en) * | 2009-11-06 | 2011-05-12 | Ingenix, Inc. | System and Method for Condition, Cost, and Duration Analysis |
US20110112873A1 (en) * | 2009-11-11 | 2011-05-12 | Medical Present Value, Inc. | System and Method for Electronically Monitoring, Alerting, and Evaluating Changes in a Health Care Payor Policy |
US8315888B2 (en) * | 2010-02-12 | 2012-11-20 | Assets Quest, Inc. | Method and system for estimating unpaid claims |
US20110202372A1 (en) * | 2010-02-12 | 2011-08-18 | Assets Quest, Inc. | Method and system for estimating unpaid claims |
US8725613B1 (en) | 2010-04-27 | 2014-05-13 | Experian Information Solutions, Inc. | Systems and methods for early account score and notification |
US20110320225A1 (en) * | 2010-06-18 | 2011-12-29 | Strategic Healthplan Services, Llc | Method and apparatus for automatic healthplan data retrieval and reconciliation using a processing device |
US20130253949A1 (en) * | 2010-09-01 | 2013-09-26 | Vishnuvyas Sethumadhavan | Systems and methods for extraction of clinical knowledge with reimbursement potential |
US11995592B2 (en) | 2010-09-01 | 2024-05-28 | Apixio, Llc | Systems and methods for enhancing workflow efficiency in a healthcare management system |
US12008613B2 (en) | 2010-09-01 | 2024-06-11 | Apixio, Inc. | Method of optimizing patient-related outcomes |
US20210280316A1 (en) * | 2010-09-01 | 2021-09-09 | Apixio, Inc. | Systems and methods for extraction of clinical knowledge with reimbursement potential |
US11195213B2 (en) | 2010-09-01 | 2021-12-07 | Apixio, Inc. | Method of optimizing patient-related outcomes |
US12198820B2 (en) | 2010-09-01 | 2025-01-14 | Apixio, Llc | Systems and methods for patient retention in network through referral analytics |
US11694239B2 (en) | 2010-09-01 | 2023-07-04 | Apixio, Inc. | Method of optimizing patient-related outcomes |
US11544652B2 (en) | 2010-09-01 | 2023-01-03 | Apixio, Inc. | Systems and methods for enhancing workflow efficiency in a healthcare management system |
US10964434B2 (en) | 2010-09-01 | 2021-03-30 | Apixio, Inc. | Systems and methods for extraction of clinical knowledge with reimbursement potential |
US11481411B2 (en) | 2010-09-01 | 2022-10-25 | Apixio, Inc. | Systems and methods for automated generation classifiers |
US11610653B2 (en) | 2010-09-01 | 2023-03-21 | Apixio, Inc. | Systems and methods for improved optical character recognition of health records |
US12165754B2 (en) | 2010-09-01 | 2024-12-10 | Apixio, Llc | Systems and methods for improved optical character recognition of health records |
US11581097B2 (en) | 2010-09-01 | 2023-02-14 | Apixio, Inc. | Systems and methods for patient retention in network through referral analytics |
US8639616B1 (en) | 2010-10-01 | 2014-01-28 | Experian Information Solutions, Inc. | Business to contact linkage system |
US8782217B1 (en) | 2010-11-10 | 2014-07-15 | Safetyweb, Inc. | Online identity management |
US9147042B1 (en) | 2010-11-22 | 2015-09-29 | Experian Information Solutions, Inc. | Systems and methods for data verification |
US9684905B1 (en) | 2010-11-22 | 2017-06-20 | Experian Information Solutions, Inc. | Systems and methods for data verification |
US20120221349A1 (en) * | 2011-02-25 | 2012-08-30 | Eric Mora | Systems and methods for the prediction of health care costs |
US9286035B2 (en) | 2011-06-30 | 2016-03-15 | Infosys Limited | Code remediation |
US20130035947A1 (en) * | 2011-08-01 | 2013-02-07 | Infosys Limited | Claims payout simulator |
US12045755B1 (en) | 2011-10-31 | 2024-07-23 | Consumerinfo.Com, Inc. | Pre-data breach monitoring |
US11568348B1 (en) | 2011-10-31 | 2023-01-31 | Consumerinfo.Com, Inc. | Pre-data breach monitoring |
US11030562B1 (en) | 2011-10-31 | 2021-06-08 | Consumerinfo.Com, Inc. | Pre-data breach monitoring |
US9734290B2 (en) | 2011-12-16 | 2017-08-15 | Neela SRINIVAS | System and method for evidence based differential analysis and incentives based healthcare policy |
US20130173329A1 (en) * | 2012-01-04 | 2013-07-04 | Honeywell International Inc. | Systems and methods for the solution to the joint problem of parts order scheduling and maintenance plan generation for field maintenance |
US20130197936A1 (en) * | 2012-02-01 | 2013-08-01 | Richard R. Willich | Predictive Healthcare Diagnosis Animation |
US11356430B1 (en) | 2012-05-07 | 2022-06-07 | Consumerinfo.Com, Inc. | Storage and maintenance of personal data |
US9853959B1 (en) | 2012-05-07 | 2017-12-26 | Consumerinfo.Com, Inc. | Storage and maintenance of personal data |
US9805422B1 (en) | 2012-05-24 | 2017-10-31 | Allstate Insurance Company | Systems and methods for calculating seasonal insurance premiums |
US10672082B1 (en) | 2012-05-24 | 2020-06-02 | Allstate Insurance Company | Systems and methods for calculating seasonal insurance premiums |
JP2014081757A (en) * | 2012-10-16 | 2014-05-08 | Hst-Labo Co Ltd | Method of processing electronic receipt data for aggregate analysis of disease conditions, healthcare cost, and others |
US20140108057A1 (en) * | 2012-10-17 | 2014-04-17 | Kip Robert Daniels | Insurance instrument, insurance coverage, and method of use thereof |
US10255622B2 (en) * | 2012-10-31 | 2019-04-09 | Continuum Health Technologies Corp. | Statistical financial system and method to value patient visits to healthcare provider organizations for follow up prioritization |
US20140122098A1 (en) * | 2012-10-31 | 2014-05-01 | DaVincian Technologies, Inc. | Statistical financial system and method to value patient visits to healthcare provider organizations for follow up prioritization |
US11568452B2 (en) * | 2012-10-31 | 2023-01-31 | Continuum Health Technolgies Corp. | Statistical financial system and method to value patient visits to healthcare provider organizations for follow up prioritization |
US20140129237A1 (en) * | 2012-11-02 | 2014-05-08 | QMedtrix Systems, Inc. | Estimating market-driven medical facility rates and/or charges |
US9507642B2 (en) * | 2012-12-04 | 2016-11-29 | Xerox Corporation | Method and systems for sub-allocating computational resources |
US20140156518A1 (en) * | 2012-12-04 | 2014-06-05 | Xerox Corporation | Method and systems for sub-allocating computational resources |
US9697263B1 (en) | 2013-03-04 | 2017-07-04 | Experian Information Solutions, Inc. | Consumer data request fulfillment system |
US9972013B2 (en) * | 2013-08-15 | 2018-05-15 | Mastercard International Incorporated | Internet site authentication with payments authorization data |
US20150052053A1 (en) * | 2013-08-15 | 2015-02-19 | Mastercard International Incorporated | Internet site authentication with payments authorization data |
US10102536B1 (en) | 2013-11-15 | 2018-10-16 | Experian Information Solutions, Inc. | Micro-geographic aggregation system |
US10580025B2 (en) | 2013-11-15 | 2020-03-03 | Experian Information Solutions, Inc. | Micro-geographic aggregation system |
US9529851B1 (en) | 2013-12-02 | 2016-12-27 | Experian Information Solutions, Inc. | Server architecture for electronic data quality processing |
US12002099B1 (en) * | 2013-12-23 | 2024-06-04 | Massachusetts Mutual Life Insurance Company | Systems and methods for developing convertible term products |
US10489860B1 (en) * | 2013-12-23 | 2019-11-26 | Massachusetts Mutual Life Insurance Company | Systems and methods for developing convertible term products |
US10262362B1 (en) | 2014-02-14 | 2019-04-16 | Experian Information Solutions, Inc. | Automatic generation of code for attributes |
US11107158B1 (en) | 2014-02-14 | 2021-08-31 | Experian Information Solutions, Inc. | Automatic generation of code for attributes |
US11847693B1 (en) | 2014-02-14 | 2023-12-19 | Experian Information Solutions, Inc. | Automatic generation of code for attributes |
US9576030B1 (en) | 2014-05-07 | 2017-02-21 | Consumerinfo.Com, Inc. | Keeping up with the joneses |
US10936629B2 (en) | 2014-05-07 | 2021-03-02 | Consumerinfo.Com, Inc. | Keeping up with the joneses |
US11620314B1 (en) | 2014-05-07 | 2023-04-04 | Consumerinfo.Com, Inc. | User rating based on comparing groups |
US10019508B1 (en) | 2014-05-07 | 2018-07-10 | Consumerinfo.Com, Inc. | Keeping up with the joneses |
US10242019B1 (en) | 2014-12-19 | 2019-03-26 | Experian Information Solutions, Inc. | User behavior segmentation using latent topic detection |
US11010345B1 (en) | 2014-12-19 | 2021-05-18 | Experian Information Solutions, Inc. | User behavior segmentation using latent topic detection |
US10445152B1 (en) | 2014-12-19 | 2019-10-15 | Experian Information Solutions, Inc. | Systems and methods for dynamic report generation based on automatic modeling of complex data structures |
JPWO2017013712A1 (en) * | 2015-07-17 | 2018-03-08 | 株式会社日立製作所 | Insurance information providing system and insurance information providing method |
US11823276B2 (en) * | 2015-12-23 | 2023-11-21 | Aetna Inc. | Resource allocation |
US10937102B2 (en) * | 2015-12-23 | 2021-03-02 | Aetna Inc. | Resource allocation |
US20170186093A1 (en) * | 2015-12-23 | 2017-06-29 | Aetna Inc. | Resource allocation |
WO2017148161A1 (en) * | 2016-03-04 | 2017-09-08 | 深圳市前海安测信息技术有限公司 | Underwriting and actuarial database system for assessing risks of subject matter of insurance |
CN107515819A (en) * | 2016-06-16 | 2017-12-26 | 平安科技(深圳)有限公司 | Medicare system method of testing and device |
US10678894B2 (en) | 2016-08-24 | 2020-06-09 | Experian Information Solutions, Inc. | Disambiguation and authentication of device users |
US11550886B2 (en) | 2016-08-24 | 2023-01-10 | Experian Information Solutions, Inc. | Disambiguation and authentication of device users |
US11227001B2 (en) | 2017-01-31 | 2022-01-18 | Experian Information Solutions, Inc. | Massive scale heterogeneous data ingestion and user resolution |
US11681733B2 (en) | 2017-01-31 | 2023-06-20 | Experian Information Solutions, Inc. | Massive scale heterogeneous data ingestion and user resolution |
US10650928B1 (en) | 2017-12-18 | 2020-05-12 | Clarify Health Solutions, Inc. | Computer network architecture for a pipeline of models for healthcare outcomes with machine learning and artificial intelligence |
US10910107B1 (en) | 2017-12-18 | 2021-02-02 | Clarify Health Solutions, Inc. | Computer network architecture for a pipeline of models for healthcare outcomes with machine learning and artificial intelligence |
CN108154444A (en) * | 2018-01-17 | 2018-06-12 | 众安信息技术服务有限公司 | For delivering the method, apparatus and computer-readable medium of shift classification |
US11810201B2 (en) * | 2018-03-27 | 2023-11-07 | Healthplan Data Solutions, Inc. | Method and system for monitoring prescription drug data and determining claim data accuracy |
US20230077820A1 (en) * | 2018-03-27 | 2023-03-16 | Healthplan Data Solutions Llc | Method and system for monitoring prescription drug data and determining claim data accuracy |
US20240104666A1 (en) * | 2018-03-27 | 2024-03-28 | Healthplan Data Solutions, Inc. | Method and system for monitoring prescription drug data and determining claim data accuracy |
US11763950B1 (en) | 2018-08-16 | 2023-09-19 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and patient risk scoring |
US11605465B1 (en) | 2018-08-16 | 2023-03-14 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and patient risk scoring |
US10963434B1 (en) | 2018-09-07 | 2021-03-30 | Experian Information Solutions, Inc. | Data architecture for supporting multiple search models |
US12066990B1 (en) | 2018-09-07 | 2024-08-20 | Experian Information Solutions, Inc. | Data architecture for supporting multiple search models |
US11734234B1 (en) | 2018-09-07 | 2023-08-22 | Experian Information Solutions, Inc. | Data architecture for supporting multiple search models |
US11194784B2 (en) * | 2018-10-19 | 2021-12-07 | International Business Machines Corporation | Extracting structured information from unstructured data using domain problem application validation |
US11748820B1 (en) | 2019-04-02 | 2023-09-05 | Clarify Health Solutions, Inc. | Computer network architecture with automated claims completion, machine learning and artificial intelligence |
US11625789B1 (en) * | 2019-04-02 | 2023-04-11 | Clarify Health Solutions, Inc. | Computer network architecture with automated claims completion, machine learning and artificial intelligence |
US11157873B2 (en) * | 2019-04-16 | 2021-10-26 | Advanced New Technologies Co., Ltd. | Blockchain-based program review system, method, computing device and storage medium |
US10922652B2 (en) * | 2019-04-16 | 2021-02-16 | Advanced New Technologies Co., Ltd. | Blockchain-based program review system, method, computing device and storage medium |
US11742091B1 (en) | 2019-04-18 | 2023-08-29 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and active updates of outcomes |
US11621085B1 (en) | 2019-04-18 | 2023-04-04 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and active updates of outcomes |
US11636497B1 (en) | 2019-05-06 | 2023-04-25 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and risk adjusted performance ranking of healthcare providers |
US10726359B1 (en) | 2019-08-06 | 2020-07-28 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and automated scalable regularization |
US10990904B1 (en) | 2019-08-06 | 2021-04-27 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and automated scalable regularization |
US11645344B2 (en) | 2019-08-26 | 2023-05-09 | Experian Health, Inc. | Entity mapping based on incongruent entity data |
US11941065B1 (en) | 2019-09-13 | 2024-03-26 | Experian Information Solutions, Inc. | Single identifier platform for storing entity data |
US10910113B1 (en) | 2019-09-26 | 2021-02-02 | Clarify Health Solutions, Inc. | Computer network architecture with benchmark automation, machine learning and artificial intelligence for measurement factors |
US10998104B1 (en) | 2019-09-30 | 2021-05-04 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and automated insight generation |
US11527313B1 (en) | 2019-11-27 | 2022-12-13 | Clarify Health Solutions, Inc. | Computer network architecture with machine learning and artificial intelligence and care groupings |
US20210200896A1 (en) * | 2019-12-30 | 2021-07-01 | Unitedhealth Group Incorporated | Programmatic determinations using decision trees generated from relational database entries |
US11816085B2 (en) * | 2019-12-30 | 2023-11-14 | Unitedhealth Group Incorporated | Programmatic determinations using decision trees generated from relational database entries |
US20220164698A1 (en) * | 2020-11-25 | 2022-05-26 | International Business Machines Corporation | Automated data quality inspection and improvement for automated machine learning |
US11880377B1 (en) | 2021-03-26 | 2024-01-23 | Experian Information Solutions, Inc. | Systems and methods for entity resolution |
US11961611B2 (en) * | 2021-05-03 | 2024-04-16 | Evernorth Strategic Development, Inc. | Automated bias correction for database systems |
US20220351842A1 (en) * | 2021-05-03 | 2022-11-03 | Evernorth Strategic Development, Inc. | Automated bias correction for database systems |
WO2023023342A1 (en) * | 2021-08-19 | 2023-02-23 | Allstate Insurance Company | Automated iterative predictive modeling computing platform |
US12266019B2 (en) * | 2023-09-28 | 2025-04-01 | Healthplan Data Solutions, Inc. | Method and system for monitoring prescription drug data and determining claim data accuracy |
US12079230B1 (en) | 2024-01-31 | 2024-09-03 | Clarify Health Solutions, Inc. | Computer network architecture and method for predictive analysis using lookup tables as prediction models |
Also Published As
Publication number | Publication date |
---|---|
US7392201B1 (en) | 2008-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20090048877A1 (en) | Insurance claim forecasting system | |
US8165894B2 (en) | Fully automated health plan administrator | |
US8600769B2 (en) | Medical bill analysis and review | |
US20080120133A1 (en) | Method for predicting the payment of medical debt | |
US20080033750A1 (en) | Enhanced systems and methods for processing of healthcare information | |
US20090204446A1 (en) | Systems and methods for valuation of life insurance policies | |
US20060190300A1 (en) | Post payment provider agreement process | |
Claxton et al. | Health benefits in 2019: premiums inch higher, employers respond to federal policy | |
US20090055227A1 (en) | Risk Assessment Company | |
Wren et al. | An examination of the potential costs of universal health insurance in Ireland | |
Arnold et al. | Who pays for health care costs? The effects of health care prices on wages | |
US10580083B2 (en) | Recording medium having program for forming a healthcare network | |
Walsh et al. | Projections of expenditure for primary, community and long-term care in Ireland, 2019-2035, based on the Hippocrates model | |
US20140350959A1 (en) | Systems and methods for reducing healthcare transaction costs | |
Turner et al. | The effects of unexpected changes in demand on the performance of emergency departments | |
Cid et al. | Global risk-adjusted payment models | |
Perez | Effect of privatized managed care on public insurance spending and generosity: Evidence from Medicaid | |
Farley et al. | Trends in special medicare payments and service utilization for rural areas in the 1990s | |
Duncan | Mining health claims data for assessing patient risk | |
Noe et al. | Calls for reform to the US hospice system | |
Melnick et al. | An empirical analysis of hospital ED pricing power | |
Barker et al. | The Impact of Hospital Closures and Mergers on Patient Welfare | |
Park | Ensuring Effective Risk Adjustment | |
Berman | Essays in social insurance | |
US20230105798A1 (en) | Recording medium having improved program for forming a healthcare network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |