The first article that was pulled from the reading sessions in NephSAP is this retrospective review of ARF by the PICARD Group.
The introduction begins by setting the scene: Acute renal failure is lethal with over 50% mortality in most series. Additionally, we haven’t seen any major improvement in mortality for these patients in the last 30 years. Then they authors state that the ANP ARF trials were well designed. This is almost laughable as the ANP trials are the poster child on how not to do a clinical trial on ARF. After that they go over the fact that ARF research is hard because the patients are sick and we lack a standardized methodology for talking about the severity of illness. APACHE II scores are notoriously bad at predicting mortality in renal failure. And the storied Cleaveland clinic ARF score has not been validated outside that institution. Then the authors layout the problem they are trying to answer:
To prepare for future clinical trials in ARF, it is essential that valid, generalizable models for risk adjustment be developed, both for stratification in patient selection and for covariate adjustment in the event of imbalanced randomization.
So in order to accurately risk stratify patients and successfully balance study groups we will not be able to properly evaluate new therapies. This study group was created to do a RCT of CVVH vs IHD and hence captured lots of prospective data of patients in ARF. Only 166 of the 851 were enrolled and randomized in the trial (negative and tragic because they were unlucky with their randomization and they had a significantly higher severity of illness in the CVVH group.). This paper uses all 851 patients to look for predictors of mortality in ARF.
- ARF + ICU + Nephrology consult
- No CKD Cr > 2 or BUN > 40
- CKD: Cr increased by 1.0
- No prior dialysis, kidney transplant, obstructive uropathy or pre-renal azotemia
Severity of illness scores (they calculated 13 different scores besides creating their own score) were calculated on the day of nephrology consult. Using the day of consult seems arbitrary and prone to variations in local practice patterns, i.e. if the study is done in an area where early consult is the norm then presumably the patients will have less severe disease and hence have better outcomes, hence a score derived with this methodology will underestimate the mortality in a center with a culture which leans towards later consultation (and presumably sicker patients).
After excluding subjects that had missing data that prevented them from calculating the scores they were left with a cohort of 605 patients.
Half of the cohort received dialysis while in the ICU and 51.9% died in the hospital.
They created a risk score based on:
- Respiratory failure
- Liver failure
- Hematologic failure
- Log urine output
- Heart rate
They then showed that their new model beating the crap out of the old models with an AUC on the ROC of 0.832. The best other scoring system, SAPS2, received only a 0.766. The APACHE II had an AUC of only 0.634. (Perfect AUC is 1.0 and a worthless value is 0.5).
The biggest weakness is that the Mehta score (the eponyminous name for the score created with The PICARD Data) was able to predict outcomes well when you tested it using the data used to define the score. This is the ultimate home field advantage. Levey in the MDRD group divided their cohort into a derivation group and a validation group. This current authors did not do that.
This will be a moot point if the score is validated in a new independent study. I wonder if the ATN group did that? I bet they did, or will publish on this soon.