Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation Pipeline #1

Open
Wizzzard93 opened this issue Mar 21, 2024 · 3 comments
Open

Validation Pipeline #1

Wizzzard93 opened this issue Mar 21, 2024 · 3 comments

Comments

@Wizzzard93
Copy link

Wizzzard93 commented Mar 21, 2024

Hi,

I want to validate your model with in-house data.
I tried to port your model to Python, but I got slightly different risk scores.
Can you provide a validation pipeline in R?

Alternatively, am I missing a preprocessing step?
``
import xgboost as xgb
import numpy as np

age = 53
MCV_fL = 88
MCHC_g_L = 330
PT = 50
WBC_G_L = 10
Lymphocytes_G_L = 3
Monocytes_G_L = 6
Platelets_G_L = 6
fibri_gL = 6
LDH_UI_L = 250

mono_percent = (Monocytes_G_L*100)/ WBC_G_L

Sample data with 10 features

sample_data = np.array([[fibri_gL, MCV_fL, mono_percent, LDH_UI_L, PT, MCHC_g_L, Lymphocytes_G_L, age, Monocytes_G_L, Platelets_G_L,]]) # Example data

Convert the sample data to DMatrix

dtest = xgb.DMatrix(sample_data)

Make the prediction with probability estimates

prediction = model.predict(dtest)
``

BR
Merlin

@VincentAlcazer
Copy link
Owner

VincentAlcazer commented Mar 23, 2024

Hey,

Thank you for your interest in our work.

The model is fully available in R on the repository. In addition to the raw predictions scores, optimal and confident cutoffs were set to guide clinical decisions.

If you want to learn more about how these cutoffs were set, the article is currently in press and should be available soon.

Best,

Vincent Alcazer

@Wizzzard93
Copy link
Author

Hi,

thanks for the response, I am excited to read the article.
Would you mind sharing your validation pipeline in R?
I have a dataset prepared and I would like to see if I can achive similar performance :)

BR
Merlin

@VincentAlcazer
Copy link
Owner

Dear Merlin,

The paper is now available on https://www.thelancet.com/journals/landig/article/PIIS2589-7500(24)00044-X/fulltext
All the R pipeline with the used cutoffs are available on the github repository

Please let me know if you have any issue running this, I would be very interested to have the results on you cohort

Best,

Vincent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants