XGBoost (eXtreme Gradient Boosting) is a popular gradient boosting library for structured data. MLflow provides native integration with XGBoost for experiment tracking, model management, and deployment.
This integration supports both the native XGBoost API and scikit-learn compatible interface, making it easy to track experiments and deploy models regardless of which API you prefer.
import mlflow
import xgboost as xgb
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
# Enable autologging - captures everything automatically
mlflow.xgboost.autolog()
# Load and prepare data
data = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(
data.data, data.target, test_size=0.2, random_state=42
)
# Prepare data in XGBoost format
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
# Train model - MLflow automatically logs everything!
with mlflow.start_run():
model = xgb.train(
params={
"objective": "reg:squarederror",
"max_depth": 6,
"learning_rate": 0.1,
},
dtrain=dtrain,
num_boost_round=100,
evals=[(dtrain, "train"), (dtest, "test")],
)
import mlflow
import xgboost as xgb
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
# Load data
data = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(
data.data, data.target, test_size=0.2, random_state=42
)
# Enable autologging
mlflow.xgboost.autolog()
# Train with native API
with mlflow.start_run():
dtrain = xgb.DMatrix(X_train, label=y_train)
model = xgb.train(
params={"objective": "reg:squarederror", "max_depth": 6},
dtrain=dtrain,
num_boost_round=100,
)
What Gets Logged
When autologging is enabled, MLflow automatically captures:
Parameters: All booster parameters and training configuration
Metrics: Training and validation metrics for each boosting round
Feature Importance: Multiple importance types (weight, gain, cover) with visualizations
Model: The trained model with proper serialization format
Artifacts: Feature importance plots and JSON data
No comments:
Post a Comment