casm.learn.FittingData
- class casm.learn.FittingData(X, y, cv, sample_weight=[], scoring=None, penalty=0.0, tdata=None)[source]
FittingData holds feature values, target values, sample weights, etc. used to solve:
L*X * b = L*y
a weighted linear model where the weights are given by W = L * L.transpose().
- X
The training input samples (correlations).
- Type:
array-like of shape (n_samples, n_features)
- y
The target values (property values).
- Type:
array-like of shape: (n_samples, 1)
- cv
Provides train/test splits
- Type:
cross-validation generator or an iterable
- n_samples
The number of samples / target values (number of rows in X)
- Type:
int
- n_features
The number of features (number of columns in X)
- Type:
int
- W
Contains sample weights.
- Type:
array-like of shape: (n_samples, n_samples)
- L
Used to generate weighted_X and weighted_y, W = L * L.transpose().
- Type:
array-like of shape: (n_samples, n_samples)
- weighted_X
Weighted training input data, weighted_X = L*x.
- Type:
array-like of shape: (n_samples, n_features)
- weighted_y
Weighted target values, weighted_y = L*y.
- Type:
array-like of shape: (n_samples, 1)
- scoring
A string or a scorer callable object / function with signature scorer(estimator, X, y). The parameter for sklearn.model_selection.cross_val_score, default = None, uses estimator.score().
- Type:
string, callable or None, optional, default: None
- penalty
The CV score is increased by ‘penalty*(number of selected basis function)’
- Type:
float, optional, default=0.0
- data
Optionally, store TrainingData.data with weighted_X and weighted_y data added. No checks are made for consistency of tdata.X, tdata.y and X and y or other parameters.
- Type:
pandas.DataFrame, optional, default=None
- Parameters:
X (array-like of shape (n_samples, n_features)) – The training input samples (correlations).
y (array-like of shape: (n_samples, 1)) – The target values (property values).
cv (cross-validation generator or an iterable) – Provides train/test splits
sample_weight (None, 1-d array-like of shape: (n_samples, 1), or 2-d array-like of shape: (n_samples, n_samples)) –
Sample weights.
- if sample_weight is None: (default, unweighted)
W = np.matlib.eye(N)
- if sample_weight is 1-dimensional:
W = np.diag(sample_weight)*Nvalue/np.sum(sample_weight)
- if sample_weight is 2-dimensional (must be Hermitian, positive-definite):
W = sample_weight*Nvalue/np.sum(sample_weight)
scoring (string, callable or None, optional, default=None) – A string or a scorer callable object / function with signature scorer(estimator, X, y). The parameter for sklearn.model_selection.cross_val_score, default = None, uses estimator.score().
penalty (float, optional, default=0.0) – The CV score is increased by ‘penalty*(number of selected basis function)’
tdata (TrainingData instance, optional, default=None) – Optionally, store TrainingData.data with weighted_X and weighted_y data added. No checks are made for consistency of tdata.X, tdata.y and X and y or other parameters.
- __init__(X, y, cv, sample_weight=[], scoring=None, penalty=0.0, tdata=None)[source]
- Parameters:
X (array-like of shape (n_samples, n_features)) – The training input samples (correlations).
y (array-like of shape: (n_samples, 1)) – The target values (property values).
cv (cross-validation generator or an iterable) – Provides train/test splits
sample_weight (None, 1-d array-like of shape: (n_samples, 1), or 2-d array-like of shape: (n_samples, n_samples)) –
Sample weights.
- if sample_weight is None: (default, unweighted)
W = np.matlib.eye(N)
- if sample_weight is 1-dimensional:
W = np.diag(sample_weight)*Nvalue/np.sum(sample_weight)
- if sample_weight is 2-dimensional (must be Hermitian, positive-definite):
W = sample_weight*Nvalue/np.sum(sample_weight)
scoring (string, callable or None, optional, default=None) – A string or a scorer callable object / function with signature scorer(estimator, X, y). The parameter for sklearn.model_selection.cross_val_score, default = None, uses estimator.score().
penalty (float, optional, default=0.0) – The CV score is increased by ‘penalty*(number of selected basis function)’
tdata (TrainingData instance, optional, default=None) – Optionally, store TrainingData.data with weighted_X and weighted_y data added. No checks are made for consistency of tdata.X, tdata.y and X and y or other parameters.
Methods
__init__
(X, y, cv[, sample_weight, scoring, ...])- param X:
The training input samples (correlations).