casm.learn.FittingData¶
-
class
casm.learn.
FittingData
(X, y, cv, sample_weight=[], scoring=None, penalty=0.0, tdata=None)[source]¶ FittingData holds feature values, target values, sample weights, etc. used to solve:
L*X * b = L*y
a weighted linear model where the weights are given by W = L * L.transpose().
-
X
¶ The training input samples (correlations).
- Type
array-like of shape (n_samples, n_features)
-
y
¶ The target values (property values).
- Type
array-like of shape: (n_samples, 1)
-
cv
¶ Provides train/test splits
- Type
cross-validation generator or an iterable
-
n_samples
¶ The number of samples / target values (number of rows in X)
- Type
int
-
n_features
¶ The number of features (number of columns in X)
- Type
int
-
W
¶ Contains sample weights.
- Type
array-like of shape: (n_samples, n_samples)
-
L
¶ Used to generate weighted_X and weighted_y, W = L * L.transpose().
- Type
array-like of shape: (n_samples, n_samples)
-
weighted_X
¶ Weighted training input data, weighted_X = L*x.
- Type
array-like of shape: (n_samples, n_features)
-
weighted_y
¶ Weighted target values, weighted_y = L*y.
- Type
array-like of shape: (n_samples, 1)
-
scoring
¶ A string or a scorer callable object / function with signature scorer(estimator, X, y). The parameter for sklearn.model_selection.cross_val_score, default = None, uses estimator.score().
- Type
string, callable or None, optional, default: None
-
penalty
¶ The CV score is increased by ‘penalty*(number of selected basis function)’
- Type
float, optional, default=0.0
-
data
¶ Optionally, store TrainingData.data with weighted_X and weighted_y data added. No checks are made for consistency of tdata.X, tdata.y and X and y or other parameters.
- Type
pandas.DataFrame, optional, default=None
- Parameters
X (array-like of shape (n_samples, n_features)) – The training input samples (correlations).
y (array-like of shape: (n_samples, 1)) – The target values (property values).
cv (cross-validation generator or an iterable) – Provides train/test splits
sample_weight (None, 1-d array-like of shape: (n_samples, 1), or 2-d array-like of shape: (n_samples, n_samples)) –
Sample weights.
- if sample_weight is None: (default, unweighted)
W = np.matlib.eye(N)
- if sample_weight is 1-dimensional:
W = np.diag(sample_weight)*Nvalue/np.sum(sample_weight)
- if sample_weight is 2-dimensional (must be Hermitian, positive-definite):
W = sample_weight*Nvalue/np.sum(sample_weight)
scoring (string, callable or None, optional, default=None) – A string or a scorer callable object / function with signature scorer(estimator, X, y). The parameter for sklearn.model_selection.cross_val_score, default = None, uses estimator.score().
penalty (float, optional, default=0.0) – The CV score is increased by ‘penalty*(number of selected basis function)’
tdata (TrainingData instance, optional, default=None) – Optionally, store TrainingData.data with weighted_X and weighted_y data added. No checks are made for consistency of tdata.X, tdata.y and X and y or other parameters.
-
__init__
(X, y, cv, sample_weight=[], scoring=None, penalty=0.0, tdata=None)[source]¶ - Parameters
X (array-like of shape (n_samples, n_features)) – The training input samples (correlations).
y (array-like of shape: (n_samples, 1)) – The target values (property values).
cv (cross-validation generator or an iterable) – Provides train/test splits
sample_weight (None, 1-d array-like of shape: (n_samples, 1), or 2-d array-like of shape: (n_samples, n_samples)) –
Sample weights.
- if sample_weight is None: (default, unweighted)
W = np.matlib.eye(N)
- if sample_weight is 1-dimensional:
W = np.diag(sample_weight)*Nvalue/np.sum(sample_weight)
- if sample_weight is 2-dimensional (must be Hermitian, positive-definite):
W = sample_weight*Nvalue/np.sum(sample_weight)
scoring (string, callable or None, optional, default=None) – A string or a scorer callable object / function with signature scorer(estimator, X, y). The parameter for sklearn.model_selection.cross_val_score, default = None, uses estimator.score().
penalty (float, optional, default=0.0) – The CV score is increased by ‘penalty*(number of selected basis function)’
tdata (TrainingData instance, optional, default=None) – Optionally, store TrainingData.data with weighted_X and weighted_y data added. No checks are made for consistency of tdata.X, tdata.y and X and y or other parameters.
Methods
__init__
(X, y, cv[, sample_weight, scoring, …])- param X
The training input samples (correlations).
-