greedytscluster module
GreedyTSCluster class
- class tscluster.greedytscluster.GreedyTSCluster(n_clusters: int, scheme: str = 'z1c0', *, n_allow_assignment_change: None | int = None, random_state: None | int = None, initialization: str = 'kmeans++')
Bases:
TSCluster,TSClusterInterface(Under development) Class for Maxima Minimation (MM) algorithm (a.k.a. greedy algorithm) for time-series clustering. Throughout this doc and code, ‘z’ refers to cluster centers, while ‘c’ to label assignment. This creates an GreedyTSCluster object.
- Parameters:
- n_clustersint
The number of clusters to generate.
- scheme: {‘z0c0’, ‘z0c1’, ‘z1c0’, ‘z1c1’}, default=’z1c0’
- The scheme to use for tsclustering. Could be one of:
‘z0c0’ means fixed center, fixed assignment
‘z0c1’ means fixed center, changing assignment
‘z1c0’ means changing center, fixed assignment
‘z1c1’ means changing center, changing assignment
Scheme needs to be a dynamic label assignment scheme (either ‘z1c1’ or ‘z0c1’) when using constrained cluster change (either with n_allow_assignment_change)
- n_allow_assignment_changeint or None, default=None
Penalty added to changing assignments over time for ‘c1’ schemes.
- random_stateint or None, default=None
Random seed for reproducibility.
- initializationstr, default=’kmeans++’
Method to initialize cluster centers. Must be one of {‘kmeans++’, ‘random’}.
- Attributes:
cluster_centers_Cluster centers learned by the model.
fitted_data_shape_Shape of the data the model was fit on.
label_dict_returns a dictionary of the labels whose keys are ‘T’, ‘N’, and ‘F’ (which are the number of time steps, entities, and features respectively). Value of each key is a list such that the value of key:
labels_Cluster labels for each sample at each time step.
n_changes_returns the total number of label changes
Methods
fit(X[, label_dict, verbose, print_to, max_iter])Fit the temporal clustering model using greedy optimization.
get_dynamic_entities()returns the dynamic entities and their number of changes.
get_index_of_label(labels[, axis])function to return the integer indexes of some given labelled items in self.label_dict_.
get_label_of_index(indexes[, axis])function to return the labels of some given integer indexes as labelled in self.label_dict_.
get_named_cluster_centers([label_dict])Method to return the cluster centers with custom names of time steps and features.
get_named_labels([label_dict])Method to return the a data frame of the label assignments with custom names of time steps and entities.
set_label_dict(value)Method to manually set the label_dict_.
- property cluster_centers_
Cluster centers learned by the model.
- Returns:
- np.ndarray of shape (T, K, F)
The cluster centroids for each cluster k and time t.
- fit(X: npt.NDArray[np.float64], label_dict: dict | None = None, verbose: bool = True, print_to: TextIO = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, max_iter: int = 1000, **kwargs) GreedyTSCluster
Fit the temporal clustering model using greedy optimization.
- Parameters:
- Xnp.ndarray of shape (T, N, F)
The input time series data, where T is the number of time steps, N is the number of samples, and F is the number of features.
- label_dictdict, optional
Optional dictionary of axis labels used for interpretability.
- verbosebool, default=True
If True, print progress and diagnostic information during fitting.
- print_toTextIO, default=sys.stdout
File-like stream to output verbose logs.
- max_iterint, default=1000
Maximum number of optimization iterations.
- Returns:
- selfGreedyTSCluster
The fitted model instance.
- property fitted_data_shape_: Tuple[int, int, int]
Shape of the data the model was fit on.
- Returns:
- tuple of int
Tuple (T, N, F) corresponding to time, samples, and features.
- property labels_
Cluster labels for each sample at each time step.
- Returns:
- np.ndarray of shape (N, T)
The cluster assignment for each sample and time.