top of page

K-means Clustering

About the Algorythm Recipe 🍰

K-means clustering is a popular unsupervised machine learning algorithm used for partitioning a dataset into a pre-defined number of clusters. It works by iteratively assigning data points to the nearest cluster centroid and then updating the centroids based on the mean of the data points assigned to each cluster. K-means is widely used for tasks such as customer segmentation, image compression, anomaly detection, and document clustering. It's efficient, easy to implement, and effective in identifying underlying patterns or structures in the data.

Cookin' time! 🍳


from sklearn.cluster import KMeans
import numpy as np

# Generate some random data points for demonstration
X = np.random.rand(100, 2)

# Define the number of clusters (K)
k = 3

# Initialize KMeans object
kmeans = KMeans(n_clusters=k)

# Fit the model to the data
kmeans.fit(X)

# Get the cluster centroids
centroids = kmeans.cluster_centers_

# Get the cluster labels for each data point
labels = kmeans.labels_

# Print the cluster centroids and labels
print("Cluster Centroids:")
print(centroids)
print("\nCluster Labels:")
print(labels)

In this code:

  • We generate some random data points (X) for demonstration purposes.

  • We define the number of clusters (k) to be 3.

  • We initialize a KMeans object with the specified number of clusters.

  • We fit the KMeans model to the data.

  • We retrieve the cluster centroids and cluster labels.

  • Finally, we print the cluster centroids and labels.

bottom of page