K-means and Fuzzy C-means Clustering Using a Naive Algorithm and Particle Swarm Optimization

Features

The code has been written and tested in Python 3.8.8.
Two clustering methods (K-means and fuzzy C-means) and two solvers (naive algorithm and PSO).
For the K-means clustering method:
- the distance from the cluster centers is assumed as clustering error;
- the function minimized is the sum of squared errors;
- the silhouette coefficient and Davies–Bouldin index are available metrics;
- the function assign_data can be used to classify new data.
For the fuzzy C-means clustering method:
- the weighted distance from the cluster centers is assumed as clustering error;
- the function minimized is the sum of (weighted) squared errors;
- the Dunn's and Kaufman's fuzzy partition coefficients are available metrics;
- the function calc_U can be used to classify new data.
Usage: python test.py example.

Main Parameters

example Name of the example to run (g2, dim2, unbalance, s3)

nPop, epochs Number of agents (population) and number of iterations.

K, K_list Number of clusters.

n_rep Number of repetitions (re-starts) in the naive algorithm.

max_iter Max. number of iterations in the naive algorithm.

func Name of the interface function for the PSO.

m Fuzziness coefficient in the fuzzy C-means method.

tol Convergency tolerance in the fuzzy C-means method.

The other PSO parameters are used with their default values (see pso.py).

Examples

Example 1: g2

K-means using PSO, 2 clusters, 8 features, 2048 samples.

# Cluster centers:
# [[600, 600, 600, 600, 600, 600, 600, 600],
#  [500, 500, 500, 500, 500, 500, 500, 500]]

# Found solution:
# [[599.06 598.27 599.21 600.61 600.05 598.84 600.48 599.4 ]
#  [499.76 499.45 499.9  500.92 497.64 498.66 499.48 499.39]]

# Max. error [%]: 0.473

Example 2: dim2

K-means using naive algorithm, 2 to 15 clusters, 2 features, 1351 samples, silhouette coefficient and Davies–Bouldin index as metrics.

Example 3: unbalance

Fuzzy C-means using PSO, 8 clusters (unbalanced), 2 features, 6500 samples.

Example 4: s3

Fuzzy C-means using naive algorithm, 2 to 20 clusters, 2 features, 5000 samples, Dunn's and Kaufman's fuzzy partition coefficients as metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Code_Python		Code_Python
LICENSE		LICENSE
README.md		README.md
Results_Example_2.jpg		Results_Example_2.jpg
Results_Example_3.jpg		Results_Example_3.jpg
Results_Example_4.jpg		Results_Example_4.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code_Python

Code_Python

LICENSE

LICENSE

README.md

README.md

Results_Example_2.jpg

Results_Example_2.jpg

Results_Example_3.jpg

Results_Example_3.jpg

Results_Example_4.jpg

Results_Example_4.jpg

Repository files navigation

K-means and Fuzzy C-means Clustering Using a Naive Algorithm and Particle Swarm Optimization

Features

Main Parameters

Examples

References

About

Languages

License

gabrielegilardi/Clustering

Folders and files

Latest commit

History

Repository files navigation

K-means and Fuzzy C-means Clustering Using a Naive Algorithm and Particle Swarm Optimization

Features

Main Parameters

Examples

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages