Add peak clustering option prior to using `ImageD11.indexing.indexer()`. #148

AxelHenningsson · 2022-03-09T06:13:47Z

What do you think about adding a clustering feature to ImageD11.indexing.indexer(). Typically when doing scanning types of measurements one needs to somehow merge things over scans for the indexer not to freak out.

I recently used sklearn.cluster.AgglomerativeClustering to do this with success. If we are willing to depend on sklearn I think this could be an option.

Thoughts?

Cheers
Axel

The text was updated successfully, but these errors were encountered:

jonwright · 2022-03-09T06:54:05Z

Sounds interesting. Are you clustering the data or the resulting ubi matrices?

ImageD11/grid_index_parallel.py includes code to try to merge ubis
s3DXRD/merge_duplicates had some code for merging gvectors
sandbox/newpeaksearch3d.py was calling scipy.sparse.csgraph.connected_components to merge peaks

There are a few ipython notebooks which made a pole figure for each hkl ring (from spot positions) and then run a peak search on the resulting pole figure.

We miss a nice system for clustering peaks into crystallographic phases in XRDCT data. There should be a lot of information in seeing which peaks show up together, or from the same place in the sample. Probably sklearn can help here. I guess that kind of approach can group peaks in grains (e.g. because they have consistent sinograms) without knowing the unit cell?

AxelHenningsson · 2022-03-09T11:45:22Z

I have just been clustering the data in dty using something like this:

def merge_dtys(colf, distance_threshold):

    agg_cluster = AgglomerativeClustering(n_clusters=None, 
                                          distance_threshold=distance_threshold, 
                                          affinity='euclidean', 
                                          linkage='single')
    clusters = agg_cluster.fit_predict( np.array([colf.sc.copy(), colf.fc.copy(), colf.omega.copy()]).T)

    nbrpks = np.max(clusters)
    merged_colf = colfile_from_dict({
        'Number_of_pixels':np.zeros((nbrpks,)),
        'sc':np.zeros((nbrpks,)),
        'fc':np.zeros((nbrpks,)),
        'omega':np.zeros((nbrpks,)),
        'sum_intensity':np.zeros((nbrpks,)),
    })

    for c in range(nbrpks):
        mask = clusters==c
        Itot = np.sum(colf.sum_intensity[mask])
        merged_colf.Number_of_pixels[c] = np.sum(colf.Number_of_pixels[mask])
        merged_colf.sc[c] = np.sum(colf.sc[mask]*colf.sum_intensity[mask]) / Itot
        merged_colf.fc[c] = np.sum(colf.fc[mask]*colf.sum_intensity[mask]) / Itot
        merged_colf.omega[c] = np.sum(colf.omega[mask]*colf.sum_intensity[mask]) / Itot

    return merged_colf

What are the benefits you think to instead merge the .ubi matrices?

jonwright · 2023-02-26T12:49:45Z

Hi Axel,
I think I missed the idea in this message. There should now be some code in ImageD11.sinograms.properties that merges peaks which overlap. It could be interesting to try this method for comparison on some large datasets. I was always a bit nervous about the scaling for clustering as you need to avoid the n^2 distance matrix. With ubi's there should be a lot less items to cluster.
Thanks!
Jon

AxelHenningsson · 2023-02-26T20:34:11Z

That looks great Jon! Will try it out the next time I need to patch up a s3dxrd dataset for sure.

Cheers
Axel

AxelHenningsson added the enhancement label Mar 9, 2022

jonwright changed the title ~~Add clustering option for ImageD11.indexing.indexer().~~ Add peak clustering option prior to using ImageD11.indexing.indexer(). Feb 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add peak clustering option prior to using `ImageD11.indexing.indexer()`. #148

Add peak clustering option prior to using `ImageD11.indexing.indexer()`. #148

AxelHenningsson commented Mar 9, 2022

jonwright commented Mar 9, 2022

AxelHenningsson commented Mar 9, 2022

jonwright commented Feb 26, 2023

AxelHenningsson commented Feb 26, 2023

Add peak clustering option prior to using ImageD11.indexing.indexer(). #148

Add peak clustering option prior to using ImageD11.indexing.indexer(). #148

Comments

AxelHenningsson commented Mar 9, 2022

jonwright commented Mar 9, 2022

AxelHenningsson commented Mar 9, 2022

jonwright commented Feb 26, 2023

AxelHenningsson commented Feb 26, 2023

Add peak clustering option prior to using `ImageD11.indexing.indexer()`. #148

Add peak clustering option prior to using `ImageD11.indexing.indexer()`. #148