Skip to content
This repository has been archived by the owner on Jul 16, 2021. It is now read-only.

DBSCAN clusters of size < min_points being returned #196

Open
NatPRoach opened this issue Jul 23, 2018 · 0 comments
Open

DBSCAN clusters of size < min_points being returned #196

NatPRoach opened this issue Jul 23, 2018 · 0 comments

Comments

@NatPRoach
Copy link

NatPRoach commented Jul 23, 2018

Hello, I've been using your implementation of DBSCAN, and noticed that its been outputting clusters smaller than the minimum size I specified at initialization. The relevant section of code I've been using to look at clusters is below:

    let mut db = DBSCAN::new(1.5, 5);
    db.train(&similarity_matrix).unwrap();
    let cluster_assignments = db.clusters().unwrap();
    let mut clusters = Vec::<Vec::<usize>>::new();
    for (i,assignment) in cluster_assignments.iter().enumerate() {
        if assignment.is_some(){
            let val = assignment.unwrap();
            println!("read {} {}: cluster {}",i, read_ids[i], val );
            if clusters.len() == val{
                clusters.push(vec![i])
            }
            else{
                clusters[val].push(i)
            }
        }
        else{
            println!("read {} {}: cluster {}", i, read_ids[i], -1 );
        }
    }
    for (i,cluster) in clusters.iter().enumerate(){
        println!("Cluster {}, size {}:", i, cluster.len());
        for index in cluster.iter(){
            println!(">{}",read_ids[*index]);
            let bytes = seqs[*index].clone();
            println!("{}",String::from_utf8(bytes).unwrap());
        }
    }

Using this code I've been getting clusters of sizes < 5, as small as 1 or 2 elements in some cases.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant