Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sambamba slice - 94 Segmentation fault (core dumped) #515

Open
alexvasilikop opened this issue Feb 5, 2024 · 2 comments
Open

Sambamba slice - 94 Segmentation fault (core dumped) #515

alexvasilikop opened this issue Feb 5, 2024 · 2 comments
Labels

Comments

@alexvasilikop
Copy link

alexvasilikop commented Feb 5, 2024

Hello I am running sambamba v. 1.0.1 from a within a docker image on the Google Cloud within nextflow. The purpose is to slice the bam file (human genome) by chromosome. Therefore I am providing a bed file with the coordinates of a single chromosome at a time (as a channel) to nextflow in a process:

'''
process sambamba_slice_bam {
container 'gcr.io/diagnostics-uz/sambamba_v1.0.1@sha256:f6947d458d2a225580976b1ce8e238a07098073307700fd41bb0cda910956b28'
label 'lotsOfWork'
machineType 'e2-highmem-16'
memory '16 GB'
maxForks 8
disk { 20.GB + ( 3.B * bam.size() ) }

input:
tuple val(sample_id), path(bam), path(bai)
path chromosome_bed
val num_threads

output:
tuple val(sample_id), path("results/.bam"), path("results/.bai"), emit: indexed_sliced_bam

shell:
mkdir -p result
#get list of chromosomes to slice
CHROMOSOMES_TO_SLICE=$(cat !{chromosome_bed} | while read chr start end; do echo "$chr";done | sort | uniq | xargs)

#perform slicing
SAMBAMBA_EXEC=/work/apps/sambamba/sambamba

for chrom in ${CHROMOSOMES_TO_SLICE}; do
  echo -e "Working on chromosome ${chrom} ...  \\n"
  single_chrom_bed="!{sample_id}.${chrom}.sliced.bed"
  echo -e "Constructing ${single_chrom_bed} to slice bam for ${chrom}... \\n"
  OUTBAM=$(basename $single_chrom_bed .bed).bam
  grep -P "^${chrom}\\s" "!{chromosome_bed}" > "${single_chrom_bed}"

  #perform slicing
  $SAMBAMBA_EXEC slice -o "results/${OUTBAM}" -L "${single_chrom_bed}" "!{bam}" 
  #index sliced BAM
  $SAMBAMBA_EXEC index --nthreads="!{num_threads}" "results/${OUTBAM}"
done
echo -e "ALL DONE\\n"

}
'''

I am getting the following error:
'''
sambamba 1.0.1
by Artem Tarasov and Pjotr Prins (C) 2012-2023
LDC 1.32.0 / DMD v2.102.2 / LLVM14.0.6 / bootstrap LDC - the LLVM D compiler (1.32.0)
/mnt/disks/gcap-nf-scratch/f1/c1747bb64e922dbfeabe384eee928d/.command.sh: line 9: 94 Segmentation fault (core dumped) ${SAMBAMBA_EXEC} slice -o "results/${OUTBAM}" -L "${single_chrom_bed}" "277469.recalibrated.sorted.bam"
'''

Any idea what the problem is?

@AgedMordorBlue
Copy link

I've had a similar issue on my institution's cluster, there it was because the D language underlying Sambamba cannot handle some modern hardware. Something with D using ubyte to estimate CPU cache size which doesn't work on either the amount of CPUs or the type of CPUs, which causes a division by 0 down the line.

What solved it for us was to request older CPUs for the job.

@alexvasilikop
Copy link
Author

I ended up using sambamba view instead ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants