Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db split1 issue #89

Open
kcl58759 opened this issue Mar 4, 2022 · 6 comments
Open

db split1 issue #89

kcl58759 opened this issue Mar 4, 2022 · 6 comments

Comments

@kcl58759
Copy link

kcl58759 commented Mar 4, 2022

Describe the bug
Hi I am trying to run NextPolish on Hifi pacbio sequences. I keep running into an unspecified error with db split.
My workflow is:

ls /scratch/kcl58759/Eco_pacbio_kendall/pb_css_474/cromwell-executions/pb_ccs/c7a3dc30-7f94-40de-ac16-2445f965bfad/call-export_fasta/execution/m64060_210804_174320.hifi_reads.fasta.gz > lgs.fofn

Make config file:

#creat config file run.cfg

job_type = slurm
job_prefix = nextPolish
task = best
rewrite = yes
rerun = 3
parallel_jobs = 6
multithread_jobs = 5
genome = /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.fa
genome_size = auto
workdir = ./01_rundir
polish_options = -p {multithread_jobs}

[lgs_option]
lgs_fofn = ./lgs.fofn
lgs_options = -min_read_len 1k -max_depth 100
lgs_minimap2_options = -x map-ont

then I submit with
#!/bin/bash
#SBATCH --job-name=NextPolish
#SBATCH --partition=batch
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6
#SBATCH --mem=90gb
#SBATCH --time=99:00:00
#SBATCH --output=nextpolish.out
#SBATCH --error=nextpolish.err
#SBATCH --mail-user=kcl58759@uga.edu
#SBATCH --mail-type=END,FAIL

nextPolish run.cfg

Error message
[60417 INFO] 2022-03-03 13:02:46 NextPolish start...
[60417 INFO] 2022-03-03 13:02:46 version:v1.4.0 logfile:pid60417.log.info
[60417 WARNING] 2022-03-03 13:02:46 Re-write workdir
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 1 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 1 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 2 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 2 due to missing sgs_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 6 due to missing hifi_fofn.
[60417 WARNING] 2022-03-03 13:03:05 Delete task: 6 due to missing hifi_fofn.
[60417 INFO] 2022-03-03 13:03:05 scheduled tasks:
[5, 5]
[60417 INFO] 2022-03-03 13:03:05 options:
[60417 INFO] 2022-03-03 13:03:05
rerun: 3
rewrite: 1
kill: None
cleantmp: 0
task: [5, 5]
use_drmaa: 0
submit: None
job_type: sge
sgs_unpaired: 0
sgs_rm_nread: 1
parallel_jobs: 6
align_threads: 5
check_alive: None
job_id_regex: None
sgs_max_depth: 100
lgs_max_depth: 100
lgs_read_type: clr
multithread_jobs: 5
lgs_max_read_len: 0
hifi_max_depth: 100
polish_options: -p 5
lgs_min_read_len: 1k
hifi_max_read_len: 0
genome_size: 36224976
hifi_block_size: 500M
hifi_min_read_len: 1k
job_prefix: nextPolish
sgs_block_size: 500000000
lgs_block_size: 500000000
sgs_use_duplicate_reads: 0
sgs_align_options: bwa mem
hifi_minimap2_options: -x map-pb
lgs_minimap2_options: -x map-pb -t 5
lgs_fofn: /scratch/kcl58759/Eco_pacbio_kendall/./lgs.fofn
workdir: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir
snp_phase: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.snp_phase
snp_valid: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.snp_valid
lgs_polish: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.lgs_polish
kmer_count: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.kmer_count
hifi_polish: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.hifi_polish
score_chain: /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/%02d.score_chain
genome: /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.fa
[60417 INFO] 2022-03-03 13:03:05 step 0 and task 5 start:
[60417 INFO] 2022-03-03 13:03:19 Total jobs: 2
[60417 CRITICAL] 2022-03-03 13:03:19 Command 'qsub -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh' returned non-zero exit status 1, error info: .
Traceback (most recent call last):
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/nextPolish", line 515, in
main(args)
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/nextPolish", line 369, in main
task.run.start()
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/lib/python3.8/site-packages/paralleltask/task_control.py", line 344, in start
self._start()
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/lib/python3.8/site-packages/paralleltask/task_control.py", line 398, in _start
self.submit(job)
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/lib/python3.8/site-packages/paralleltask/task_control.py", line 252, in submit
_, stdout, _ = self.run(job.cmd)
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/lib/python3.8/site-packages/paralleltask/task_control.py", line 288, in run
log.critical("Command '%s' returned non-zero exit status %d, error info: %s." % (cmd, p.returncode, stderr))
File "/apps/eb/Python/3.8.2-GCCcore-8.3.0/lib/python3.8/logging/init.py", line 1481, in critical
self._log(CRITICAL, msg, args, **kwargs)
File "/apps/eb/Python/3.8.2-GCCcore-8.3.0/lib/python3.8/logging/init.py", line 1577, in _log
self.handle(record)
File "/apps/eb/Python/3.8.2-GCCcore-8.3.0/lib/python3.8/logging/init.py", line 1587, in handle
self.callHandlers(record)
File "/apps/eb/Python/3.8.2-GCCcore-8.3.0/lib/python3.8/logging/init.py", line 1649, in callHandlers
hdlr.handle(record)
File "/apps/eb/Python/3.8.2-GCCcore-8.3.0/lib/python3.8/logging/init.py", line 950, in handle
self.emit(record)
File "/apps/eb/NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2/lib/python3.8/site-packages/paralleltask/kit.py", line 42, in emit
raise Exception(record.msg)
Exception: Command 'qsub -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh' returned non-zero exit status 1, error info: .
Operating system
I am running on UGA's cluster system Sapelo2, I used slurm.

GCC
What version of GCC are you using?
You can use the command gcc -v to get it.

Python
What version of Python are you using?
You can use the command python --version to get it.

NextPolish
NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2

@moold
Copy link
Member

moold commented Mar 5, 2022

Could you try to submit task qsub -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh manually and see what happens?

@kcl58759
Copy link
Author

kcl58759 commented Mar 7, 2022

I am trying to rework it for slurm like this
sbatch --mem=90 -pe smp 5 -l vf=2.5G -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh

However, it keeps saying cant open smp. Is this an option?

@moold
Copy link
Member

moold commented Mar 8, 2022

If you are using slurm but the log show you are using job_type: sge SGE, so check what happened?

@kcl58759
Copy link
Author

kcl58759 commented Mar 9, 2022

Hmm, I am confused about that.

This is what happened when I tried the qsub command:

The command was:
'/opt/apps/slurm/21.08.5/bin/sbatch -e /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e -o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o /scratch/kcl58759/Eco_pacbio_kendall/Nextpolish_dir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh 2>&1'
and the output was:
'sbatch: error: You must request some amount of memory.
sbatch: error: Batch job submission failed: Job size specification needs to be provided
'

@kcl58759
Copy link
Author

kcl58759 commented Mar 9, 2022

I believe I got this to work with:
sbatch --partition=batch --ntasks=1 --cpus-per-task=5 --mem-per-cpu=537 --time=99:00:00 -o /scratch/kcl58759/Eco_pacbio_kendall/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.o -e /scratch/kcl58759/Eco_pacbio_kendall/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh.e /scratch/kcl58759/Eco_pacbio_kendall/01_rundir/00.lgs_polish/01.db_split.sh.work/db_split1/nextPolish.sh

However, I am unsure if there are further steps since I cant seem to find the output files.

@moold
Copy link
Member

moold commented Mar 10, 2022

Hi, see here and ParallelTask to change the submit command template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants