Skip to content
/ msqchr Public

An R package used for chromosome fasta file splitting

License

Notifications You must be signed in to change notification settings

MSQ-123/msqchr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

msqchr

Some fasta files contain all chromosomes from one genome,sometimes users have to split these chromosomes into different files according to their number label.The msqchr can help to handle this, so that the choosed chromosome fasta file can be used for downstream analysis.

Installation

devtools::install_github("MSQ-123/msqchr")

Usage

Replace tedious chromosome identifier into simple format. So that the subtracted ids are easy to manipulate.

data("id")
simpleID<- replaceText(type = "text",input = id)

Subtract chromosome ids from a fasta file

data("text")
text<- replaceText(type = "text",input = text)
id <- subFasID(text = text)

Transform the large character object into special list:

fil <- tempfile(fileext = ".data")
write(text,file = fil)
con0 <- file(fil, "r")
tex <- readToList(id,text = text,con = con0)

Sort the chromosome list according to their number. Note: the “single” and “double” chromosome should be sort separately. Note: This data is already sorted, this is just for expository purposes.

tex2<- sortList(id=id,tex = tex,chrsig = "single")
tex3 <- sortList(id=id,tex = tex,chrsig = "double")

Now we can split the chromosome fasta file into different files according to their number.

outdir <- tempdir()
splitChr(tex = tex2,chr=seq(1,9),sex = TRUE,outdir = outdir)
#chromosome X or Y is "single",so the tex should be a
#"single" chromosome list.
#In the below case, sex should be F.
splitChr(tex = tex3,chr=seq(10,22),sex = FALSE,outdir = outdir)

About

An R package used for chromosome fasta file splitting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages