Control size of the output of seqLogo?

I am using the seqLogo package to draw some sequence Logos. I need to make the logos wider, the default drawing makes the logos into square graphs. Is there a way to do this?

How much storage would be required to store a human genome?

I'm looking for the amount of storage in bytes (MB, GB, TB, etc.) required to store a single human genome. I read a few articles on Wikipedia about DNA, chromosomes, base pairs, genes, and have...

How do I decide which way to backtrack in the Smith–Waterman algorithm?

I am trying to implement local sequence alignment in Python using the Smith–Waterman algorithm. Here's what I have so far. It gets as far as building the similarity matrix: import sys,...

How do I convert the three letter amino acid codes to one letter code with python or R?

I have a fasta file as shown below. I would like to convert the three letter codes to one letter code. How can I do this with python or R? >2ppo ARGHISLEULEULYS >3oot METHISARGARGMET desired ...

How to build the scoring matrix for global sequence alignment?

I have tried to get the global sequence alignment between two strings. But it gives me the wrong answer. My way of generating the scoring matrix as below. public void makeScoringMatrix(String...

BLOSUM50 Matrix

I'm trying to import this table into a table in Python, how can I do this? I found a .MATRIX file that has the table but idk how to use that either. I'm new to python so any help would be appreciated

Traceback in DP Needleman-Wunsch/Smith-Waterman

In Needleman-Wunsch and Smith-Waterman, what is the best way to implement traceback? Do we usually keep two matrices, one with each entry's predecessor? That is, each entry would be UP, DIAG, or...

Reverse complement of DNA strand using Python

I have a DNA sequence and would like to get reverse complement of it using Python. It is in one of the columns of a CSV file and I'd like to write the reverse complement to another column in the...

How to call module written with argparse in iPython notebook

I am trying to pass BioPython sequences to Ilya Stepanov's implementation of Ukkonen's suffix tree algorithm in iPython's notebook environment. I am stumbling on the argparse component. I have...

Algorithm for Deep Hash Invert (should be in ruby)

I have a hash H (see bottom) and need to perform a deep invert operation on it, such that a new hash H2 is returned where each key K is a value inside the original hash. The keys in H2 map to an...

How to cache reads?

I am using python/pysam to do analyze sequencing data. In its tutorial (pysam - An interface for reading and writing SAM files) for the command mate it says: 'This method is too slow for...

Good way to graph allele frequency of different SNPs along chromosomes

I have a set of SNPs from different parts of the genome and their allele frequencies in various populations and metapopulations of interest. I want to plot the allele frequencies along the SNPs'...

Backtrace through Needleman Wunsch table

I'm trying to do a back trace through a completed DP table. Assume that the table is correctly filled in with the proper values.. I can post a snippet of the table if one would like. But here is...

FastQC fails to process fastq sequence file - java.lang.NullPointerException

I am trying to run fastqc on RNA seq (.fastq) and I get this issue that I haven't managed to fix yet: Approx 5% complete for SRR5280293.fastq Approx 10% complete for SRR5280293.fastq Approx 15%...

biopython no module named Bio

FYI: this is NOT a duplicate! Before running my python code I installed biopython in the cmd prompt: pip install biopython I then get an error saying 'No module named Bio' when try to import it...

R run T-test/anova for each row with 2 groups with 3 samples

My dataset looks something like this: df <- data.frame(compound = c("alanine ", "arginine", "asparagine", "aspartate")) df <- matrix(rnorm(12*4), ncol = 12) colnames(df) <- c("AC-1", "AC-2",...

BWA fails to locate index genome

I know this question has been asked before. I have searched through the threads and nothing has made a ton of sense to me. Admittedly, I am a newbie at bioinformatics so maybe the answer is clear...

Menu and submenu python

How can I return from a sub menu to a main menu? Also I want to keep the data generated in the submenu. Main menu: 1. Load data 2. Filter data 3. Display statistics 4. Generate plots 5. Quit On...

How to make a pandas DataFrame from a PISA interface list

I am trying to make a DataFrame in pandas from the interface results page at the PISA server. After clicking the LaunchPDBePisa button, I click on the Interfaces button to get a page with a table...

traceback in global sequence alignment

I am facing problem of tracing back the global sequence alignment. My first sequence is ATTGCGCGCAT and second sequence is ATGCTTAACCA. The traceback result should be A T T G C _ _ _ G C G C A T A...

How to deal with gaps during translation with biopython

I need to translate aligned DNA sequences with biopython from Bio.Seq import Seq from Bio.Alphabet import generic_dna seq = Seq("tt-aaaatg") seq.translate() Running this script will get...

How can I write a script to summarise information from a multi-fasta file without using biopython?

I've an assignment for a bioinformatics class which is asking for a python script to do the following for a FASTA file with several protein sequences in it: -Open a .fasta file specified by user...

Enroll face using Zkteco

I cannot enroll face via C# application. I can enroll directly in the device and I can access the enrolled face using the function GetUserFaceStr. I could do fingerprint enrollment from c#...

How to merge FASTA files in R

I have four separate FASTA files that I'd like to merge into one large FASTA file. So far I've used the Biostrings package to read each file separately

Unable to run make command for BWA on Apple M1 (On Mac OS Big Sur)

I am trying to install BWA using the make file. Github repo is as following : https://github.com/lh3/bwa The make file is as following: CC= gcc #CC= clang --analyze CFLAGS= -g -Wall...

Microbiome analysis using dada2, decipher, phangorn

I'm working on a sequencing dataset produced by the Minion/nanopore method extracted from soil. My data is single-read and in fastaq format, and Im currently just working on the 29 sample set...

Enrichment Analysis with GSEAPY

I am trying to run an enrichment analysis with gseapy enrichr on a list of gene names that look like the following: 0 RAB4B 1 TIGAR 2 RNF44 3 DNAH3 4 RPL23A 5 ARL8B 6 ...

No module named 'pyrosetta.rosetta'

I'm trying to install pyrosetta (after I downloaded the tar files and request for a license). I'm running on Ubuntu 20.04 in a virtual environment in Anaconda-navigator The installation step that...

Dictionary comprehension with multiple values for each key

Im doing a course in bioinformatics. We were supposed to create a function that takes a list of strings like this: Motifs =[ "AACGTA", "CCCGTT", "CACCTT", "GGATTA", ...

Count the number of times a substring appears in a file and place it in a new column

Question: I have 2 files, file 1 is a TSV (BED) file that has 23 base-pair sequences in column 7, for example: 1 779692 779715 Sample_3 + 1 ATGGTGCTTTGTTATGGCAGCTC 1 783462 783485 Sample_4 - 1 ...