Increase or decrease the number of hits by changing "number = X" in Top <- topTable(fit, coef=1, number = 15) #Make a table of the 15 most correlated SNPs to #the deisgn matrix using the sample names to tie them together Next: Find the positions with the greatest correlation to the genotype of your samples by using the following code.ĭesignmat% eBayes()) #Fit a linear model to the SNP matrix against Matrix2 % column_to_rownames("sample") #Make sample (e.g., "ANN0802, ANN0803, etc" #can identify which loci contain SNPs correlated to your genotype at the end. Quickly set up your data with the following code.Ĭolumn_to_rownames("POS") #Turn the location of each SNP into its rowname. Multiallelic SNPs will cause problems.Ī CSV file with a column for every individual in the VCF (no more and no less, otherwise there will be problems), and a column for the genotype of interest of each individual.īefore using this script in R, open your VCF in excel and remove all the contig information so that the first row of the spreadsheet is “#CHROM, POS, ID, REF, ALT, etc,” then save it as a. This script is for finding biallelic SNPs in a VCF that are highly correlated to a genomic feature of interest.Ī VCF of candidate biallelic SNPs in a set of individuals.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |