Title: | Relationship Inference for DNA Mixtures |
---|---|
Description: | Analysis of DNA mixtures involving relatives by computation of likelihood ratios that account for dropout and drop-in, mutations, silent alleles and population substructure. This is useful in kinship cases, like non-invasive prenatal paternity testing, where deductions about individuals' relationships rely on DNA mixtures, and in criminal cases where the contributors to a mixed DNA stain may be related. Relationships are represented by pedigrees and can include kinship between more than two individuals. The main function is relMix() and its graphical user interface relMixGUI(). The implementation and method is described in Dorum et al. (2017) <doi:10.1007/s00414-016-1526-x>, Hernandis et al. (2019) <doi:10.1016/j.fsigss.2019.09.085> and Kaur et al. (2016) <doi:10.1007/s00414-015-1276-1>. |
Authors: | Guro Dorum [aut, cre], Elias Hernandis [aut], Navreet Kaur [ctb], Thore Egeland [ctb], Magnus Dehli Vigeland [ctb] |
Maintainer: | Guro Dorum <[email protected]> |
License: | GPL (>=2) |
Version: | 1.4.1 |
Built: | 2025-02-20 06:01:42 UTC |
Source: | https://github.com/gdorum/relmix |
Finds all possible genotypes based on input alleles.
allGenos(alleles)
allGenos(alleles)
alleles |
Vector of input alleles, numeric or character |
Matrix of all possible genotypes, one row per genotype
Guro Dorum
alleles <- 1:3 allGenos(alleles)
alleles <- 1:3 allGenos(alleles)
Loads a frequency database file and compares it against mixture data to check for common errors.
checkFrequenciesFile(filename, mix)
checkFrequenciesFile(filename, mix)
filename |
Path of the frequency database file |
mix |
Data frame with mixture data. See relMix vignette for description of the format |
The mixture data is used to perform more advanced checks, such as to make sure all alleles present in the mixture file have an entry in the frequency database. If warnings are found, the function attempts to fix them and explains what it has done in the warning messages. If an error is found, checking stops and a NULL data frame is returned. The error is described in the error messages.
A list containing
df
Data frame with frequencies
warning
List of strings describing the errors that occurred but could be fixed or that do not prevent
the execution of the program.
error
List of strings describing the errors that occurred that made it impossible to return a valid data frame.
If this list is not empty, then the data frame item will be NULL
Elias Hernandis
checkMixtureFile
for information on how to load a mixture file.
mixfile <- system.file("extdata","mixture.txt",package="relMix") mix <- checkMixtureFile(mixfile) # note: the mixture data frame is passed as an argument # if the previous check failed, the program should not continue # with the frequencies file check freqfile <- system.file('extdata','frequencies22Markers.txt',package='relMix') freqs <- checkFrequenciesFile(freqfile, mix$df)
mixfile <- system.file("extdata","mixture.txt",package="relMix") mix <- checkMixtureFile(mixfile) # note: the mixture data frame is passed as an argument # if the previous check failed, the program should not continue # with the frequencies file check freqfile <- system.file('extdata','frequencies22Markers.txt',package='relMix') freqs <- checkFrequenciesFile(freqfile, mix$df)
Given a mixture file name, returns the loaded data frame along with any detected errors or warnings.
checkMixtureFile(filename)
checkMixtureFile(filename)
filename |
Path of the mixture file |
If warnings are found, the function attempts to fix them and explains what it has done in the warning messages. If an error is found, checking stops and a NULL data frame is returned. The error is described in the error messages.
A list containing
df
The loaded data frame, NULL if errors are present.
warning
A list of strings describing the errors that ocurred but could be fixed or that do not prevent the execution of the program.
error
A list of strings describing the errors that occurred that made it impossible to return a valid data frame.
If this list is not empty, then the data frame item will be null.
Elias Hernandis
mixfile <- system.file("extdata","mixture.txt",package="relMix") result <- checkMixtureFile(mixfile); print(result$df); print(result$warning); print(result$error);
mixfile <- system.file("extdata","mixture.txt",package="relMix") result <- checkMixtureFile(mixfile); print(result$df); print(result$warning); print(result$error);
Given a pedigree file path the function attempts to load it and compare it to the reference profiles to detect possible errors.
checkPedigreeFile(filename, df)
checkPedigreeFile(filename, df)
filename |
Path of the pedigree file |
df |
Data frame with reference profiles |
The pedigree file must be a a text file in ped format (see the relMix vignette for an example). The data frame with reference data is used to compare names of individuals and detect possible misspellings. If warnings are found, the function attempts to fix them and explains what it has done in the warning messages. If an error is found, checking stops. The error is described in the error messages.
A list containing
df
A list of class [Familias::FamiliasPedigree()], or NULL if errors are present.
warning
A list of strings describing the errors that ocurred but could be fixed or that do not prevent the execution of the program.
error
A list of strings describing the errors that ocurred that made it imposible to return a valid data frame.
If this list is not empty, then the dataframe item will be null.
Elias Hernandis
#First load mixture file mixfile <- system.file("extdata","mixture_silent_ex.txt",package="relMix") mix <- checkMixtureFile(mixfile); #Load reference file reffile <- system.file("extdata","references_silent.txt",package="relMix") ref <- checkReferenceFile(reffile, mix$df) #Check pedigree file pedfile <- system.file("extdata","custom_pedigree_maternity_duo.ped",package="relMix") checkPedigreeFile(pedfile,ref$df);
#First load mixture file mixfile <- system.file("extdata","mixture_silent_ex.txt",package="relMix") mix <- checkMixtureFile(mixfile); #Load reference file reffile <- system.file("extdata","references_silent.txt",package="relMix") ref <- checkReferenceFile(reffile, mix$df) #Check pedigree file pedfile <- system.file("extdata","custom_pedigree_maternity_duo.ped",package="relMix") checkPedigreeFile(pedfile,ref$df);
Given a reference profile file name the function attempts to load it and compare it to the mixture file to detect possible errors.
checkReferenceFile(filename, mix)
checkReferenceFile(filename, mix)
filename |
Path of the reference profiles file |
mix |
Data frame with mixture data |
See the relMix vignette for a description of the format of the reference file. The data frame with mixture data is used to compare If warnings are found, the function attempts to fix them and explains what it has done in the warning messages. If an error is found, checking stops and a NULL dataframe is returned. The error is described in the error messages.
A list containing
df
The loaded data frame, NULL if errors are present
warning
A list of strings describing the errors that ocurred but could be fixed or that do not prevent the execution of the program.
error
A list of strings describing the errors that ocurred that made it impossible to return a valid data frame.
If this list is not empty, then the data frame item will be null.
Elias Hernandis
checkMixtureFile
for information on how to load a mixture file.
#Load a mixture file mixfile <- system.file("extdata","mixture.txt",package="relMix") mix <- checkMixtureFile(mixfile); #Note: the mixture dataframe is passed as an argument. If the previous check failed, #the program should not continue with the reference file check reffile <- system.file("extdata","references.txt",package="relMix") checkReferenceFile(reffile, mix$df);
#Load a mixture file mixfile <- system.file("extdata","mixture.txt",package="relMix") mix <- checkMixtureFile(mixfile); #Note: the mixture dataframe is passed as an argument. If the previous check failed, #the program should not continue with the reference file check reffile <- system.file("extdata","references.txt",package="relMix") checkReferenceFile(reffile, mix$df);
A data matrix of genotypes for known individuals and all possible genotypes for unknown individuals is created.
createDatamatrix(locus, knownGenos, idsU = NULL)
createDatamatrix(locus, knownGenos, idsU = NULL)
locus |
A list of class |
knownGenos |
List of known genotypes. Each element is a vector with genotype for one individual. The elements must be named |
idsU |
Vector of indices for unknown individuals |
A data matrix of genotypes where each row corresponds to an individual.
Guro Dorum
#Define alleles and frequencies alleles <- 1:2 afreq <- c(0.5,0.5) #Create locus object locus <- Familias::FamiliasLocus(frequencies=afreq,name="M1",allelenames= alleles) #Known genotypes of alleged father and mother, child's genotype is uknown gAF <- c(1,1) gMO <- c(1,1) datamatrix <- createDatamatrix(locus,knownGenos=list(AF=gAF,MO=gMO),idsU=c("CH"))
#Define alleles and frequencies alleles <- 1:2 afreq <- c(0.5,0.5) #Create locus object locus <- Familias::FamiliasLocus(frequencies=afreq,name="M1",allelenames= alleles) #Known genotypes of alleged father and mother, child's genotype is uknown gAF <- c(1,1) gMO <- c(1,1) datamatrix <- createDatamatrix(locus,knownGenos=list(AF=gAF,MO=gMO),idsU=c("CH"))
Norwegian database with 17 EXS17 markers and 6 additional markers.
data(db)
data(db)
A data frame with 324 observations on the following 3 variables:
Marker
a factor with levels corresponding to name of markers
Allel
a numeric vector denoting allele
Frequency
a numeric vector in (0,1)
Dupuy et al. (2013), unpublished.
data(db) #Checks that frequencies add to 1 lapply(split(db$Frequency,db$Marker),sum) #Finds number of alleles for all markers unlist(lapply(split(db$Frequency,db$Marker),length)) #A closer look at the marker SE33 SE33=db[db$Marker=="SE33",] barplot(SE33$Frequency)
data(db) #Checks that frequencies add to 1 lapply(split(db$Frequency,db$Marker),sum) #Finds number of alleles for all markers unlist(lapply(split(db$Frequency,db$Marker),length)) #A closer look at the marker SE33 SE33=db[db$Marker=="SE33",] barplot(SE33$Frequency)
Frequencies for 22 loci from the prototype 24-plex STR panel from Thermo Fisher.
data(db2)
data(db2)
A data frame with 206 observations on the following 3 variables.
Marker
a factor with levels corresponding to name of markers
Allele
a numeric vector denoting allele
Frequency
a numeric vector in (0,1)
The format is convenient for R.
Hill et al. (2013) U.S. population data for 29 autosomal STR loci. Forensic Sci. Int. Genet. 7, e82-e83.
Hill et al. (2006) Allele Frequencies for 26 MiniSTR Loci with U.S. Caucasian, African American, and Hispanic Populations. http://www.cstl.nist.gov/biotech/strbase/NISTpop.htm
data(db2)
data(db2)
Takes as input genotypes and creates a mixture. Alleles drop in and out of the mixture with the specified probabilities
generateMix(G, alleles, afreq, D, di)
generateMix(G, alleles, afreq, D, di)
G |
List of genotypes. Each element is a vector with genotype for one individual |
alleles |
Numeric or character Vector of allele names for the marker |
afreq |
Numeric vector of allele frequencies for the marker |
D |
List of dropout values (between 0 and 1) per contributor. Each element is a vector containing heterozygous and homozygous dropout probability for the given contributor |
di |
Drop-in value (between 0 and 1) |
A vector of mixture alleles.
Guro Dorum
#Define alleles and frequencies alleles <- 1:2 afreq <- c(0.5,0.5) #Genotypes gM <- c(1,1) gC <- c(1,2) #Dropout and drop-in values d <- 0.1 di <- 0.05 #No drop-in for first contributor D <- list(c(0,0),c(d,d^2)) R <- generateMix(G=list(gM,gC),alleles,afreq,D=D,di=di)
#Define alleles and frequencies alleles <- 1:2 afreq <- c(0.5,0.5) #Genotypes gM <- c(1,1) gC <- c(1,2) #Dropout and drop-in values d <- 0.1 di <- 0.05 #No drop-in for first contributor D <- list(c(0,0),c(d,d^2)) R <- generateMix(G=list(gM,gC),alleles,afreq,D=D,di=di)
Computes the likelihood of a mixture conditioned on a given number of known and unknown contributors, and drop-in and dropout probabilities.
mixLikDrop(R, G, D, di = 0, alleleNames, afreq)
mixLikDrop(R, G, D, di = 0, alleleNames, afreq)
R |
Vector of mixture alleles |
G |
List of genotypes. Each element is a vector with genotype for one individual |
D |
List of dropout values (between 0 and 1) per contributor. Each element is a vector containing heterozygous and homozygous dropout probability for the given contributor |
di |
Drop-in value (between 0 and 1) |
alleleNames |
Vector of allele names for the marker |
afreq |
Vector of allele frequencies for the marker |
The likelihood (a numeric)
Guro Dorum
The model is specified in the appendix of Haned et al. (2012) <doi:10.1016/j.fsigen.2012.08.008>.
#Define alleles and frequencies alleles <- 1:2 afreq <- c(0.5,0.5) #Genotypes gM <- c(1,1) gC <- c(1,2) #Mixture alleles R <- c(1,2) #Dropout and drop-in values d <- 0.1 di <- 0.05 #No drop-in for first contributor D <- list(c(0,0),c(d,d^2)) mixLikDrop(R=R,G=list(gM,gC),D=D,di=di,alleleNames=alleles,afreq=afreq)
#Define alleles and frequencies alleles <- 1:2 afreq <- c(0.5,0.5) #Genotypes gM <- c(1,1) gC <- c(1,2) #Mixture alleles R <- c(1,2) #Dropout and drop-in values d <- 0.1 di <- 0.05 #No drop-in for first contributor D <- list(c(0,0),c(d,d^2)) mixLikDrop(R=R,G=list(gM,gC),D=D,di=di,alleleNames=alleles,afreq=afreq)
Calculates likelihoods for relationship inference involving mixtures and missing reference profiles, including drop-in and dropout, mutations, silent alleles and theta correction.
relMix( pedigrees, locus, R, datamatrix, ids, D = rep(list(c(0, 0)), length(ids)), di = 0, kinship = 0 )
relMix( pedigrees, locus, R, datamatrix, ids, D = rep(list(c(0, 0)), length(ids)), di = 0, kinship = 0 )
pedigrees |
A list of pedigrees defined using |
locus |
A list of class |
R |
A vector of mixture alleles, or a list of such if there are multiple replicates |
datamatrix |
A data frame where each line corresponds to one constellation of genotypes for the involved individuals. Indices of individuals must be given as rownames and must correspond to indices in the pedigree |
ids |
Index vector indicating which individuals are contributors to the mixture. The indices must correspond to indices in the pedigree |
D |
List of numeric dropout values (between 0 and 1) per contributor. Each element is a vector containing heterozygous and homozygous dropout probability for the given contributor |
di |
A numeric drop-in value (between 0 and 1) |
kinship |
A numeric value between 0 and 1 that defines the theta-parameter |
The function requires the package Familias
and calls on the function FamiliasPosterior
.
A numeric likelihood for each pedigree named according to the pedigrees, and a matrix of likelihoods for each pedigree and each term (genotype constellation) considered in the calculation (one row per term).
Navreet Kaur, Thore Egeland, Guro Dorum
Dorum et al. (2017) <doi:10.1007/s00414-016-1526-x>
Kaur et al. (2016) <doi:10.1007/s00414-015-1276-1>
relMixGUI
for the GUI version of relMix, FamiliasLocus
on how to create a FamiliasLocus and FamiliasPedigree
on how to create a FamiliasPedigree.
#Example 1: paternity trio with mixture of mother and child #Define alleles and frequencies alleles <- 1:2 afreq <- c(0.4,0.6) #Define pedigrees persons <- c("CH","MO","AF") ped1 <- Familias::FamiliasPedigree(id=persons, dadid=c("AF",NA, NA), momid=c("MO", NA,NA), sex=c("male", "female", "male")) ped2 <- Familias::FamiliasPedigree(id=c(persons, "TF"), dadid=c("TF", NA, NA,NA), momid=c("MO", NA, NA,NA), sex=c("male", "female", "male", "male")) pedigrees <- list(isFather = ped1, unrelated=ped2) #Create locus object locus <- Familias::FamiliasLocus(frequencies=afreq,name="M1", allelenames= alleles) #Known genotypes of alleged father and mother gAF <- c(1,1) gMO <- c(1,1) #Mixture alleles R <- c(1,2) datamatrix <- createDatamatrix(locus,knownGenos=list(AF=gAF,MO=gMO),idsU=c("CH")) #Define dropout and drop-in values d <- 0.1 di <- 0.05 res <- relMix(pedigrees, locus, R, datamatrix, ids=c("MO","CH"), D=list(c(0,0),c(d,d^2)),di=di, kinship=0) #LR=0.054 res$isFather/res$unrelated #Example 2: Exhaustive example with silent allele, mutations, dropout and drop-in #H1: Contributors are mother and child #H2: Contributors are mother and unrelated #Possible dropout in both contributors gMO <- c(1,1) #Mother's genotype R <- 1 #Mixture alleles #Mother/child pedigree persons <- c("CH","MO") ped1 <- Familias::FamiliasPedigree(id=persons, dadid=c(NA,NA), momid=c("MO", NA), sex=c("male", "female")) ped2 <- Familias::FamiliasPedigree(id=c(persons), dadid=c(NA, NA), momid=c( NA, NA), sex=c("male", "female")) pedigrees <- list(H1 = ped1, H2=ped2) #Alleles and frequencies: #When silent alleles are involved, a custom mutation matrix is required. #No mutations are possible to or from silent alleles. #We create the mutation model with FamiliasLocus and modify it before it is #passed on to relMix alleles <- c(1,2,'silent') afreq <- c(0.4,0.5,0.1) #Create initial locus object with mutation matrix locus <- Familias::FamiliasLocus(frequencies=afreq,name='M1', allelenames= alleles, MutationModel='Equal', femaleMutationRate=0.1,maleMutationRate=0.1) #Modify mutation matrix from Familias: #Silent allele must be given as 's' (not 'silent' as in Familias) newAlleles <- c(alleles[-length(alleles)],'s') mm <- locus$femaleMutationMatrix colnames(mm) <- rownames(mm) <- newAlleles #Create new locus object with modified mutation matrix locus <- Familias::FamiliasLocus(frequencies=afreq,name='M1', allelenames= newAlleles, MutationModel='Custom', MutationMatrix=mm) knownGenos <- list(gMO) names(knownGenos) <- c("MO") datamatrix <- createDatamatrix(locus,knownGenos,ids="CH") d <- 0.1 #Dropout probability for both contributors di <- 0.05 res2 <- relMix(pedigrees, locus, R, datamatrix, ids=c("MO","CH"), D=list(c(d,d^2),c(d,d^2)),di=di, kinship=0) #LR=1.68 res2$H1/res2$H2
#Example 1: paternity trio with mixture of mother and child #Define alleles and frequencies alleles <- 1:2 afreq <- c(0.4,0.6) #Define pedigrees persons <- c("CH","MO","AF") ped1 <- Familias::FamiliasPedigree(id=persons, dadid=c("AF",NA, NA), momid=c("MO", NA,NA), sex=c("male", "female", "male")) ped2 <- Familias::FamiliasPedigree(id=c(persons, "TF"), dadid=c("TF", NA, NA,NA), momid=c("MO", NA, NA,NA), sex=c("male", "female", "male", "male")) pedigrees <- list(isFather = ped1, unrelated=ped2) #Create locus object locus <- Familias::FamiliasLocus(frequencies=afreq,name="M1", allelenames= alleles) #Known genotypes of alleged father and mother gAF <- c(1,1) gMO <- c(1,1) #Mixture alleles R <- c(1,2) datamatrix <- createDatamatrix(locus,knownGenos=list(AF=gAF,MO=gMO),idsU=c("CH")) #Define dropout and drop-in values d <- 0.1 di <- 0.05 res <- relMix(pedigrees, locus, R, datamatrix, ids=c("MO","CH"), D=list(c(0,0),c(d,d^2)),di=di, kinship=0) #LR=0.054 res$isFather/res$unrelated #Example 2: Exhaustive example with silent allele, mutations, dropout and drop-in #H1: Contributors are mother and child #H2: Contributors are mother and unrelated #Possible dropout in both contributors gMO <- c(1,1) #Mother's genotype R <- 1 #Mixture alleles #Mother/child pedigree persons <- c("CH","MO") ped1 <- Familias::FamiliasPedigree(id=persons, dadid=c(NA,NA), momid=c("MO", NA), sex=c("male", "female")) ped2 <- Familias::FamiliasPedigree(id=c(persons), dadid=c(NA, NA), momid=c( NA, NA), sex=c("male", "female")) pedigrees <- list(H1 = ped1, H2=ped2) #Alleles and frequencies: #When silent alleles are involved, a custom mutation matrix is required. #No mutations are possible to or from silent alleles. #We create the mutation model with FamiliasLocus and modify it before it is #passed on to relMix alleles <- c(1,2,'silent') afreq <- c(0.4,0.5,0.1) #Create initial locus object with mutation matrix locus <- Familias::FamiliasLocus(frequencies=afreq,name='M1', allelenames= alleles, MutationModel='Equal', femaleMutationRate=0.1,maleMutationRate=0.1) #Modify mutation matrix from Familias: #Silent allele must be given as 's' (not 'silent' as in Familias) newAlleles <- c(alleles[-length(alleles)],'s') mm <- locus$femaleMutationMatrix colnames(mm) <- rownames(mm) <- newAlleles #Create new locus object with modified mutation matrix locus <- Familias::FamiliasLocus(frequencies=afreq,name='M1', allelenames= newAlleles, MutationModel='Custom', MutationMatrix=mm) knownGenos <- list(gMO) names(knownGenos) <- c("MO") datamatrix <- createDatamatrix(locus,knownGenos,ids="CH") d <- 0.1 #Dropout probability for both contributors di <- 0.05 res2 <- relMix(pedigrees, locus, R, datamatrix, ids=c("MO","CH"), D=list(c(d,d^2),c(d,d^2)),di=di, kinship=0) #LR=1.68 res2$H1/res2$H2
User-friendly graphical user interface for relMix.
relMixGUI()
relMixGUI()
Includes error checking for the input files.
No return value, called for side effects.
Guro Dorum, Elias Hernandis, Magnus Dehli Vigeland
relMix
for the main function implemented in relMixGUI
.
#Examples can be found in the vignette and example data files can be found #in the folder "extdata" in the installation folder for relMix
#Examples can be found in the vignette and example data files can be found #in the folder "extdata" in the installation folder for relMix