US20130345988A1 - Summarizing an aggregate contribution to a characteristic for an individual - Google Patents
Summarizing an aggregate contribution to a characteristic for an individual Download PDFInfo
- Publication number
- US20130345988A1 US20130345988A1 US13/932,513 US201313932513A US2013345988A1 US 20130345988 A1 US20130345988 A1 US 20130345988A1 US 201313932513 A US201313932513 A US 201313932513A US 2013345988 A1 US2013345988 A1 US 2013345988A1
- Authority
- US
- United States
- Prior art keywords
- individual
- phenotypic characteristic
- markers
- characteristic
- aggregate contribution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G06F19/18—
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G06F19/24—
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/40—Population genetics; Linkage disequilibrium
Definitions
- Genetic testing can provide information to help individuals understand areas of potential concern to discuss with their doctors, and together with a doctor, can help individuals make informed decisions about medical management and lifestyle choices.
- Typical genetic testing solutions allow an individual to order a particular genetic test, such as for type 2 diabetes. The individual typically receives a summary report having a limited set of results with limited interpretative information. Improvements in the interpretation and reporting of genetic test results would be useful.
- FIG. 1 is a block diagram illustrating an embodiment of a system for determining an aggregate contribution to a characteristic for an individual with a specific marker measurement, for instance, a genotype.
- FIG. 2 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for an individual.
- FIG. 3 is a flow chart illustrating an embodiment of a process for summarizing an aggregate contribution to a characteristic for an individual.
- FIG. 4 is a flow chart illustrating an embodiment of a process for determining an aggregate genetic contribution.
- FIG. 5 is a flow chart illustrating an embodiment of a process for obtaining a normalized statistical factor.
- FIG. 6 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for multiple individuals.
- the invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor.
- these implementations, or any other form that the invention may take, may be referred to as techniques.
- the order of the steps of disclosed processes may be altered within the scope of the invention.
- a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task.
- the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
- FIG. 1 is a block diagram illustrating an embodiment of a system for determining an aggregate contribution to a characteristic for an individual with a specific marker measurement, for instance, a genotype.
- the aggregate contribution may also allow an individual to add non-genetic factors to the calculation of the aggregate contribution, including other biomarkers, family history and environmental factors.
- a characteristic includes a phenotypic characteristic, trait, or condition of an individual. Examples of characteristics include eye color, ability to taste a bitter flavor in raw broccoli, type 2 diabetes, etc.
- a contribution refers to a measure of the association between a characteristic and an individual based on the individual's marker measurements, including genetics as well as any non-genetic information added by the individual. For example, a contribution may be an odds or probability that an individual has or will develop a characteristic.
- processor 108 interacts with genotype database 102 , marker database 104 , and statistics database 106 . Using data obtained from these databases, processor 108 determines contributions associated with each of a set of one or more markers associated with a characteristic. Markers, as used herein, are measurable factors, categorical or quantitative, that are associated with characteristics of individuals. Markers may include genetic markers (e.g., single nucleotide polymorphisms (SNPs)), other biomarkers (e.g., epigenetic markers or cholesterol level), any other appropriate markers (e.g., family history or weight) or combinations of any of the above (e.g. the joint or conditional risk ratio for a SNP and an environmental condition such as smoking status). Contributions from each marker are aggregated at aggregator 110 into an aggregate contribution. The aggregate contribution and/or contributions from each marker are displayed on display 112 .
- SNPs single nucleotide polymorphisms
- other biomarkers e.g., epigenetic markers or cholesterol level
- SNPs may be described in some of the examples herein, other markers may be used instead of or in addition to SNPs in various embodiments.
- Genotype database 102 contains genetic information for a plurality of individuals, where genetic information could include genotyping data, such as SNP data for an individual.
- Marker database 104 contains information that measures the association between various markers and characteristics.
- marker database 104 may include an entry that includes:
- the odds ratio indicates the odds that a person having that genotype together with a specified ethnicity and age range will have type 2 diabetes relative to the odds that a person having the AA genotype will have type 2 diabetes (which is why the odds ratio for AA is 1.0).
- This is the definition used in some association studies.
- Other traits may incorporate additional modifying non-genetic information, such as other biomarkers, family history, or environmental factors.
- other statistical factors besides an odds ratio may be provided (e.g., risk ratio or hazard ratio).
- data in marker database 104 is based on genetic studies, such as association studies or linkage studies.
- statistical factors e.g., odds ratios
- deterministic rules that quantify the association between a marker associated with a characteristic and the characteristic.
- statistical factors are stored for specified combinations of genotypes from two or more SNPs (e.g., the joint or conditional odds ratios for a pair of SNPs (i.e., an epistatic interaction)).
- entries in marker database 104 are generated by users or third parties.
- a third party for example a geneticist, may annotate a putative association between a marker or markers and a characteristic. This annotation may take the form of a mapping from measurements of a marker or a set of markers to textual descriptions of the characteristic in question and/or to statistical factors. All such third party annotations will be noted as such and as not generated by the company.
- Statistics database 106 contains information such as incidence or prevalence data for a characteristic (e.g., 10% of people of European ancestry between the ages of 20-39 are diagnosed with type 2 diabetes) and marker frequency data associated with the incidence data (e.g., of the people in the incidence data study, 20% had genotype AA, 60% had genotype AC, and 20% had genotype CC).
- the marker frequency data may come from the same source as the incidence data, or from a marker frequency database that can be mapped to the same population as the incidence data (e.g., the Hapmap database).
- the statistical information from statistics database 106 can be combined with the information in marker database 104 to obtain information that is more relevant to an individual with a particular set of marker measurements, as more fully described below.
- elements 102 - 110 are associated with a website hosted by a web server that is coupled to a network, such as the Internet.
- Display 112 is displayed by a web browser running on a client device that is coupled to the network.
- FIG. 2 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for an individual.
- the characteristic is X and the individual is Greg Mendel (not his real name, although a real person available as part of our reference database for viewing).
- X may be a trait (e.g., eye color) or condition (e.g., type 2 diabetes).
- interface 200 is displayed by display 112 of FIG. 1 .
- Marker effects box 204 shows the approximate effects of Greg Mendel's genotype at four SNPs that are associated with characteristic X.
- the bar chart represents the change in odds that certain genotypes contribute to the estimated incidence of this characteristic for people with Greg Mendel's measurements at four markers.
- the horizontal line represents an estimate of the average person's probability of having or developing the characteristic in a particular population subset chosen by the user.
- Each bar represents the contribution of a single marker to the odds of having or developing the characteristic.
- the bars are labeled with a SNP identifier (e.g., rs1234567).
- the bars are labeled with the gene the SNP is closest to (e.g., TCF7L2).
- the height of each bar is determined by the log of the odds ratio.
- a bar may represent a combination of genotypes from SNPs (e.g., joint odds of genotypes from a pair of SNPs) or may represent the contribution of one genotype conditioned on another (e.g., the height of the bar for genotype AA of SNP X may vary depending on the genotype of SNP Y).
- slashed bars indicate increased risk from the average
- lower, dotted bars indicate decreased risk from the average.
- an increased risk compared to the average is associated with marker M 2 and a decreased risk compared to the average is associated with markers M 1 , M 3 , and M 4 .
- the white bars show the maximum possible effects for the possible genotypes at the marker or the full range of contributions to risk made by a given SNP.
- hovering a cursor over a bar causes a tooltip (small box) to be displayed that indicates the odds for Greg Mendel compared to the average odds for that marker.
- the cursor is hovering over the bar for marker M 4 , and the tooltip indicates that Greg Mendel's odds of having or getting or having characteristic X is 0.92 times the average risk. Examples of how these computations are made are more fully described below.
- Aggregate contribution box 202 shows the aggregate contribution of the effects from all the markers (M 1 -M 4 in this example). Incidence is a measure of how often people in a population develop or are diagnosed with a particular condition in a given period of time, and is usually measured in events per person-year. Aggregate contribution box 202 shows an estimate of incidence in two different contexts: average incidence, and a genotype-specific incidence value.
- Incidence may also be thought of as an individual's chance of being diagnosed with a condition during a given period of time (assuming that he or she did not have the condition to begin with). A 100% chance is a sure thing. A 0% chance means that the event will never occur. Saying that “25 people out of 100” will be diagnosed with a condition over a given time period is another way to describe a 25% chance of getting it (or odds of 1-to-3 on getting it).
- the marker-measurement—specific incidence value is an estimate of how many individuals in a population composed of people with a particular set of marker measurements are likely to be diagnosed with a condition over a given age range and with a given ethnicity.
- the estimate is based on the current state of biomedical literature and is related only to a particular genotype, age or age range, and ethnicity, but not to the environment.
- non-genetic factors such as other biomarkers, family history, and environmental markers are also accounted for. Examples of computing a genotype-specific incidence value are discussed more fully below.
- Interface 200 includes a pull-down menu to select markers, including ethnicity and age range.
- markers including ethnicity and age range.
- the aggregate contributions shown in box 202 and the marker effects shown in box 204 will be updated to reflect the new ethnicity and/or age range.
- the available ethnicities and age ranges depend on the availability of this data from scientific studies.
- the ethnicity may be determined based on a user's designated ethnicity or automatically determined based on a user's genetic data.
- the age of the user may be provided by the user.
- the pull-down menu defaults to the designated or automatically determined ethnicity, and an appropriate age range.
- FIG. 3 is a flow chart illustrating an embodiment of a process for summarizing an aggregate contribution to a characteristic for an individual. In some embodiments, this process is used to determine an aggregate contribution to characteristic X for Greg Mendel to be displayed in box 202 .
- a characteristic to be evaluated is input. For example, to get to interface 200 , a user (e.g., Greg Mendel) may choose from a list of available characteristics, such as type 2 diabetes or Crohn's disease.
- a user e.g., Greg Mendel
- one or more markers associated with the characteristic are identified. For example, in FIG. 2 , for characteristic X, markers M 1 M 2 , M 3 ,and M 4 are identified since they are associated with characteristic X through genetic association studies. In some embodiments, the markers are identified using a marker database, which includes data about which markers are associated with various characteristics.
- the genotype of an individual for each of the one or more SNPs is retrieved from a database of individuals' marker measurements. For example, Greg Mendel's genotype at the relevant SNPs for characteristic X (i.e., M 1 M 2 , M 3 , and M 4 ) are retrieved from genotype database 102 .
- a statistical factor that measures the association between the marker and the characteristic is retrieved.
- the statistical factor that is retrieved from marker database 104 is an odds ratio.
- the odds ratio is specific to a selected ethnicity and/or age range of an individual.
- the odds ratio is computed relative to the odds of the lowest risk marker.
- the odds ratio is a normalized odds ratio computed relative to the average odds. The normalized odds ratio accounts for statistical data about disease incidence and genotype frequency, as more fully described below.
- an aggregate genetic contribution is determined based on the retrieved statistical factors.
- the aggregate genetic contribution is determined by multiplying the individual marker genetic contributions (statistical factors) with each other, as more fully described below.
- the aggregate genetic contribution is displayed.
- box 202 displays the aggregate genetic contribution.
- the individual marker contribution is displayed, as shown in box 204 of FIG. 2 .
- steps 304 - 308 are performed by processor 108
- step 310 is performed by aggregator 110
- step 312 is performed by display 112 .
- FIG. 4 is a flow chart illustrating an embodiment of a process for determining an aggregate genetic contribution. In some embodiments, this process is used to perform step 310 of FIG. 3 .
- each normalized statistical factor is applied to a baseline risk to obtain a set of marker-measurement—specific statistical factors. In some embodiments, as previously described, the statistical factor is an odds ratio.
- baseline risk is average risk. As used herein, risk can include incidence, prevalence, or any other measure of risk. Prevalence is defined as the total number of cases of a characteristic in the population at a given time, or the total number of cases in the population, divided by the number of individuals in the population.
- each normalized statistical factor is a marker-specific contribution.
- marker-specific contributions are shown in box 204 of FIG. 2 .
- the marker-specific contribution for marker M 4 is 0.92 times the average odds; that is, the odds of someone with Greg Mendel's genotype of getting or having characteristic X is 0.92 times the average.
- a marker-specific contribution is a genotype-specific risk, i.e., the probability that an individual with that genotype is affected given their genotype, Pr(D
- FIG. 5 is a flow chart illustrating an embodiment of a process for obtaining a normalized statistical factor. In some embodiments, this process is used to perform step 402 of FIG. 4 .
- G 1 is the genotype with the lowest odds ratio
- G 3 is the genotype with the highest odds ratio
- G 1 , G 2 , and G 3 could be AA, AC, and CC and OR 1 , OR 2 , and OR 3 could be 1.0, 1.5, and 2.0.
- a binary characteristic is a characteristic that has one of two possible values. The individual for whom to estimate the incidence has genotype G m : m ⁇ ⁇ 1, 2, 3 ⁇ . The quantity to compute is Pr(D
- estimates of incidence of the characteristic are obtained.
- the unconditional risk for the characteristic denoted Pr(D)
- Pr(D) is obtained for a subpopulation of which the individual is a member.
- this may be an estimate of the incidence of type 2 diabetes for Asian subjects between the ages of 20 and 40.
- This can be obtained from public data sources, such as data collected on incidence and/or prevalence from Centers for Disease Control (CDC) data, or published epidemiological data, for example from disease consortia.
- genotype frequencies are obtained. The genotype frequency is the percentage of the studied population having each possible genotype (G 1 , G 2 , and G 3 ). For example, estimates of the three genotype frequencies Pr(G i ) are obtained for the same subpopulation.
- odds ratio estimates are obtained.
- the odds ratio estimates are obtained from marker database 104 .
- normalized odds ratio estimates are computed based on genotype frequencies and estimates of incidence. In some embodiments, this is performed as follows. Pr(D
- Equation 1 follows from basic probability theory, and Equations 2 and 3 follow from the definition of an odds ratio.
- the estimated genotype-specific risk at the single locus is then Pr(D
- Pr(D), Pr(G 1 ), Pr(G 2 ), and Pr(G 3 ) are known.
- G 3 ) are unknown.
- G m ) can be calculated in a number of ways, including with a numerical root solver, such as Newton-Raphson.
- OR* m odds(D
- G m )/odds(D), where, for brevity, we have introduced the function odds(X) Pr(X)/(1 ⁇ Pr(X)).
- the superscript asterisk on OR* m is used to distinguish an odds ratio computed relative to the average odds, rather than relative to the lowest odds ratio, which is the definition used in some association studies.
- an aggregate contribution is a probability Pr(D
- OR C the composite odds ratio
- OR C the composite odds ratio
- OR* i,k the normalized odds ratios are denoted OR* i,k .
- G 1,k , G 2,k , G 3,k the individual's genotype at the kth locus G* m k ,k .
- Greg Mendel has a 2.9 in 100 probability of getting characteristic X given his genotype for markers M 1 -M 4 .
- FIG. 6 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for multiple individuals.
- interface 200 is displayed by display 112 of FIG. 1 .
- the characteristic is X and the individuals are Greg Mendel (the user) and other individuals who have given permission to allow Greg Mendel to view their aggregate contributions to characteristic X.
- Greg Mendel may have family and/or friends who have enabled sharing of their aggregate contributions to characteristic X with Greg.
- Lilly Mendel his mother
- John Lee a friend
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Genetics & Genomics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Analytical Chemistry (AREA)
- Molecular Biology (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Ecology (AREA)
- Epidemiology (AREA)
- Physiology (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Summarizing an aggregate contribution to a phenotypic characteristic for an individual includes: receiving information pertaining to the phenotypic characteristic of an individual; identifying, using one or more computer processors, a set of one or more markers associated with the phenotypic characteristic; obtaining a set of one or more marker measurements of the individual that corresponds to the set of one or more markers; obtaining a set of one or more statistical factors that measure associations between the set of one or more markers and the phenotypic characteristic; determining an aggregate contribution to the phenotypic characteristic of the individual based at least in part on the retrieved set of one or more statistical factors; and outputting a display characteristic to be displayed that is associated with the aggregate contribution to the phenotypic characteristic for the individual.
Description
- This application is a continuation of co-pending U.S. patent application Ser. No. 12/151,977, entitled SUMMARIZING AN AGGREGATE CONTRIBUTION TO A CHARACTERISTIC FOR AN INDIVIDUAL filed May 8, 2008 which is incorporated herein by reference for all purposes, which claims priority to U.S. Provisional Patent Application No. 60/999,175 (Attorney Docket No. 23MEP002+) entitled GENE JOURNAL filed Oct. 15, 2007 which is incorporated herein by reference for all purposes.
- Recently, interest in genetics and genetic testing has risen as increasing amounts of research show how an individual's genetic information can influence aspects of a person's ancestry, appearance, behavior, and physiology. Genetic testing can provide information to help individuals understand areas of potential concern to discuss with their doctors, and together with a doctor, can help individuals make informed decisions about medical management and lifestyle choices. Typical genetic testing solutions allow an individual to order a particular genetic test, such as for
type 2 diabetes. The individual typically receives a summary report having a limited set of results with limited interpretative information. Improvements in the interpretation and reporting of genetic test results would be useful. - Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
-
FIG. 1 is a block diagram illustrating an embodiment of a system for determining an aggregate contribution to a characteristic for an individual with a specific marker measurement, for instance, a genotype. -
FIG. 2 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for an individual. -
FIG. 3 is a flow chart illustrating an embodiment of a process for summarizing an aggregate contribution to a characteristic for an individual. -
FIG. 4 is a flow chart illustrating an embodiment of a process for determining an aggregate genetic contribution. -
FIG. 5 is a flow chart illustrating an embodiment of a process for obtaining a normalized statistical factor. -
FIG. 6 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for multiple individuals. - The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
- A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
-
FIG. 1 is a block diagram illustrating an embodiment of a system for determining an aggregate contribution to a characteristic for an individual with a specific marker measurement, for instance, a genotype. The aggregate contribution may also allow an individual to add non-genetic factors to the calculation of the aggregate contribution, including other biomarkers, family history and environmental factors. A characteristic, as used herein, includes a phenotypic characteristic, trait, or condition of an individual. Examples of characteristics include eye color, ability to taste a bitter flavor in raw broccoli,type 2 diabetes, etc. A contribution, as used herein, refers to a measure of the association between a characteristic and an individual based on the individual's marker measurements, including genetics as well as any non-genetic information added by the individual. For example, a contribution may be an odds or probability that an individual has or will develop a characteristic. - In the example shown,
processor 108 interacts withgenotype database 102,marker database 104, andstatistics database 106. Using data obtained from these databases,processor 108 determines contributions associated with each of a set of one or more markers associated with a characteristic. Markers, as used herein, are measurable factors, categorical or quantitative, that are associated with characteristics of individuals. Markers may include genetic markers (e.g., single nucleotide polymorphisms (SNPs)), other biomarkers (e.g., epigenetic markers or cholesterol level), any other appropriate markers (e.g., family history or weight) or combinations of any of the above (e.g. the joint or conditional risk ratio for a SNP and an environmental condition such as smoking status). Contributions from each marker are aggregated ataggregator 110 into an aggregate contribution. The aggregate contribution and/or contributions from each marker are displayed ondisplay 112. - Although SNPs may be described in some of the examples herein, other markers may be used instead of or in addition to SNPs in various embodiments.
-
Genotype database 102 contains genetic information for a plurality of individuals, where genetic information could include genotyping data, such as SNP data for an individual. Markerdatabase 104 contains information that measures the association between various markers and characteristics. For example,marker database 104 may include an entry that includes: - Ethnicity: European
- Ages: 20-39
- Characteristic:
type 2 diabetes - Marker: TCF7L2
- Possible genotypes and odds ratio for each genotype: AA 1.0, AC 1.5, CC 2.0
- In this example, the odds ratio indicates the odds that a person having that genotype together with a specified ethnicity and age range will have
type 2 diabetes relative to the odds that a person having the AA genotype will havetype 2 diabetes (which is why the odds ratio for AA is 1.0). This is the definition used in some association studies. Other traits may incorporate additional modifying non-genetic information, such as other biomarkers, family history, or environmental factors. In other embodiments, other statistical factors besides an odds ratio may be provided (e.g., risk ratio or hazard ratio). In some embodiments, data inmarker database 104 is based on genetic studies, such as association studies or linkage studies. These are research studies that produce as output statistical factors (e.g., odds ratios) or deterministic rules that quantify the association between a marker associated with a characteristic and the characteristic. In some embodiments, statistical factors are stored for specified combinations of genotypes from two or more SNPs (e.g., the joint or conditional odds ratios for a pair of SNPs (i.e., an epistatic interaction)). - In some embodiments, entries in
marker database 104 are generated by users or third parties. A third party, for example a geneticist, may annotate a putative association between a marker or markers and a characteristic. This annotation may take the form of a mapping from measurements of a marker or a set of markers to textual descriptions of the characteristic in question and/or to statistical factors. All such third party annotations will be noted as such and as not generated by the company. -
Statistics database 106 contains information such as incidence or prevalence data for a characteristic (e.g., 10% of people of European ancestry between the ages of 20-39 are diagnosed withtype 2 diabetes) and marker frequency data associated with the incidence data (e.g., of the people in the incidence data study, 20% had genotype AA, 60% had genotype AC, and 20% had genotype CC). The marker frequency data may come from the same source as the incidence data, or from a marker frequency database that can be mapped to the same population as the incidence data (e.g., the Hapmap database). We define populations as granularly as the marker association, incidence, and genotype frequency data permit (e.g., European, Asian, and African). - The statistical information from
statistics database 106 can be combined with the information inmarker database 104 to obtain information that is more relevant to an individual with a particular set of marker measurements, as more fully described below. - In some embodiments, elements 102-110 are associated with a website hosted by a web server that is coupled to a network, such as the Internet.
Display 112 is displayed by a web browser running on a client device that is coupled to the network. -
FIG. 2 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for an individual. In the example shown, the characteristic is X and the individual is Greg Mendel (not his real name, although a real person available as part of our reference database for viewing). For example, X may be a trait (e.g., eye color) or condition (e.g.,type 2 diabetes). In some embodiments,interface 200 is displayed bydisplay 112 ofFIG. 1 . - Marker effects box 204 shows the approximate effects of Greg Mendel's genotype at four SNPs that are associated with characteristic X. The bar chart represents the change in odds that certain genotypes contribute to the estimated incidence of this characteristic for people with Greg Mendel's measurements at four markers. The horizontal line represents an estimate of the average person's probability of having or developing the characteristic in a particular population subset chosen by the user. Each bar represents the contribution of a single marker to the odds of having or developing the characteristic. In some embodiments, the bars are labeled with a SNP identifier (e.g., rs1234567). In some embodiments, the bars are labeled with the gene the SNP is closest to (e.g., TCF7L2). In some embodiments, the height of each bar is determined by the log of the odds ratio. In some embodiments, a bar may represent a combination of genotypes from SNPs (e.g., joint odds of genotypes from a pair of SNPs) or may represent the contribution of one genotype conditioned on another (e.g., the height of the bar for genotype AA of SNP X may vary depending on the genotype of SNP Y).
- Higher, slashed bars indicate increased risk from the average, while lower, dotted bars indicate decreased risk from the average. In this example, an increased risk compared to the average is associated with marker M2 and a decreased risk compared to the average is associated with markers M1, M3, and M4. The white bars show the maximum possible effects for the possible genotypes at the marker or the full range of contributions to risk made by a given SNP. As shown, hovering a cursor over a bar causes a tooltip (small box) to be displayed that indicates the odds for Greg Mendel compared to the average odds for that marker. In this example, the cursor is hovering over the bar for marker M4, and the tooltip indicates that Greg Mendel's odds of having or getting or having characteristic X is 0.92 times the average risk. Examples of how these computations are made are more fully described below.
-
Aggregate contribution box 202 shows the aggregate contribution of the effects from all the markers (M1-M4 in this example). Incidence is a measure of how often people in a population develop or are diagnosed with a particular condition in a given period of time, and is usually measured in events per person-year.Aggregate contribution box 202 shows an estimate of incidence in two different contexts: average incidence, and a genotype-specific incidence value. - Incidence may also be thought of as an individual's chance of being diagnosed with a condition during a given period of time (assuming that he or she did not have the condition to begin with). A 100% chance is a sure thing. A 0% chance means that the event will never occur. Saying that “25 people out of 100” will be diagnosed with a condition over a given time period is another way to describe a 25% chance of getting it (or odds of 1-to-3 on getting it).
- The marker-measurement—specific incidence value (Greg Mendel's incidence value) is an estimate of how many individuals in a population composed of people with a particular set of marker measurements are likely to be diagnosed with a condition over a given age range and with a given ethnicity. In some embodiments, the estimate is based on the current state of biomedical literature and is related only to a particular genotype, age or age range, and ethnicity, but not to the environment. In some embodiments, non-genetic factors, such as other biomarkers, family history, and environmental markers are also accounted for. Examples of computing a genotype-specific incidence value are discussed more fully below.
-
Interface 200 includes a pull-down menu to select markers, including ethnicity and age range. In this example, if a different ethnicity and/or age range is selected, the aggregate contributions shown inbox 202 and the marker effects shown inbox 204 will be updated to reflect the new ethnicity and/or age range. The available ethnicities and age ranges depend on the availability of this data from scientific studies. In some embodiments, the ethnicity may be determined based on a user's designated ethnicity or automatically determined based on a user's genetic data. Similarly, the age of the user may be provided by the user. In some cases, the pull-down menu defaults to the designated or automatically determined ethnicity, and an appropriate age range. -
FIG. 3 is a flow chart illustrating an embodiment of a process for summarizing an aggregate contribution to a characteristic for an individual. In some embodiments, this process is used to determine an aggregate contribution to characteristic X for Greg Mendel to be displayed inbox 202. - At 302, a characteristic to be evaluated is input. For example, to get to
interface 200, a user (e.g., Greg Mendel) may choose from a list of available characteristics, such astype 2 diabetes or Crohn's disease. - At 304, one or more markers associated with the characteristic are identified. For example, in
FIG. 2 , for characteristic X, markers M1 M2, M3,and M4 are identified since they are associated with characteristic X through genetic association studies. In some embodiments, the markers are identified using a marker database, which includes data about which markers are associated with various characteristics. - At 306, the genotype of an individual for each of the one or more SNPs is retrieved from a database of individuals' marker measurements. For example, Greg Mendel's genotype at the relevant SNPs for characteristic X (i.e., M1 M2, M3, and M4) are retrieved from
genotype database 102. - At 308, for each marker, a statistical factor that measures the association between the marker and the characteristic is retrieved. For example, the statistical factor that is retrieved from
marker database 104 is an odds ratio. In some embodiments, the odds ratio is specific to a selected ethnicity and/or age range of an individual. In some embodiments, the odds ratio is computed relative to the odds of the lowest risk marker. In some embodiments, the odds ratio is a normalized odds ratio computed relative to the average odds. The normalized odds ratio accounts for statistical data about disease incidence and genotype frequency, as more fully described below. - At 310, an aggregate genetic contribution is determined based on the retrieved statistical factors. In some embodiments, the aggregate genetic contribution is determined by multiplying the individual marker genetic contributions (statistical factors) with each other, as more fully described below.
- At 312, the aggregate genetic contribution is displayed. For example, in
FIG. 2 ,box 202 displays the aggregate genetic contribution. In some embodiments, the individual marker contribution is displayed, as shown inbox 204 ofFIG. 2 . - In some embodiments, steps 304-308 are performed by
processor 108,step 310 is performed byaggregator 110, and step 312 is performed bydisplay 112. -
FIG. 4 is a flow chart illustrating an embodiment of a process for determining an aggregate genetic contribution. In some embodiments, this process is used to performstep 310 ofFIG. 3 . At 402, each normalized statistical factor is applied to a baseline risk to obtain a set of marker-measurement—specific statistical factors. In some embodiments, as previously described, the statistical factor is an odds ratio. In some embodiments, baseline risk is average risk. As used herein, risk can include incidence, prevalence, or any other measure of risk. Prevalence is defined as the total number of cases of a characteristic in the population at a given time, or the total number of cases in the population, divided by the number of individuals in the population. - In some embodiments, each normalized statistical factor is a marker-specific contribution. Examples of marker-specific contributions are shown in
box 204 ofFIG. 2 . For example, the marker-specific contribution for marker M4 is 0.92 times the average odds; that is, the odds of someone with Greg Mendel's genotype of getting or having characteristic X is 0.92 times the average. - In some embodiments, a marker-specific contribution is a genotype-specific risk, i.e., the probability that an individual with that genotype is affected given their genotype, Pr(D|Gm), computed according to
FIG. 5 . -
FIG. 5 is a flow chart illustrating an embodiment of a process for obtaining a normalized statistical factor. In some embodiments, this process is used to performstep 402 ofFIG. 4 . - In this example, assume a binary characteristic D and a single associated SNP at which there are three possible genotypes G1, G2, and G3, which have odds ratios OR'1R2, and OR3, respectively. G1 is the genotype with the lowest odds ratio, OR1, and G3 is the genotype with the highest odds ratio, OR3. For example, G1, G2, and G3 could be AA, AC, and CC and OR1, OR2, and OR3 could be 1.0, 1.5, and 2.0. A binary characteristic is a characteristic that has one of two possible values. The individual for whom to estimate the incidence has genotype Gm: m ∈ {1, 2, 3}. The quantity to compute is Pr(D|Gm), or the probability that the individual is affected given their genotype. Although a binary characteristic and three genotypes are shown in this example, in other embodiments, any other type of characteristic and number of genotypes may be used.
- At 502, estimates of incidence of the characteristic are obtained. For example, the unconditional risk for the characteristic, denoted Pr(D), is obtained for a subpopulation of which the individual is a member. For instance, this may be an estimate of the incidence of
type 2 diabetes for Asian subjects between the ages of 20 and 40. This can be obtained from public data sources, such as data collected on incidence and/or prevalence from Centers for Disease Control (CDC) data, or published epidemiological data, for example from disease consortia. At 504, genotype frequencies are obtained. The genotype frequency is the percentage of the studied population having each possible genotype (G1, G2, and G3). For example, estimates of the three genotype frequencies Pr(Gi) are obtained for the same subpopulation. This information can be obtained from databases that store genotype frequencies for different ethnicities, such as dbSNP or SNP500Cancer. At 506, odds ratio estimates are obtained. In some embodiments, the odds ratio estimates are obtained frommarker database 104. Odds ratio estimates inmarker database 104 may be based on association studies or may be user generated, as previously described. For example, estimates of the three genotype-specific odds ratios OR1, OR2, and OR3 are obtained, where OR1, OR2, and OR3 are odds ratios relative to the lowest odds ratio (OR1), so that OR1=1. In some embodiments, these odds ratios are the statistical factors referred to at 402. - At 508, normalized odds ratio estimates are computed based on genotype frequencies and estimates of incidence. In some embodiments, this is performed as follows. Pr(D|Gm) is computed by solving the following system of equations:
-
- Equation 1 follows from basic probability theory, and
Equations 2 and 3 follow from the definition of an odds ratio. The estimated genotype-specific risk at the single locus is then Pr(D|Gm). - From
equations 1, 2, and 3, OR2, OR3, Pr(D), Pr(G1), Pr(G2), and Pr(G3) are known. The quantities of interest, Pr(D|G1), Pr(D|G2), and Pr(D|G3) are unknown. Pr(D|Gm) can be calculated in a number of ways, including with a numerical root solver, such as Newton-Raphson. - To obtain the genotype-specific risk relative to the average risk in the population (or normalized odds ratio), compute the quantity OR*m=odds(D|Gm)/odds(D), where, for brevity, we have introduced the function odds(X)=Pr(X)/(1−Pr(X)). The inverse odds function will be used later on, and is odds(X)=Pr(X)/(1+Pr(X)). The superscript asterisk on OR*m is used to distinguish an odds ratio computed relative to the average odds, rather than relative to the lowest odds ratio, which is the definition used in some association studies.
- Returning to
FIG. 4 , at 404, the normalized statistical factors are combined to obtain an aggregate contribution. In some embodiments, an aggregate contribution is a probability Pr(D|∩k−1 KGmk ,k) computed as follows. - Assume that the composite odds ratio, ORC, is given by the product of the individual's odds ratios at each locus. Assume that there are K loci of interest, and denote the kth odds ratio of the ith genotype ORi,k. Similarly, the normalized odds ratios are denoted OR*i,k. Denote the genotypes at the kth locus G1,k, G2,k , G3,k and denote the individual's genotype at the kth locus G*m
k ,k. Thus -
- The quantity OR*C has the interpretation
-
OR*C=odds(D|∩ k=1 K G mk ,k)/odds(D), (5) - where odds(D|∩k=1 KGm
k ,k) is the odds for the individual's multilocus genotype. In computing the product, we implicitly assume that the point where log(odds(D)) intersects the respective logistic regression line for each locus is the same point in each of the individual regression calculations, as would be true if a multiple logistic regression had been performed. Then -
Pr(D|∩ k=1 K G mk ,k)odds−1[OR*Codds(D)]. (6) - An example of an aggregate contribution Pr(D|∩k= KGm
k ,k) is shown inbox 202 ofFIG. 2 . For example, Greg Mendel has a 2.9 in 100 probability of getting characteristic X given his genotype for markers M1-M4. -
FIG. 6 is a diagram illustrating an embodiment of an interface for displaying contributions to a characteristic for multiple individuals. In some embodiments,interface 200 is displayed bydisplay 112 ofFIG. 1 . In the example shown, the characteristic is X and the individuals are Greg Mendel (the user) and other individuals who have given permission to allow Greg Mendel to view their aggregate contributions to characteristic X. For example, Greg Mendel may have family and/or friends who have enabled sharing of their aggregate contributions to characteristic X with Greg. In this example, Lilly Mendel (his mother), and John Lee (a friend), have enabled sharing with Greg Mendel. - Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
Claims (22)
1. A method for summarizing an aggregate contribution to a phenotypic characteristic for an individual, comprising:
receiving information pertaining to the phenotypic characteristic of an individual;
identifying, using one or more computer processors, a set of one or more markers associated with the phenotypic characteristic;
obtaining a set of one or more marker measurements of the individual that corresponds to the set of one or more markers;
obtaining a set of one or more statistical factors that measure associations between the set of one or more markers and the phenotypic characteristic;
determining an aggregate contribution to the phenotypic characteristic of the individual based at least in part on the retrieved set of one or more statistical factors; and
outputting a display characteristic to be displayed that is associated with the aggregate contribution to the phenotypic characteristic for the individual.
2. The method of claim 1 , wherein at least one marker is a single nucleotide polymorphism (SNP).
3. The method of claim 1 , wherein the statistical factor includes an odds ratio or a normalized odds ratio.
4. The method of claim 1 , wherein the aggregate contribution includes a probability, an odds ratio, or a risk.
5. The method of claim 1 , wherein the one or more statistical factors includes one or more statistical factors based on ancestral group, ethnicity, or age group.
6. The method of claim 1 , wherein determining the aggregate contribution includes applying each statistical factor to a baseline risk.
7. The method of claim 1 , wherein determining the aggregate contribution includes combining the retrieved statistical factors for the markers.
8. The method of claim 1 , wherein the set of one or more markers includes a biomarker or a genetic marker.
9. The method of claim 1 , further including displaying the display characteristic.
10. The method of claim 1 , further including displaying a baseline risk and a risk specific to the individual.
11. The method of claim 1 , further including displaying effects of the set of one or more markers associated with the phenotypic characteristic.
12. A system for summarizing an aggregate contribution to a phenotypic characteristic for an individual, comprising:
one or more processors configured to:
receive information pertaining to the phenotypic characteristic of an individual;
identify a set of one or more markers associated with the phenotypic characteristic;
obtain a set of one or more marker measurements of the individual that corresponds to the set of one or more markers;
determine a set of one or more statistical factors that measure associations between the set of one or more markers and the phenotypic characteristic;
determine an aggregate contribution to the phenotypic characteristic of the individual based at least in part on the retrieved set of one or more statistical factors; and
output a display characteristic to be displayed that is associated with the aggregate contribution to the phenotypic characteristic for the individual; and
one or more memories coupled to the one or more processors and configured to provide the one or more processors with instructions.
13. The system of claim 12 , wherein at least one marker is a single nucleotide polymorphism (SNP).
14. The system of claim 12 , wherein the statistical factor includes an odds ratio or a normalized odds ratio.
15. The system of claim 12 , wherein the aggregate contribution includes a probability, an odds ratio, or a risk.
16. The system of claim 12 , wherein the one or more statistical factors includes one or more statistical factors based on ancestral group, ethnicity, or age group.
17. The system of claim 12 , wherein to determine the aggregate contribution includes to apply each statistical factor to a baseline risk.
18. The system of claim 12 , wherein to determine the aggregate contribution includes to combine the retrieved statistical factors for the markers.
19. The system of claim 12 , wherein the set of one or more markers includes a biomarker or a genetic marker.
20. The system of claim 12 , wherein the set of one or more processors are further configured to cause to be displayed a baseline risk and a risk specific to the individual.
21. The system of claim 12 , wherein the set of one or more processors are further configured to cause to be displayed effects of the set of one or more markers associated with the phenotypic characteristic.
22. A computer program product for summarizing an aggregate contribution to a phenotypic characteristic for an individual, the computer program product being embodied in a computer readable medium and comprising computer instructions for:
receiving information pertaining to the phenotypic characteristic of an individual;
identifying, using one or more computer processors, a set of one or more markers associated with the phenotypic characteristic;
obtaining a set of one or more marker measurements of the individual that corresponds to the set of one or more markers;
determining a set of one or more statistical factors that measure associations between the set of one or more markers and the phenotypic characteristic;
determining an aggregate contribution to the phenotypic characteristic of the individual based at least in part on the retrieved set of one or more statistical factors; and
outputting a display characteristic to be displayed that is associated with the aggregate contribution to the phenotypic characteristic for the individual.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/932,513 US20130345988A1 (en) | 2007-10-15 | 2013-07-01 | Summarizing an aggregate contribution to a characteristic for an individual |
US15/621,985 US20170277828A1 (en) | 2007-10-15 | 2017-06-13 | Summarizing an aggregate contribution to a characteristic for an individual |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US99917507P | 2007-10-15 | 2007-10-15 | |
US12/151,977 US8510057B1 (en) | 2007-10-15 | 2008-05-08 | Summarizing an aggregate contribution to a characteristic for an individual |
US13/932,513 US20130345988A1 (en) | 2007-10-15 | 2013-07-01 | Summarizing an aggregate contribution to a characteristic for an individual |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/151,977 Continuation US8510057B1 (en) | 2007-10-15 | 2008-05-08 | Summarizing an aggregate contribution to a characteristic for an individual |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/621,985 Continuation US20170277828A1 (en) | 2007-10-15 | 2017-06-13 | Summarizing an aggregate contribution to a characteristic for an individual |
Publications (1)
Publication Number | Publication Date |
---|---|
US20130345988A1 true US20130345988A1 (en) | 2013-12-26 |
Family
ID=48916724
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/151,977 Active 2030-03-31 US8510057B1 (en) | 2007-10-15 | 2008-05-08 | Summarizing an aggregate contribution to a characteristic for an individual |
US13/932,513 Abandoned US20130345988A1 (en) | 2007-10-15 | 2013-07-01 | Summarizing an aggregate contribution to a characteristic for an individual |
US15/621,985 Abandoned US20170277828A1 (en) | 2007-10-15 | 2017-06-13 | Summarizing an aggregate contribution to a characteristic for an individual |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/151,977 Active 2030-03-31 US8510057B1 (en) | 2007-10-15 | 2008-05-08 | Summarizing an aggregate contribution to a characteristic for an individual |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/621,985 Abandoned US20170277828A1 (en) | 2007-10-15 | 2017-06-13 | Summarizing an aggregate contribution to a characteristic for an individual |
Country Status (1)
Country | Link |
---|---|
US (3) | US8510057B1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9213947B1 (en) * | 2012-11-08 | 2015-12-15 | 23Andme, Inc. | Scalable pipeline for local ancestry inference |
US9367800B1 (en) * | 2012-11-08 | 2016-06-14 | 23Andme, Inc. | Ancestry painting with local ancestry inference |
US10025877B2 (en) | 2012-06-06 | 2018-07-17 | 23Andme, Inc. | Determining family connections of individuals in a database |
US10162880B1 (en) | 2011-10-11 | 2018-12-25 | 23Andme, Inc. | Cohort selection with privacy protection |
US10275569B2 (en) | 2007-10-15 | 2019-04-30 | 22andMe, Inc. | Family inheritance |
US10432640B1 (en) | 2007-10-15 | 2019-10-01 | 23Andme, Inc. | Genome sharing |
US10437858B2 (en) | 2011-11-23 | 2019-10-08 | 23Andme, Inc. | Database and data processing system for use with a network-based personal genetics services platform |
US10854318B2 (en) | 2008-12-31 | 2020-12-01 | 23Andme, Inc. | Ancestry finder |
US10896742B2 (en) | 2018-10-31 | 2021-01-19 | Ancestry.Com Dna, Llc | Estimation of phenotypes using DNA, pedigree, and historical data |
US11348692B1 (en) | 2007-03-16 | 2022-05-31 | 23Andme, Inc. | Computer implemented identification of modifiable attributes associated with phenotypic predispositions in a genetics platform |
US11514085B2 (en) | 2008-12-30 | 2022-11-29 | 23Andme, Inc. | Learning system for pangenetic-based recommendations |
US11514627B2 (en) | 2019-09-13 | 2022-11-29 | 23Andme, Inc. | Methods and systems for determining and displaying pedigrees |
US11783919B2 (en) | 2020-10-09 | 2023-10-10 | 23Andme, Inc. | Formatting and storage of genetic markers |
US11817176B2 (en) | 2020-08-13 | 2023-11-14 | 23Andme, Inc. | Ancestry composition determination |
US12046327B1 (en) | 2019-07-19 | 2024-07-23 | 23Andme, Inc. | Identity-by-descent relatedness based on focal and reference segments |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8510057B1 (en) * | 2007-10-15 | 2013-08-13 | 23Andme, Inc. | Summarizing an aggregate contribution to a characteristic for an individual |
MX2017004978A (en) | 2014-10-17 | 2017-09-13 | Ancestry Com Dna Llc | Haplotype phasing models. |
US10395759B2 (en) | 2015-05-18 | 2019-08-27 | Regeneron Pharmaceuticals, Inc. | Methods and systems for copy number variant detection |
AU2016272732B2 (en) | 2015-05-30 | 2021-02-25 | Ancestry.Com Dna, Llc | Discovering population structure from patterns of identity-by-descent |
AU2016293485B2 (en) | 2015-07-13 | 2021-05-13 | Ancestry.Com Dna, Llc | Local genetic ethnicity determination system |
KR102341129B1 (en) | 2016-02-12 | 2021-12-21 | 리제너론 파마슈티칼스 인코포레이티드 | Methods and systems for detecting abnormal karyotypes |
US20180190384A1 (en) * | 2017-01-05 | 2018-07-05 | Clear Genetics, Inc. | Automated genetic test counseling |
JP2021521511A (en) | 2018-04-05 | 2021-08-26 | アンセストリー ドットコム ディーエヌエー リミテッド ライアビリティ カンパニー | Identity network by descent and community allocation in gene mutation development |
CN112585688A (en) | 2018-06-19 | 2021-03-30 | Dna家族网有限责任公司 | Filtering genetic networks to discover populations of interest |
US12050629B1 (en) | 2019-08-02 | 2024-07-30 | Ancestry.Com Dna, Llc | Determining data inheritance of data segments |
US11429615B2 (en) | 2019-12-20 | 2022-08-30 | Ancestry.Com Dna, Llc | Linking individual datasets to a database |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8510057B1 (en) * | 2007-10-15 | 2013-08-13 | 23Andme, Inc. | Summarizing an aggregate contribution to a characteristic for an individual |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2363874B (en) | 1999-11-06 | 2004-08-04 | Dennis Sunga Fernandez | Bioinformatic transaction scheme |
US20020133495A1 (en) | 2000-03-16 | 2002-09-19 | Rienhoff Hugh Y. | Database system and method |
US7085834B2 (en) | 2000-12-22 | 2006-08-01 | Oracle International Corporation | Determining a user's groups |
US20020128860A1 (en) | 2001-01-04 | 2002-09-12 | Leveque Joseph A. | Collecting and managing clinical information |
US7062752B2 (en) | 2001-08-08 | 2006-06-13 | Hewlett-Packard Development Company, L.P. | Method, system and program product for multi-profile operations and expansive profile operation |
US20050027560A1 (en) | 2003-07-28 | 2005-02-03 | Deborah Cook | Interactive multi-user medication and medical history management method |
US8554876B2 (en) | 2004-01-23 | 2013-10-08 | Hewlett-Packard Development Company, L.P. | User profile service |
US7984421B2 (en) | 2006-10-03 | 2011-07-19 | Ning, Inc. | Web application cloning |
-
2008
- 2008-05-08 US US12/151,977 patent/US8510057B1/en active Active
-
2013
- 2013-07-01 US US13/932,513 patent/US20130345988A1/en not_active Abandoned
-
2017
- 2017-06-13 US US15/621,985 patent/US20170277828A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8510057B1 (en) * | 2007-10-15 | 2013-08-13 | 23Andme, Inc. | Summarizing an aggregate contribution to a characteristic for an individual |
Non-Patent Citations (1)
Title |
---|
Morrison et al. Am J Epidemiology 2007, 166:1 28-35, online publication April 2007. cited in parent application. * |
Cited By (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11515047B2 (en) | 2007-03-16 | 2022-11-29 | 23Andme, Inc. | Computer implemented identification of modifiable attributes associated with phenotypic predispositions in a genetics platform |
US11600393B2 (en) | 2007-03-16 | 2023-03-07 | 23Andme, Inc. | Computer implemented modeling and prediction of phenotypes |
US12106862B2 (en) | 2007-03-16 | 2024-10-01 | 23Andme, Inc. | Determination and display of likelihoods over time of developing age-associated disease |
US11495360B2 (en) | 2007-03-16 | 2022-11-08 | 23Andme, Inc. | Computer implemented identification of treatments for predicted predispositions with clinician assistance |
US11348691B1 (en) | 2007-03-16 | 2022-05-31 | 23Andme, Inc. | Computer implemented predisposition prediction in a genetics platform |
US11348692B1 (en) | 2007-03-16 | 2022-05-31 | 23Andme, Inc. | Computer implemented identification of modifiable attributes associated with phenotypic predispositions in a genetics platform |
US11515046B2 (en) | 2007-03-16 | 2022-11-29 | 23Andme, Inc. | Treatment determination and impact analysis |
US11482340B1 (en) | 2007-03-16 | 2022-10-25 | 23Andme, Inc. | Attribute combination discovery for predisposition determination of health conditions |
US11621089B2 (en) | 2007-03-16 | 2023-04-04 | 23Andme, Inc. | Attribute combination discovery for predisposition determination of health conditions |
US11545269B2 (en) | 2007-03-16 | 2023-01-03 | 23Andme, Inc. | Computer implemented identification of genetic similarity |
US11581098B2 (en) | 2007-03-16 | 2023-02-14 | 23Andme, Inc. | Computer implemented predisposition prediction in a genetics platform |
US11791054B2 (en) | 2007-03-16 | 2023-10-17 | 23Andme, Inc. | Comparison and identification of attribute similarity based on genetic markers |
US11581096B2 (en) | 2007-03-16 | 2023-02-14 | 23Andme, Inc. | Attribute identification based on seeded learning |
US11735323B2 (en) | 2007-03-16 | 2023-08-22 | 23Andme, Inc. | Computer implemented identification of genetic similarity |
US12243654B2 (en) | 2007-03-16 | 2025-03-04 | 23Andme, Inc. | Computer implemented identification of genetic similarity |
US10841312B2 (en) | 2007-10-15 | 2020-11-17 | 23Andme, Inc. | Genome sharing |
US10275569B2 (en) | 2007-10-15 | 2019-04-30 | 22andMe, Inc. | Family inheritance |
US12088592B2 (en) | 2007-10-15 | 2024-09-10 | 23Andme, Inc. | Genome sharing |
US11683315B2 (en) | 2007-10-15 | 2023-06-20 | 23Andme, Inc. | Genome sharing |
US10643740B2 (en) | 2007-10-15 | 2020-05-05 | 23Andme, Inc. | Family inheritance |
US10516670B2 (en) | 2007-10-15 | 2019-12-24 | 23Andme, Inc. | Genome sharing |
US11875879B1 (en) | 2007-10-15 | 2024-01-16 | 23Andme, Inc. | Window-based method for determining inherited segments |
US10999285B2 (en) | 2007-10-15 | 2021-05-04 | 23Andme, Inc. | Genome sharing |
US11170873B2 (en) | 2007-10-15 | 2021-11-09 | 23Andme, Inc. | Genetic comparisons between grandparents and grandchildren |
US10432640B1 (en) | 2007-10-15 | 2019-10-01 | 23Andme, Inc. | Genome sharing |
US11171962B2 (en) | 2007-10-15 | 2021-11-09 | 23Andme, Inc. | Genome sharing |
US10296847B1 (en) | 2008-03-19 | 2019-05-21 | 23Andme, Inc. | Ancestry painting with local ancestry inference |
US11531445B1 (en) | 2008-03-19 | 2022-12-20 | 23Andme, Inc. | Ancestry painting |
US11803777B2 (en) | 2008-03-19 | 2023-10-31 | 23Andme, Inc. | Ancestry painting |
US12033046B2 (en) | 2008-03-19 | 2024-07-09 | 23Andme, Inc. | Ancestry painting |
US11625139B2 (en) | 2008-03-19 | 2023-04-11 | 23Andme, Inc. | Ancestry painting |
US11514085B2 (en) | 2008-12-30 | 2022-11-29 | 23Andme, Inc. | Learning system for pangenetic-based recommendations |
US10854318B2 (en) | 2008-12-31 | 2020-12-01 | 23Andme, Inc. | Ancestry finder |
US11657902B2 (en) | 2008-12-31 | 2023-05-23 | 23Andme, Inc. | Finding relatives in a database |
US11508461B2 (en) | 2008-12-31 | 2022-11-22 | 23Andme, Inc. | Finding relatives in a database |
US11322227B2 (en) | 2008-12-31 | 2022-05-03 | 23Andme, Inc. | Finding relatives in a database |
US12100487B2 (en) | 2008-12-31 | 2024-09-24 | 23Andme, Inc. | Finding relatives in a database |
US11935628B2 (en) | 2008-12-31 | 2024-03-19 | 23Andme, Inc. | Finding relatives in a database |
US11049589B2 (en) | 2008-12-31 | 2021-06-29 | 23Andme, Inc. | Finding relatives in a database |
US11031101B2 (en) | 2008-12-31 | 2021-06-08 | 23Andme, Inc. | Finding relatives in a database |
US11468971B2 (en) | 2008-12-31 | 2022-10-11 | 23Andme, Inc. | Ancestry finder |
US11776662B2 (en) | 2008-12-31 | 2023-10-03 | 23Andme, Inc. | Finding relatives in a database |
US10162880B1 (en) | 2011-10-11 | 2018-12-25 | 23Andme, Inc. | Cohort selection with privacy protection |
US10891317B1 (en) | 2011-10-11 | 2021-01-12 | 23Andme, Inc. | Cohort selection with privacy protection |
US11748383B1 (en) | 2011-10-11 | 2023-09-05 | 23Andme, Inc. | Cohort selection with privacy protection |
US10936626B1 (en) | 2011-11-23 | 2021-03-02 | 23Andme, Inc. | Database and data processing system for use with a network-based personal genetics services platform |
US10437858B2 (en) | 2011-11-23 | 2019-10-08 | 23Andme, Inc. | Database and data processing system for use with a network-based personal genetics services platform |
US10691725B2 (en) | 2011-11-23 | 2020-06-23 | 23Andme, Inc. | Database and data processing system for use with a network-based personal genetics services platform |
US10025877B2 (en) | 2012-06-06 | 2018-07-17 | 23Andme, Inc. | Determining family connections of individuals in a database |
US11170047B2 (en) | 2012-06-06 | 2021-11-09 | 23Andme, Inc. | Determining family connections of individuals in a database |
US10572831B1 (en) | 2012-11-08 | 2020-02-25 | 23Andme, Inc. | Ancestry painting with local ancestry inference |
US9213947B1 (en) * | 2012-11-08 | 2015-12-15 | 23Andme, Inc. | Scalable pipeline for local ancestry inference |
US9836576B1 (en) | 2012-11-08 | 2017-12-05 | 23Andme, Inc. | Phasing of unphased genotype data |
US9367800B1 (en) * | 2012-11-08 | 2016-06-14 | 23Andme, Inc. | Ancestry painting with local ancestry inference |
US11521708B1 (en) | 2012-11-08 | 2022-12-06 | 23Andme, Inc. | Scalable pipeline for local ancestry inference |
US10658071B2 (en) | 2012-11-08 | 2020-05-19 | 23Andme, Inc. | Scalable pipeline for local ancestry inference |
US10755805B1 (en) | 2012-11-08 | 2020-08-25 | 23Andme, Inc. | Ancestry painting with local ancestry inference |
US10699803B1 (en) | 2012-11-08 | 2020-06-30 | 23Andme, Inc. | Ancestry painting with local ancestry inference |
US9977708B1 (en) | 2012-11-08 | 2018-05-22 | 23Andme, Inc. | Error correction in ancestry classification |
US10896742B2 (en) | 2018-10-31 | 2021-01-19 | Ancestry.Com Dna, Llc | Estimation of phenotypes using DNA, pedigree, and historical data |
US11735290B2 (en) | 2018-10-31 | 2023-08-22 | Ancestry.Com Dna, Llc | Estimation of phenotypes using DNA, pedigree, and historical data |
US12260936B2 (en) | 2019-07-19 | 2025-03-25 | 23Andme, Inc. | Identity-by-descent relatedness based on focal and reference segments |
US12046327B1 (en) | 2019-07-19 | 2024-07-23 | 23Andme, Inc. | Identity-by-descent relatedness based on focal and reference segments |
US12073495B2 (en) | 2019-09-13 | 2024-08-27 | 23Andme, Inc. | Methods and systems for determining and displaying pedigrees |
US11514627B2 (en) | 2019-09-13 | 2022-11-29 | 23Andme, Inc. | Methods and systems for determining and displaying pedigrees |
US12159690B2 (en) | 2020-08-13 | 2024-12-03 | 23Andme, Inc. | Ancestry composition determination |
US11817176B2 (en) | 2020-08-13 | 2023-11-14 | 23Andme, Inc. | Ancestry composition determination |
US11783919B2 (en) | 2020-10-09 | 2023-10-10 | 23Andme, Inc. | Formatting and storage of genetic markers |
Also Published As
Publication number | Publication date |
---|---|
US20170277828A1 (en) | 2017-09-28 |
US8510057B1 (en) | 2013-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8510057B1 (en) | Summarizing an aggregate contribution to a characteristic for an individual | |
Teumer | Common methods for performing Mendelian randomization | |
Polubriaginof et al. | Disease heritability inferred from familial relationships reported in medical records | |
Ginsburg et al. | Family health history: underused for actionable risk assessment | |
Cassa et al. | Disclosing pathogenic genetic variants to research participants: quantifying an emerging ethical responsibility | |
Kang et al. | Comparing two correlated C indices with right‐censored survival outcome: a one‐shot nonparametric approach | |
Timsit et al. | Calibration and discrimination by daily Logistic Organ Dysfunction scoring comparatively with daily Sequential Organ Failure Assessment scoring for predicting hospital mortality in critically ill patients | |
Chen et al. | Sequence kernel association test for quantitative traits in family samples | |
Levey et al. | A new equation to estimate glomerular filtration rate | |
Govindarajulu et al. | Frailty models: applications to biomedical and genetic studies | |
MacDonald et al. | Selection of family members for communication of cancer risk and barriers to this communication before and after genetic cancer risk assessment | |
Kramer et al. | Comparing observed and predicted mortality among ICUs using different prognostic systems: why do performance assessments differ? | |
Vassy et al. | Genotype prediction of adult type 2 diabetes from adolescence in a multiracial population | |
Ding et al. | Genotype-informed estimation of risk of coronary heart disease based on genome-wide association data linked to the electronic medical record | |
Raghavan et al. | Incident type 2 diabetes risk is influenced by obesity and diabetes in social contacts: a social network analysis | |
Zintzaras | The power of generalized odds ratio in assessing association in genetic studies with known mode of inheritance | |
Wu et al. | GWAS on birth year infant mortality rates provides evidence of recent natural selection | |
Lee et al. | Follow-up of incidental pulmonary nodules and association with mortality in a safety-net cohort | |
Li et al. | Integration of a polygenic score into guideline-recommended prediction of cardiovascular disease | |
Evans et al. | Increased rate of phenocopies in all age groups in BRCA1/BRCA2 mutation kindred, but increased prospective breast cancer risk is confined to BRCA2 mutation carriers | |
Claus | Risk models used to counsel women for breast and ovarian cancer: a guide for clinicians | |
Silverman-Retana et al. | Effect of familial diabetes status and age at diagnosis on type 2 diabetes risk: a nation-wide register-based study from Denmark | |
Cournane et al. | Predicting outcomes in emergency medical admissions using a laboratory only nomogram | |
Liao et al. | Impact of measurement error on testing genetic association with quantitative traits | |
Stack et al. | Genetic risk estimation in the coriell personalized medicine collaborative |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |