:py:mod:`bhepop2.enrichment` ============================ .. py:module:: bhepop2.enrichment .. autoapi-nested-parse:: This package contains classes used to enrich synthetic populations, using various methodologies. Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 base/index.rst bhepop2/index.rst uniform/index.rst Package Contents ---------------- Classes ~~~~~~~ .. autoapisummary:: bhepop2.enrichment.Bhepop2Enrichment bhepop2.enrichment.SimpleUniformEnrichment .. py:class:: Bhepop2Enrichment(population: pandas.DataFrame, source: bhepop2.sources.marginal_distributions.MarginalDistributions, feature_name=None, seed=None) Bases: :py:obj:`bhepop2.enrichment.base.SyntheticPopulationEnrichment` Implementation of the Bhepop2 methodology as an enrichment class. See :mod:`bhepop2.enrichment.bhepop2` module documentation for details about the algorithm. **Expected source types**: .. autosummary:: :nosignatures: ~bhepop2.sources.marginal_distributions.QualitativeMarginalDistributions ~bhepop2.sources.marginal_distributions.QuantitativeMarginalDistributions ---- This class documentation uses the following notations: - :math:`M_{k}` : crossed modality k (combination of attribute modalities) - :math:`F_{i}` : feature class i - For quantitative features, corresponds to a numeric interval. - For qualitative features, corresponds to one of the feature values. .. py:property:: modalities Dict containing list of modalities for each attribute. .. py:method:: _evaluate_feature_on_population() Assign feature values to the population individuals using the algorithm results. :return: enriched population DataFrame .. py:method:: _draw_feature_value(probs) Return a feature value using the given probabilities. First draw the feature index. Then get a feature value from the distributions. :return: feature value to assign to individual .. py:method:: _get_feature_probs() For each crossed modality, compute the probability of belonging to a feature interval. Invert the crossed modality probabilities using Bayes. Compute .. math:: P(f \in F_{i} \mid M_{k}) = P(M_{k} \mid f \in F_{i}) \cdot \frac{P(f \in F_{i})}{P(M_{k})} :return: DataFrame .. py:method:: _optimise() Run the optimisation algorithm to find the probability distributions that maximise entropy. When done, set the *optim_result* attribute. .. py:method:: _run_optimization() -> pandas.DataFrame Run optimization model on each feature value. The resulting probabilities are the :math:`P(M_{k} \mid f \in F_{i})`. :return: DataFrame containing the result probabilities .. py:method:: _compute_constraints() For each modality of each attribute, compute the probability of belonging to each feature interval. .. math:: P(Modality \mid f \in F_{i}) = P(f \in F_{i} \mid Modality) \cdot \frac{P(Modality)}{P(f \in F_{i})} .. py:method:: _compute_crossed_modalities_matrix() Compute crossed modalities matrix for the present modalities. A reducted samplespace is evaluated from the crossed modalities present in the population. Functions describing each modality are then applied to elements of this samplespace. For each modality m and sample c, M(m, c) is 1 if c has modality m, 0 otherwise. :return: crossed_modalities_matrix describing crossed modalities .. py:class:: SimpleUniformEnrichment(population: pandas.DataFrame, source, feature_name: str = None, seed=None) Bases: :py:obj:`bhepop2.enrichment.base.SyntheticPopulationEnrichment` This class implements a simple enrichment using a global distribution. **Expected source types**: .. autosummary:: :nosignatures: ~bhepop2.sources.global_distribution.QuantitativeGlobalDistribution ------ The global distribution describes the feature values of the whole population, using deciles (see :mod:`~bhepop2.sources.global_distribution`). To evaluate a feature value for an individual, we randomly choose one of the deciles, and then draw a random value between its two boundaries. This method ensures a good distribution of the feature values over the total population, but no more. .. py:method:: _evaluate_feature_on_population() Evaluate a list of feature values for each individual. :return: iterable with same size and order than the population .. py:method:: _draw_feature_value()