bhepop2.functions

Module Contents

Functions

get_attributes(→ list)

Get attributes list from dictionary of modalities

filter_distributions_and_infer_modalities(...)

Filter distributions table with attribute selection and infer modalities.

infer_modalities_from_distributions(distributions)

Infer attributes and their modalities from the given distributions.

compute_feature_values(→ list)

Compute the list of feature values that will define the assignment intervals.

get_feature_from_qualitative_distribution(distribution)

Get feature values from the given distributions.

compute_features_prob(feature_values, distribution)

Create a DataFrame containing probabilities for the given feature values.

interpolate_feature_prob(feature_value, distribution)

Linear interpolation of a feature value probability.

compute_crossed_modalities_frequencies(→ pandas.DataFrame)

Compute the frequency of each crossed modality present in the population.

build_cross_table(pop, names_attribute)

Parameters

compute_rq(model, nb_modalities, K)

bhepop2.functions.get_attributes(modalities: dict) list

Get attributes list from dictionary of modalities

Parameters:

modalities

Returns:

attributes

bhepop2.functions.filter_distributions_and_infer_modalities(distributions: pandas.DataFrame, attribute_selection)

Filter distributions table with attribute selection and infer modalities.

Parameters:
  • distributions – distribution DataFrame

  • attribute_selection – list of attributes to keep in the distribution, or None

Returns:

filtered distribution Dataframe, { attribute: [modalities] } dict

bhepop2.functions.infer_modalities_from_distributions(distributions: pandas.DataFrame)

Infer attributes and their modalities from the given distributions.

Parameters:

distributions – distributions DataFrame

Returns:

dict of attributes and their modalities, { attribute: [modalities] }

bhepop2.functions.compute_feature_values(distribution: pandas.DataFrame, relative_maximum: float, delta_min=None) list

Compute the list of feature values that will define the assignment intervals.

The distributions do not give the knowledge of the minimum and maximum feature values, so we have to choose them. The minimum is the same for all distributions, it is directly equal to the abs_first_value parameter. The maximum is computed by multiplying the relative_maximum parameter to the last value of each distribution.

Parameters:
  • distribution – dataframe of distribution

  • relative_maximum – multiplicand applied to compute the last feature value of each distribution

  • delta_min – minimum delta between two feature values. None to keep all values.

Returns:

list of feature values

bhepop2.functions.get_feature_from_qualitative_distribution(distribution: pandas.DataFrame)

Get feature values from the given distributions.

Parameters:

distribution – distribution DataFrame

Returns:

list of possible values for the qualitative feature

bhepop2.functions.compute_features_prob(feature_values: list, distribution: list)

Create a DataFrame containing probabilities for the given feature values.

Parameters:
  • feature_values – list of feature values

  • distribution – list of distribution values

Returns:

DataFrame of feature probabilities

bhepop2.functions.interpolate_feature_prob(feature_value: float, distribution: list)

Linear interpolation of a feature value probability.

First and last distribution values represent minimum and maximum values that can be taken.

Parameters:
  • feature_value – value of feature to interpolate

  • distribution – feature values for each decile from 0 to 10

Returns:

probability of being lower than the input feature value

bhepop2.functions.compute_crossed_modalities_frequencies(population: pandas.DataFrame, modalities: dict) pandas.DataFrame

Compute the frequency of each crossed modality present in the population.

Columns other than attributes are removed from the result DataFrame, and a ‘probability’ column is added.

Parameters:
  • population – population DataFrame

  • modalities – modalities dict

Returns:

DataFrame of crossed modalities frequencies

bhepop2.functions.build_cross_table(pop: pandas.DataFrame, names_attribute: list)

Parameters

pop : DataFrame synthesis population names_attribute: list of two strings

name of attribute1 and name of attribute 2

Returns

table_percentageDataFrame

proportion of modalities of attribute 2 given attribute 1

bhepop2.functions.compute_rq(model, nb_modalities, K)