psychometrics¶
Psychometrics is a python package designed to help users implement both Classical Test Theory and Item Response Theory models and applications within a python framework.
Motivation¶
There were very few python packages built in python and I felt it was important to have some packages built as psychometricians begin utilizing python more frequently.
Installation¶
This package can be installed via pip:
pip install git+https://github.com/deepdatadive/psychometric.git
CTT example¶
Lets examine how we could analyze a test using classical test theory. First lets generate some data using the simulation module:
from psychometric.simulation import simulate_items, simulate_people, response_vector
items = simulate_items()
people = simulate_people(100, {'mean': 0, 'sd': 1})
prob_vector, response_vector = item_vectors(items, people)
We now have a pandas dataframe name response_vector that contains correct (1) and incorrect(0) responses for 100 people and 50 items. We can apply specific CTT functions directly to this dataframe.:
from psychometrics.CTT import calculate_alpha, discrimination_index, get_p_values, examinee_score
# Calculate coefficient alpha
alphas = calculate_alpha(response_vector)
# Calculate p-values for each item
p_values = get_p_values(response_vector)
# Calculate biserial and point biseral values for each item
discrim, discrim2 = discrimination_index(response_vector)
# Calculate raw scores for each examinee
examinee_scores = examinee_score(response_vector)
API¶
-
psychometrics.CTT.calculate_alpha(items)[source]¶ Calculates coefficient alpha for the given exam.
Parameters: items – a pandas dataframe with columns for each item and rows for each examinee. Returns: coefficient alpha
-
psychometrics.CTT.discrimination_index(items)[source]¶ Calculates the point biserial and biserial for each item on the exam. Essentially these are item-total correlations where the item is not included in the total score (point-biserial) and is included int he total score (biserial).
Parameters: items – a pandas dataframe with columns for each item and rows for each examinee. Returns: two dataframes. one containing the point-biserials and the other containing the biserials
-
psychometrics.CTT.examinee_score(items)[source]¶ Returns examinee scores for an exam
Parameters: items – a pandas dataframe containing the correct (1) and incorrect (0) response patterns for each examinee on every item. Returns: a vector containing raw scores for the exam.
-
psychometrics.CTT.get_p_values(data)[source]¶ returns p-values for every item in the dataframe
Parameters: data – a pandas dataframe where columns are items and rows are examinees. Returns: a vector of p-values for each item in the assessment.
-
psychometrics.simulation.get_probabilities(discrimination, ability, difficulty)[source]¶ Estimates the probability of a correct response for the 2 parameter model for an item of a given difficult and discrimination for an examinee with ability theta :param discrimination: discrimination parameter for item :param ability: examinee estimated theta :param difficulty: a difficutly parameter for an item :return: the probability an examinee with a given ability will get the question correct.
-
psychometrics.simulation.item_vectors(items, abilities)[source]¶ Parameters: - items – a dictionary (usually from the simulate items function) which contains keys ‘a’, ‘b’, and ‘c’ which are vectors containing the parameters of items.
- abilities – a list of examinee abilities
Returns: two dataframes: one containing the probabilities of getting each item correct for each examinee and another containing the correct (1) and incorrect (0) response vectors for each examinee
-
psychometrics.simulation.simulate_items(difficulty={'mean': 0, 'sd': 1}, discrimination={'mean': 1, 'sd': 0}, guessing=None, item_count=50)[source]¶ Simulates item parameters for the one, two and three parameter models. For the one parameter model keep the the discrimination sd equal to 0. :param difficulty: a dictionary with keys mean and sd :param discrimination: a dictionary with keys mean and sd :param guessing: a dictionary with keys mean and sd :param item_count: an integer for how many items you wish to simulate :return: A dictionary with keys ‘a’, ‘b’ and ‘c’ each containing a vector of length equal to item count with the corresponding parameters
-
psychometrics.simulation.simulate_people(examinee_count, information)[source]¶ Simulates theta parameters for examinees sampled from a normal distribution. :param examinee_count: how many examinees you wish to simulate :param information: dictionary with keys ‘mean’ and ‘sd’ representing the mean and standard deveviation. :return: a list of examinee abilities