psychometrics

Psychometrics is a python package designed to help users implement both Classical Test Theory and Item Response Theory models and applications within a python framework.

Motivation

There were very few python packages built in python and I felt it was important to have some packages built as psychometricians begin utilizing python more frequently.

Installation

This package can be installed via pip:

pip install git+https://github.com/deepdatadive/psychometric.git

CTT example

Lets examine how we could analyze a test using classical test theory. First lets generate some data using the simulation module:

from psychometric.simulation import simulate_items, simulate_people, response_vector
items = simulate_items()
people = simulate_people(100, {'mean': 0, 'sd': 1})
prob_vector, response_vector = item_vectors(items, people)

We now have a pandas dataframe name response_vector that contains correct (1) and incorrect(0) responses for 100 people and 50 items. We can apply specific CTT functions directly to this dataframe.:

from psychometrics.CTT import calculate_alpha, discrimination_index, get_p_values, examinee_score
# Calculate coefficient alpha
alphas = calculate_alpha(response_vector)
# Calculate p-values for each item
p_values = get_p_values(response_vector)
# Calculate biserial and point biseral values for each item
discrim, discrim2 = discrimination_index(response_vector)
# Calculate raw scores for each examinee
examinee_scores = examinee_score(response_vector)

API

psychometrics.CTT.calculate_alpha(items)[source]

Calculates coefficient alpha for the given exam.

Parameters:items – a pandas dataframe with columns for each item and rows for each examinee.
Returns:coefficient alpha
psychometrics.CTT.discrimination_index(items)[source]

Calculates the point biserial and biserial for each item on the exam. Essentially these are item-total correlations where the item is not included in the total score (point-biserial) and is included int he total score (biserial).

Parameters:items – a pandas dataframe with columns for each item and rows for each examinee.
Returns:two dataframes. one containing the point-biserials and the other containing the biserials
psychometrics.CTT.examinee_score(items)[source]

Returns examinee scores for an exam

Parameters:items – a pandas dataframe containing the correct (1) and incorrect (0) response patterns for each examinee on every item.
Returns:a vector containing raw scores for the exam.
psychometrics.CTT.get_p_values(data)[source]

returns p-values for every item in the dataframe

Parameters:data – a pandas dataframe where columns are items and rows are examinees.
Returns:a vector of p-values for each item in the assessment.
psychometrics.simulation.get_probabilities(discrimination, ability, difficulty)[source]

Estimates the probability of a correct response for the 2 parameter model for an item of a given difficult and discrimination for an examinee with ability theta :param discrimination: discrimination parameter for item :param ability: examinee estimated theta :param difficulty: a difficutly parameter for an item :return: the probability an examinee with a given ability will get the question correct.

psychometrics.simulation.item_vectors(items, abilities)[source]
Parameters:
  • items – a dictionary (usually from the simulate items function) which contains keys ‘a’, ‘b’, and ‘c’ which are vectors containing the parameters of items.
  • abilities – a list of examinee abilities
Returns:

two dataframes: one containing the probabilities of getting each item correct for each examinee and another containing the correct (1) and incorrect (0) response vectors for each examinee

psychometrics.simulation.simulate_items(difficulty={'mean': 0, 'sd': 1}, discrimination={'mean': 1, 'sd': 0}, guessing=None, item_count=50)[source]

Simulates item parameters for the one, two and three parameter models. For the one parameter model keep the the discrimination sd equal to 0. :param difficulty: a dictionary with keys mean and sd :param discrimination: a dictionary with keys mean and sd :param guessing: a dictionary with keys mean and sd :param item_count: an integer for how many items you wish to simulate :return: A dictionary with keys ‘a’, ‘b’ and ‘c’ each containing a vector of length equal to item count with the corresponding parameters

psychometrics.simulation.simulate_people(examinee_count, information)[source]

Simulates theta parameters for examinees sampled from a normal distribution. :param examinee_count: how many examinees you wish to simulate :param information: dictionary with keys ‘mean’ and ‘sd’ representing the mean and standard deveviation. :return: a list of examinee abilities

Indices and tables