Functions to estimate the multiple choice item parameters by MMLE (Maximal Marginal Likelihood). The model is the multivariate logistic. More...

#include "libirt.h"
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_multiroots.h>
#include <gsl/gsl_linalg.h>
#include <gsl/gsl_wavelet.h>

Data Structures
struct	like_2plm_mc_struct
	Used to passed extra parameter to mple_wave_mc_fdfdf2. More...

Functions
void	probs_2plm_mc (gsl_vector slopes, gsl_vector thresholds, gsl_vector_int nbr_options, gsl_vector_int items_pos, gsl_vector quad_points, gsl_matrix probs)
	Compute the response functions for a multivariate logistic model.
int	like_2plm_mc_fdfdf2 (const gsl_vector par, void params, double f, gsl_vector df, gsl_matrix *df2)
	Compute the gradient and Hessian of likelihood.
int	like_2plm_mc_dfdf2 (const gsl_vector par, void params, gsl_vector df, gsl_matrix df2)
	Compute the gradient and Hessian of the likelihood.
int	like_2plm_mc_df (const gsl_vector par, void params, gsl_vector *df)
	Compute the gradient of the likelihood.
int	like_2plm_mc_df2 (const gsl_vector par, void params, gsl_matrix *df2)
	Compute the Hessian of the likelihood.
int	mle_2plm_mc (int max_iter, double prec, like_2plm_mc_struct params, gsl_vector thresholds, gsl_vector thresh_stddev, gsl_vector slopes, gsl_vector slopes_stddev, double mllk)
	Does the maximization step of the EM algorithm to estimate the response functions by MMLE (Maximum Marginal Likelihood) of one multiple choice item.
int	mmle_2plm_mc (int max_em_iter, int max_nr_iter, double prec, gsl_matrix_int patterns, gsl_vector counts, gsl_vector quad_points, gsl_vector quad_weights, gsl_vector_int items_pos, gsl_vector_int nbr_options, gsl_vector thresholds, gsl_vector thresh_stddev, gsl_vector slopes, gsl_vector slopes_stddev, gsl_vector_int ignore, int nbr_notconverge, gsl_vector_int *notconverge, int adjust_weights)
	Estimate the options response functions by MMLE (Maximum Marginal Likelihood).

Detailed Description

Functions to estimate the multiple choice item parameters by MMLE (Maximal Marginal Likelihood). The model is the multivariate logistic.

The overall objectif is to find the OCC (option characteristic curves) maximizing the ML (marginal likelihood). An EM (expectation-maximization) iterative algorithm is used.

A grid of ability levels has to be fixed. Something like 32 values from -4 to 4 will do. Those are called the quadrature classes in the code and can be generated with the function "quadrature".

A first approximation of the OCC has to be available.

For each pattern (a response vector from one subject) the a posteriori probabilities of being in each quadrature classes is computed by the function "posteriors".

The expected number of subject in each quadrature classes (quad_sizes), and for each option the expected number of subject in each quadrature classes having choosen this option (quad_freqs) are computed by the function "frequencies".

Once these quantities are assumed to be known, the log-likelihood can be maximized independantly for each item. The maximization is done by a root finding algorithm in the function "mle_2plm_mc". A variant of Newton-Raphson from the gsl library is used. For that we must have a function giving the firsts (gradient) and seconds (hessian) derivatives of the log-likelihood by the item parameters, those are computed in "like_2plm_mc_fdfdf2".

Steps 3-5 are repeated until convergence is achieved.

Author: Stephane Germain germs.nosp@m.te@g.nosp@m.mail..nosp@m.com

Function Documentation

void probs_2plm_mc	(	gsl_vector *	slopes,
		gsl_vector *	thresholds,
		gsl_vector_int *	nbr_options,
		gsl_vector_int *	items_pos,
		gsl_vector *	quad_points,
		gsl_matrix *	probs
	)

Compute the response functions for a multivariate logistic model.

Parameters

[in]	slopes	A vector(options) with the slope parameters of each option.
[in]	thresholds	A vector(options) with the threshold parameters of each option.
[in]	nbr_options	A vector(items) with the number of option of each items.
[in]	items_pos	A vector(items) with the position of the first option of each item in patterns.
[in]	quad_points	A vector(classes) with the middle points of each quadrature class.
[out]	probs	A matrix(options x classes) with the response functions.

Todo:: Stddev of the probs

Warning: The memory for probs should be allocated before.

int like_2plm_mc_fdfdf2	(	const gsl_vector *	par,
		void *	params,
		double *	f,
		gsl_vector *	df,
		gsl_matrix *	df2
	)

Compute the gradient and Hessian of likelihood.

Parameters

[in]	par	The multivariate 2PLM parameters, first the (nbr_option-1) intercepts then the (nbr_option-1) slopes.
[in]	params	The extra parameter to passes to the function.
[out]	df	The gradient of the log likelihood.
[out]	df2	The Hessian of the log likelihood.

This function is not used directly by the root finding functions, but by others functions that comply with the gsl.

Returns: GSL_SUCCESS for success.

int like_2plm_mc_dfdf2	(	const gsl_vector *	par,
		void *	params,
		gsl_vector *	df,
		gsl_matrix *	df2
	)

Compute the gradient and Hessian of the likelihood.

Parameters

[in]	par	The parameters.
[in]	params	The extra parameter to passes to the function.
[out]	df	The gradient of the log likelihood.
[out]	df2	The Hessian of the log likelihood.

This function is just a wrapper around like_2plmfdfdf2 to be used by the root finding functions in the gsl.

Returns: GSL_SUCCESS for success.

int like_2plm_mc_df	(	const gsl_vector *	par,
		void *	params,
		gsl_vector *	df
	)

Compute the gradient of the likelihood.

Parameters

[in]	par	The parameters.
[in]	params	The extra parameter to passes to the function.
[out]	df	The gradient of the log likelihood.

This function is just a wrapper around like_2plmfdfdf2 to be used by the root finding functions in the gsl.

Returns: GSL_SUCCESS for success.

int like_2plm_mc_df2	(	const gsl_vector *	par,
		void *	params,
		gsl_matrix *	df2
	)

Compute the Hessian of the likelihood.

Parameters

[in]	par	The parameters.
[in]	params	The extra parameter to passes to the function.
[out]	df2	The Hessian of the log likelihood.

This function is just a wrapper around like_2plmfdfdf2 to be used by the root finding functions in the gsl.

Returns: GSL_SUCCESS for success.

int mle_2plm_mc	(	int	max_iter,
		double	prec,
		like_2plm_mc_struct *	params,
		gsl_vector *	thresholds,
		gsl_vector *	thresh_stddev,
		gsl_vector *	slopes,
		gsl_vector *	slopes_stddev,
		double *	mllk
	)

Does the maximization step of the EM algorithm to estimate the response functions by MMLE (Maximum Marginal Likelihood) of one multiple choice item.

Parameters

[in]	max_iter	The maximum number of Newton iterations performed for each item.
[in]	prec	The desired precision of each parameter estimate.
[in]	params	The extra parameter to passes to the function.
[in,out]	thresholds	A vector(options) with the estimated thresholds. They should be initialize first.
[out]	thresh_stddev	A vector(options) with the estimated thresholds standard deviation.
[in,out]	slopes	A vector(options) with the estimated slopes. They should be initialize first.
[out]	slopes_stddev	A vector(options) with the estimated slopes standard deviation.
[out]	mllk	The maximum log likelihood.

Returns: 1 if the item converge, 0 otherwise.

Warning: The memory for the outputs should be allocated before.

int mmle_2plm_mc	(	int	max_em_iter,
		int	max_nr_iter,
		double	prec,
		gsl_matrix_int *	patterns,
		gsl_vector *	counts,
		gsl_vector *	quad_points,
		gsl_vector *	quad_weights,
		gsl_vector_int *	items_pos,
		gsl_vector_int *	nbr_options,
		gsl_vector *	thresholds,
		gsl_vector *	thresh_stddev,
		gsl_vector *	slopes,
		gsl_vector *	slopes_stddev,
		gsl_vector_int *	ignore,
		int *	nbr_notconverge,
		gsl_vector_int *	notconverge,
		int	adjust_weights
	)

Estimate the options response functions by MMLE (Maximum Marginal Likelihood).

Parameters

[in]	max_em_iter	The maximum number of EM iterations. At least 20 iteration are made.
[in]	max_nr_iter	The maximum number of Newton iterations performed for each item at each EM iteration.
[in]	prec	The relative change in the likelihood to stop the EM algorithm. This value divided by 10 is also the desired precision of each parameter estimate.
[in]	patterns	A matrix(patterns x options) of binary responses.
[in]	counts	A vector(patterns) with the count of each pattern. If NULL the counts are assumed to be all 1.
[in]	quad_points	A vector(classes) with the middle points of each quadrature class.
[in]	quad_weights	A vector(classes) with the prior weights of each quadrature class.
[in]	items_pos	A vector(items) with the position of the first option of each item in patterns (and probs).
[in]	nbr_options	A vector(items) with the number of option of each item in patterns (and probs).
[in,out]	thresholds	A vector(options) with the estimated thresholds. They should be initialize first.
[out]	thresh_stddev	A vector(options) with the estimated thresholds standard deviation.
[in,out]	slopes	A vector(options) with the estimated slopes. They should be initialize first.
[out]	slopes_stddev	A vector(options) with the estimated slopes standard deviation.
[in]	ignore	A vector(items) of ignore flag.
[out]	nbr_notconverge	The number of items that didn't converged.
[out]	notconverge	A vector(items) of flag set for the items that didn't converged.
[in]	adjust_weights	Controls whether adjust the quadrature weights after each iteration.

Returns: 1 if the relative change in the maximum log likelihood was less than prec else 0.

Warning: The memory for the outputs should be allocated before.

Data Structures

Functions

Detailed Description

Function Documentation