Sequential sampling model for multiattribute choice alternatives with random attention time and processing order

Diederich, Adele; Oswald, Peter

doi:10.3389/fnhum.2014.00697

METHODS article

Front. Hum. Neurosci., 09 September 2014

Sec. Cognitive Neuroscience

Volume 8 - 2014 | https://doi.org/10.3389/fnhum.2014.00697

Sequential sampling model for multiattribute choice alternatives with random attention time and processing order

1. Cognitive Psychology, School of Humanities and Social Sciences, Jacobs University Bremen, Germany
2. Mathematics, Modeling, and Computing Center, School of Engineering and Science, Jacobs University Bremen, Germany

Abstract

A sequential sampling model for multiattribute binary choice options, called multiattribute attention switching (MAAS) model, assumes a separate sampling process for each attribute. During the deliberation process attention switches from one attribute consideration to the next. The order in which attributes are considered as well for how long each attribute is considered—the attention time—influences the predicted choice probabilities and choice response times. Several probability distributions for the attention time with different variances are investigated. Depending on the time and order schedule the model predicts a rich choice probability/choice response time pattern including preference reversals and fast errors. Furthermore, the difference between finite and infinite decision horizons for the attribute considered last is investigated. For the former case the model predicts a probability p₀ > 0 of not deciding within the available time. The underlying stochastic process for each attribute is an Ornstein-Uhlenbeck process approximated by a discrete birth-death process. All predictions are also true for the widely applied Wiener process.

1. Introduction

Sequential sampling models are powerful models to account simultaneously for choice probabilities and choice response times. They have become the dominant approach to modeling decision processes in cognitive science. Their application includes a variety of psychological tasks from basic perceptual decision to complex preferential choice tasks. Early on they have been applied to identification and discrimination tasks (e.g., Edwards, 1965; Laming, 1968; Pike, 1973; Link and Heath, 1975; Heath, 1981; Ashby, 1983); memory retrieval (e.g., Stone, 1960; Ratcliff, 1978; Van Zandt et al., 2000); and classification (e.g., general recognition theory, Ashby, 2000; exemplar–based random walk models of classification, Nosofsky and Palmeri, 1997) to account for speed-accuracy data. They have also been used for preferential decision tasks (e.g., decision field theory (DFT), Busemeyer and Townsend, 1993; multiattribute dynamic decision model, Diederich, 1997; Diederich and Busemeyer, 1999) to account for choice response times and choice probabilities interpreted as preference strength; judgment and confidence ratings (Pleskac and Busemeyer, 2010); to account for selling prices, certainty equivalents, and preference reversal phenomena (Busemeyer and Goldstein, 1992; Johnson and Busemeyer, 2005). More recently, they have been applied to combining perceptional decision making and payoffs (Diederich and Busemeyer, 2006; Diederich, 2008; Rorie et al., 2010; Gao et al., 2011). Furthermore, these models have been closely linked to measures from neuroscience like multi-cell electrode recordings (e.g., Ditterich, 2006; Gold and Shadlen, 2007; Churchland et al., 2008).

Sequential sampling models assume that (1) stimulus and choice alternative characteristics can be mapped onto a hypothetical numerical value representing the instantaneous level of evidence (activation, information, or preference—the wording often depends on the context), (2) some random fluctuation of this value over time occurs, (3) this evidence is accumulated over time, and (4) a final choice is made as soon as the evidence reaches a threshold. Therefore, sequential sampling can be described as a stochastic process. Two quantities are of foremost interest: (1) the probability that the process eventually reaches one of the thresholds or boundaries for the first time (the criterion to initiate a response), i.e., first passage probability; (2) the time it takes for the process to reach one of the boundaries for the first time, i.e., first passage time. The former quantity is related to the observed relative frequencies, the latter usually to the observed mean choice response times or the observed choice response time distribution.

Two classes of sequential sampling models have been predominantly used in psychology: Random walk/diffusion models and accumulator/counter models. The former are typically applied to a binary choice task, so that evidence for one choice alternative is at the same time evidence against the other. A decision is made as soon as the process reaches one of two preset criteria. In the latter, an accumulator/counter is established for each choice alternative separately, and evidence is accumulated in parallel. A decision is made as soon as one counter wins the race to reach one preset criterion. The accumulators/counters may or may not be independent. In the following we focus on random walk/diffusion models. For a review of both diffusion models and counter models see (Ratcliff and Smith, 2004).

To be more precise and to introduce notation, let X(t) denote the accumulation process. For a binary choice, say between choice options A and B (Figure 1), the models assume that the decision process begins with an initial state of evidence X(0). This initial state may either favor option A (X(0) > 0) or option B (X(0) < 0) or may be neutral with respect to A or B (X(0) = 0). Upon presentation of the choice options, the decision maker sequentially samples information from the stimulus display over time, retrieves information from memory, or forms preferences, depending on the context. The small increments of evidence sampled at any moment in time are such that they either favor option A (dX(t) > 0) or option B (dX(t) < 0). The evidence is accumulated from one moment in time to the next by summing the current state with the new increment: X(t + h) ≈ X(t) + μ(X(t), t) h + σ (X(t), t) (W(t + h) − W(t)). Here, μ(x, t) is called the drift rate and describes the expected value of increments per unit time. The factor σ(x, t) in front of the increments W(t + h) − W(t) of a standard Wiener process W(t) is called the diffusion rate, and relates to the variance of the increments. This process continues until the magnitude of the cumulative evidence exceeds a threshold criterion, θ. The process stops and option A is chosen as soon as the accumulated evidence reaches a criterion value for choosing A (here, X(t) = θ_A > 0) or it stops and chooses option B as soon as the accumulated evidence reaches a criterion value for choosing B (here X(t) = θ_B < 0). The probability of choosing A over B is determined by the accumulation process reaching the threshold for A before reaching the threshold for B. The criterion is assumed to be set by the decision maker prior to the decision task.

Figure 1

The Wiener process with drift, lately called drift-diffusion model in the psychological literature (Bogacz et al., 2006), is the most widely applied model. Different versions reflect additional assumptions for specific psychological domains. Ratcliff (1978) proposed a diffusion model for memory retrieval that is used for various psychological decision tasks. It is based on the work by Laming (1968) and Link and Heath (1975) and assumes variability in the starting point (i.e., X(0) follows a uniform distribution), and the drift rate μ = μ(t) of the Wiener process is normally distributed (cf. Laming). The residual time, i.e., the time other than the decision time, such as stimulus encoding and motor response, is assumed to be uniformly distributed and added to the decision time, i.e., response time equals the decision time plus a residual (non-decision) time. For a recent overview with applications see Voss et al. (2013). Other approaches include the Ornstein-Uhlenbeck model that linearly accumulates evidence with decay (Busemeyer and Townsend, 1993; Diederich, 1997), and the leaky competing accumulator model (Usher and McClelland, 2001) that non-linearly accumulates evidence with decay.

Common to almost all of these approaches is the assumption that a single integrated source of evidence generates the evidence during the deliberation process leading to a decision. In particular, the integrated source may be based on multiple features or attributes, but all of these features or attributes are assumed to be combined and integrated into a single source of evidence, and this single source is used throughout the decision process until a final decision is reached. Diederich (e.g., Diederich, 1995, 1997, 2003, 2008), however, assumed a separate process for each attribute¹. The decision maker switches attention from one attribute to the next during the time course of one trial. For instance, in a crossmodal task (visual, auditory, tactile), Diederich (1995) assumed a serial processing controlled by stimulus input at given stimulus onset asynchronies (SOA). That is, the order of attributes, here a light, followed by a tone, followed by a tactile vibration, as well as the point in time when a new attribute was added, here the tone presented at t₁ (t₁ ms after the light onset) and the tactile vibration at t₂ (t₂ ms after the light onset) was determined externally by the experimental setup. In the following we will call an attention switch at predetermined, fixed times, and predefined order attributes, a deterministic time and order schedule. Often, however, neither the processing order of attributes nor the point in time when the decision maker switches attention from one attribute to the next are known or can be inferred from the experimental setup. For those cases, Diederich (1997) proposed a specific model in which attention switches from one attribute to the next with some probability. This is an instance of a random time and order schedule which will be investigated more systematically in the present study.

The purpose of this paper is to present a unified treatment of sequential sampling models for both deterministic and random time and order schedules. To do so we start with deriving expressions for mean choice response times and choice probabilities for a deterministic time and order schedule before we show how they extend to random time and order schedules, including Poisson, binomial, geometric, and uniform distributions for the attention time devoted to each attribute in the sequence before attention switches to the next randomly or deterministically chosen attribute. We will provide first numerical evidence on the influence of various properties of a schedule on the predictions for mean choice response times and choice probabilities.

2. Preliminaries

The model applies to any finite number of attributes that the decision maker may consider, i.e., k = 1, …, K. For convenience we first describe the process for one attribute. As underlying information process for each attribute we assume an Ornstein-Uhlenbeck process X(t) defined by

where W(t) is a standard Wiener process. The parameters δ_k, γ_k, and σ_k are characteristics of the k-th attribute. The attribute characteristics may affect the quality of the extracted evidence for choosing A over B and this quality of evidence determines the drift rate δ_k. That is, the better an attribute discriminates between A and B, the larger is δ_k. The parameter γ_k which induces a change of the drift rate depending on the current state in the state space is often connected to memory processes (e.g., primacy and recency effects), conflict situations (e.g., approach-avoidance), or similarities between choice alternatives. Thus, together the effective drift δ_k − γ_kX(t) determines the direction and the velocity of the process when considering the k-th attribute at time t. Note that by setting γ_k to 0 results in a Wiener process with drift. That is, all the analysis we perform in the following is also valid for the Wiener process with drift. The diffusion coefficient σ_k indicates the variance of the increments of the process, for simplicity, we will set σ_k = σ for all k.

2.1. Matrix approach

Stochastic processes such as the above X(t) can be approximated by a discrete time, finite state space Markov chain. We use the matrix approach since it is simple to implement, sufficient in determining the entities of interest, i.e., choice probabilities and choice response times, and flexible to account for non-stationary and non-linear properties one wishes to include for the decision making process in the future. The continuous state space [θ_B, θ_A] of the piecewise Ornstein-Uhlenbeck process X(t) is replaced by a finite state space S = {−m_B, …, m_A} with m = m_A + m_B + 1 states. The diffusion process {X(t), t ≥ 0} is approximated by a discrete random walk {(n), n ≥ 0} with values in S such that X(nτ) ≈ Δ · (n) and θ_A ≈ m_AΔ and θ_B ≈ −m_BΔ, where Δ is the step size of change in evidence. To achieve convergence in the limit, the discretization parameters (Δ for state space, and τ for time) are tied to each other by the relation Δ = σ .

The attribute-related matrices P_k, k = 1, …, K, are given in their canonical form by

where

for i = 2, …, m − 1 (here, the index i corresponds to the state i − 1 − m_B). As Δ → 0 (or, equivalently, τ → 0), the decision probabilities and mean choice response times obtained from the Markov chain model converge to the values obtained from the underlying continuous process X(t). The identity matrix I corresponds to the two absorbing states (−m_B and m_A) associated with the two decision thresholds, one for each choice alternative; the matrix Q_k contains the transient probabilities, corresponding to the updating evidence process, and the matrix R_k contains the one-step transition probabilities from the transient to the absorbing states. In particular, the first column vector of the matrix R_k (denoted by R_B,k) contains the transient probabilities for reaching alternative B, while the second R_A,k contains the ones for alternative A. For details and derivations see Diederich (1997) and Diederich and Busemeyer (2003).

2.2. Time and order schedule

For K attributes, each one to be considered for some specific time in some specific order it is convenient to introduce a formal schedule of both time and order. A finite time and order schedule consists of a set of L consecutive time intervals {[t_{l − 1}, t_l]}_{l= 1, …,L} and the attribute sequence {k_l ∈ {1, …, K}}_{l= 1, …,L} which specifies that during the time interval [t_{l − 1}, t_l] the k_l-th attribute is considered. At switching time t_l, l = 1,…, L − 1, attention switches from attribute k_l to attribute k_{l + 1}. Depending on the situation, the final time t_l may be set finite (then the decision process may also finish without deciding for one of the alternatives) or infinite. Consequently, the process X(t) determined by such a schedule is a piecewise Ornstein-Uhlenbeck process, defined over a finite partition t₀ = 0 < t₁ < … < t_{L − 1} < t_L ≤ + ∞ of the time interval [0, t_L], where for t ∈ [t_{l − 1}, t_l] the process is determined by (1) with k = k_l. Figure 2 shows an example with three different attributes (K = 3) and a deterministic time and order schedule of length L = 4 with switching times t_l independent of the trajectories, and attribute order (1, 2, 1, 3), i.e., k₁ = 1, k₂ = 2, k₃ = 1, k₄ = 3 (note that the first attribute is reconsidered once).

Figure 2

For fixed Δ resp. τ, the m × m transition probability matrix _n containing the transition probabilities _ii′: = P(_{n + 1} = i′|_n = i) for the n-th step of the discrete-time random walk depends on the currently considered attribute defined by the time and order schedule, i.e., we set _n = P_{k_l} if n = n_{l − 1}, …, n_l − 1, where n₀ = 0, τ n_l ≈ t_l for l = 1, …, L (if t_L = ∞, we formally set n_L = ∞).

3. Choice probabilities and mean choice response times

In this section we derive the choice probabilities and mean choice response times for various time and order schedules. For simplicity we assume an unbiased process, i.e., with X(0) = 0 and symmetric decision thresholds, i.e., θ_A = −θ_B. Since the diffusion coefficient is a scaling parameter it will be set to σ = 1 for all attributes throughout. We start with the deterministic time and order schedule.

3.1. Deterministic time and order schedule

The evidence accumulation process for attribute k₁, which is considered first, evolves until time t₁ when the second attribute k₂ comes into consideration, triggering a change in the accumulation process. This attribute in turn is considered until time t₂ when a third attribute k₃ is considered and so forth until a decision is initiated (or t_l is reached). Let the random variables T_A and T_B denote the finite time when the process reaches a decision threshold θ_A or −θ_B, stops, and a decision response for A or B is initiated. With the switching times t_l replaced by integers n_l ≈ t_l/τ, the choice probability Pr[choose A] = Pr(T_A < ∞) is then approximated by the value p_A obtained from the discrete random walk model as

where Z is the probability distribution for the initial state X(0). For instance, for an unbiased process, Z would be a coordinate vector with probability 1 at state 0 halfway between the decision thresholds. The remaining vectors and matrices are those defined in (2). The evidence accumulation process for a successive attribute starts with the final evidence state of the previous attribute. Note that Z′Q^n₁_k₁ to Z′Q^n₁_k₁…Q^{n_{L − 1}−n_{L − 2}}_{k_{L − 1}} are defective distributions, i.e., the entries of these vectors do not sum up to 1, for the states of the random walk at discrete times n₁,…,n_{L − 1}. Further note that the stochastic process is time homogeneous within each time interval [0, t₁) to [t_{L − 1}, t_l] but non-homogeneous across [0, t_L] (see Diederich, 1992, 1995).

Similarly, the mean response time for choosing alternative A is approximated as

The probability and the mean response time for choosing alternative B can be determined similarly. Note that p₀: = 1 − (p_A + p_B), the probability of not making a decision until the final time t_L, is strictly positive if t_L < ∞. As shown in Diederich (1997), these formulas can be further compactified. We will do this below for the general case of deterministic and random schedules by deriving an efficient recursion for their evaluation.

3.2. Random time and order schedule

The above derivation of formulas for choice probabilities and mean response times for a deterministic time and order schedule have counterparts for random schedules which we describe next in three steps.

3.2.1. Random order schedule

For generating the attribute order {k_l}_{l = 1,…,L}, we consider stochastic K × K matrices D^(l) such that d^(l)_k′k ≥ 0 describes the probability with which attention switches from the k′-th attribute to the k-th attribute at switching time t_l ≈ τ n_l, l = 1,…,L − 1. Normally, d^(l)_kk = 0 would be assumed, to avoid a no switching situation. For two attributes K = 2, we must then have d^(l)₁₁ = d^(l)₂₂ = 0, d^(l)₁₂ = d^(l)₂₁ = 1, and the attribute sequence is either (1, 2, 1, 2, …) or (2, 1, 2, 1, …), depending on whether k₁ = 1 or k₁ = 2. For three attributes and L = 3, choosing

would for k₁ = 1 result in order sequences (1, 2, 1), (1, 3, 1), (1, 3, 2) with probability 1/2, 3/8, 1/8, respectively. The above matrix D⁽¹⁾ models the situation when no preference or bias for considering attributes can be asserted.

3.2.2. Random time schedule

We assume that the number of discrete time steps during which attention is paid to the k-th attribute is a discrete random variable denoted by T_at with given distribution. In principle, this distribution may change its type and may have different parameters, such as expected value, depending on the attribute and the attribute order {k_l}_{l = 1, …, L}. This can be used to model time pressure and other temporal effects. However, often we assume one and the same distribution type for attention times across all attributes, and allow for different parameters only.

For instance, the geometric distribution (as implicitly considered in Diederich, 1997) is given by

and characterized by a single parameter r > 0, with expectation 1/r and variance (1 − r)/r², and the uniform distribution is defined as

with parameters N and M = 0, 1, …, N − 1 and expectation N and variance M(M + 1)/3. Details for other tested distributions (Poisson with parameter λ > 0, and binomial distributions with parameters n and p) are omitted. For comparable expectation values E(T_at) (i.e., for parameter choices 1/r ≈ N ≈ λ ≈ np), the geometric distribution has much larger variance than the Poisson, binomial and uniform distribution with M ≈ (the latter are very close to each other). Figure 3 shows the pdf and cdf for different T_at distributions with fixed mean value E(T_at) = 300. The two uniform distributions are with M = 150 = N/2 and M = 299 = N − 1. Varying the parameter M of the uniform distribution allows us to produce intermediate results between the deterministic and geometric distribution cases as shown in the following.

Figure 3

3.2.3. Constructing random time and order schedules

We create a random time and order schedule of length L in two steps: First, given an initial distribution of k₁ ∈ {1, …, K}, we create the attribute sequence {k_l}_{l = 2, …,L} using a non-stationary Markov chain model with transition probability matrices D^(l), l = 1,…, L − 1. In a second step, for each l = 1,…,L, the attention time T^(l)_at = n_l − n_{l − 1} is created by the discrete random variable responsible for the attention time paid to the k_l-th attribute, choices are independent for the different l. Consequently, t_l − t_{l − 1} ≈ τ T^(l)_at is the real attention time paid to the k_l-th attribute. We note that semi-random schedules, where the sequence {k_l} is given deterministically, and only the T^(l)_at are determined as in the second step outlined above, are covered if we choose the D^(l) such that d^(l)_{k_l,k_{l + 1}} = 1.

To understand the recursive computation of choice probabilities and mean response times in this more general case, we first consider the special cases L = 1, 2, and illustrate the derivation on some distribution types of the random variable T_at generating attention times by providing concrete formulas. In general, the distribution for T_at is given by its probability mass distribution (pdf) and cumulative distribution function (cdf)

We start with L = 1, and will drop the index l from the notation introduced in the previous subsection. Since the probability of choosing alternative A at the i-th step is given by Z′Q_kⁱ⁻¹R_A,k, i = 1, …, T_at, and T_at is a random variable distributed according to (5) we get

A similar formula holds for p_B,k. To avoid repetition, introduce the row vector p_AB,k: = [p_B,k, p_A,k], then

The 2 × (m − 2) matrix V_k depends on the attribute and its parameters via Q_k, R_k, and on the chosen attention time distribution and the cdf (f_n,k). For the discussed concrete attention time distributions these matrices may be precomputed, in some cases closed-form expressions can be found, e.g., for the geometric distribution with parameter r = r_k we have

Next we discuss choice probabilities for the case L = 2, assuming for simplicity that the attention time distribution is the same for all attributes. To save on indices, denote k₁ ≡ k′, k₂ ≡ k, and D⁽¹⁾ ≡ D (this matrix is responsible for the random choice of k given any k′). Then the decision probability vector p_{AB,k′, k} for reaching alternatives B or A in with attribute order (k′,k) has two parts: the probabilities of having decided on while still considering the k′-th attribute (i.e., T_A/τ ≤ T′_at, where T′_at is the randomly generated attention time for the first attribute k′) plus the probabilities that τ T′_at < T_A/τ ≤ T′_at + T_at, where T_at is the randomly (and independently) generated attention time for the second attribute k. On top of this, k itself is randomly chosen according to the entries in the k′-th row of D. Thus, for each fixed k₁ = k′ and n₁ = T′_at according to (6) probabilities for reaching a decision after n₁ are given by

Thus, for L = 2, the choice probabilities (under the assumption that k₁ = k′ is fixed) can be obtained as

where

are (m − 2)× (m − 2) matrices depending on the attribute and attention time distribution type. For example, for the geometric distribution this simplifies to B_k = r_kQ_k(I − (1 − r_k)Q_k)⁻¹, closed form expressions are available for Poisson, binomial, and uniform distributions as well.

For arbitrary L, it is more convenient to write the resulting recursion in terms of block-matrix-vector operations. Denote by

Z	the K × 1 array with each entry equal to the initial distribution Z (and think of Z′ as its transpose, a 1× K array with entries Z′),
B	the K × K diagonal array with the B_k on the diagonal (similarly for C defined later),
I	the K × K diagonal array, with identity matrices I of the appropriate size on the diagonal,
V	the K × 1 array with the V_k as entries (similarly for W defined later), and
p_AB	the K × 2 matrix, whose rows are the choice probabilities [p_A, p_B]\|_{k₁ = k} defined before in the case L = 2.

Then the above result for L = 2 can be compactly written as

Note that the product BD of the array B with the matrix D is interpreted as the K × K array with d_k′kB_k′ as entry in row k′ and column k. Moreover, by iterating (8), one arrives at the formula for arbitrary L:

Formulas for mean response times can be derived similarly. Indeed, for L = 1, denote by ET_A,k the mean response time for reaching alternative A when considering the k-th attribute for a random time T_at distributed according to (5). Then ET_A,k ≈ τ et_A,k/P_A,k, where

Similarly for ET_B,k and et_B,k. Thus, similar to (6), we can write

The matrices W_k can be precomputed to any accuracy at essentially the same cost as the V_k. For particular distributions, the formulas can be turned into closed form expressions.

Next, let us look at L = 2. By using similar notation and arguments as for choice probabilities, the quantities et_A,k′,k, et_B,k′,k have a part before and after T′_at. This, together with (10), (11), gives

where

Thus, the counterpart of (8) is

From here, combining with (8), a joint recursion for computing p_AB and et_AB results:

We conclude this section with a few remarks. In Diederich (1997), under the name MADD/pp, a slightly different presentation of random schedules is given for the special case of geometrically distributed attention times. It is not hard to see, that (with the notation r_ij used in the K = 3 example presented in Section 4.2 Diederich, 1997) our model is equivalent to MADD/pp as L → ∞, if we set r_k = 1 − r_kk for the parameters r of the geometrically distributed T_at, k = 1, 2, 3, and d_kk = 0, d_kk′ = r_kk′/(1 − r_kk), k′≠ k, for the entries of the matrix D = D^(l), l ≥ 1. The advantage of the MADD/pp model is that it provides closed form formulas for the case L = ∞, a possibility that we did not pursue here for other types of attention time distributions.

In previous sequential decision models with finite L (Diederich, 1997), the last attribute was always considered infinitely long (infinite decision horizon) to avoid the situation of no decision, i. e., p₀ > 0. This can be incorporated into the current model by modifying the definition of the matrices V_k, W_k corresponding to the last interval [t_{L − 1},∞) to

and modifying the recursion (14) slightly. Alternatively, one can artificially change the parameters of the attention time distribution for l = L such that its expected value is sufficiently large, and make p₀ practically negligible. Since infinite decision horizons do not seem to adequately reflect the situation of a real decision process or laboratory experiment, it might be interesting to work under scenarios where t_l is fixed and finite that we described in this paper.

4. Simulations

We present some simulations that demonstrate the predictive power of the proposed model. We focus on features that have not been considered in Diederich (1997) for the deterministic case. Throughout this section we fix certain parameters, such as σ = 1, θ_A = −θ_B = 10, (this implies a state space size of m = 81), and always start at the neutral position X(0) = 0 between choice alternatives A and B.

4.1. Impact of attention time distributions

First, we show how different assumptions on the randomness of the attention time T_at (i.e., the time spent on considering a certain attribute) influences choice probabilities and mean response times. In the first example, we assume just two attributes with parameters δ₁ = 0.2, γ₁ = 0.03, δ₂ = 0.04, γ₂ = 0.003, both attributes favor alternative A, the first one more strongly than the second one². The attributes are considered only once (L = 2), with order k₁ = 1, k₂ = 2. The first attribute is considered for time t₁ = τ n₁, where n₁ is a random variable T_at described above with given expectation N. For the second attribute we compare two situations: (1) We assume an infinitely long decision horizon t₂ = ∞, and (2) we determine a finite time horizon t₂ = τ n₂ by choosing n₂ = n₁ + T_at which is also T_at distributed with the same expected value N. These two situations are depicted in Figures 4, 5. The graphs show choice probabilities and mean response times as functions of the expectation τ E(T_at) of the real attention times. Lines of different color represent different distributions. Distributions with a small variance, such as the Poisson distribution, the binomial distribution, and the uniform distribution with M ≈ produce results indistinguishable from the deterministic case. This holds for all tested situations shown below. This means, small uncertainties in attention time spans do not influence the observable choice frequencies and mean response times. However, as the variance of the attention times grows, we see quantitative and qualitative changes. Compared to the deterministic attention time situation, the geometric distribution differs most, and the uniform distributions with M = N/2 = 150 (Unif.1) and M = N − 1 = 299 (Unif.2) are intermediate. Moreover, there is expectedly a big difference for small mean attention times between finite and infinite decision horizons. Most importantly, for the former case it predicts a probability p₀ > 0 of not deciding within the available time t₂. We claim that for many situations, where an infinite time horizon does not represent reality well enough, our finite schedule model might be more appealing. This aspect will be pursued in further research.

Figure 4

Figure 5

Figures 6, 7 show similar simulation results for the situation of considering first an attribute favoring B (δ₁ = −0.1, γ₁ = 0) followed by an attribute more strongly favoring A (δ₂ = 0.2, γ₂ = 0.03). As expected, the results look now different, however, the main conclusions from the previous example concerning the influence of the randomness type for attention times and the differences for finite vs. infinite time horizons remain the same. Most importantly here, the model predicts a preference reversal (i.e., choice probabilities from below 0.5 to above 0.5) as a function of attention time when one attribute is in favor of choosing alternative A and the other in favor of choosing alternative B. Parameter studies, as in Diederich (1997), will be pursued further elsewhere.

Figure 6

Figure 7

To complete the picture, we show a three-attribute example (K = 3) in Figure 8. The chosen attribute parameters are now δ₁ = 0.04, γ₁ = 0.003, δ₂ = −0.1, γ₂ = 0, δ₃ = 0.2, γ₃ = 0.03, i.e., a weakly in favor of A, in favor of B, and strongly in favor of A sequence of attributes. Attention times for the first two attributes are chosen independently from each other but with the same distribution with fixed mean value; the last attribute is considered indefinitely.

Figure 8

4.2. Dependence on attribute order

The proposed sequential decision model is sensitive to the order in which the attributes are consider. If we consider in the aforementioned second two-attribute example the attribute in favor of A first, and then the attribute in favor of B we get very different patterns as shown in Figure 9 compared to Figure 6. A similar effect is true for the above K = 3 example. In Figure 10, the attribute in favor of B is now the last one; the graphs need to be compared with Figure 8. One interesting pattern can be observed. If the evidence for choosing one alternative decreases in the sequence of attribute consideration then the model predicts faster choice response times for the more frequently chosen alternative—a typical pattern observed in response time analysis. However, if the evidence increases in the sequence of attribute consideration then the model predicts faster choice response times for the less frequently chosen alternative which has been called fast error, as shown in Figure 11 compared to Figure 4. Simply by changing the order of attribute processing the model predicts a complex pattern of choice response times and choice probabilities.

Figure 9

Figure 10

Figure 11

So far, all examples shown are with a fixed, deterministic attribute order with no repetitions (semi-random schedule, L = K). The evaluation of fully random time and order schedules requires larger L, and will be presented elsewhere.

5. Concluding remarks

The proposed multiattribute attention switching (MAAS) model can predict a very complex choice probability/(mean) choice response time pattern. It may appear too flexible to be testable. However, this is not the case. If two attributes both favor alternative, A say, and the first attribute that is considered provides more evidence for choosing A than the second (δ₁ > δ₂), then the model predicts always shorter response times for the more frequently chosen alternative, here A, regardless of the assumed underlying attention time distribution. If the order of processing these attributes is reversed, i.e., the attribute that favors alternative A less is considered first (δ₂ > δ₁), then the model always predicts faster responses for the less frequently chosen alternative, here B, again regardless of the assumed underlying attention time distribution. A single stage process can only account for this pattern by assuming variability in starting positions and variability in drift rates, i.e., a statistical means where the drift rate itself is a random variable. It is difficult experimentally to disentangle the variability stemming from the stochastic process itself and the variability from the distribution of different drift rates. As Jones and Dzhafarov (2013) pointed out, the predictions of various sequential sampling models rest upon the assumptions made about the assumed probability distributions. This is not the case here. The model is falsifiable without assuming specific distributions. Rather than relying on statistical mechanisms to ensure an observed response patterns we rely on assumptions about cognitive processes such as attention switching and salience. The specific attention time distribution used for an application may be related to the experimental paradigm. For instance, when tracking eye movements, the sequence of attribute consideration and the switching times are directly observable, and a deterministic or a uniform distribution with a small variance is advisable. When all attributes are shown simultaneously, like in complex objects, and attention may shift at any moment in time a geometric distribution or a uniform distribution with a large variance may describe the situation better. Testing the model rigorously will be pursued in the future.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Statements

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Footnotes

1.^The notion of attributes is defined here in a broad sense. For example, it includes dimensions such as color and size of visual target; amplitude and frequency of a tone; different modalities in a crossmodal task; payoff information and perceptual information; attitudinal evidence and perceptual evidence; prize and quality of a consumer product and more.

2.^Note that when looking only at the numerical values of the drift parameter δ₁ = 0.2 and the decision criterion θ_A = 10 and assuming that the attention times t₁ to the first attribute are large enough it would suggest mean response times in the range T_A ≈ 50 (and very small p_B). However, since γ₁ = 0.03 it leads to a negative effective drift δ₁ − γ₁X(t) if X(t) comes close θ_A, and the mean response times become much longer. This also demonstrates the effect of the parameter γ_k, and a difference between Ornstein-Uhlenbeck process and Wiener process based models.

References

1
AshbyF. (1983). A biased random walk model for two choice reaction times. J. Math. Psychol. 27, 277–297. 10.1016/0022-2496(83)90011-1
2
AshbyF. (2000). A stochastic version of general recognition theory. J. Math. Psychol. 44, 310–329. 10.1006/jmps.1998.1249
3
BogaczR.BrownE.MoehlisJ.HolmesP.CohenJ. (2006). The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. Psychol. Rev. 113, 700–765. 10.1037/0033-295X.113.4.700
- CrossRef
- Google Scholar
4
BusemeyerJ.GoldsteinW. (1992). Linking together different measures of preference: a dynamic model of matching derived from decision field theory. Organ. Behav. Hum. Decis. Process. 52, 370–396. 10.1016/0749-5978(92)90026-4
- CrossRef
- Google Scholar
5
BusemeyerJ.TownsendJ. (1993). Decision field theory: a dynamic-cognitive approach to decision-making in an uncertain environment. Psychol. Rev. 100, 432–459. 10.1037/0033-295X.100.3.432
6
ChurchlandA.KianiR.ShadlenM. (2008). Survey of decision field theory. Nat. Neurosci. 11, 693–702. 10.1038/nn.2123
- CrossRef
- Google Scholar
7
DiederichA. (1992). Intersensory Facilitation: Race, Superposition, and Diffusion Models for Reaction Time to Multiple Stimuli. Frankfurt am Main: Verlag Peter Lang.
- Google Scholar
8
DiederichA. (1995). Intersensory facilitation of reaction time: evaluation of counter and diffusion coactivation models. J. Math. Psychol. 39, 197–215. 10.1006/jmps.1995.1020
- CrossRef
- Google Scholar
9
DiederichA. (1997). Dynamic stochastic models for decision making with time constraints. J. Math. Psychol. 41, 260–274. 10.1006/jmps.1997.1167
10
DiederichA. (2003). Decision making under conflict: decision time as a measure of conflict strength. Psychon. Bull. Rev. 10, 167–176. 10.3758/BF03196481
11
DiederichA. (2008). A further test on sequential sampling models accounting for payoff effects on response bias in perceptual decision tasks. Percept. Psychophys. 70, 229–256. 10.3758/PP.70.2.229
12
DiederichA.BusemeyerJ. (1999). Conflict and the stochastic dominance principle of decision making. Psychol. Sci. 10, 353–359. 10.1111/1467-9280.00167
- CrossRef
- Google Scholar
13
DiederichA.BusemeyerJ. (2003). Simple matrix methods for analyzing diffusion models of choice probability, choice response time and simple response time. J. Math. Psychol. 47, 304–322. 10.1016/S0022-2496(03)00003-8
- CrossRef
- Google Scholar
14
DiederichA.BusemeyerJ. (2006). Modeling the effects of payoff on response bias in a perceptual discrimination task: threshold-bound, drift rate-change, or two-stage-processing hypothesis. Percept. Psychophys. 68, 194–207. 10.3758/BF03193669
15
DitterichJ. (2006). Stochastic models of decisions about motion direction: behavior and physiology. Neural Netw. 19, 981–1012. 10.1016/j.neunet.2006.05.042
16
EdwardsW. (1965). Optimal strategies for seeking information: models for statistics, choice reaction times, and human information processing. J. Math. Psychol. 2, 312–329. 10.1016/0022-2496(65)90007-6
- CrossRef
- Google Scholar
17
GaoJ.TortellR.McClellandJ. L. (2011). Dynamic integration of reward and stimulus information in perceptual decision-making. PLoS ONE6:e16749. 10.1371/journal.pone.0016749
18
GoldJ.ShadlenM. (2007). The neural basis of decision making. Ann. Rev. Neurosci. 30, 535–574. 10.1146/annurev.neuro.29.051605.113038
19
HeathR. (1981). A tandem random walk model for psychological discrimination. Br. J. Math. Stat. Psychol. 34, 76–92. 10.1111/j.2044-8317.1981.tb00619.x
20
JohnsonJ.BusemeyerJ. (2005). A dynamic, stochastic, computational model of preference reversal phenomena. Psychol. Rev. 112, 841–861. 10.1037/0033-295X.112.4.841
21
JonesM.DzhafarovE. N. (2013). Unfalsifiability and mutual translatability of major modeling schemes for choice reaction time. Psychol. Rev. 121, 1–32. 10.1037/a0034190
22
LamingD. (1968). Information Theory of Choice Reaction Times. Oxford: Academic Press.
- Google Scholar
23
LinkS.HeathR. (1975). A sequential theory of psychological discrimination. Psychometrika40, 77–105. 10.1007/BF02291481
- CrossRef
- Google Scholar
24
NosofskyR.PalmeriT. (1997). An exemplar based random walk model of speeded classification. Psychol. Rev. 104, 266–300. 10.1037/0033-295X.104.2.266
25
PikeA. (1973). Response latency models for signal detection. Psychol. Rev. 80, 53–68. 10.1037/h0033871
26
PleskacT.BusemeyerJ. (2010). Two-stage dynamic signal detection: a theory of choice, decision time, and confidence. Acta Neurobiol. Exp. 117, 864–901. 10.1037/a0019737
27
RatcliffR. (1978). A theory of memory retrieval. Psychol. Rev. 85, 59–108. 10.1037/0033-295X.85.2.59
- CrossRef
- Google Scholar
28
RatcliffR.SmithP. (2004). A comparison of sequential sampling models for two-choice reaction time. Psychol. Rev. 111, 333–367. 10.1037/0033-295X.111.2.333
29
RorieA.GaoJ.McClellandJ.NewsomeW. (2010). Integration of sensory and reward information during perceptual decision-making in lateral intraparietal cortex (lip) of the macaque monkey. PLoS ONE5:e9308. 10.1371/journal.pone.0009308
30
StoneM. (1960). Models for choice-reaction time. Psychometrika25, 251–260. 10.1007/BF02289729
- CrossRef
- Google Scholar
31
UsherM.McClellandJ. (2001). The time course of perceptual choice: the leaky, competing accumulator model. Psychol. Rev. 108, 550–592. 10.1037/0033-295X.108.3.550
32
Van ZandtT.ColoniusH.ProctorR. (2000). A comparison of two reaction time models applied to perceptual matching. Psychon. Bull. Rev. 7, 208–256. 10.3758/BF03212980
33
VossA.NaglerM.LercheV. (2013). Diffusion models in experimental psychology. Exp. Psychol. 60, 385–402. 10.1027/1618-3169/a000218
- CrossRef
- Google Scholar

Summary

Keywords

sequential sampling, multiattribute, attention time, time schedule, order schedule, finite time horizon, Ornstein-Uhlenbeck, Wiener

Citation

Diederich A and Oswald P (2014) Sequential sampling model for multiattribute choice alternatives with random attention time and processing order. Front. Hum. Neurosci. 8:697. doi: 10.3389/fnhum.2014.00697

Received

07 April 2014

Accepted

19 August 2014

Published

09 September 2014

Volume

8 - 2014

Edited by

José Antonio Díaz, Universidad de Granada, Spain

Reviewed by

Chris Donkin, University of New South Wales, Australia; José Antonio Díaz, Universidad de Granada, Spain

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Adele Diederich, Cognitive Psychology, School of Humanities and Social Sciences, Jacobs University, Campus Ring 1, Bremen 28759, Germany e-mail: a.diederich@jacobs-university.de

This article was submitted to the journal Frontiers in Human Neuroscience.

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Cognitive Neuroscience

METHODS article

Sequential sampling model for multiattribute choice alternatives with random attention time and processing order

Abstract

1. Introduction