Next Article in Journal
Human Movement Quality Assessment Using Sensor Technologies in Recreational and Professional Sports: A Scoping Review
Next Article in Special Issue
Technical Perspectives on Applications of Biologically Coupled Gate Field-Effect Transistors
Previous Article in Journal
Voltammetric Detection of Glucose—The Electrochemical Behavior of the Copper Oxide Materials with Well-Defined Facets
Previous Article in Special Issue
Efficient Illumination for a Light-Addressable Potentiometric Sensor
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Rational Design of Field-Effect Sensors Using Partial Differential Equations, Bayesian Inversion, and Artificial Neural Networks

1
Institute of Applied Mathematics, Leibniz University Hannover, Welfengarten 1, 30167 Hannover, Germany
2
Cluster of Excellence PhoenixD (Photonics, Optics, and Engineering-Innovation Across Disciplines), Leibniz University Hannover, 30167 Hannover, Germany
3
Faculty of Electrical Engineering, K. N. Toosi University of Technology, Tehran 19697, Iran
4
Institute of Analysis and Scientific Computing, TU Wien, Wiedner Hauptstrasse 8–10, 1040 Vienna, Austria
5
Center for Artificial Intelligence and Machine Learning (CAIML), TU Wien, 1040 Vienna, Austria
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(13), 4785; https://doi.org/10.3390/s22134785
Submission received: 31 May 2022 / Revised: 17 June 2022 / Accepted: 21 June 2022 / Published: 24 June 2022
(This article belongs to the Special Issue Field-Effect Sensors: From pH Sensing to Biosensing)

Abstract

:
Silicon nanowire field-effect transistors are promising devices used to detect minute amounts of different biological species. We introduce the theoretical and computational aspects of forward and backward modeling of biosensitive sensors. Firstly, we introduce a forward system of partial differential equations to model the electrical behavior, and secondly, a backward Bayesian Markov-chain Monte-Carlo method is used to identify the unknown parameters such as the concentration of target molecules. Furthermore, we introduce a machine learning algorithm according to multilayer feed-forward neural networks. The trained model makes it possible to predict the sensor behavior based on the given parameters.

1. Introduction

Silicon nanowire (SiNW) field-effect transistors (FETs) are typically used to detect proteins [1], cancer cells [2], DNA and miRNA strands [3,4], enzymes [5], and toxic gases such as carbon monoxide [6,7]. The sensors have several advantages including fast response, very high sensitivity, and low power consumption; they do not need labeling and can be used to detect subpicomolar concentrations of biological species [8,9,10,11,12,13]. The functioning of the sensors is based on the field effect due to the (partial) charges of the target molecules. When they are selectively bound to probe molecules and close enough to the semiconducting transducer, they affect the charge concentration inside the nanowire, which changes the current through the nanowire.
Using mathematical models based on partial differential equations (PDEs) enables us to model physically relevant quantities such as electrostatic potential, electron and hole current density, device sensitivity to the target molecule and signal-to-noise ratio [14,15,16,17,18]. The three-dimensional simulations give rise to more reliable models compared to two-dimensional cross-sections, since all target molecules bound to bio-receptors will be included [19,20]. We couple a charge transport model (the drift-diffusion equations) and the nonlinear Poisson–Boltzmann equation (PBE) for fully self-consistent simulations. The system of equations is a comprehensive model to compute the electrical current and study the nonlinear effects of different semiconductor parameters (e.g., doping concentration) and device parameters such as nanowire type (radial, trapezoidal, radial, or rectangular), its dimensions, contact voltages, and insulator thickness on device performance (output and sensitivity).
Having an accurate model enables the rational design of field-effect sensors. However, in the model equations, there are several material parameters that cannot be (easily) measured. The surface charge density of the insulator has an essential effect on the device and also affects the probe and target molecules. The doping concentration has a crucial effect on the device and the model. Due to the nonlinear effect of these parameters, an efficient parameter estimation framework will enhance the accuracy and reliability of the model.
Markov-chain Monte-Carlo (MCMC) techniques are among the most efficient probabilistic methods to extract information by comparison between measurements and simulations by updating available prior knowledge and estimating the posterior densities of unknown quantities of interest. Here, we use a forward model, and a backward, inverse setting is used to determined the unknown parameters using the experiments. The classical algorithm was introduced in 1970 and is called the Metropolis–Hastings (MH) algorithm [21]. There are several improvements in the algorithm, e.g., adaptive-proposal Metropolis [22], delayed-rejection Metropolis [23], and delayed rejection adaptive Metropolis (DRAM) [24], as well as using ensemble Kalman filters [25]. In all techniques, different candidates are proposed based on a proposal distribution, and the algorithm decides whether they are rejected or accepted. A review of the MCMC methods is given in [26]. For SiNW-FETs, the DRAM algorithm has been used to identify the doping concentration and the amount of target molecules [14]. Considering the selective functionalization of SiNW, the authors of [1] used the MH algorithm to estimate the probe-target density at the surface.
Neural networks (also known as artificial neural networks (ANNs)) as the subset of machine learning are frameworks to analyze the available data and discover patterns that can not be observed independently. The ANNs have been inspired by the human brain and are suitable for complicated and nonlinear cases. Here, we split the prior data into two categories, namely training and testing data. The training set (between 60% and 80%) is used to extract useful information from the data, and the test set (between 20% and 40%) is employed to monitor the algorithm performance. In SiNW-FETs, there are a large amount of simulation and experimental data concerning different input (physical, chemical, and device) parameters that should be analyzed to ensure their accuracy and reliability. Of course, this process is time consuming and reduces the efficiency. Furthermore, the sensors are developed to detect specific biological species with the highest sensitivity. In the design process, using neural networks enables us to optimize the design parameters to enhance the sensor performance [27,28,29,30,31,32].
This article is structured as follows. In Section 2, we present the model equations and explain how the electrical current is computed. In Section 3, we discuss the parameter estimation methods and explain how MCMC can be used to determine the unknown parameters. In Section 4, we introduce the developed neural networks algorithm for SiNW-FETs. In Section 5, we first verify the model response with the experimental data; then, Bayesian inversion is used to identify the material parameters. Afterward, the developed machine-learning algorithm is employed in training and testing. Finally, the conclusions are summarized in Section 6.

2. The Model Equations

The drift–diffusion–Poisson system is used to describe the electrochemical interactions (Poisson–Boltzmann equation) and the charge transport (drift–diffusion equations) in field-effect sensors. The convex and bounded domain Ω R 3 consists of four subdomains, namely the insulator (SiO 2 , Ω Si ), the silicon substrate and transducer ( Ω Si ), the aqueous solution ( Ω liq ), and the charged molecules ( Ω mol ). To model the potential interactions, we use the Poisson–Boltzmann equation
· ( A ( x ) V ( x ) ) = q ( C dop ( x ) + p ( x ) n ( x ) ) in Ω Si , 0 in Ω ox , ρ ( x ) in Ω M , 2 φ ( x ) sinh ( β ( V ( x ) Φ F ) ) in Ω liq ,
where A indicates the dielectric constant, which is a function of the material, V is the electrostatic potential, C dop is the doping concentration, ρ is the surface charge of the molecules, Φ F denotes the Fermi level, and φ is the ionic concentration. Regarding the electrical constants, we use the relative values A Si = 11.7 , A ox = 3.9 , A M = 3.7 , and A liq = 78.4 . Considering the Boltzmann constant k B , the temperature T and the elementary charge q, we define β = q / ( k B T ) . In the simulations, a thermal voltage of 0.021 V will be used.
A two-dimensional cross-section of the device is given in Figure 1.
At the interface between the insulator and the liquid (i.e., Γ : = Ω ox Ω liq ), we impose the interface conditions
A ( 0 + ) V ( 0 + , y , z ) V ( 0 , y , z ) = α ( y , z ) on Γ ,
A ( 0 + ) x V ( 0 + , y , z ) A ( 0 ) x V ( 0 , y , z ) = γ ( y , z ) on Γ
for V I . Here, 0 + and 0 denote the limit at the interface on the side of liquid and insulator. Furthermore, α is macroscopic dipole moment density, and γ is the macroscopic surface-charge density.
In Ω Si , we solve the drift–diffusion system
· ( A V ) = q ( p ( x ) n ( x ) + C dop ( x ) ) ,
· J n = q R ( n , p ) ,
· J p = q R ( n , p ) ,
J n = q ( D n n μ n n V ) ,
J p = q ( D p p μ p p V )
to model the charges in the transistor, where D n and D p are the electron and hole diffusion coefficients. The concentrations of electrons and holes are given by
p = : n i exp q K B T ( Φ 1 V ) , n = : n i exp q K B T ( Φ 2 V ) ,
where n i is the intrinsic carrier density and Φ 1 and Φ 1 are the Fermi levels. In order to compute the electron and hole current densities, we use the Shockley–Read–Hall recombination rate, i.e.,
R ( n , p ) : = n p n i 2 τ n ( p + n i ) + τ p ( n + n i ) ,
where τ n and τ p denote the lifetimes of the electrons and holes.
For solving the nonlinear system of equations, we use the Scharfetter–Gummel iteration. For this, we write the concentrations n and p in terms of the two Slotboom variables u and v as
n ( x , ω ) = : n i e V ( x , ω ) / U T u ( x , ω ) ,
p ( x , ω ) = : n i e V ( x , ω ) / U T v ( x , ω ) .
Therefore, the model problem (3) can be rewritten as
· ( A ( x ) V ( x ) ) = q C dop ( x ) n i e V ( x ) / U T u ( x ) e V ( x ) / U T v ( x ) ,
U T n i · ( μ n e V / U T u ( x ) ) = R ( x ) ,
U T n i · ( μ p e V / U T v ( x ) ) = R ( x ) ,
where U T is the thermal voltage and the Shockley–Read–Hall recombination rate takes the form
R SRH ( x ) = n i u ( x ) v ( x ) 1 τ p ( e V / U T u ( x ) + 1 ) + τ n ( e V / U T v ( x ) + 1 ) .
At the ohmic contacts (backgate, source, and drain) and the solution gate, we have a Dirichlet boundary condition V Ω = V D consisting of
V | Ω G = V g V | Ω S = V S V | Ω D = V D V | Ω sol = V solution .
At the source and drain contacts ( on Ω Si ), we apply
u ( x ) = u D ( x ) , v ( x ) = v D ( x ) .
For the remaining part of the domain, we impose a zero Neumann boundary condition to guarantee the self-isolation. We refer the interested reader to [15,19,33,34] for theoretical discussions about the model including the Slotboom variables. The existence and uniqueness of the solutions for deterministic and stochastic model problems are given in [15,35]. Finally, the computation of J n and J p enables us to calculate the electrical current as
I : = J n + J p d x ,
where we take the integral on a cross-section of the transducing part.
In this work, we use the finite element method (FEM) to solve the coupled system of equations. We define the spaces
X 1 = V H 1 ( Ω ) | V | Ω = V D , V | Γ = V I ,
X 2 = u H 1 ( Ω Si ) | u | Ω Si = u D ,
X 3 = v H 1 ( Ω Si ) | v | Ω Si = v D .
Therefore, we define the continuous solution space X : = X 1 × X 2 × X 3 for the DDP system. Regarding the space discretization, we assume T h = { T 1 , T 2 , , T n } denotes a quasi-uniform mesh defined on Ω h Ω with mesh width h : = max T j T h diam ( T j ) . We define
S V 1 ( T h ) : = { V H 1 ( Ω ) | V | T P 1 ( T ) T T h } , S u 1 ( T h ) : = { u H 1 ( Ω ) | u | T P 1 ( T ) T T h } , S v 1 ( T h ) : = { v H 1 ( Ω ) | v | T P 1 ( T ) T T h } ,
where P 1 is the space of first-order polynomials. Then, we have
X h 1 : = V h S V 1 ( T h ) | V h | Ω = V D , V h | Γ = V I ,
X h 2 : = u h S u 1 ( T h ) | u h | Ω Si = u D ,
X h 3 : = v h S v 1 ( T h ) | v h | Ω Si = v D .
The discrete solution is defined as X h : = X h 1 × X h 2 × X h 3 , which is a subset of X. The weak form of the model equations can be found in [15,33]. The a prior and a posterior estimations are proved in [33]. More theoretical works regarding the finite elements analysis are given in [36,37,38].

3. Parameter Estimation Based on Bayesian Inference

In different experimental situations, an accurate estimation of the effective parameters and constants cannot be easily estimated. Bayesian inversion techniques based on Markov chain Monte Carlo methods are efficient and straightforward probabilistic techniques to estimate these unknowns. We initiate the algorithm using the available information, named prior knowledge (which may not be sufficiently accurate), and during several iterations, we can update the information and provide more reliable data (i.e., the posterior density). Then, we can extract valuable information from the posterior density, and its mean/median can be used as the solution of the interference. A very strong agreement with the experimental values and the model response can be achieved. We start a statistical model
M = P ( x , χ ) + ε ,
where M is the experimental observation (normally n dimensional), while P is the solution of the model problem which depends on the set of parameters χ (i.e., χ = χ 1 , χ 2 , , χ k and the Cartesian coordinates x. Here, ε is the measurement error, and we assume that it is normally distributed, i.e., ε N ( 0 , σ 2 I ) , including the parameter σ 2 . Having an experimental observation, for instance electrical current (i.e., M = obs ), we define the probability function
π ( obs ) = R n π ( obs | χ ) π 0 ( χ ) d χ .
Our aim is to estimate the posterior density π ( χ | m ) , considering the measured observation m and the available prior information. For this, we compute the likelihood function
π ( M | χ ) = L ( χ , σ 2 | M ) = 1 ( 2 π σ 2 ) n / 2 exp M P / 2 σ 2
where
M P = j = 1 n [ M j P j ( x , χ ) ] 2
is the sum of square errors. Obviously, if the model response with respect to the (set of) parameters χ will be closer to the measured value, the square error (15) will converge to zero, and its relative probability (computed by the likelihood function) will converge to 1. Inaccurate estimation of χ will increase the error term, and the probability will converge to zero.
In the Metropolis algorithm, we initiate the process using an initial guess χ 0 based on the prior density. According to the proposal distribution, a new candidate χ 🟉 is proposed. We compute the acceptance rate by
λ ( χ j 1 , χ 🟉 ) = min 1 , π ( χ 🟉 ) π ( χ j 1 ) .
If the new candidate χ 🟉 is accepted, we continue the MCMC chain with that; otherwise, ( χ j 1 has a higher probability concerning χ 🟉 ), we follow the chain with the previous candidate. Using a non-symmetric proposal density is a generalization of the Metropolis algorithm, introduced by Hastings [21], where the probability of the forward jump is not equal to the backward one. A summary of the algorithm is given in Algorithm 1.
Algorithm 1 The Metropolis–Hastings algorithm.
Initialization: Start the process with the initial guess χ 0 and number of samples N.
while j < N
   1. Propose a new sample according to the proposal density χ * T ( χ * | χ j 1 ) .
   2. Compute the acceptance/rejection ratio
ζ ( χ * | θ j 1 ) = min 1 , π ( χ * | m ) π ( χ j 1 | m ) T ( χ j 1 | χ * ) T ( χ * | χ j 1 ) .
   3. Sample R Uniform ( 0 , 1 ) .
   4. if R < ζ  then
         accept χ * and set χ j : = χ *
      else
         reject χ * and set χ j : = χ j 1
      end if
   5. Set j = j + 1 .
The Metropolis–Hastings algorithm is a simple and versatile technique and has been widely used for several problems in applied science. However, for the high-dimensional cases (different parameters should be inferred simultaneously), the algorithm does not work appropriately, since the rejection rate increases significantly. To improve its computational drawbacks, different improvements, such as the adaptive Metropolis algorithm [22], delayed rejection Metropolis [23], and their combination, namely delayed rejection adaptive Metropolis (DRAM) [24]. We refer the interested readers to [26] as a review paper about the methods.

Mcmc with Ensemble-Kalman Filter (EnKF-MCMC)

In EnKF-MCM [25], we use a Kalman gain employing the mean and the covariance of the prior distribution and the cross-covariance between parameters and observations. It will be used to compute the proposal distribution and make the convergence to the target density faster. Here, the new candidate is computed as the jump of the Kalman-inspired proposal Δ χ as
χ 🟉 = θ j 1 + Δ χ .
In order to update the candidates, we compute Δ χ by
Δ χ = K y j 1 + s j 1 ,
where K denotes the so-called Kalman gain,
K = C χ M C M M + R 1 .
Here, C θ M indicates the covariance matrix between the identified unknowns and model response, C M M points out the covariance matrix of the model response, and R denotes the measurement noise covariance matrix [39]. In addition, y j 1 is the residual of the proposed values concerning the model and s j 1 N ( 0 , R ) relates to the density of measurement. A summary of the relative algorithm is given in Algorithm 2. Finally, Figure 2 shows the implementation of EnKF-MCMC and Schafetter–Gummel iteration for parameter estimation and solving the model equations.
Algorithm 2 Bayesian inference using EnKF-MCMC
Initialization ( j = 0 ): Start the process with the initial guess χ 0 and number of samples N.
while j < N
      1. Estimate the model response with respect to χ j 1
      2. Compute the Kalman gain     K = C χ M C M M + R 1
      3. Produce the new proposal using the shift     χ 🟉 = χ j 1 + K y j 1 + s j 1
      4. Accepted/rejected χ 🟉
      5. Set j = j + 1 .

4. Multilayer Feed-Forward Neural Networks

Neural networks are efficient, flexible, and robust simulation tools specifically for nonlinear and complicated problems. They consist of three effective components, including neurons, structures, and weights, which all affect the response and behavior of the network. Artificial neural networks (ANNs) are supervised machine learning algorithms consisting of neurons and hidden layers. The input data are processed into the hidden layers, the output is compared with the target trajectory, and the relative error is computed. The neural networks strive to minimize this error.
Typically, there are two common classes of neural networks, namely feed-forward neural networks (single or multilayers) and recurrent dynamics neural networks. Single-layer neural networks [40] have less complexity; however, they are more suitable for linear problems. In multilayer feed-forward neural networks (MFNNs) [41,42], more than one layer of the artificial neurons will be used to enhance the capability to learn nonlinear patterns, which is more appropriate for BIO-FETs. In MFNNs, the neurons are organized in different non-recurrent layers, where in the first layer, we have the input vector (here are the parameters of the sensor), and the output is given to the first hidden layer. After the data processing, the data are transferred to the next layers using the weights; the procedure is followed until the latest MFNNs layer. These networks are also named multilayer perceptrons, and their structure is shown in Figure 3.
Let us assume d denotes the desired trajectory (i.e., the device output); for M-layer neural networks, we have
w j s ( k ) ( n s 1 ) × 1 = η s E w j s ( k ) = η s δ i s ( k ) w j s ( k ) ( n s 1 ) × 1
= η s e j s ( k ) f j s n e t j s ( k ) x s 1 ( n s 1 ) × 1 s = 1 , 2 , , M , j = 1 , 2 , , n s ,
δ j s ( k ) : = E n e t j s ( k ) = e j s ( k ) f j s n e t j s ( k ) ,
e j s ( k ) = l = 1 n s + 1 δ j s + 1 ( k ) w l j s + 1 ( k ) ,
where w is the weights, η is the training rate, E is the network mean square error (MSE), δ is the sensitivity function (here, δ s indicates the network error in the jth layer), n e t s is the weighted input, n s is the number of neurons in the sth layer, x 0 is the network input, x s 1 is the output of the s 1 th layer, and it is also the input of the sth layer. We also have the following initial conditions for the recurrent process
δ j M ( k ) = e j M ( k ) f j M n e t j M ( k ) ,
e j M ( k ) ) d j ( k ) O j M ( k ) .
Figure 4 shows the jth neuron in the ith layer in the learning algorithm. In the recurrent process, in order to adjust the weights from the first layer, we follow as
δ l i s ( k ) = E ( k ) n e t l i s l m = 1 n M l m 1 = 1 n M 1 l i + 2 = 1 n i + 2 l i + 1 = 1 n i + 1 E n e t l m s n e t l m s n e t l m 1 s 1 n e t l i + 2 s + 2 n e t l i + 1 s + 1 n e t l i + 1 s + 1 n e t l i s ( k )
For i = 1 , 2 , , M 1 and s = 1 , 2 , , M , the relation n e t l i s and n e t l i + 1 s + 1 takes
n e t l i + 1 s + 1 ( k ) = p = 1 n i w l i + 1 p s + 1 ( k ) f p s n e t p s ( k ) ,
therefore
n e t l s + 1 n e t l i s ( k ) = w l i + 1 l i s + 1 ( k ) f l i s n e t l i s ( k ) .
So, we can write δ l i s as
δ l i s ( k ) = l = 1 n i + 1 δ l s + 1 ( k ) w l l i s + 1 ( k ) f l i s n e t l i s ( k ) = e l i s ( k ) f l i s n e t l i s ( k ) ,
where
e l i s ( k ) = l = 1 n i + 1 δ l s + 1 ( k ) w l l i s + 1 ( k ) .
The gradient of E (the difference between desired trajectory and the neural networks’s output) with respect to the weight vector is given by
E w l i s ( k ) = l = 1 n E n e t l i s ( k ) n e t l i s w l i s ( k ) ,
where the second term depends only on the neurons features and takes
n e t l s w l i s ( k ) = x s 1 ( k ) l = l i , 0 otherwise ,
E w l i s ( k ) = δ l i s ( k ) x s 1 ( k ) .
Using the back-propagation error algorithm enables us to adjust the weight functions in order to minimize the network error. This training process is also named the supervised learning algorithm.

5. Numerical Experiments

As we already mentioned, the DDP system is a roust and reliable system of equations to model the electrical behavior of the FET devices. We use a prostate-specific antigen (PSA) sensitive sensor which is used to diagnose prostate cancer. For the simulations, we use a sensor device with the nanowire length of 1000 nm, width of 100 nm and height of 50 nm, which is coated with SiO 2 with 8 nm thickness. We use the P 1 finite element to solve the model problem, and tetrahedral meshes are employed to discretize the domain. A schematic of the bio-FET including dimensions using 6622 nodes and 45,735 tetrahedra is shown in Figure 5. The sensor is developed for the detection of 2ZCH (https://www.rcsb.org/structure/2ZCH). The PROPKA algorithm predicts the pK a values of ionizable groups in proteins and protein–ligand complexes based on the 3D structure. The values are the basis for understanding the pH-dependent characteristics of proteins and catalytic mechanisms of many enzymes [43]. To compute the net charge, we performed a PROPKA algorithm [44,45,46] to detect the net charge for different pH values. The simulations are completed using a pH value of 9, giving rise to the net charge of −15 q [14]. In field-effect sensors, surface reactions at the oxide surface depending on the pH value and the binding of charged target molecules result in changes in the charge concentration at and near the surface, and subsequently in changes in the electrostatic potential, which then modulates the current through the transducer. Since the molecules are negatively charged, the binding of the target molecules to the bio-receptors will enhance the charge conductance and increase the response of the sensor (i.e., the electrical current).
The system of equations is capable of modeling the surface charges at the surface. In a previous work, we developed a Monte-Carlo approach to simulate the charges around a charged biomolecule at a charged surface [47]. Furthermore, in [48], a nonlinear Poisson model was used to calculate the free energies of various molecule orientations in dependence of the surface charge. Based on the free energies, the probabilities of the orientations were calculated, and hence, the biological noise was simulated.

5.1. Model Verification

As the first step, we verify the model accuracy with the experiments. We compute the electrical current I with respect to different gate voltages V G where the source-to-drain voltage V SD = 0.2 V, doping concentration C dop = 1 × 10 16 cm 3 , and the thermal voltage U T = 0.021 V . The experimental data are taken form [20]. In order to solve the nonlinear coupled system of equations, a Scharfetter–Gummel-type iteration is used. Figure 6 shows the current as a function gate voltage varying between V G = 1 V and V G = 3.5 V for experimental and simulation values. These results indicate that the DDP system is reliable and will be used for the next simulations.

5.2. Bayesian Inversion

The molecules are negatively charged (here, −15 q is used); however, an accurate estimation of the molecule charge density will be necessary. In semiconductor devices, in order to enhance the conductivity, impurity atoms are added to the silicon lattice, namely the doping process. Higher doping concentration will improve the transistor conductivity; however, the device will be less sensitive to the charged molecules. Physically, doping concentration (as a macroscopic quantity) denotes the average amount of the dopants. We implemented a delayed rejection adaptive Metropolis (DRAM) [14] and the Metropolis–Hastings algorithm [1] to infer doping concentration, molecule charge density, and probe–target density. The efficiency of the EnKF-MCMC compared to these algorithms is studied in [26]. Therefore, we employ the Kalman filter for the proposal adaptation. We performed the MCMC algorithm with N = 10,000 iterations, and a uniform prior density is used. The computational aspects are summarized in Table 1.
The back-propagation error is an efficient algorithm for the training of neural networks where we compute the gradient of the loss function with respect to the weights of the network.
Employing a footprint of 10 nm for the molecules [20,49] gives rise to a surface charge of −1.5 q/nm 2 . In the experiments, a doping concentration of 1 × 10 16 is used in the transducer (both values are selected as the true values). The posterior densities are shown in Figure 7. As expected, the posterior densities are around the true values. Regarding the surface charge, we have a normal distribution, and the charge cannot be positive (which is reasonable due to using P-type FET). For the doping concentration, the distribution points out that for C dop more than 2 × 10 16 , the sensitivity will reduce significantly, and almost all of the candidates are rejected.

5.3. Machine Learning Based on MFNNs

In this section, we employ MFNNs to train the machine according to available information from the sensors. The effective physical/geometrical parameters will have a nonlinear effect on the device output. For instance, for a doping concentration of more than C dop = 2 × 10 16 , the current will increase sharply, which is compatible with the results in Bayesian inversion (Figure 7). Due to this nonlinear behavior, the MFNNs algorithm is chosen to monitor the data accuracy and reliability and predict the sensor behavior.
More hidden layers will facilitate the convergence to the desired trajectory; however, it will increase dramatically the computational costs (e.g., computational time). In this work, we use two hidden layers for the MFNNs algorithm to strike a balance between complexity and efficiency. The procedure is shown in Figure 8. We define five specific scenarios according to the number of inputs. In Case 1, we only have one input (V g ) varying between 1 V and 5 V , where other parameters including insulator thickness, nanowire width (N W ), doping concentration, and nanowire height (N H ) are constant. In Case 5, we have five inputs, and the output is the calculated electrical current. Table 2 shows the range of the parameters used for different cases.
The MFNNs algorithm is trained with two learning rates (i.e., η = 0.1 and η = 0.2 ) and different numbers of epochs. Here, we use 75% of the samples for data training and 25% of the samples for data testing. The numbers of epochs and neurons in the 1st and 2nd hidden layers are given in Table 3. The sigmoid function is used as an activation function in hidden and output layers. In order to verify the efficiency/accuracy of the MFNNs structure algorithm, for different cases, we compare the output of the machine learning algorithm with the desired trajectories (computed currents). We have the relative MSE for the test and training process and performed a linear regression test to explain the relation between the targets and MFNNs output. Figure 9 and Figure 10 show the results for Cases 1–5, where in all cases, there is a good agreement between the machine learning output and the sensor data.

6. Conclusions

In this work, we introduced a computational framework for modeling charge transport and electrostatic potential distribution in SiNW-FETs in order to enable the rational design of this sensor technology. The PDE-based model has been verified with the experimental data and showed its accuracy. Bayesian inversion can be used to determine quantities of interest such as molecule concentrations, surface charges, and doping concentrations.
Our approach and results can be extended to different types of sensors including plasma resonance-based biosensors, fluorescence-based sensors, and electrochemiluminescence-based biosensors that are used to detect biomarkers.
Finally, machine learning algorithms based on MFNNs have been developed for SiNW-FETs. Here, we use two hidden layers to deal with the nonlinear behavior of the current (with respect to the input parameters), where the method shows its computational efficiency. We used 75% of the data to train the machine and the remaining 25% for testing. In both cases, the obtained MSE shows the convergence to the desired trajectory. The results indicate that MFNNs are a suitable machine learning algorithm for SiNW-FETs and can be used to predict the sensor output behavior as a compact model.

Author Contributions

Writing—review & editing, A.K., M.P., M.T. and C.H. All authors have read and agreed to the published version of the manuscript.

Funding

C. Heitzinger and A. Khodadadian acknowledge support by FWF START project no. Y660 PDE Models for Nanotechnology. M. Parvizi acknowledges the financial support of the Alexander von Humbold Foundation project named –matrix approximability of the inverses for FEM, BEM and FEM–BEM coupling of the electromagnetic problems. She is affiliated to the Cluster of Excellence PhoenixD (EXC 2122, Project ID 390833453).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mirsian, S.; Khodadadian, A.; Hedayati, M.; Manzour-ol Ajdad, A.; Kalantarinejad, R.; Heitzinger, C. A new method for selective functionalization of silicon nanowire sensors and Bayesian inversion for its parameters. Biosens. Bioelectron. 2019, 142, 111527. [Google Scholar] [CrossRef]
  2. Kuang, T.; Chang, L.; Peng, X.; Hu, X.; Gallego-Perez, D. Molecular beacon nano-sensors for probing living cancer cells. Trends Biotechnol. 2017, 35, 347–359. [Google Scholar] [CrossRef]
  3. Hahm, J.i.; Lieber, C.M. Direct ultrasensitive electrical detection of DNA and DNA sequence variations using nanowire nanosensors. Nano Lett. 2004, 4, 51–54. [Google Scholar] [CrossRef]
  4. Zhang, G.J.; Chua, J.H.; Chee, R.E.; Agarwal, A.; Wong, S.M. Label-free direct detection of MiRNAs with silicon nanowire biosensors. Biosens. Bioelectron. 2009, 24, 2504–2508. [Google Scholar] [CrossRef] [PubMed]
  5. Choi, J.H.; Kim, H.; Kim, H.S.; Um, S.H.; Choi, J.W.; Oh, B.K. MMP-2 detective silicon nanowire biosensor using enzymatic cleavage reaction. J. Biomed. Nanotechnol. 2013, 9, 732–735. [Google Scholar] [CrossRef] [PubMed]
  6. De Santiago, F.; Trejo, A.; Miranda, A.; Salazar, F.; Carvajal, E.; Pérez, L.; Cruz-Irisson, M. Carbon monoxide sensing properties of B-, Al-and Ga-doped Si nanowires. Nanotechnology 2018, 29, 204001. [Google Scholar] [CrossRef]
  7. Song, X.; Hu, R.; Xu, S.; Liu, Z.; Wang, J.; Shi, Y.; Xu, J.; Chen, K.; Yu, L. Highly sensitive ammonia gas detection at room temperature by integratable silicon nanowire field-effect sensors. ACS Appl. Mater. Interfaces 2021, 13, 14377–14384. [Google Scholar] [CrossRef]
  8. Duan, X.; Li, Y.; Rajan, N.K.; Routenberg, D.A.; Modis, Y.; Reed, M.A. Quantification of the affinities and kinetics of protein interactions using silicon nanowire biosensors. Nat. Nanotechnol. 2012, 7, 401–407. [Google Scholar] [CrossRef]
  9. Patolsky, F.; Zheng, G.; Lieber, C.M. Fabrication of silicon nanowire devices for ultrasensitive, label-free, real-time detection of biological and chemical species. Nat. Protoc. 2006, 1, 1711–1724. [Google Scholar] [CrossRef]
  10. Stern, E.; Vacic, A.; Rajan, N.K.; Criscione, J.M.; Park, J.; Ilic, B.R.; Mooney, D.J.; Reed, M.A.; Fahmy, T.M. Label-free biomarker detection from whole blood. Nat. Nanotechnol. 2010, 5, 138–142. [Google Scholar] [CrossRef] [Green Version]
  11. Chua, J.H.; Chee, R.E.; Agarwal, A.; Wong, S.M.; Zhang, G.J. Label-free electrical detection of cardiac biomarker with complementary metal-oxide semiconductor-compatible silicon nanowire sensor arrays. Anal. Chem. 2009, 81, 6266–6271. [Google Scholar] [CrossRef]
  12. Chen, K.I.; Li, B.R.; Chen, Y.T. Silicon nanowire field-effect transistor-based biosensors for biomedical diagnosis and cellular recording investigation. Nano Today 2011, 6, 131–154. [Google Scholar] [CrossRef]
  13. Gao, A.; Lu, N.; Wang, Y.; Dai, P.; Li, T.; Gao, X.; Wang, Y.; Fan, C. Enhanced sensing of nucleic acids with silicon nanowire field effect transistor biosensors. Nano Lett. 2012, 12, 5262–5268. [Google Scholar] [CrossRef]
  14. Khodadadian, A.; Stadlbauer, B.; Heitzinger, C. Bayesian inversion for nanowire field-effect sensors. J. Comput. Electron. 2020, 19, 147–159. [Google Scholar] [CrossRef] [Green Version]
  15. Taghizadeh, L.; Khodadadian, A.; Heitzinger, C. The optimal multilevel Monte-Carlo approximation of the stochastic drift–diffusion-Poisson system. Comput. Methods Appl. Mech. Eng. 2017, 318, 739–761. [Google Scholar] [CrossRef] [Green Version]
  16. Khodadadian, A.; Hosseini, K.; Manzour-ol Ajdad, A.; Hedayati, M.; Kalantarinejad, R.; Heitzinger, C. Optimal design of nanowire field-effect troponin sensors. Comput. Biol. Med. 2017, 87, 46–56. [Google Scholar] [CrossRef]
  17. Pittino, F.; Selmi, L. Use and comparative assessment of the CVFEM method for Poisson–Boltzmann and Poisson–Nernst–Planck three dimensional simulations of impedimetric nano-biosensors operated in the DC and AC small signal regimes. Comput. Methods Appl. Mech. Eng. 2014, 278, 902–923. [Google Scholar] [CrossRef]
  18. Khodadadian, A.; Heitzinger, C. Basis adaptation for the stochastic nonlinear Poisson–Boltzmann equation. J. Comput. Electron. 2016, 15, 1393–1406. [Google Scholar] [CrossRef]
  19. Khodadadian, A.; Taghizadeh, L.; Heitzinger, C. Three-dimensional optimal multi-level Monte–Carlo approximation of the stochastic drift–diffusion–Poisson system in nanoscale devices. J. Comput. Electron. 2018, 17, 76–89. [Google Scholar] [CrossRef]
  20. Baumgartner, S.; Heitzinger, C.; Vacic, A.; Reed, M.A. Predictive simulations and optimization of nanowire field-effect PSA sensors including screening. Nanotechnology 2013, 24, 225503. [Google Scholar] [CrossRef] [Green Version]
  21. Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57, 97–109. [Google Scholar] [CrossRef]
  22. Haario, H.; Saksman, E.; Tamminen, J. Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 1999, 14, 375–395. [Google Scholar] [CrossRef]
  23. Green, P.J.; Mira, A. Delayed rejection in reversible jump Metropolis–Hastings. Biometrika 2001, 88, 1035–1053. [Google Scholar] [CrossRef]
  24. Haario, H.; Laine, M.; Mira, A.; Saksman, E. DRAM: Efficient adaptive MCMC. Stat. Comput. 2006, 16, 339–354. [Google Scholar] [CrossRef]
  25. Evensen, G. The ensemble Kalman filter for combined state and parameter estimation. IEEE Control Syst. Mag. 2009, 29, 83–104. [Google Scholar] [CrossRef]
  26. Noii, N.; Khodadadian, A.; Ulloa, J.; Aldakheel, F.; Wick, T.; François, S.; Wriggers, P. Bayesian Inversion with Open-Source Codes for Various One-Dimensional Model Problems in Computational Mechanics. Arch. Comput. Methods Eng. 2022, 1–34. [Google Scholar] [CrossRef]
  27. Schackart, K.E.; Yoon, J.Y. Machine learning enhances the performance of bioreceptor-free biosensors. Sensors 2021, 21, 5519. [Google Scholar] [CrossRef] [PubMed]
  28. Cui, F.; Yue, Y.; Zhang, Y.; Zhang, Z.; Zhou, H.S. Advancing biosensors with machine learning. ACS Sens. 2020, 5, 3346–3364. [Google Scholar] [CrossRef] [PubMed]
  29. Albrecht, T.; Slabaugh, G.; Alonso, E.; Al-Arif, S.M.R. Deep learning for single-molecule science. Nanotechnology 2017, 28, 423001. [Google Scholar] [CrossRef] [PubMed]
  30. Jin, X.; Liu, C.; Xu, T.; Su, L.; Zhang, X. Artificial intelligence biosensors: Challenges and prospects. Biosens. Bioelectron. 2020, 165, 112412. [Google Scholar] [CrossRef] [PubMed]
  31. Raji, H.; Tayyab, M.; Sui, J.; Mahmoodi, S.R.; Javanmard, M. Biosensors and machine learning for enhanced detection, stratification, and classification of cells: A review. arXiv 2021, arXiv:2101.01866. [Google Scholar]
  32. Rivera, E.C.; Swerdlow, J.J.; Summerscales, R.L.; Uppala, P.P.T.; Maciel Filho, R.; Neto, M.R.; Kwon, H.J. Data-driven modeling of smartphone-based electrochemiluminescence sensor data using artificial intelligence. Sensors 2020, 20, 625. [Google Scholar] [CrossRef] [Green Version]
  33. Khodadadian, A.; Parvizi, M.; Heitzinger, C. An adaptive multilevel Monte Carlo algorithm for the stochastic drift–diffusion–Poisson system. Comput. Methods Appl. Mech. Eng. 2020, 368, 113163. [Google Scholar] [CrossRef]
  34. Khodadadian, A.; Taghizadeh, L.; Heitzinger, C. Optimal multilevel randomized quasi-Monte-Carlo method for the stochastic drift–diffusion-Poisson system. Comput. Methods Appl. Mech. Eng. 2018, 329, 480–497. [Google Scholar] [CrossRef]
  35. Baumgartner, S.; Heitzinger, C. Existence and local uniqueness for 3d self-consistent multiscale models of field-effect sensors. Commun. Math. Sci. 2012, 10, 693–716. [Google Scholar] [CrossRef]
  36. Cockburn, B.; Triandaf, I. Convergence of a finite element method for the drift-diffusion semiconductor device equations: The zero diffusion case. Math. Comput. 1992, 59, 383–401. [Google Scholar] [CrossRef]
  37. Chen, Z.; Cockburn, B. Analysis of a finite element method for the drift-diffusion semiconductor device equations: The multidimensional case. Numer. Math. 1995, 71, 1–28. [Google Scholar] [CrossRef]
  38. Zlámal, M. Finite element solution of the fundamental equations of semiconductor devices. I. Math. Comput. 1986, 46, 27–43. [Google Scholar] [CrossRef] [Green Version]
  39. Zhang, J.; Vrugt, J.A.; Shi, X.; Lin, G.; Wu, L.; Zeng, L. Improving Simulation Efficiency of MCMC for Inverse Modeling of Hydrologic Systems with a Kalman-Inspired Proposal Distribution. Water Resour. Res. 2020, 56, e2019WR025474. [Google Scholar] [CrossRef] [Green Version]
  40. Bebis, G.; Georgiopoulos, M. Feed-forward neural networks. IEEE Potentials 1994, 13, 27–31. [Google Scholar] [CrossRef]
  41. Svozil, D.; Kvasnicka, V.; Pospichal, J. Introduction to multi-layer feed-forward neural networks. Chemom. Intell. Lab. Syst. 1997, 39, 43–62. [Google Scholar] [CrossRef]
  42. Sazli, M.H. A brief review of feed-forward neural networks. Commun. Fac. Sci. Univ. Ank. Ser. A2-A3 Phys. Sci. Eng. 2006, 50. [Google Scholar] [CrossRef]
  43. Dolinsky, T.J.; Czodrowski, P.; Li, H.; Nielsen, J.E.; Jensen, J.H.; Klebe, G.; Baker, N.A. PDB2PQR: Expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 2007, 35, W522–W525. [Google Scholar] [CrossRef] [PubMed]
  44. Li, H.; Robertson, A.D.; Jensen, J.H. Very fast empirical prediction and rationalization of protein pKa values. Proteins Struct. Funct. Bioinform. 2005, 61, 704–721. [Google Scholar] [CrossRef] [PubMed]
  45. Søndergaard, C.R.; Olsson, M.H.; Rostkowski, M.; Jensen, J.H. Improved treatment of ligands and coupling effects in empirical calculation and rationalization of pKa values. J. Chem. Theory Comput. 2011, 7, 2284–2295. [Google Scholar] [CrossRef] [PubMed]
  46. Olsson, M.H.; Søndergaard, C.R.; Rostkowski, M.; Jensen, J.H. PROPKA3: Consistent treatment of internal and surface residues in empirical pKa predictions. J. Chem. Theory Comput. 2011, 7, 525–537. [Google Scholar] [CrossRef] [PubMed]
  47. Bulyha, A.; Heitzinger, C. An algorithm for three-dimensional Monte-Carlo simulation of charge distribution at biofunctionalized surfaces. Nanoscale 2011, 3, 1608–1617. [Google Scholar] [CrossRef]
  48. Heitzinger, C.; Liu, Y.; Mauser, N.J.; Ringhofer, C.; Dutton, R.W. Calculation of fluctuations in boundary layers of nanowire field-effect biosensors. J. Comput. Theor. Nanosci. 2010, 7, 2574–2580. [Google Scholar] [CrossRef]
  49. Punzet, M.; Baurecht, D.; Varga, F.; Karlic, H.; Heitzinger, C. Determination of surface concentrations of individual molecule-layers used in nanoscale biosensors by in situ ATR-FTIR spectroscopy. Nanoscale 2012, 4, 2431–2438. [Google Scholar] [CrossRef]
Figure 1. A schematic cross-section of a SiNW-FET depicting the subdomains, i.e., the transducer Ω Si , SiO 2 insulator ( Ω ox ), the aqueous solution Ω liq , the binding of the target molecules to the immobilized receptor molecules ( Ω mol ), and the boundary conditions.
Figure 1. A schematic cross-section of a SiNW-FET depicting the subdomains, i.e., the transducer Ω Si , SiO 2 insulator ( Ω ox ), the aqueous solution Ω liq , the binding of the target molecules to the immobilized receptor molecules ( Ω mol ), and the boundary conditions.
Sensors 22 04785 g001
Figure 2. Bayesian inversion using EnKF-MCMC to identify the unknown material parameters, where the Scharfetter–Gummel iteration is used to solved the coupled system of equations.
Figure 2. Bayesian inversion using EnKF-MCMC to identify the unknown material parameters, where the Scharfetter–Gummel iteration is used to solved the coupled system of equations.
Sensors 22 04785 g002
Figure 3. The structure of multilayer feed-forward neural networks (MFNNs).
Figure 3. The structure of multilayer feed-forward neural networks (MFNNs).
Sensors 22 04785 g003
Figure 4. The back-propagation algorithm for the adjustment of neuron weights.
Figure 4. The back-propagation algorithm for the adjustment of neuron weights.
Sensors 22 04785 g004
Figure 5. A 3D schematic of the sensor device including the dimensions and tetrahedral meshes for the discretization. All values are in nanometers.
Figure 5. A 3D schematic of the sensor device including the dimensions and tetrahedral meshes for the discretization. All values are in nanometers.
Sensors 22 04785 g005
Figure 6. A comparison between the experimental [20] and simulation current.
Figure 6. A comparison between the experimental [20] and simulation current.
Sensors 22 04785 g006
Figure 7. The posterior density of doping concentration (left) and surface charge density (right) using EnKF-MCMC. The units are C dop (cm 3 ) and ρ (q/nm 2 ).
Figure 7. The posterior density of doping concentration (left) and surface charge density (right) using EnKF-MCMC. The units are C dop (cm 3 ) and ρ (q/nm 2 ).
Sensors 22 04785 g007
Figure 8. The structure of the MFNNs algorithm.
Figure 8. The structure of the MFNNs algorithm.
Sensors 22 04785 g008
Figure 9. The performance of MFNNs algorithm for Case 1 (a), Case 2 (b), and Case 3 (c). In the first column, the desired trajectories (shown in blue) are compared with the MFNN output (shown in red). In the second column, we have the relative MSE, and the regression test is given in the third column.
Figure 9. The performance of MFNNs algorithm for Case 1 (a), Case 2 (b), and Case 3 (c). In the first column, the desired trajectories (shown in blue) are compared with the MFNN output (shown in red). In the second column, we have the relative MSE, and the regression test is given in the third column.
Sensors 22 04785 g009aSensors 22 04785 g009b
Figure 10. The performance of the MFNNs algorithm for Case 4 (a) and Case 5 (b). In the first column, the desired trajectories (shown in blue) are compared with the MFNN output (shown in red). In the second column, we have the relative MSE, and the regression test is given in the third column.
Figure 10. The performance of the MFNNs algorithm for Case 4 (a) and Case 5 (b). In the first column, the desired trajectories (shown in blue) are compared with the MFNN output (shown in red). In the second column, we have the relative MSE, and the regression test is given in the third column.
Sensors 22 04785 g010
Table 1. The computational features and the results of the Bayesian inversion.
Table 1. The computational features and the results of the Bayesian inversion.
ParameterMinMaxEnKF (Median)True ValuesAcceptance Rate
C dop (cm 3 )1 × 10 15 5 × 10 16 9.4 × 10 15 1 × 10 16 91%
ρ (q/nm 2 )−51−1.55−1.586%
Table 2. The range of parameters used to compute the electrical current in different cases.
Table 2. The range of parameters used to compute the electrical current in different cases.
CasesInputs V g [V]SiO 2 [nm]N W [nm] C dop [ cm 3 ] N H [nm]
Case 11 U ( 1 , 5 ) 81001 × 10 16 50
Case 22 U ( 1 , 5 ) U ( 5 , 15 ) 1001 × 10 16 50
Case 33 U ( 1 , 5 ) U ( 5 , 15 ) U ( 80 , 120 ) 1 × 10 16 50
Case 44 U ( 1 , 5 ) U ( 5 , 15 ) U ( 80 , 120 ) U ( 1 × 10 15 , 5 × 10 16 ) 50
Case 55 U ( 1 , 5 ) U ( 5 , 15 ) U ( 80 , 120 ) U ( 1 × 10 15 , 5 × 10 16 ) U ( 40 , 60 )
Table 3. The features of the MFNNs algorithm including the MSE of training and test processes.
Table 3. The features of the MFNNs algorithm including the MSE of training and test processes.
CaseNo. Neurons in 1st Hidden LayerNo. Neurons in 2nd Hidden LayerMSE-TrainMSE-TestNo. Epochs η
11040.000570.0006110000.1
22070.001470.0018420000.2
32070.001810.00083640000.2
42070.0008420.00051780000.2
52070.00110.00005810,0000.2
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Khodadadian, A.; Parvizi, M.; Teshnehlab, M.; Heitzinger, C. Rational Design of Field-Effect Sensors Using Partial Differential Equations, Bayesian Inversion, and Artificial Neural Networks. Sensors 2022, 22, 4785. https://doi.org/10.3390/s22134785

AMA Style

Khodadadian A, Parvizi M, Teshnehlab M, Heitzinger C. Rational Design of Field-Effect Sensors Using Partial Differential Equations, Bayesian Inversion, and Artificial Neural Networks. Sensors. 2022; 22(13):4785. https://doi.org/10.3390/s22134785

Chicago/Turabian Style

Khodadadian, Amirreza, Maryam Parvizi, Mohammad Teshnehlab, and Clemens Heitzinger. 2022. "Rational Design of Field-Effect Sensors Using Partial Differential Equations, Bayesian Inversion, and Artificial Neural Networks" Sensors 22, no. 13: 4785. https://doi.org/10.3390/s22134785

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop