An Intelligent Internet Search Assistant Based on the Random Neural Network

Serrano, Will; Gelenbe, Erol

doi:10.1007/978-3-319-44944-9_13

Will Serrano¹⁷ &
Erol Gelenbe¹⁷

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 475))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

2140 Accesses
6 Citations

Abstract

Even web services that are free of charge, typically offer access to online information based on some form of economic interest of the web service itself. Thus advertisers who put the information on the web will make a payment to the search services based on the clicks that their advertisements receive. Thus end users cannot know that the results they obtain from Web search engines are exhaustive, or that they actually respond to their needs. To fill the gap between user needs and the information that is presented to them on the web, Intelligent Search Assistants have been proposed to act at the interface between users and search engines to present data to users in a manner that reflects their actual needs or their observed or stated preferences. This paper presents an Intelligent Internet Search Assistant based on the Random Neural Network that tracks the user’s preferences and makes a selection on the output of one or more search engines using the preferences that it has learned. We also introduce a “relevance metric” to compare the performance of our Intelligent Internet Search Assistant against a few search engines, showing that it provides better performance.

You have full access to this open access chapter, Download conference paper PDF

The Random Neural Network Applied to an Intelligent Search Assistant

A Big Data Intelligent Search Assistant Based on the Random Neural Network

The Random Neural Network and Web Search: Survey Paper

Keywords

1 Introduction

The need to search for specific information in the ever expanding Internet has led the development of Web search engines. Whereas their benefit is the provision of a direct connection between users and the information or products sought, any search outcome will be influenced by a commercial interest as well as by the users’ own ambiguity in formulating their requests or queries. An example of this situation is travel services. The Internet has made accessible real time travel industry’s information and services; customers can purchase flight tickets, hotels and holiday packs online. Distribution costs have been reduced due a shorter value chain; however businesses not shown on the top positions within the search results may lose potential customers. A similar scenario also occurs within academic search; the Internet has allowed the democratization of academic publications. Authors can upload their work onto their personal Webpages bypassing the traditional model of the journal peer review. There is the biased interest from authors to get their publications in top search positions in order to reach a bigger audience so they will be cited more. In both examples ranking algorithms are essential as they decide the relevance; they make information visible or hidden to customers or users. Under this model, Web search engines or recommender systems can be tempted to artificially rank results from some specific businesses for a fee whereas also authors or business can be tempted to manipulate ranking algorithms by “optimizing” the presentation of their work or products. The main consequence is that irrelevant results may be shown on top positions and relevant ones “hidden” at the very bottom of the search list.

In order to address the presented search issues; this paper proposes an Intelligent Internet Search Assistant (ISA) that acts as an interface between an individual user’s query and the different search engines. Our ISA acquires a query from the user and retrieves results from one or various search engines assigning one neuron per each Web result dimension. The result relevance is calculated by applying our innovative cost function based on the division of a query into a multidimensional vector weighting its dimension terms with different relevance parameters. Our ISA adapts and learns the perceived user’s interest and reorders the retrieved snippets based in our dimension relevant centre point. Our ISA learns result relevance on an iterative process where the user evaluates directly the listed results. We evaluate and compare its performance against other search engines with a new proposed quality definition, which combines both relevance and rank. We have also included two learning algorithms; Gradient Descent learns the centre of relevant dimensions and Reinforcement Learning updates the network weights based on rewarding relevant dimensions and punishing irrelevant ones. We have validated our ISA against other Web search engines and metasearch engines using travel services and open user queries. We have also analysed the Gradient Descent and Reinforcement Learning algorithms based on result relevance and learning speed.

We describe the application of neural networks in Web search in Sect. 2. We define our Intelligent Internet Search Assistant mathematical model in Sect. 3 and we have validated it against other Web search engines in Sect. 4. Finally, we present our conclusions in Sect. 5.

2 Related Work

The ability of neural networks to learn iteratively from different inputs to acquire the desired outputs as a mechanism of adaptation to users’ interest in order to provide relevant answers have already been applied in the World Wide Web and recommender systems.

F. Scarselli et al. [1] and M. Chau et al. [2] use a neural network by assigning a neuron to each Web page; they create a graph where the neural links are the equivalent of the hyperlinks. S. Bermejo et al. [3] use a similar approach to our proposal, the allocation of one neuron per Web search result, however the main difference is that the network is trained to cluster results by meaning. C. Burgues et al. [4] define RankNet which uses neural networks to evaluate Web sites by training the neural network based on query-document pairs. Shu, B. et al. [5] retrieve results from different Web search engines and train the network following the assumption that a result in a top position would be relevant. J. Boyan et al. [6] use reinforcement learning to rank Web pages using their HTML properties and hyperlink connections between them. X. Wang et al. [7] use a back propagation neural network with its input nodes corresponding to an specific quantified user profile and one output node which it is the a probability the user would consider the Web page relevant.

3 The Intelligent Internet Search Assistant Model

The search assistant we design is based on the Random Neural Network (RNN) [8–10]. This is a biologically inspired spiking recurrent stochastic model for neural networks. Its main analytical properties are the “product form” and the existence of the unique network steady state solution. The RNN represents more closely how signals are transmitted in many biological neural networks where they actual travel as spikes or impulses, rather than as analogue signal levels. It has been used in different applications including network routing with cognitive packet networks, using reinforcement learning, which requires the search for paths that meet certain pre-specified quality of service requirements [11, 17], search for exit routes for evacuees in emergency situations [12, 13], pattern based search for specific objects [14], video compression [15], and image texture learning and generation [16].

3.1 Search Model

In the case of our own application of the RNN, the search for information or for some meaning needs requires us to specify some elements: an M-dimensional universe of X entities or ideas to be searched, a high level query that specifies the N-properties or concepts requested by a user and a method that searches and selects Y entities from the universe showing the first Z results to user according to an algorithm or rule. Each entity or concept in the universe is distinct from the others in some recognizable way; for instance two entities may be different just in the date or time-stamp that characterizes the time when they were last stored or in the ownership or origin of the entities. On the other hand, we consider concepts to be distinct if they contain any different meaning, even though if they are identical with respect to a user’s query.

We consider that the universe which we are searching within as a relation U that consists of a set of X M-tuples, U = {v₁, v₂ … v_X}, where v_i = (l_i1, l_i2 … l_iM) and li are the M different attributes for i = 1, 2 … X. The relation U is a very large relation consisting on M > > N attributes. The important concept in the development of this paper is a query can be defined as R_t(n(t)) = (R_t(1), R_t(2), …, R_t(n(t))) where n(t) is a variable N-dimension attribute vector with 1 < N < M and t is the search iteration being t > 0; n(t) is variable so that attributes can be added or removed based on their relevance as the search progresses, i.e. as t increases. Each R_t(n(t)) takes its values from the attributes within the domain D(n(t)), where D is the corresponding domain that forms the universe U. Thus D(n(t)) is a set of properties or meanings based in words or integers, but also words in another language, or a set of icons, images or sounds.

The answer A to the query R_t(n(t)) is a set of Y M-tuples A = {v₁, v₂ … v_Y} where v_o = (l_o1, l_o2 … l_oM) and lo are the M different attributes for o = 1, 2 … Y. Our Intelligent Internet Search Assistant only shows to the user the first set of Z tuples that have the highest neuron potentials among the set of Y tuples. The neuron potential that represents the relevance of each M-tuple v_o is calculated at each t iteration. The user or the high level query itself is limited mainly by two main factors: the user’s lack of information about all the attributes that form the universe U of entities and ideas, or the user’s lack of precise knowledge about what he is looking for.

3.2 Result Cost Function

We consider the universe U is formed of the entire results that can be searched. We assign each result provided by a search engine to an M-tuple v_o of the answer set A. We calculate the result relevance based on a cost function described within this section. The query R_t(n(t)) is a variable N-dimension vector that specifies the attributes the user consider relevant. The number of dimensions of the attribute vector n(t) varies as the iteration t increases. Our Intelligent Internet Search Assistant associates an M-tuple v_o to each result provided by the Search Engine creating an answer set A of Y M-tuples. Search Engines select their results from the universe U. We apply our cost function to each result or M-tuple v_o from the answer set A of Y M-tuples. We consider each v_o as a M-dimensional vector. The cost function is firstly calculated based on the relevant N attributes the user introduced on the High Level Query, R₁(n(1)) within the domain D(n(1)) however, as the search progresses, R_t(n(t)), attributes may be added or removed based on the perceived relevance within the domain D’(n(t)). We calculate the overall Result Score, RS, by measuring the relationship between the values of its different attributes:

$$ {\text{RS}} = \;{\text{RV}} * {\text{HW}} $$

(1)

where RV is the Result Value which measures the result relevance and HW the Homogeneity Weight. The Homogeneity Weight (HW) rewards results that have relevance or scores dispersed along their attributes. This parameter is also based on the idea that the first dimensions or attributes of the user query R_t(n(t)) are more important than the last ones:

$$ {\text{HW}} = \;\frac{{\sum\limits_{{{\text{n}} = 1}}^{\text{N}} {\text{HF[n]}} }}{\text{N}} $$

(2)

where HF[n], homogeneity factor, is a N-dimension vector associated to the result and n is the attribute index from the query R_t(n(t)):

$$ \begin{array}{*{20}c} {{\text{HF[n]}} = } & {\left| {\begin{array}{*{20}c} {\;\frac{{{\text{N}} - {\text{n}}}}{\text{N}}\quad {\text{if SD[n]}} > 0} \\ {\;\;\;\begin{array}{*{20}c} {} \\ { 0\quad \;\;{\text{if SD[n]}} = 0} \\ \end{array} } \\ \end{array} } \right.} \\ \end{array} $$

(3)

We define Score Dimension SD[n] as a N-dimension vector that represents the attribute values of each result or M-tuple v_o in relation with the query R_t(n(t)). The Result Value (RV) is the sum of each dimension individual score:

$$ {\text{RV}} = \;\sum\limits_{{{\text{n}} = 1}}^{\text{N}} {\text{SD[n]}} $$

(4)

where n is the attribute index from the query R_t(n(t)). Each dimension of the Score Dimension vector SD[n] is calculated independently for each n-attribute value that forms the query R_t(n(t)):

$$ {\text{SD[n]}} = \;{\text{S}} * {\text{PPW}} * {\text{RPW}} * {\text{DPW}} $$

(5)

We consider only three different types of domains of interest: words, numbers (as for dates and times) and prices. S is the score calculated depending if the domain of the attribute is a word (WS), number (NS) or price (PS). If the domain D(n) is a word, our ISA calculates the score Word Score (WS) following the formula:

$$ {\text{S}} = \;\frac{\text{WR}}{\text{NW}} $$

(6)

where the value of WR is 1 if the word of the n-attribute of the query R_t(n(t)) is contained in the search result or 0 otherwise. NW is the number of words in the search result. If the domain D(n) is a number, our ISA selects the best Number Score (NS) from the numbers they are contained within the search result that maximizes the cost function:

$$ {\text{S}} = \;\frac{{\left( {1 - \left( {\frac{{\left| {{\text{DV}} - {\text{RV}}} \right|}}{{\left| {\text{DV}} \right| + \left| \text{RV} \right|}}} \right)} \right)}}{\text{NN}} $$

(7)

where DV is the value of the n-attribute of the query R_t(n(t)), RV is the value of a number in the result and NN is the total number of numbers in the result. If the domain D(n) is a price, our ISA chooses the best Price Score (PS) from the prices in the result that maximizes the cost function:

$$ {\text{S}}\, = \,\,\frac{{\left( {\frac{\text{DV}}{\text{RV}}} \right)}}{\text{NP}} $$

(8)

where DV is value of the n-attribute of the query R_t(n(t)), RV is the value of a price in the result and NP is the total number of prices in the result. We penalize if the search result provides unnecessary information by dividing the score by the total amount of elements in the Web result. The dimension Score Dimension vector, SD[n] is weighted according to different relevance factors:

$$ {\text{SD[n]}} = \;{\text{S}} * {\text{PPW}} * {\text{RPW}} * {\text{DPW}} $$

(9)

The Position Parameter Weight (PPW) is based on the idea that an attribute value shown within the first positions of the search result is more relevant than if it is shown at the final:

$$ {\text{PPW}} = \;\frac{{{\text{NC}} - {\text{DVP}}}}{\text{NC}} $$

(10)

where NC is the number of characters in the result and DVP is the position within the result where the value of the dimension is shown. The Relevance Parameter Weight (RPW) incorporates the user’s perception of relevance by rewarding the first attributes of the query R_t(n(t)) as highly desirable and penalising the last ones:

$$ {\text{RPW}} = \; 1- \frac{\text{PD}}{\text{N}} $$

(11)

where PD is the position of the n-attribute of the query R_t(n(t)) and N is the total number of dimensions of the query vector R_t(n(t)). The Dimension Parameter Weight (DPW) incorporates the observation of user relevance with the value of domains D(n(t)) by providing a better score on the domain values the user has more filled on the query:

$$ {\text{DPW}} = \;\frac{\text{NDT}}{\text{N}} $$

(12)

where NDT is the number of dimensions with the same domain (word, number or price) on the query R_t(n(t)) and N is the total number of dimensions of the query vector R_t(n(t)). We assign this final Result Score value (RS) to each M-tuple v_o of the answer set A. This value is used by our ISA to reorder the answer set A of Y M-tuples, showing to the user the first set of Z results which have the higher potential value.

3.3 User Iteration

The user, based on the answer set A can now act as an intelligent critic and select a subset of P relevant results, C_P, of A. C_P is a set that consists of P M-tuples C_P = {v₁, v₂ … v_P}. We consider v_P as a vector of M dimensions; v_p = (l_p1, l_p2 … l_pM) where l_p are the M different attributes for p = 1, 2 … P. Similarly, the user can also select a subset of Q irrelevant results, C_Q of A, C_Q = {v₁, v₂ … v_Q}. We consider v_q as a vector of M dimensions; v_q = (l_q1, l_q2 … l_qM) where lq are the M different attributes for q = 1, 2 … Q. Based on the user iteration, our Intelligent Internet Search Assistant provides to the user with a different answer set A of Z M-tuples reordered to MD, the minimum distance to the Relevant Centre for the results selected, following the formula:

$$ {\text{RCP[n]}} = \;\frac{{\sum\limits_{{{\text{p}} = 1}}^{\text{P}} {{\text{SD}}_{\text{p}} [ {\text{n]}}} }}{\text{P}} = \;\frac{{\sum\limits_{{{\text{p}} = 1}}^{\text{P}} {{\text{l}}_{\text{pn}} } }}{\text{P}} $$

(13)

where P is the number of relevant results selected, n the attribute index from the query R_t(n(t)) and SD_p[n] the associated Score Dimension vector to the result or M-tuple v_P formed of l_pn attributes. An equivalent equation applies to the calculation of the Irrelevant Centre Point. Our Intelligent Internet Search Assistant reorders the retrieved Y set of M-tuples showing only to the user the first Z set of M-tuples based on the lowest distance (MD) between the difference of their distances to both Relevant Centre Point (RD) and the Irrelevant Centre Point (ID) respectively:

$$ {\text{MD}} = \;{\text{RD}} - {\text{ID}} $$

(14)

where MD is the result distance, RD is the Relevant Distance and ID is the Irrelevant Distance. The Relevant Distance (RD) of each result or M-tuple v_q is formulated as below:

$$ {\text{RD}} = \;\sqrt {\sum\limits_{{{\text{n}} = 1}}^{\text{N}} {\left( {{\text{SD}}[{\text{n}}] - {\text{RCP[n]}}} \right)^{ 2} } } $$

(15)

where SD[n] is the Score Dimension vector of the result or M-tuple v_q and RCP[n] is the coordinate of the Relevant Centre Point. Equivalent equation applies to the calculation of the Irrelevant Distance. Therefore we are presenting an iterative search progress that learns and adapts to the perceived user relevance based on the dimensions or attributes the user has introduced on the initial query.

3.4 Dimension Learning

The answer set A to the query R₁(n(1)) is based on the N dimension query introduced by the user however results are formed of M dimensions therefore the subset of results the user has considered as relevant may have other relevant concepts hidden the user did not considered on the original query. We consider the domain D(m) or the M attributes from which our universe U is formed as the different independent words that form the set of Y results retrieved from the search engines. Our cost function is expanded from the N attributes defined in the query R₁(n(1)) to the M attributes that form the searched results. Our Score Dimension vector, SD[m], is now based on M-dimensions. An analogue attribute expansion is applied to the Relevance Centre Calculation, RCP[m]. The query R₁(n(1)) is based on the N-Dimension vector introduced by the user however the answer set A consist of Y M-tuples. The user, based on the presented set A, selects a subset of P relevant results, C_P and a subset of Q irrelevant results, C_Q.

Let us consider C_P as a set that consists of P M-tuples C_P = {v₁, v₂ … v_P} where v_P is a vector of M dimensions; v_P = (l_p1, l_p2 … l_pM) and l_p are the M different attributes for p = 1, 2 … P. The M-dimension vector Dimension Average, DA[m], is the average value of the m-attributes for the selected relevant P results:

$$ {\text{DA[m]}} = \;\frac{{\sum\limits_{{{\text{p}} = 1}}^{\text{P}} {{\text{SD}}_{\text{p}} [ {\text{m]}}} }}{\text{P}} = \;\frac{{\sum\limits_{{{\text{p}} = 1}}^{\text{P}} {{\text{l}}_{\text{pm}} } }}{\text{P}} $$

(16)

where P is the number of relevant results selected, m the attribute index of the relation U and SD_p[m] the associated Score Dimension vector to the result or M-tuple v_P formed of l_pm attributes. We define ADV as the Average Dimension Value of the M-dimension vector DA[m]:

$$ {\text{ADV}} = \frac{{\sum\limits_{{{\text{m}} = 1}}^{\text{M}} {\text{DA[m]}} }}{\text{M}} $$

(17)

where M is the total number of attributes that form the relation U. The correlation vector σ[m] is the difference between the dimension values of each result with the average vector:

$$ {{\sigma [ \text{m]}}} = \frac{{\sum\limits_{{{\text{p}} = 1}}^{\text{P}} {\left( {{\text{SD}}_{\text{p}} [ {\text{m]} - \text{DA[m]}}} \right)} }}{\text{P}} = \frac{{\sum\limits_{{{\text{p}} = 1}}^{\text{P}} {\left( {{\text{l}}_{\text{Pm}} {{ - \text{DA[m]}}}} \right)}}}{\text{P}} $$

(18)

where P is the number of relevant results selected, m the attribute index of the relation U and SD_p[m] the associated Score Dimension vector to the result or M-tuple v_P formed of l_pm attributes. We define C as the average correlation value of the M-dimensions of the vector σ[m]:

$$ {\text{C}} = \frac{{\sum\limits_{{{\text{m}} = 1}}^{\text{M}} {\sigma [m]}}}{\text{M}} $$

(19)

where M is the total number of attributes that form the relation U. We consider an m-attribute relevant if its associated Dimension Average value DA[m] is larger than the average dimension ADV and its correlation value σ[m] is lesser than the average correlation C. We have therefore changed the relevant attributes of the searched entities or ideas by correlating the error value of its concepts or properties represented as attributes or dimensions. On the next iteration, the query R₂(n(2)) is formed by the attributes our ISA has considered relevant. The answer to the query R₂(n(2)) is a different set A of Y M-tuples. This process iterates until there are not new relevant results to be shown to the user.

3.5 Gradient Descent Learning

Gradient Descent learning is based on the adaptation to the perceived user interests or understanding of meaning by correlating the attribute values of each result to extract similar meanings and cancel superfluous ones. The ISA Gradient Descent learning algorithm is based on a recurrent model. The inputs i = {i₁, …, i_P} are the M-tuples v_P corresponding to the selected relevant result subset C_P and the desired outputs y = {y₁, …, y_P} are the same values as the input. Our ISA then obtains the learned random neural network weights, calculates the relevant dimensions and finally reorders the results according to the minimum distance to the new Relevant Centre Point focused on the relevant dimensions.

3.6 Reinforcement Learning

The external interaction with the environment is provided when the user selects the relevant result set C_P. Reinforcement Learning adapts to the perceived user relevance by incrementing the value of relevant dimensions and reducing it for the irrelevant ones. Reinforcement Learning modifies the values of the m attributes of the results, accentuating hidden relevant meanings and lowering irrelevant properties. We associate the Random Neural Network weights to the answer set A; W = A. Our ISA updates the network weights W by rewarding the result relevant attributes by:

$$ {\text{w(p,}}\,{\text{m)}} = {\text{ l}}_{\text{pm}}^{{{\text{s}} - 1}} \, + {\text{ l}}_{\text{pm}}^{{{\text{s}} - 1}} *\left( {\frac{{{\text{l}}_{\text{pm}}^{{{\text{s}} - 1}} }}{{\sum\nolimits_{{{\text{m}} = 1}}^{\text{M}} {{\text{l}}_{\text{pm}}^{{{\text{s}} - 1}} } }}} \right) $$

(20)

where p is the result or M-tuple v_P formed of l_pm attributes, m the result attribute index, M the total number of attributes and s the iteration number. ISA also updates the network weights by punishing the result irrelevant attributes by:

$$ {\text{w(p,}}\,{\text{m)}} = {\text{ l}}_{\text{pm}}^{{{\text{s}} - 1}} \, - {\text{ l}}_{\text{pm}}^{{{\text{s}} - 1}} *\left( {\frac{{{\text{l}}_{\text{pm}}^{{{\text{s}} - 1}} }}{{\sum\nolimits_{{{\text{m}} = 1}}^{\text{M}} {{\text{l}}_{\text{pm}}^{{{\text{s}} - 1}} } }}} \right) $$

(21)

where p is the result or M-tuple v_P formed of l_pm attributes, m the result attribute index, M the total number of attributes and s the iteration number. Our ISA then recalculates the potential of each of the result based on the updated network weights and reorders them, showing to the user the results which have a higher potential or score.

4 Validation

The Intelligent Internet Search Assistant we have proposed emulates how Web search engines work by using a very similar interface to introduce and display information. We validate our ISA algorithm with a set of three different experiments. Users in the experiments can both choose between the different Web search engines and the N number of results they would to retrieve from each one. We propose the following formula to measure Web search quality; it is based on the concept that a better search engine provides with a list of more relevant results on top positions. In an list of N results, we score N to the first result and 1 to the last result, the value of the quality proposed is then the summation of the position score based of each of the selected results. Our definition of Quality, Q, can be defined as:

$$ {\text{Q}} = \;\sum\limits_{{{\text{i}} = 1}}^{\text{Y}} {{\text{RSE}}_{\text{i}} } $$

(22)

where RSE_i is the rank of the result i in a particular search engine with a value of N if the result is in the first position and 1 if the result is the last one. Y is the total number of results selected by the user. The best Web search engine would have the largest Quality value. We define normalized quality, $ \overline{\text{Q}} $, as the division of the quality, Q, by the optimum figure which it is when the user consider relevant all the results provided by the Web search engine. On this situation Y and N have the same value:

$$ \overline{\text{Q}} = \;\frac{\text{Q}}{{\frac{{{\text{N(N}} + 1 )}}{ 2}}} $$

(23)

We define I as the quality improvement between a Web search engine and a reference:

$$ {\text{I}} = \;\frac{{{\text{QW}} - {\text{QR}}}}{\text{QR}} $$

(24)

where I is the Improvement, QW is the quality of the Web search engine and QR is the quality reference; we use the Quality of Google as QR in our validation exercise.

In our first experiment we have asked to our validators to search for different queries using only Google; ISA provides with a set of reordered results from which the user needs to select the relevant results. We show the average values for the 20 different queries, the average number of results retrieved by Google and the average number of results selected by the user. We represent the normalized quality of Google and ISA with the improvement of our algorithm against Google. In our second experiment, ISA provides with a reordered list from where the user needs to select which results are relevant. Our ISA reorders the results using the dimension relevant centre point providing to the user with another reordered result list from where the user needs to select the relevant ones. We show the average values for the 16 different queries, the average number of results selected by the user and the average number of results selected. We also represent the normalized quality of Google, ISA and the ISA with the relevant circle iteration including the improvement against Google in both scenarios. In our third experiment, validators can select from which Web search engine they would their results to be retrieved from; as in our first experiment, the users need to select the relevant results. Our ISA combines the results retrieved from the different Web search engines selected. We present the average values for the 18 different queries. We show the normalized quality of each Web search engine selected including our ISA; because users can choose any Web search engine; we are not introducing the improvement value as we do not have a unique reference Web search engine (Table 1).

Table 1. Web search engine validation

Full size table

4.1 ISA Learning

Users in the experiments can choose between Google and Bing with either Gradient Descent or Reinforcement Learning type. Our ISA then collects the first 50 results from the Web search engine selected, reorders them according to its cost function and finally show to the user the first 20 results. We consider 50 results is a good approximation of search depth as more results can add clutter and irrelevance; 20 results is the average number of results read by a user before he launches another search if he does not find any relevant one. ISA reorders results while learning on the two step iterative process showing only the best 20 results to the user. We present the average Quality values of the Web search engine and ISA for the 29 different queries searched by different users, the learning type and the Web search engine used. The first I represents the improvement from ISA against the Web search; the second I is between ISA iterations 2 and 1 and finally the third I is between the ISA iterations 3 and 2 (Table 2).

Table 2. ISA learning validation

Full size table

5 Conclusions

We have proposed a novel approach to Web search where the user iteratively trains the neural network while looking for relevant results. We have also defined a different process; the application of the Random Neural Network as a biological inspired algorithm to measure both user relevance and result ranking based on a predetermined cost function. Our Intelligent Internet Search Assistant performs generally slightly better than Google and other Web search engines however, this evaluation may be biased because users tend to concentrate on the first results provided which were the ones we showed in our algorithm. Our ISA adapts and learns from user previous relevance measurements increasing significantly its quality and improvement within the first iteration. Reinforcement Learning algorithm performs better than Gradient Descent. Although Gradient Descent provides a better quality on the first iteration; Reinforcement Learning outperforms on the second one due its higher learning rate. Both of them have a residual learning on their third iteration. Gradient Descent would have been the preferred learning algorithm if only one iteration is required; however Reinforcement Learning would have been a better option in the case of two iterations. It is not recommended three iterations because learning is only residual.

References

Scarselli, F., Liang, S., Hagenbuchner, M., Chung, A.: Adaptive page ranking with neural networks. In: Proceeding of WWW 2005 Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, pp. 936– 937 (2005)
Google Scholar
Chau, M., Chen, H.: Incorporating web analysis into neural networks: an example in Hopfield net searching. IEEE Trans. Syst Cybern. C Appl. Rev. 37(3), 352–358 (2007)
Article Google Scholar
Bermejo, S., Dalmau, J.: Web metasearch using unsupervised neural networks. In: IWANN 2003 Proceedings of the 7th International Work-Conference on Artificial and Natural Neural Networks: Part II: Artificial Neural Nets Problem Solving Methods, pp. 711–718 (2003)
Google Scholar
Burgues, C., Shaked, T., Renshaw, E., Lazier, L., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: ICML 2005 Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96 (2005)
Google Scholar
Shu, B., Kak, S.: A neural network-based intelligent metasearch engine. Inf. Sci. Inform. Comput. Sci. 120, 1–11 (2009)
Google Scholar
Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proceedings of the AAAI Workshop on Internet-based Information Systems (1996)
Google Scholar
Wang, X., Zhang, L.: Search engine optimization based on algorithm of BP neural networks. In: Proceedings of the Seventh International Conference on Computational Intelligence and Security, pp. 390–394 (2011)
Google Scholar
Gelenbe, E.: Random Neural Network with negative and positive signals and product form solution. Neural Comput. 1, 502–510 (1989)
Article Google Scholar
Gelenbe, E.: Learning in the recurrent Random Neural Network. Neural Comput. 5, 154–164 (1993)
Article Google Scholar
Gelenbe, E., Timotheou, S.: Random Neural Networks with synchronized interactions. Neural Comput. 20(9), 2308–2324 (2008)
Article MathSciNet MATH Google Scholar
Gelenbe, E., Lent, R., Xu, Z.: Towards networks with cognitive packets. In: Goto, K., Hasegawa, T., Takagi, H., Takahashi, Y. (eds.) Performance and QoS of next generation networking, pp 3–17, Springer, London (2011)
Google Scholar
Gelenbe, E., Wu, F.J.: Large scale simulation for human evacuation and rescue. Comput. Math Appl. 64(12), 3869–3880 (2012)
Article Google Scholar
Filippoupolitis, A., Hey, L., Loukas, G., Gelenbe, E., Timotheou, S.: Emergency response simulation using wireless sensor networks. In: Proceedings of the 1st International Conference on Ambient Media and Systems, no. 21 (2008)
Google Scholar
Gelenbe, E., Koçak, T.: Area-based results for mine detection. IEEE Trans. Geosci. Remote Sens. 38(1), 12–24 (2000)
Article Google Scholar
Cramer, C., Gelenbe, E., Bakircloglu, H.: Low bit-rate video compression with neural networks and temporal subsampling. Proc. IEEE 84(10), 1529–1543 (1996)
Article Google Scholar
Atalay, V., Gelenbe, E., Yalabik, N.: The Random Neural Network model for texture generation. Int. J. Pattern Recogn. Artif. Intell. 6(1), 131–141 (1992)
Article Google Scholar
Gelenbe, E.: Steps towards self-aware networks. Commun. ACM 52(7), 66–75 (2009)
Article Google Scholar
Gelenbe, E.: Analysis of single and networked auctions. ACM Trans. Internet Technol. 9(2), 1–24 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Systems and Networks Group, Electrical and Electronic Engineering, Imperial College London, London, UK
Will Serrano & Erol Gelenbe

Authors

Will Serrano
View author publications
You can also search for this author in PubMed Google Scholar
Erol Gelenbe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Will Serrano .

Editor information

Editors and Affiliations

Democritus University of Thrace , Thessaloniki, Greece
Lazaros Iliadis
University of Piraeus , Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Serrano, W., Gelenbe, E. (2016). An Intelligent Internet Search Assistant Based on the Random Neural Network. In: Iliadis, L., Maglogiannis, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2016. IFIP Advances in Information and Communication Technology, vol 475. Springer, Cham. https://doi.org/10.1007/978-3-319-44944-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-44944-9_13
Published: 02 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44943-2
Online ISBN: 978-3-319-44944-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Intelligent Internet Search Assistant Based on the Random Neural Network

Abstract

Similar content being viewed by others

The Random Neural Network Applied to an Intelligent Search Assistant

A Big Data Intelligent Search Assistant Based on the Random Neural Network

The Random Neural Network and Web Search: Survey Paper

Keywords

1 Introduction

2 Related Work

3 The Intelligent Internet Search Assistant Model

3.1 Search Model

3.2 Result Cost Function

3.3 User Iteration

3.4 Dimension Learning

3.5 Gradient Descent Learning

3.6 Reinforcement Learning

4 Validation

4.1 ISA Learning

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Intelligent Internet Search Assistant Based on the Random Neural Network

Abstract

Similar content being viewed by others

The Random Neural Network Applied to an Intelligent Search Assistant

A Big Data Intelligent Search Assistant Based on the Random Neural Network

The Random Neural Network and Web Search: Survey Paper

Keywords

1 Introduction

2 Related Work

3 The Intelligent Internet Search Assistant Model

3.1 Search Model

3.2 Result Cost Function

3.3 User Iteration

3.4 Dimension Learning

3.5 Gradient Descent Learning

3.6 Reinforcement Learning

4 Validation

4.1 ISA Learning

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation