It integrates multiple sources of prior medical knowledge, including: + +- Clinical trial information. +- Co-occurrence networks from literature citations. +- Drug-target interactions. +- Drug-indication importance. +- Disease-drug distance. + +`labyrinth` emphasizes the importance of aligning computational models with intuitive human reasoning. Thus, it employs a human-like knowledge retrieval methodology to identify potential drug candidates for various diseases. `labyrinth` aims to strike a balance between predictive accuracy and model interpretability, as demonstrated by its robust performance across diverse diseases, evaluated using ROC-AUC metrics. + +![A simple schema of the labyrinth](man/figures/intro.jpg) + + +## A notice on operating system compatibility + +I developed and tested `labyrinth` on Fedora Linux version 38 and 39. While this package does not contain any operating system-specific code, it has not been tested on other operating systems. In theory, `labyrinth` should work on other Unix-like operating systems as long as the required dependencies are installed. + +We recommended these dependencies to be installed: + +- **R (≥ 4.3.0)**: We developed this R package using R version 4.3.3. +- **Python**: Python is required for drawing plots in demos. It is recommended to have Python and `seaborn` installed, as the `reticulate` package will use the system's Python installation. +- **OpenMP**: This package uses OpenMP for parallelization and multithreading if OpenMP exists. Having OpenMP installed can significantly improve performance. +- **Intel oneAPI Math Kernel Library (oneMKL)**: This library can further enhance mathematical performance, especially on Intel processors. oneMKL is not required but highly recommended. + +If you encounter any issues while running this package on other operating system, please open an [issue](https://github.com/randef1ned/labyrinth/issues). + + +## Before installation + +Before installation, we recommended you install Intel oneAPI Math Kernel Library (oneMKL) to optimize the computational performance of linear algebra. + +Windows users can download oneMKL from [Intel's website](https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-download.html) and install it in the default directory. The default directory is: `C:\Program Files (x86)\Intel\oneAPI`. + +Debian and Ubuntu users can download oneMKL using apt in the non-free repo: + +``` bash +# Install oneMKL version 2020.4.304-4 +sudo apt install intel-mkl-full +``` + +Or using the Intel repo: + +``` bash +# Set up the repository and signed the entry +wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \ +| gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null +echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list +# Update the package list +sudo apt update +# Install the latest oneMKL (version 2024.1) +sudo apt install intel-oneapi-mkl +``` + +Fedora users can download oneMKL by using dnf: + +``` bash +# Create dnf repository file +tee > /tmp/oneAPI.repo << EOF +[oneAPI] +name=Intel® oneAPI repository +baseurl=https://yum.repos.intel.com/oneapi +enabled=1 +gpgcheck=1 +repo_gpgcheck=1 +gpgkey=https://yum.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB +EOF +sudo mv /tmp/oneAPI.repo /etc/yum.repos.d +# Install the latest oneMKL (version 2024.0) +sudo dnf install intel-oneapi-mkl +``` + + +## Installation + +Install `labyrinth` using: + +``` r +install.packages(c('devtools', 'BiocManager')) +remotes::install_github("randef1ned/labyrinth") +``` + +Or you can download the pre-built binary packages from [Releases](https://github.com/randef1ned/labyrinth/releases). + + +## Usage + +Load the package using `library(labyrinth)`. We provide a vignette for the package that can be called using: `vignette("labyrinth")`. Alternatively, you can view the online version on [website](https://labyrinth.yinchun.su/articles/labyrinth) or [GitHub](doc/labyrinth_knit.md). Basically that is all you have to know. + +[This documentation](doc/training_knit.md) contains information about the contents and the necessary information for training the model used in this project. The `tools/` folder contains all the code and scripts required for constructing your own model, so that you can understand the technical details. Besides, you can refer to [this documentation](doc/preface_knit.md) for the background and inspirations behind the overall workflow of `labyrinth. + + +## Changelog Changelog: [see this](NEWS.md) + + diff --git a/_pkgdown.yml b/_pkgdown.yml new file mode 100644 index 0000000..29b2682 --- /dev/null +++ b/_pkgdown.yml @@ -0,0 +1,3 @@ +url: https://labyrinth.yinchun.su +template: + bootstrap: 5 diff --git a/doc/labyrinth_knit.md b/doc/labyrinth_knit.md new file mode 100644 index 0000000..1553dd5 --- /dev/null +++ b/doc/labyrinth_knit.md @@ -0,0 +1,5 @@ +- [Introduction](#introduction) + +## Introduction + +ab diff --git a/doc/preface_knit.md b/doc/preface_knit.md new file mode 100644 index 0000000..190c5d1 --- /dev/null +++ b/doc/preface_knit.md @@ -0,0 +1,1529 @@ +- [1. From human memory to activation diffusion + network](#from-human-memory-to-activation-diffusion-network) +- [2. Unique advantages of human cognitive + abilities](#unique-advantages-of-human-cognitive-abilities) +- [3. Simulating Human Cognitive Abilities: The Way + Forward](#simulating-human-cognitive-abilities-the-way-forward) +- [4. Simulating human knowledge representation using large-scale + medical knowledge + networks](#simulating-human-knowledge-representation-using-large-scale-medical-knowledge-networks) +- [5. Conclusion](#conclusion) +- [References](#references) + +## 1. From human memory to activation diffusion network + +Memory is a fundamental cognitive process that allows the brain to +store, acquire, and recall information. It serves as a temporary storage +system when sensory cues disappear ([Benjamin +2007](#ref-benjamin_memory_2007)). Memory plays a crucial role in +encoding, storing, retaining, and recalling everything from simple +sensory data to complex knowledge and experiences. Additionally, memory +is the basis for learning, planning, and decision-making ([Benjamin +2007](#ref-benjamin_memory_2007); [Nussenbaum, Prentis, and Hartley +2020](#ref-nussenbaum_memorys_2020)). Specifically, it enables us to +learn from past experiences and simulate potential future outcomes, +thereby influencing current behavior and future actions ([Schacter et +al. 2012](#ref-schacter_future_2012)). + +The formation of memories, their recall, and reasoning based on them +involve a combination of systems and physiological processes that allow +humans to adapt well to their environment ([Schacter et al. +2012](#ref-schacter_future_2012); [Camina and GĂĽell +2017](#ref-camina_neuroanatomical_2017); [Nairne and Pandeirada +2016](#ref-nairne_adaptive_2016)).Memory formation comprises three +stages: information perception, encoding, and storage ([Atkinson and +Shiffrin 1968](#ref-atkinson_human_1968)). These stages correspond to +three types of memory: (1) sensory memory ([Atkinson and Shiffrin +1968](#ref-atkinson_human_1968)) briefly stores raw physical stimuli +from primary receptors such as vision and hearing; (2) short-term memory +(STM) ([Baddeley 2000](#ref-baddeley_episodic_2000)) involves the +transient storage and manipulation of information, allowing individuals +to temporarily memorize small amounts of data to perform current and +future tasks; (3) long-term memory (LTM) ([Camina and GĂĽell +2017](#ref-camina_neuroanatomical_2017)) is the long-term storage of +information, divided into episodic and implicit memory. Episodic memory +consists of knowledge processed and recalled at a conscious level, such +as personal experiences and specialized knowledge, while implicit memory +encompasses skills and habits expressed without conscious awareness, +such as fear, riding a bicycle, heart rate regulation, and other +conditioned reflexes ([Smith and Grossman +2008](#ref-smith_multiple_2008)). In contrast to the limited-capacity +sensory memory and STM, LTM is a more complex cognitive system with an +unlimited capacity for long-term storage and retrieval of a wide range +of information, including factual knowledge and personal experiences. + +Memory can also be categorized into contextual memory, referring to an +individual’s personal experiences, and semantic memory, referring to +textual knowledge about concepts ([Renoult et al. +2019](#ref-renoult_knowing_2019)). Storing contextual and semantic +knowledge allows individuals to construct new knowledge based on past +experiences, facilitating their survival ([Kazanas and Altarriba +2015](#ref-kazanas_survival_2015)). In addition to storing information +and memories, LTM plays an important role in learning and reasoning. It +can automatically relate relationships and attributes between objects +([Nairne and Pandeirada 2016](#ref-nairne_adaptive_2016)), giving +individuals the ability to use stored skills and concepts to make +rational decisions by evaluating different choices in various +environments and predicting possible outcomes ([Camina and GĂĽell +2017](#ref-camina_neuroanatomical_2017)). + +The formation and consolidation of LTM involve several brain regions, +including the prefrontal lobe, associated with working memory and +situational memory in LTM ([Blumenfeld and Ranganath +2019](#ref-blumenfeld_lateral_2019)), and the temporal lobe, associated +with semantic memory in LTM ([Simmons and Martin +2009](#ref-simmons_anterior_2009)). The hippocampus acts as a relay +station for information ([Squire and Zola-Morgan +1991](#ref-squire_medial_1991)) and can integrate situational memory +into the semantic memory knowledge network stored in LTM ([Renoult et +al. 2019](#ref-renoult_knowing_2019)). Consequently, even for the same +concept or knowledge, the knowledge network formed by different +individuals can vary. + +Each individual has unique experiences and backgrounds, leading to +different understandings and reactions when interpreting the same +information. LTM is stored as a vast and complex semantic network, which +includes various types of interconnected nodes, such as concepts, +memories, and experiences ([Collins and Loftus +1975](#ref-collins_spreading_activation_1975)). Other kinds of memories +or experiences are also integrated into this network; for example, an +individual’s representation of abstract concepts (e.g., time) may be +based on physical sensations ([Casasanto and Boroditsky +2008](#ref-casasanto_time_2008)). In such cases, individuals associate +time with their experiences, forming their knowledge network. This form +of organization is named semantic networks or knowledge networks, +emphasizing how information is interconnected and organized according to +meaning ([Lehmann 1992](#ref-lehmann_semantic_1992)). + +In this article, we use the term **semantic network** to represent the +form of memory storage and organization in LTM, while **knowledge +network** refers to an artificially built knowledge network. In a +semantic network, concepts are represented as nodes, and +concept-to-concept relationships are represented as edges between nodes, +with edge weights indicating the strength of the association. A higher +edge weight implies a closer relationship between two nodes, typically +resulting in a higher recall rate after receiving a stimulus ([Anderson +1983](#ref-anderson_spreading_1983)). Learning and memorizing new +knowledge and experiences involve building new edges or reinforcing +existing ones. This organization facilitates information retrieval by +enabling individuals to jump from one concept to another in the network +and simultaneously activate neighboring nodes to form knowledge +connections, even if there is no direct correlation between them +([Lehmann 1992](#ref-lehmann_semantic_1992)). An interesting example is +that in the semantic network of some police officers, black people +produce a strong association with weapons, and this association is even +stronger if the police officer is sleep-deprived ([James +2018](#ref-james_stability_2018)). Another example is that an +individual’s preference for Coca-Cola or McDonald’s is determined by +their attitude and reflected in their semantic network ([Lee and Kim +2013](#ref-lee_comparison_2013); [Karpinski, Steinman, and Hilton +2005](#ref-karpinski_attitude_2005)). + +Homogeneous networks consist of nodes with similar properties or +characteristics ([Mhatre and Rosenberg +2004](#ref-mhatre_homogeneous_2004)). Nodes represented as the same kind +of elements and edges connected nodes with high correlations, which +jointly creating a homogeneous network. The two examples mentioned above +illustrate that different individuals form different LTMs, and the +memory contents stored in their LTMs do not satisfy homogeneity. + +Moving from one concept to an unrelated concept is impossible in a +homogeneous network. The process by which individuals store their +memories in the same semantic network and retrieve information from LTM +is often described as spreading activation, and this network is also +called the spreading activation network ([Sharifian and Samani +1997](#ref-sharifian_hierarchical_1997)). In this network model, if an +initial node is activated, this activation state spreads to other nodes +along connected edges. This diffusion process can quickly span multiple +network layers, extensively activating the concepts and memories +associated with the initial node. When a node receives activation above +a certain threshold, it is fully activated like neurons. Otherwise, it +will not be activated. This may lead to the recall of specific memories, +the formation of decisions, or the generation of problem-solving +strategies. + +As mentioned earlier, some unrelated concepts in a semantic network may +have relatively strong associations. The implicit association test (IAT) +paradigm proposed by Greenwald can effectively test the edge connections +between nodes of an individual in a semantic network ([Greenwald, +McGhee, and Schwartz 1998](#ref-greenwald_measuring_1998); [Greenwald et +al. 2009](#ref-greenwald_understanding_2009)). This paradigm tests the +strength of association in the human brain between two nodes, i.e., the +edge weights. The mechanisms of association and activation in activation +diffusion networks depend on the strength of association between nodes. +If the strength is high, the probability of activation is relatively +high; if the strength is low, there is a higher probability of +non-activation. This theory partly explains the forgetting phenomenon +that occurs in human memory. Additionally, activation diffusion networks +enable individuals to retrieve necessary information, reorganize their +memories, and apply knowledge to the same or different situations. In +summary, activation diffusion networks effectively account for the +dynamic nature of memory retrieval and use. + +## 2. Unique advantages of human cognitive abilities + +Compared to computer programs, humans possess an ability to think about +problems from different perspectives and exhibit greater flexibility in +knowledge association ([Lehmann 1992](#ref-lehmann_semantic_1992)). +Therefore, humans have the advantage of applying knowledge from one +domain to another seemingly unrelated domain. For example, concepts from +biology can be transferred to economics ([Lawlor et al. +2008](#ref-lawlor_mendelian_2008)), economic models to the field of +electronic information ([Han et al. 2019](#ref-han_rational_2019)), and +linguistic concepts to neuroscience ([Mayberry et al. +2018](#ref-mayberry_neurolinguistic_2018)) and computer science ([H. +Zhang et al. 2023](#ref-zhang_algorithm_2023)). This characteristic has +led humans to create many cross-disciplinary fields, such as artificial +intelligence, computational biology, neuropolitics, and bioinformatics. +Humans can use intuition and creative thinking to solve problems, and +this ability to think across domains allows them to make new connections +between different areas, thereby building new disciplinary knowledge. + +The human brain contains approximately 100 billion neurons and roughly +the same number of glial cells ([Bartheld, Bahney, and Herculano-Houzel +2016](#ref-von_bartheld_search_2016)), of each connected to thousands of +others via synapses ([Herculano-Houzel +2009](#ref-herculano_houzel_human_2009)). Neurons and glial cells form +extremely complex networks. Neurons communicate via the all-or-none +principle ([Pareti 2007](#ref-pareti_all_or_none_2007)), and glial cells +play crucial roles in central nervous system formation, neuronal +differentiation, synapse formation ([Allen and Lyons +2018](#ref-allen_glia_2018)), regulation of neuroinflammatory immunity +([Yang and Zhou 2019](#ref-yang_neuroinflammation_2019)), and +neurological diseases like dementia ([Kim, Choi, and Yoon +2020](#ref-kim_neuron_glia_2020)), in addition to their traditional +supportive functions ([Wolosker et al. +2008](#ref-wolosker_d_amino_2008)). Such complexity lays the foundation +for an individual’s ability to process information, experience emotions, +maintain awareness, and exhibit creativity. + +Drawing on the fundamentals of human cognition, artificial neural +networks have been simulated using computers to mimic the brain’s +information processing. They emulate human cognitive abilities to some +extent, excelling in tasks like learning, decision-making, and pattern +recognition that humans are naturally proficient at ([Agatonovic-Kustrin +and Beresford 2000](#ref-agatonovic_kustrin_basic_2000); [Parisi +1997](#ref-parisi_artificial_1997)). The simulation of human cognitive +abilities has shown great potential ([Parisi +1997](#ref-parisi_artificial_1997); [Zahedi +1991](#ref-zahedi_introduction_1991)). However, the neurons used in deep +learning and artificial neural networks are highly abstract, and the +architecture is unable to account for the neurons ([Cichy and Kaiser +2019](#ref-cichy_deep_2019)). Therefore, this field has focused more +attention on fitting data rather than interpreting it ([Pichler and +Hartig 2023](#ref-pichler_machine_2023)). + +Currently, the field of deep learning is more concerned with fitting +data, the effect of fitting is used as a guiding criterion in this +field, rather than integrating cognitive mechanisms discovered by +neuroscience ([Chavlis and Poirazi 2021](#ref-chavlis_drawing_2021)). +Much of the progress in deep learning over recent decades can be +attributed to the application of backpropagation, often used with +optimization methods to update weights and minimize the loss function. +However, while neural networks and deep learning are biologically +inspired approaches, the biological rationality of backpropagation +remains questionable, as activated neurons do not acquire features +through backpropagation ([Whittington and Bogacz +2019](#ref-whittington_theories_2019); [Lillicrap et al. +2020](#ref-lillicrap_backpropagation_2020); [Aru, Suzuki, and Larkum +2020](#ref-aru_cellular_2020)). Currently, two mainstream learning +mechanisms have been identified in the human brain using +electrophysiological methods: Hebbian learning ([Munakata and Pfaffly +2004](#ref-munakata_hebbian_2004)) and reinforcement learning +([Botvinick et al. 2019](#ref-botvinick_reinforcement_2019)). +Additionally, synaptic pruning may be related to learning ([Halassa and +Haydon 2010](#ref-halassa_integrated_2010)), and epigenetic mechanisms +also play an important role ([Changeux, Courrège, and Danchin +1973](#ref-changeux_theory_1973)). Although Hebbian learning, +reinforcement learning, and attempts to migrate human cognitive +mechanisms have been applied in deep learning for years, they still +cannot perfectly reproduce human learning features ([Volzhenin, +Changeux, and Dumas 2022](#ref-volzhenin_multilevel_2022)). +Comparatively, the energy consumption when using neural networks for +reasoning is huge ([Desislavov, MartĂnez-Plumed, and Hernández-Orallo +2023](#ref-desislavov_trends_2023)), in contrast to the human brain’s +lower energy usage for training and reasoning ([Attwell and Laughlin +2001](#ref-attwell_energy_2001)). + +Another example is the attention mechanism in neural networks, inspired +by human attention ([Vaswani et al. 2023](#ref-vaswani_attention_2023)). +Attention is a cognitive ability that selectively receives information +with limited resources ([Nelson Cowan et al. +2005](#ref-cowan_capacity_2005)). It’s a complex biological process +involving multiple brain regions, encompassing not only selective +attention but also coordinated consciousness, memory, and cognition. +Selective attention mechanisms are associated with short-term memory, +where only 3-5 chunks of original stimuli can enter during a single +session ([N. Cowan 2001](#ref-cowan_magical_2001)), with attention +lasting just a few seconds to minutes ([Polti, Martin, and Van +Wassenhove 2018](#ref-polti_effect_2018)). This selective mechanism +allows humans to focus on targets with limited resources, reducing +cognitive resource consumption ([Buschman and Kastner +2015](#ref-buschman_behavior_2015)), refining elements deposited into +memory ([Chun and Turk-Browne 2007](#ref-chun_interactions_2007)), +delimiting the problem space, and narrowing memory retrieval in +problem-solving situations ([Wiley and Jarosz +2012](#ref-wiley_working_2012)). Human intuition about numbers may also +relate to attention ([Kutter et al. 2023](#ref-kutter_distinct_2023)). +Thus, selective attention is crucial for cognitive activities like +perception, memory, and decision-making. + +Attention mechanisms in deep learning, inspired by human selective +attention, which have been successfully integrated into various +frameworks ([Niu, Zhong, and Yu 2021](#ref-niu_review_2021)), greatly +improving performance in tasks like natural language processing (NLP), +computer vision, and speech recognition ([B. Zhang, Xiong, and Su +2020](#ref-zhang_neural_2020); [Guo et al. +2022](#ref-guo_attention_2022); [Ding et al. +2021](#ref-ding_deep_2021)). In recent years, the Transformer model, +relying on self-attention mechanisms to process data, has demonstrated +superior performance across various tasks ([Vaswani et al. +2023](#ref-vaswani_attention_2023); [Khan et al. +2022](#ref-khan_transformers_2022)). Its multi-head attention mechanism +performs multiple parallel self-attention computations with different +parameters, allowing the model to capture information from different +subspaces and improving fitting efficiency and accuracy ([Liu, Liu, and +Han 2021](#ref-liu_multi_head_2021)). Practically, with the Transformer, +neural networks have made significant progress in areas like NLP and +vision tasks. + +Attention mechanisms in deep learning are implemented through +mathematical functions that assign weights to different elements of the +input data ([Niu, Zhong, and Yu 2021](#ref-niu_review_2021); [De Santana +Correia and Colombini 2022](#ref-de_santana_correia_attention_2022)). +However, a subset of studies has found that the attention mechanism in +deep learning cannot fully simulate human attention and lacks the +cognitive and emotional context that human attention encompasses ([Lai +et al. 2021](#ref-lai_understanding_2021)). Despite these differences, +artificial neural networks have been successfully applied in several +fields, including image and speech recognition, natural language +processing, robot control, gaming, and decision support systems. These +applications demonstrate the power of artificial neural networks in +dealing with complex problems and simulating certain human cognitive +processes while highlighting the unique advantages of models that +simulate human cognitive abilities. + +## 3. Simulating Human Cognitive Abilities: The Way Forward + +In recent years, the Transformer model has excelled in various tasks +that rely on self-attentive mechanisms for data processing ([Vaswani et +al. 2023](#ref-vaswani_attention_2023); [Khan et al. +2022](#ref-khan_transformers_2022)). It departs from traditional +recurrent neural networks (RNNs) and convolutional neural networks +(CNNs), favoring a comprehensive utilization of attentional mechanisms +to process sequential data. The Transformer’s attention model is +primarily applied through self-attention and multi-head attention +mechanisms. The self-attention mechanism considers all other elements in +the sequence when processing each input element, enabling the model to +capture long-range dependencies within the sequence ([Vig and Belinkov +2019](#ref-vig_analyzing_2019)). Each element is transformed into query +(*q*), key (*k*), and value (*v*) vectors, representing the current +lexical element, other lexical elements, and the information contained +in the lexical element, respectively. The attention score is computed by +calculating the similarity scores of *q* and *k*, and weighted summing +over *v*. Recently, LLMs have employed the Transformer’s framework, +demonstrating an improved simulation of human cognition. + +LLMs are large-scale simulations of human cognitive functions ([Binz and +Schulz 2023](#ref-binz_turning_2023)), and their emergence mark a +significant advancement in computers’ ability to simulate human +cognition. LLMs possess enhanced reasoning capabilities, and Claude 3, +released this month by Anthropic, exhibits self-awareness through +contextual understanding in a needle-in-a-haystack task ([Anthropic +2024](#ref-anthropic_claude_2024); [Kuratov et al. +2024](#ref-kuratov_search_2024)). In zero-shot problem scenarios, LLMs’ +reasoning abilities without prior knowledge surpass those of humans, who +rely on analogies for reasoning ([Webb, Holyoak, and Lu +2023](#ref-webb_emergent_2023)). Furthermore, LLMs can comprehend +others’ beliefs, goals, and mental states with an accuracy of up to 80%. +Notably, GPT-4, considered the most advanced LLM, can achieve 100% +accuracy in theory of mind (ToM) tasks after suitable prompting, +indicating a human-like level of ToM ([Thaler +1988](#ref-thaler_anomalies_1988)). + +LLMs can also simulate human behavior observed in experiments, such as +the ultimatum game ([Thaler 1988](#ref-thaler_anomalies_1988)), +garden-path sentences ([Ferreira, Christianson, and Hollingworth +2001](#ref-ferreira_misinterpretations_2001)), loss aversion ([Kimball +1993](#ref-kimball_standard_1993)), and reactions to the Milgram +electric shock experiment ([Blass 1999](#ref-blass_milgram_1999); [Aher, +Arriaga, and Kalai 2022](#ref-aher_using_2022)). Additionally, LLMs +exhibit cognitive biases or errors that humans typically demonstrate, +such as additive bias ([Winter et al. 2023](#ref-winter_more_2023)), +where individuals default to adding or modifying existing content rather +than deleting or pushing back when problem-solving ([Adams et al. +2021](#ref-adams_people_2021)). LLMs produce various human cognitive +effects, including priming effects and biases ([Koo et al. +2023](#ref-koo_benchmarking_2023); [Shaki, Kraus, and Wooldridge +2023](#ref-shaki_cognitive_2023)), suggesting that LLMs mimicking human +cognitive processes may possess cognitive abilities approaching the +human level. + +In specific domains, LLMs closely mimic human-specific abilities. For +instance, ChatGPT’s accuracy in medical diagnosis and providing feasible +medical advice in complex situations is comparable to that of human +physicians ([Hopkins et al. 2023](#ref-hopkins_artificial_2023)). The +performance metrics show that it diagnoses up to 93.3% of common +clinical cases correctly ([Hirosawa et al. +2023](#ref-hirosawa_diagnostic_2023)). Furthermore, in standardized +clinical decision-making tasks, ChatGPT achieves an accuracy rate close +to 70% ([Rao et al. 2023](#ref-rao_assessing_2023)), similar to the +expected level of third-year medical students in the United States +([Gilson et al. 2023](#ref-gilson_how_2023)). Due to GPT-4’s superior +ToM, it correctly answered 90% of soft skill questions ([Brin et al. +2023](#ref-brin_comparing_2023)), demonstrating excellent clinical +skills. + +However, ChatGPT’s ability to handle complex questions remains +unsatisfactory compared to widely used technologies like Google search +([Hopkins et al. 2023](#ref-hopkins_artificial_2023)). It cannot fully +replicate professional clinicians’ decision-making abilities when faced +with complex problems, primarily due to its text-based training data, +resulting in less satisfactory performance in non-text-based tasks ([Y. +Zhang et al. 2024](#ref-zhang_unexpectedly_2024)). Furthermore, in +patient care and other medically related domains, it sometimes generates +false or misleading information, potentially causing doctors, nurses, or +caregivers to make erroneous decisions, endangering patients’ lives ([Z. +Ji et al. 2023](#ref-ji_survey_2023)). + +LLMs often contain billions to hundreds of billions of parameters, +making it difficult to implement debugging and understand their +decision-making processes ([Z. Ji et al. 2023](#ref-ji_survey_2023); +[Khullar, Wang, and Wang 2024](#ref-khullar_large_2024)). Therefore, +developing relatively interpretable models is a viable alternative at +the moment. These models are trained in specific areas of expertise, +possessing prior knowledge and learning not exclusively from samples. +Recently, the life2vec model successfully predicted the relationship +between early mortality and aspects of an individual’s personality +traits, demonstrating relatively good predictive efficacy ([Savcisens et +al. 2023](#ref-savcisens_using_2023)). The model provides clinicians and +family physicians with insights and assistance that can help patients +better manage their lifespan, showcasing the potential of specialized +models. + +## 4. Simulating human knowledge representation using large-scale medical knowledge networks + +In summary, we have found that computer models simulating human +cognitive abilities tend to achieve very good model fitting results, +such as Transformer-based neural network models like LLMs. While LLMs +perform satisfactorily in a wide range of contexts, there are multiple +aspects that have not been addressed adequately in addition to the +aforementioned medical issues. + +LLMs require an enormous number of parameters and a vast amount of +training data, consuming substantial computational resources during the +training process ([S. Zhang et al. 2022](#ref-zhang_opt_2022)). Even +after training, reasoning with LLMs consumes significant computational +resources ([Samsi et al. 2023](#ref-samsi_words_2023)). Furthermore, +LLMs produce a large carbon footprint ([S. Zhang et al. +2022](#ref-zhang_opt_2022); [Faiz et al. +2023](#ref-faiz_llmcarbon_2023)) and require considerable water +consumption for cooling ([George, A.S.Hovan George, and A.S.Gabrio +Martin 2023](#ref-george_environmental_2023)), exacerbating +environmental concerns and extreme climate events. As computational +power is concentrated in a few labs, LLM also exacerbates inequality +issues and prevents most labs from gaining LLMs ([S. Zhang et al. +2022](#ref-zhang_opt_2022)). + +LLMs are often considered black boxes, making it difficult to understand +and explain their operating mechanisms. Recently, OpenAI has +demonstrated early forms of artificial intelligence in LLMs by +increasing their parameters and training sample size ([OpenAI et al. +2023](#ref-openai_gpt_4_2023); [Bubeck et al. +2023](#ref-bubeck_sparks_2023); [Schaeffer, Miranda, and Koyejo +2023](#ref-schaeffer_are_2023); [Wei et al. +2022](#ref-wei_emergent_2022)), challenging many scholars’ perceptions. +Some have argued that it resembles the Chinese room problem, where LLMs +do not emerge intelligence but rather acquire deeper features of +language, as consciousness may be a special form of language ([Hamid +2023](#ref-hamid_chatgpt_2023)). Others contend that the emergent +intelligence of LLMs is merely wishful thinking by researchers +([Schaeffer, Miranda, and Koyejo 2023](#ref-schaeffer_are_2023)). +Alternatively, it has been proposed that LLMs resemble human societies, +where a large number of individuals collectively exhibit abilities that +individuals do not possess, with emergent capabilities resulting from +complex relationships between numerous data points, akin to ant colony +algorithms. + +We suspect the possible reasons for their emergence regarding the +capabilities demonstrated by LLMs. As noted earlier, developing +relatively interpretable specialized models would maximize usability and +transparency, making them safer for clinical applications. Over the past +decades, humans have accumulated substantial historical experience in +fighting diseases and a large number of low-practice-value papers and +monographs ([Hanson et al. 2023](#ref-hanson_strain_2023)). Translating +this experience into clinical resources in bulk has become an important +issue in modern medical research. + +We observed that clinicians tend to treat patients automatically based +on their diseases while considering comorbidities the patients may have, +e.g., heart disease, high cholesterol, and bacterial infections. We +aimed to develop a model that could simulate this ability while +maintaining model interpretability. Therefore, we adapted the original +spreading activation model by replacing the LTM with a knowledge network +and substituting the memory search and inference with a random walk +approach to simulate human abilities. + +LLMs are often trained using knowledge from publications, and the +promising life2vec model uses medical information from Danish citizens. +Here, we use medical texts to build knowledge networks to train our +models. A knowledge network is a large-scale, graph-structured database +that abstracts core concepts and relationships in reality, allowing AI +systems to understand complex relationships and reason about them. It +can integrate various data sources and types to represent relationships +between elements and their properties. Knowledge networks abstract the +real world for AI systems ([Martin and Baggio +2020](#ref-martin_modelling_2020)), enabling them to solve complex tasks +and reason about the world ([S. Ji et al. 2022](#ref-ji_survey_2022)). + +Biomedical knowledge is characterized using formalism, an abstraction +process of the human brain to model systems formally and mathematically +([Phillips 2020](#ref-phillips_sheavinguniversal_2020)). Although +biomedical knowledge does not use formulas to describe biological +processes like mathematics, physics, and chemistry, knowledge networks +can establish the mechanisms involved in biological processes ([Martin +and Baggio 2020](#ref-martin_modelling_2020)). For example, biologists +usually use nodes to represent genes and edges to represent regulatory +relationships between genes. + +Once the knowledge network is having constructed, we can simulate how +humans utilize LTM by choosing the random walk approach. Numerous +studies have shown that random walk can effectively simulate human +semantic cognition ([S. Ji et al. 2022](#ref-ji_survey_2022); [Kumar, +Steyvers, and Balota 2021](#ref-kumar_semantic_2021)) and is consistent +with the human memory retrieval process. Compared to the outputs of +spreading activation, that of computer-simulated random walks showed +higher correlation with the spreading activation model’s results +([Abbott, Austerweil, and Griffiths 2015](#ref-abbott_random_2015); +[Zemla and Austerweil 2018](#ref-zemla_estimating_2018); [Siew +2019](#ref-siew_spreadr_2019)). Furthermore, brain scientists have used +random walk algorithms to explore theoretical concepts ([Abbott, +Austerweil, and Griffiths 2015](#ref-abbott_random_2015); [Siew +2019](#ref-siew_spreadr_2019)) or simulate specific human cognitive +behaviors to reduce experimental errors introduced by external +environments ([Abbott, Austerweil, and Griffiths +2015](#ref-abbott_random_2015); [Zemla and Austerweil +2018](#ref-zemla_estimating_2018)). + +Similar to the repeated, random selection of various possible solutions +in the human brain, the random walk simulates the random events that +exists in individual problem-solving and decision-making processes. As a +diffusion model, it is applicable to a wide range of situations, even +computer-simulated human societies ([Park et al. +2023](#ref-park_generative_2023)), demonstrating the broad applicability +of such computer models to many different biological scenarios. + +## 5. Conclusion + +Humans excel at employing existing problem-solving strategies ([Kaula +1995](#ref-kaula_problem_1995)). With the rapid advancement of computer +technology, there has been a surge in research articles on drug +repositioning aided by computational biology and bioinformatics. Figure +demonstrates that the relative number of articles on drug repositioning +included in PubMed, shows an increasing trend over the years with a more +significant rise in recent years. The calculation method also exhibits +the same increasing trend. The *banana* metric has proven effective in +quantifying and analyzing research interest trends across various +fields, which is defined as the number of articles retrieved using +*banana* as a keyword per year ([Dalmaijer et al. +2021](#ref-dalmaijer_banana_2021)). + +