onlinelibrary.wiley.com Open in urlscan Pro
162.159.130.87  Public Scan

URL: https://onlinelibrary.wiley.com/doi/10.1002/cjce.25525
Submission: On November 07 via manual from ES — Scanned from ES

Form analysis 8 forms found in the DOM

Name: thisJournalQuickSearchGET /action/doSearch

<form action="/action/doSearch" name="thisJournalQuickSearch" method="get" title="Quick Search" role="search">
  <div class="input-group option-0"><label for="searchField0" class="hiddenLabel">Search term</label><input type="search" aria-label="search text" role="combobox" aria-autocomplete="list" aria-controls="ui-id-1" aria-expanded="false" name="AllField"
      id="searchField0" placeholder="Search" onfocus="this.value = this.value;" data-auto-complete-max-words="7" data-auto-complete-max-chars="32" data-contributors-conf="3" data-topics-conf="3" data-publication-titles-conf="3"
      data-history-items-conf="3" value="" tabindex="9" class="autocomplete actualQSInput quickSearchFilter ui-autocomplete-input" autocomplete="off"><span role="status" aria-live="polite" class="ui-helper-hidden-accessible"></span><input
      type="hidden" name="SeriesKey" value="1939019x">
    <div class="search-options-wrapper quickSearchFilter">
      <a href="https://onlinelibrary.wiley.com/search/advanced?publication=1939019x" title="" tabindex="12" class="advanced">  Advanced Search</a><a href="https://onlinelibrary.wiley.com/search/advanced?publication=1939019x#citation" title="" tabindex="12" class="citation">  Citation Search</a>
    </div>
  </div><button type="submit" title="Search" tabindex="11" aria-label="Submit Search" class="btn quick-search__button icon-search"></button>
</form>

Name: defaultQuickSearchGET https://onlinelibrary.wiley.com/action/doSearch

<form action="https://onlinelibrary.wiley.com/action/doSearch" name="defaultQuickSearch" method="get" title="Quick Search" role="search">
  <div class="input-group option-1"><label for="searchField1" class="hiddenLabel">Search term</label><input type="search" aria-label="search text" role="combobox" aria-autocomplete="list" aria-controls="ui-id-1" aria-expanded="false" name="AllField"
      id="searchField1" placeholder="Search" onfocus="this.value = this.value;" data-auto-complete-max-words="7" data-auto-complete-max-chars="32" data-contributors-conf="3" data-topics-conf="3" data-publication-titles-conf="3"
      data-history-items-conf="3" value="" tabindex="9" class="autocomplete actualQSInput quickSearchFilter ui-autocomplete-input" autocomplete="off"><span role="status" aria-live="polite" class="ui-helper-hidden-accessible"></span>
    <div class="search-options-wrapper quickSearchFilter">
      <a href="https://onlinelibrary.wiley.com/search/advanced" title="" tabindex="12" class="advanced">  Advanced Search</a><a href="https://onlinelibrary.wiley.com/search/advanced#citation" title="" tabindex="12" class="citation">  Citation Search</a>
    </div>
  </div><button type="submit" title="Search" tabindex="11" aria-label="Submit Search" class="btn quick-search__button icon-search"></button>
</form>

POST

<form method="post">
  <fieldset>
    <legend>Please review our <a href="https://onlinelibrary.wiley.com/termsAndConditions" target="_blank" class="terms-and-conditions__cycleElement">Terms and Conditions of Use</a> and check box below to share full-text version of article.</legend>
    <div class="input-group"><label for="terms-and-conditions" class="checkbox--primary"><input id="terms-and-conditions" type="checkbox" value="yes" required="" name="terms-and-conditions"
          data-ajax-link="/action/generateShareUrl?doi=10.1002/cjce.25525&amp;shareType=P2P&amp;format=PDF" data-shareable-link=""><span class="label-txt terms-and-conditions__cycleElement">I have read and accept the Wiley Online Library Terms and
          Conditions of Use</span></label></div>
  </fieldset>
  <hr class="separator">
  <div class="shareable"><label>Shareable Link</label>
    <p>Use the link below to share a full-text version of this article with your friends and colleagues.
      <a href="https://onlinelibrary.wiley.com/researchers/tools-resources/sharing" target="_blank" class="emphasis more-link__cycleElement">Learn more.</a></p>
    <div class="shareable__box">
      <div class="shareable__text">
        <div class="shareable__field"><span id="shareable__text"></span><textarea tabindex="-1" class="shareable__text-area"></textarea></div>
      </div><button type="submit" disabled="" class="btn shareable__btn shareable-btn__cycleElement">Copy URL</button>
    </div>
    <div class="error shareable__error hidden"></div>
  </div>
</form>

POST /action/doLogin?societyURLCode=

<form action="/action/doLogin?societyURLCode=" class="bordered" method="post"><input type="hidden" name="id" value="67065c09-4a88-49cd-934c-ac707951d35c">
  <input type="hidden" name="popup" value="true">
  <input type="hidden" name="loginUri" value="/doi/10.1002/cjce.25525">
  <input type="hidden" name="remoteLoginUri" value="">
  <input type="hidden" name="redirectUri" value="/doi/10.1002/cjce.25525">
  <div class="input-group">
    <div class="label">
      <label for="username">Email or Customer ID</label>
    </div>
    <input id="username" class="login" type="text" name="login" value="" size="15" placeholder="Enter your email" autocorrect="off" spellcheck="false" autocapitalize="off" required="true">
    <div class="actions">
    </div>
  </div>
  <div class="input-group">
    <div class="label">
      <label for="password">Password</label>
    </div>
    <input id="password" class="password" type="password" name="password" value="" autocomplete="off" placeholder="Enter your password" autocorrect="off" spellcheck="false" autocapitalize="off" required="true">
    <span class="password-eye-icon icon-eye hidden" role="button" tabindex="0" aria-label="Password visibility" aria-pressed="false"></span>
  </div>
  <div class="actions">
    <a href="/action/requestResetPassword" class="link show-request-reset-password">
                                Forgot password?
                            </a>
  </div>
  <div class="loginExtraBeans-dropZone" data-pb-dropzone="loginExtraBeans">
  </div>
  <div class="align-end">
    <span class="submit " disabled="disabled">
      <input class="button btn submit primary no-margin-bottom accessSubmit" type="submit" name="submitButton" value="Log In" disabled="disabled">
    </span>
  </div>
</form>

POST /action/changePassword

<form action="/action/changePassword" method="post">
  <div class="message error"></div>
  <input type="hidden" name="submit" value="submit">
  <div class="input-group">
    <div class="label">
      <label for="a589574e-bb98-4c6e-8fed-67365ff05357-old">Old Password</label>
    </div>
    <input id="a589574e-bb98-4c6e-8fed-67365ff05357-old" class="old" type="password" name="old" value="" autocomplete="off">
    <span class="password-eye-icon icon-eye hidden"></span>
  </div>
  <div class="input-group">
    <div class="label">
      <label for="a589574e-bb98-4c6e-8fed-67365ff05357-new">New Password</label>
    </div>
    <input id="a589574e-bb98-4c6e-8fed-67365ff05357-new" class="pass-hint new" type="password" name="new" value="" autocomplete="off">
    <span class="password-eye-icon icon-eye hidden"></span>
    <div class="password-strength-indicator" data-min="10" data-max="32" data-strength="4">
      <span class="text too-short">Too Short</span>
      <span class="text weak">Weak</span>
      <span class="text medium">Medium</span>
      <span class="text strong">Strong</span>
      <span class="text very-strong">Very Strong</span>
      <span class="text too-long">Too Long</span>
    </div>
    <div id="pswd_info" class="pass-strength-popup js__pswd_info" style="display: none;">
      <h4 id="length"> Your password must have 10 characters or more: </h4>
      <ul>
        <li id="letter" class="invalid">
          <span>a lower case character,&nbsp;</span>
        </li>
        <li id="capital" class="invalid">
          <span>an upper case character,&nbsp;</span>
        </li>
        <li id="special" class="invalid">
          <span>a special character&nbsp;</span>
        </li>
        <li id="number" class="invalid">
          <span>or a digit</span>
        </li>
      </ul>
      <span class="strength">Too Short</span>
    </div>
  </div>
  <input class="button primary submit" type="submit" value="Submit" disabled="disabled">
</form>

POST /action/registration

<form action="/action/registration" class="registration-form" method="post"><input type="hidden" name="redirectUri" value="/doi/10.1002/cjce.25525">
  <div class="input-group">
    <div class="label">
      <label for="4e647394-f751-4441-baa4-df426bca4b6e.email">Email</label>
    </div>
    <input id="4e647394-f751-4441-baa4-df426bca4b6e.email" class="email" type="text" name="email" value="" size="15">
  </div>
  <div class="submit">
    <input class="button submit primary" type="submit" value="Register" disabled="disabled">
  </div>
</form>

POST /action/requestResetPassword

<form action="/action/requestResetPassword" class="request-reset-password-form" method="post"><input type="hidden" name="requestResetPassword" value="true">
  <div class="input-group">
    <div class="input-group">
      <div class="label">
        <label for="email">Email</label>
      </div>
      <input id="email" class="email" type="text" name="email" value="" size="15" placeholder="Enter your email" autocorrect="off" spellcheck="false" autocapitalize="off">
    </div>
  </div>
  <div class="password-recaptcha-ajax"></div>
  <div class="message error"></div>
  <div class="form-btn">
    <input class="button btn primary submit" type="submit" name="submit" value="RESET PASSWORD" disabled="disabled">
  </div>
</form>

POST /action/requestUsername

<form action="/action/requestUsername" method="post"><input type="hidden" name="requestUsername" value="requestUsername">
  <div class="input-group">
    <div class="label">
      <label for="ac834f24-aa07-4ad2-9d13-f77c843f21cb.email">Email</label>
    </div>
    <input id="ac834f24-aa07-4ad2-9d13-f77c843f21cb.email" class="email" type="text" name="email" value="" size="15">
  </div>
  <div class="username-recaptcha-ajax">
  </div>
  <input class="button primary submit" type="submit" name="submit" value="Submit" disabled="disabled">
  <div class="center">
    <a href="#" class="cancel">Close</a>
  </div>
</form>

Text Content

 * Skip to Article Content
 * Skip to Article Information

Search withinThis JournalAnywhere
 * Search term
   Advanced Search Citation Search
 * Search term
   Advanced Search Citation Search

Login / Register
 * Individual login
 * Institutional login
 * REGISTER

The Canadian Journal of Chemical Engineering
Early View
SPECIAL ISSUE ARTICLE
Open Access



ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING AT VARIOUS STAGES AND SCALES OF
PROCESS SYSTEMS ENGINEERING


Karthik Srinivasan, 

Karthik Srinivasan

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Anjana Puliyanda, 

Anjana Puliyanda

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Writing - original draft, Conceptualization, Investigation

Search for more papers by this author
Devavrat Thosar, 

Devavrat Thosar

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Abhijit Bhakte, 

Abhijit Bhakte

Department of Chemical Engineering, Indian Institute of Technology Madras,
Chennai, India

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Kuldeep Singh, 

Kuldeep Singh

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Prince Addo, 

Prince Addo

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Rajagopalan Srinivasan, 

Rajagopalan Srinivasan

 * orcid.org/0000-0002-8790-4349

Department of Chemical Engineering, Indian Institute of Technology Madras,
Chennai, India

American Express Lab for Data Analytics, Risk & Technology, Indian Institute of
Technology Madras, Chennai, India

Contribution: Conceptualization, Investigation, Writing - original draft,
Writing - review & editing, Funding acquisition

Search for more papers by this author
Vinay Prasad, 

Corresponding Author

Vinay Prasad

 * vprasad@ualberta.ca

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Correspondence

Vinay Prasad, Department of Chemical and Materials Engineering, University of
Alberta, Edmonton, AB, T6G 1H9, Canada.

Email: vprasad@ualberta.ca

Contribution: Conceptualization, Investigation, Writing - original draft,
Writing - review & editing, Funding acquisition

Search for more papers by this author
Karthik Srinivasan, 

Karthik Srinivasan

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Anjana Puliyanda, 

Anjana Puliyanda

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Writing - original draft, Conceptualization, Investigation

Search for more papers by this author
Devavrat Thosar, 

Devavrat Thosar

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Abhijit Bhakte, 

Abhijit Bhakte

Department of Chemical Engineering, Indian Institute of Technology Madras,
Chennai, India

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Kuldeep Singh, 

Kuldeep Singh

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Prince Addo, 

Prince Addo

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Contribution: Conceptualization, Investigation, Writing - original draft

Search for more papers by this author
Rajagopalan Srinivasan, 

Rajagopalan Srinivasan

 * orcid.org/0000-0002-8790-4349

Department of Chemical Engineering, Indian Institute of Technology Madras,
Chennai, India

American Express Lab for Data Analytics, Risk & Technology, Indian Institute of
Technology Madras, Chennai, India

Contribution: Conceptualization, Investigation, Writing - original draft,
Writing - review & editing, Funding acquisition

Search for more papers by this author
Vinay Prasad, 

Corresponding Author

Vinay Prasad

 * vprasad@ualberta.ca

Department of Chemical and Materials Engineering, University of Alberta,
Edmonton, Alberta, Canada

Correspondence

Vinay Prasad, Department of Chemical and Materials Engineering, University of
Alberta, Edmonton, AB, T6G 1H9, Canada.

Email: vprasad@ualberta.ca

Contribution: Conceptualization, Investigation, Writing - original draft,
Writing - review & editing, Funding acquisition

Search for more papers by this author
First published: 06 November 2024
https://doi.org/10.1002/cjce.25525
About


 * FIGURES


 * REFERENCES


 * RELATED


 * INFORMATION

 * PDF

Sections
 * Abstract
 * 1 INTRODUCTION
 * 2 REPRESENTATIONS
 * 3 HYBRID AI
 * 4 HUMAN AND AI
 * 5 GENERATIVE AI
 * 6 TO AI OR NOT TO AI?
 * AUTHOR CONTRIBUTIONS
 * Open Research
 * REFERENCES

PDF
Tools
 * Request permission
 * Export citation
 * Add to favorites
 * Track citation

ShareShare

Give access

Share full text access
Close modal

Share full-text access

Please review our Terms and Conditions of Use and check box below to share
full-text version of article.
I have read and accept the Wiley Online Library Terms and Conditions of Use

--------------------------------------------------------------------------------

Shareable Link

Use the link below to share a full-text version of this article with your
friends and colleagues. Learn more.


Copy URL


Share a link

Share on
 * Email
 * Facebook
 * x
 * LinkedIn
 * Reddit
 * Wechat




ABSTRACT

We review the utility and application of artificial intelligence (AI) and
machine learning (ML) at various process scales in this work, from molecules and
reactions to materials to processes, plants, and supply chains; furthermore, we
highlight whether the application is at the design or operational stage of the
process. In particular, we focus on the distinct representational frameworks
employed at the various scales and the physics (equivariance, additivity,
injectivity, connectivity, hierarchy, and heterogeneity) they capture. We also
review AI techniques and frameworks important in process systems, including
hybrid AI modelling, human-AI collaborations, and generative AI techniques. In
hybrid AI models, we emphasize the importance of hyperparameter tuning,
especially in the case of physics-informed regularization. We highlight the
importance of studying human-AI interactions, especially in the context of
automation, and distinguish the features of human-complements-AI systems from
those of AI-complements-human systems. Of particular importance in the
AI-complements-human framework are model explanations, including rule-based
explanation, explanation-by-example, explanation-by-simplification,
visualization, and feature relevance. Generative AI methods are becoming
increasingly relevant in process systems engineering, especially in contexts
that do not belong to ‘big data’, primarily due to the lack of high quality
labelled data. We highlight the use of generative AI methods including
generative adversarial networks, graph neural networks, and large language
models/transformers along with non-traditional process data (images, audio, and
text).




1 INTRODUCTION

Artificial intelligence (AI) and machine learning (ML) have found increasing
usage in various domains, including many varieties of industrial practice, in
the recent past, and process systems are no exception. If, as with other
reviews,[1] we adopt the definition of Rich[2] for AI: ‘Artificial Intelligence
is the study of how to make computers do things at which, at the moment, people
are better’. The popularity of AI lies not just in its current successes, but in
the promise that it can get much better and surpass humans at practically any
endeavour. In this work, we review the use and promise of AI and ML in relation
to process systems engineering. We do note that reviews in the recent past have
addressed questions around the state of the art in great detail, drawing
particular attention to the work of Venkatsubramanian,[1] where
Venkatasubramanian provides a detailed history of the evolution of AI in the
context of process systems engineering, and gives a perspective on future
directions. He makes the point that there is both a ‘technology push’ and a
‘market pull’ for the current interest in AI, with greater computing power
pushing and advanced algorithms (including model predictive control and
reinforcement learning [RL]) pulling. Thon et al.[3] do have a brief discussion
on the applications at various process scales, but their focus is primarily on
particle technology. Both of these reviews, when they describe AI techniques,
organize the presentation around methods (such as supervised and unsupervised
learning, and describe methods in each area in greater detail) and applications
(such as modelling, fault detection, optimization, and control). Dutta and
Upreti[4] provide an overview of the application of AI and ML to process control
in particular, and their framework of presentation is also organized around
techniques (expert systems, fuzzy logic, and ANN-based controllers) and
applications in various process contexts.

What distinguishes our review from other notable ones in the recent past[1, 3,
4] is our focus on the various scales and stages of process engineering.
Specifically, as shown in Figure 1, we analyze the application of AI at scales
ranging from molecules and reactions to materials, and then to processes,
plants, and supply chains. The other aspect we highlight is related to the stage
of deployment. We broadly consider two stages (design and operation), with
design also encompassing aspects of modelling and discovery (such as materials
discovery) and operation including modelling, monitoring, control, and
optimization. Both product and process design are considered in this
categorization. In the figure, the solid lines with arrows indicate the scales
of current application of the areas of generative AI, humans and AI, and hybrid
AI, while the dotted lines represent opportunities to extend the application of
these areas. Representations, of course, are required at all scales. We identify
the most common sources of data at the various scales and in the various
techniques: operational data includes all experimental investigations at the
laboratory scale and process data, while databases (of publicly available data
on molecules and reaction templates, for instance) encode some form of prior
knowledge in their data. We note that hybrid AI includes not just data but also
knowledge, typically encoding the known physics in the form of differential
equations. Also, we provide a comprehensive look at representations, which is
arguably the most important aspect of AI and ML in process systems engineering:
a majority of the tasks involve regression and classification tasks, for which a
variety of algorithms are available and can be tested, but as we shall see, the
choice of data representation has a significant impact on the performance.
Finally, we cover aspects of AI such as hybrid and generative AI, where a
somewhat unique contribution is our evaluation of human-AI interactions in
automation and other process engineering contexts. What is somewhat
underappreciated in current practice is the importance of the human in the loop
with AI. For instance, while AI algorithms may be deployed most of the time,
there may be times when they are offline, and humans need to exercise judgement
and make decisions. In instances where AI is used to complement humans, it is
very important that the results of the AI are explainable to aid humans. In this
context, we also present some thoughts on when not to use AI, emphasizing that
not all process applications necessarily need AI, even if it is possible to
develop AI algorithms for those cases, and that a cost–benefit analysis of AI
should guide its deployment.

FIGURE 1
Open in figure viewerPowerPoint
An overview of artificial intelligence (AI) across the scales (molecules to
supply chains) and stages (design and operation) of process engineering.

In the rest of this paper, Section 2 focuses on representations, Section 3
focuses on hybrid AI modelling, Section 4 focuses on human-AI interactions, and
Section 5 describes generative AI methods. We finish with Section 6, which
describes the conditions that are favourable (or unfavourable) for the
deployment of AI.


2 REPRESENTATIONS

The adoption of ML in domain-specific areas is primarily to develop a suitable
model of a process (i) to achieve a computational advantage over the
traditionally used first-principles models when they are available, or (ii) to
act as a surrogate when first-principles based models cannot be identified. The
scale of chemical engineering spans the spectrum from molecular-level
engineering to global supply chain optimization. Incorporation of ML strategies
at any level of scale requires an effective machine-readable representation of
entities at the considered scale. The class of ML models to be utilized is also
intricately connected with the representation chosen. In this section, we
present a discussion on some representational schemes that are widely employed
at different scales of engineering in accordance with the taxonomy presented in
Figure 2A. Note that the choice of representation falls into what is
conventionally regarded as data preparation but also impacts feature
engineering. Figure 2A provides a bird's eye view of the different
representations detailed in this section. The ‘physics’ at the top of the figure
describes the latent properties of the scale that are to be captured by the
representations. Molecular representations, especially topological descriptors,
are expected to be equivariant to translation and rotation. In general,
molecular representations capture the permutation-invariant nature of the
ordering of atoms in molecules. In the case of reactions, capturing the additive
nature of the reaction mixture is deemed to be an important characteristic of
the representation. Material representations that are produced through injective
mappings are favoured in design and discovery due to ease of invertability. At
the process level, string or graph-based representations aim to capture the
connectivity between unit operations of the process and the inherent mass and
energy flows. At the larger plant scale, representations that incorporate the
hierarchy and relationship between different processing streams become more
informative. Supply chain representations are typically constructed to encode
the heterogeneity in the source and destination nodes and transportation routes.
It is to be noted here that the qualities ascribed to representations are not
limited to the specific scale but can be features of representations at other
scales as well. For example, connectivity is still an important property of
molecular representations as is heterogeneity. The more distinct feature of each
scale has been denoted in Figure 2A. Figure 2B provides a classification of the
embeddings mentioned in Section 2 under different scales and physics, and
Figure 3 lists applications of the representations across scales.

FIGURE 2
Open in figure viewerPowerPoint
An overview of representations across different scales. (A) Schematic of
representations across scales. (B) Classification of representations based on
physics at different scales. DEXPI, Data Exchange in Process Industry. IUPAC,
International Union of Pure and Applied Chemistry. HAZOP, hazard and
operability. SFILES, simplified flowsheet-input line-entry system. SMILES,
Simplified molecular-input line-entry system. SMARTS, SMILES arbitrary target
specification; SMIRKS, SMILES reaktion specification.
FIGURE 3
Open in figure viewerPowerPoint
Application of representations across scales. HAZOP, hazard and operability.
QSAR, quantitative structure–activity relationship. QSPR, quantitative
structure–property relationship. LLM, large language models.

Enforcing domain physics in ML models mitigates the need to train on large
datasets and, most importantly, enhances the generalizability, transferability,
and interpretability of these models, which is crucial for their adoption in
mission-critical domains as chemical engineering.[5] Domain physics can be
encoded in ML either by regularizing the structure of the models by way of model
architectures and loss functions, or by suitably representing digitized data,
which is typically heterogeneous, multi-scale, unstructured, and otherwise
inaccessible for training ML models[6] (also see Section 3). Suitable data
representations are a set of features/descriptors enriched with metadata from
which the system physics emerges, and are task and scale-specific. Metadata is
an avenue to incorporate knowledge representations at varying levels of system
abstraction, ranging from differential equations, theoretical simulations, logic
rules, and neural symbolic expressions to knowledge graphs, text embeddings, and
ontologies for semantic relationships.[7, 8] In the absence of prior knowledge,
an operator-theoretic approach is used to obtain global data representations.
For instance, the Koopman operator uses infinite, intrinsic linear coordinates
to decouple the underlying nonlinear dynamics by spectral decomposition of time
series sensor data, a finite approximation of which is achieved by dynamic mode
decomposition.[9] However, partial knowledge about a process like the
conservation of physical quantities is incorporated by affine neural operators
for spatial and temporal invariances via convolution and time recurrence,
respectively.[10] Section 2 is intended to act as a primer to familiarize the
readers with the more widely used representations at different levels of
chemical engineering. In this section, we also identify representation schemes
that incorporate metadata and operational information to inject domain knowledge
of the process.

Representations of molecules aim to capture the intrinsic topology and
properties of constituent atoms to enable computer-aided modelling and
exploration of the chemical space. While we acknowledge that other
representations of molecules (such as Molecular Design Limited (MDL)) Molfiles,
Wiswesser line notation) exist, we focus this section on the more commonly used
representations of molecules for ML applications.[11-13]

String-based representations depict molecular structure as a combination of
numerals, alphabets and certain special characters and follow a set of semantics
typically unique to the type of representation. Simplified molecular-input
line-entry system (SMILES)[14] is a widely used string-based notation that
encodes molecular connectivity. Each atom is represented by a unique sets of
alphabets (same as their symbols in the periodic table). Numbers and parentheses
are employed to indicate rings and branches. Special characters such as (=, #,
@, +, etc.) indicate the type of bond, stereochemistry, and charge. The SMILES
representation of a molecule is non-unique and the enumerative expanse of
equivalent SMILES structures leads to issues with its application. Attempts to
develop canonical variations of SMILES strings have been noted in the
literature. Several modifications to the base SMILES notation such as
DeepSMILES[15] and SELFIES[16] have been proposed to overcome specific
drawbacks. International Union of Pure and Applied Chemistry (IUPAC)
nomenclature describes functional groups in a molecule in words rather than
symbols. Similar to SMILES, the mapping between a molecule and its IUPAC name is
non-unique (due to the use of retained names) though the reverse mapping is
unique. The International Chemical Identifier (InChI) is a canonical, open
source molecular representation developed by IUPAC that represents a molecule in
a hierarchical text notation.

The inherent sequential nature of string representations lends itself to be
readily adapted into ML models incorporating recurrence and convolutional
operations. Reaction and product distribution predictions using recurrent neural
networks (RNNs) and transformers have been carried out on SMILES and InChI
strings.[17-21] Neural machine translation tasks have traditionally used SMILES
as input to translate to other molecular representations.[22, 23] IUPAC names
and InChI identifiers have been used in machine translation as well[24-28] but
with a lower degree of success due to large token space and rigorous grammatical
rules. Generative algorithms such as variational autoencoders (VAE), generative
adversarial networks (GANs), and transformers employ SMILES (and its variants)
as the decoding from a latent vector space.[29-32] SMILES also form the input
representation for many molecular property prediction techniques.[33, 34] The
strength of InChI strings comes to the forefront as an effective library
management representation scheme due to the uniqueness of the identifier.[35-37]

Molecules are typically depicted as molecular graphs with atoms forming the
nodes and bonds forming the edges. Graphical representations of molecules
highlight their topological connectivity. A featurized version of these graphs,
that is, a graph where each node and edge has an associated feature vector, is
used in performing computations on molecules.

A straightforward application of the molecular graph is the application of
rule-based reaction templates to molecules. The templates describe the edges to
be formed or removed between the molecular graphs of the substrates in a
reaction to form the graph of the product molecule. Graphical representations
also allow for substructure matching used by reaction templates to identify
reaction centers. The change in connectivity of the molecular graph is achieved
by performing modifications to the adjacency matrix.[36, 38, 39]

Molecular fingerprints such as the Extended Connectivity Fingerprint (ECFP) take
advantage of the connectivity information encoded in molecular graphs to
generate molecular fingerprints. Similarly, the graphical information is taken
into account for molecular property prediction (quantitative structure–property
relationship [QSPR]/quantitative structure–activity relationship [QSAR] studies)
through the repeated application of convolution operation to the molecular
graph. Convolution-based approaches have also been employed in automated
reaction prediction to identify possible products from a set of
reactants.[40-42]

Generative models have been developed recently that aim to generate molecules as
molecular graphs using GAN.[43, 44] The order-invariant nature of the adjacency
matrix lends to computational complexity in one-shot generation approaches.
Nonetheless, the methodology has been found to be effective for smaller
molecules. RNN and VAE-based methodologies have also been used for sequential
graph generation in node-wise and fragment-wise fashions.[29-32, 45, 46]

As an extension of molecular graphs, reactions have also been encoded in
graphical format, thus allowing access to graph theory-based approaches in
reaction outcome prediction and identification of pathways for reactions.
Reaction hypergraphs have been used in reaction classification.[47]

Molecular fingerprints form a class of representations where information about a
molecule is encoded in a vector representation. These may include bit vectors
such as Molecular ACCess System (MACCS) keys, which indicate the presence or
absence of a particular sub-structure or fingerprints based on connectivity,
such as Morgan fingerprints or topological fingerprints.

First generated to aid isomeric structure screening, fingerprints have found a
variety of applications. The primary use still remains database screening and
substructure matching. Fingerprints are used in tandem with a distance metric
such as the Tanimoto similarity or Dice coefficient to identify the degree of
closeness between the query and the database.[48-50] The compactness of
fingerprints allows for faster screening of databases. As an extension, reaction
fingerprints have also been developed to find reactions similar to each other in
the chemical space.[51, 52]

The ability to generate fingerprints focused on certain aspects of the molecule
allows for contextual extraction of information for specific tasks. This has led
to widespread usage of molecular fingerprints as inputs for molecular property
prediction, especially for drug-like molecules.[53-55] In a similar vein,
localized reactivity information of molecules can be extracted through
fingerprints to aid in product prediction.[52]

Computer-aided representations capture latent abstractions of the molecule
through nonlinear transformations. These embeddings represent the molecule in a
continuous real-valued space and are usually extracted for a specific
purpose.[23, 28] Invertible mappings of molecules allow for complex mathematical
operations to be performed on molecules in an efficient manner. For example,
convolution operations on molecular graphs followed by pooling of the node
features generates a vectoral representation of the molecule, which is further
used downstream for prediction of partition coefficients.[23] Generative models
such as GANs and VAEs take advantage of this representation to generate
embeddings of new molecules, which are then decoded to generate a SMILES
representation.[29-32] Typically extracted using auto-encoders, the embeddings
learnt depend on the type of information present in the input.[56-58] The
dimensionality reduction achievable through the use of latent space embeddings
is exploited in reconstruction of potential energy surfaces and to develop
functionals for DFT calculations.[12, 59, 60]

A wide variety of molecular representations have been discussed to encode
spatial geometry and connectivity among atoms. However, descriptors that capture
topological equivariance are most sought after to avoid confounding among
molecular conformers.[61, 62] However, graph-based geometric and topological
descriptors of molecules are ill-posed to describe multi-component systems and
may even misrepresent emergent properties of component interactions by merely
concatenating descriptors of individual molecules. Hence, molecular images of
chemical structures are widely used in training ML models to predict
solubility[63] and kinetics[64] of molecular mixtures.

Coming to reactions, graphical and string-based representations stem from their
applicability towards molecules. Reactions are encoded as modifications to
molecular SMILES through atom-atom mapping or as graph edits to the molecular
graphs. The bond-based representations of several molecules are widely
supplemented with metadata about molecular charge densities and reaction
energies from quantum calculations, in order to capture pairwise interactions
and the algebraic additive nature of reaction properties uniquely.[65] The
aforementioned representation would suffice if the task at hand was to train
generative ML models that enumerate candidate products from a set of reactants,
but it would be context-limited in tasks requiring translation from
computational retrosynthesis to an experimental protocol. Context-free
grammar-based ontological representations of molecules have been used in a
neural seq-to-seq machine translation framework to capture hierarchical
information of reactant molecules in predicting the products from a reaction,
although the context of experimental conditions is still absent.[66] However,
text-based reaction representations of a sequence of experimental steps have
been mined from patents to train context-enriched seq-to-seq transformer
language model (Smiles2Actions) to predict the experimental protocol for batch
organic synthesis.[67]

Materials representation by descriptors of molecular structure and composition,
that respect symmetry, similarity, density, continuity, locality, and additivity
to ensure that distinct materials do not have identical descriptors is vital to
develop ML surrogates linking molecular-level material descriptors to its
macroscopic properties by way of capturing QSARs/QSPRs.[68] Kernels to capture
symmetry and continuity, distance metrics to capture similarity, summation of
atom-centred symmetry functions (ACSFs) to capture density, connectivity
matrices to capture interactions (e.g., Coulomb) or compositional variations,
spectral decompositions of connectivity matrices to capture invariance, and
persistence of topological homologues over radius to capture locality are among
the phylogenetic tree of material descriptors that successfully embed the
additive decomposition of material properties implicit in the functional forms
of inter-atomic potentials from theoretical calculations.[69] Applying these
meaningful transformations to metadata from high throughput theoretical
calculations helps develop reliable, faster, and generalizable ML surrogates
(QSAR/QSPR models) that have ushered in a new era of discovery in computational
materials science via robotic self-driving labs (SDLs) for material
synthesis.[70] The development of over 40 novel featurization approaches from
topological representations of crystalline materials as molecular building
blocks has advanced the field of digital reticular chemistry for materials
discovery, with the promise of being transferable to other classes of
materials.[71] The multi-scale material descriptors for QSAR/QSPR models are
prediction task-specific and range from atom-centred descriptors for local
properties, building unit-centred descriptors for shape and flexibility
properties, and finally, more coarse-grained volume elements in the material
space, like voxels or point cloud representations for global properties. Imaging
data from scanning tunnelling and transmission electron microscopy, from which
periodic features are extracted using VAE, is found to relate to material
properties like polarizability and strain gradients,[72] as do neural
network-extracted features from hyperspectral images of functional materials
relate to their macroscopic thermochemical properties.[73] One may also use the
operator-theoretic approach of using a hierarchy of Hamiltonian matrices to
devise global material descriptors that are robust to structural confounders and
guarantee the injectivity of ML-based QSAR/QSPR model to encourage transferable
learning.[74] The Hamiltonian is the energy operator for the wave function in
the Schrodinger's equations, whose eigenvalues are representative of the total
energy of a quantum mechanical system. The use of context-aware text-embeddings
like BERT and ELMo, as opposed to Word2vec and GloVe to capture collective
associations of materials and molecules to target properties from unstructured
knowledge in literature, enables the development of trustworthy QSAR/QSPR models
for materials discovery.[75] Note that only a small portion of the associations
are digitized into structured property databases. This also limits the chances
of making future discoveries that were already reported in past publications.
Very recently, the claims made by a self-driven autonomous lab (A-lab) of having
discovered 41 novel compounds in 17 days were challenged by researchers who
contested that some of the materials were misidentified because compositional
order invariant descriptors for inorganic materials were not considered, while
some of the other discovered materials, had already been reported in the
literature.[76]

Processes generally comprise several unit operations connected by material and
energy flows and are adequately represented by flowsheets. The topology of
flowsheets, in turn, has been represented in varied forms to aid exploration and
generation. Representing chemical process flowsheets as strings facilitates the
use of machine-learning algorithms applied to sequences, such as RNNs and
transformers.[77] Generative models employing SFILES notation have been
developed using large language models (LLMs). Pattern recognition and feature
extraction tasks have been carried out through the usage of string
notation.[78-81] Using images of process flow diagrams (PFDs) opens up the usage
of convolutional operations for feature extraction. Pattern recognition in
images of PFDs has been performed through incorporation of spatial relationships
between process units.[82, 83] The sequential nature of PFDs lends itself to a
graphical representation such as the P-graph, S-graph, SSR (state space
representation), and so on. Generative models can be used for auto-completion of
PFDs by addition of operational modules as nodes to the graph of the process.
Canonical descriptions of P&ID diagrams such as the Data Exchange in Process
Industry (DEXPI) Standard allow for vendor-independent exchange of information.
DEXPI standard has been used as an input to generate graphical representations
of P&ID diagrams to aid in sequential node/edge predictions using RNNs and
GNNs.[84] Graphical representations also lend themselves to aid the
superstructure optimization problem, attaching attributes corresponding to
dynamics of material and energy flows of each node and edge and optimizing over
a combination of these units subject to certain production and physical
constraints provided as algebraic and differential equations.[85, 86]

Time series records of process variables from measurement sensors have long been
used to construct representations of process history,[87] and even to infer
trends of the unmeasured process variables from hierarchical representations via
more sophisticated soft-sensor modelling.[88] The modularity of chemical
processes supports their digitization using knowledge graphs to represent
semantic relations among heterogeneous entities, that is, inputs, outputs, and
unit operations are designated as the nodes, while the material and energy flows
among them are designated as the edges to improve production lines by
efficiently assessing downstream environmental and economic impacts.[89]
Graphical representation of ontologies results in knowledge graphs. Ontologies
are a way of representing hierarchical knowledge by using logical theory to
formalize conceptual semantics among entities and comprise: (i) a set of
classes/unique concepts/entities, (ii) attributes of these classes, (iii)
instances of classes, (iv) relations among classes, and (v) axioms or logic
rules that are restrictions defined on the classes that regulate their
properties.[90] Resource description framework (RDF) triples of the
subject-predicate-object format are used for knowledge graphs parsed from
ontologies.[91] Building ontologies and thereafter knowledge graphs is a very
time-consuming and non-standardized practice, but it encourages interoperability
among disparate process entities and decentralized databases by facilitating
recommender systems, identifying inductive relations, and entity
recognition.[92] To encourage the transferability and reusability of ontologies
across domains,[93] only generic modelling knowledge is incorporated and very
few relational restrictions are imposed to create a meta-ontology, with a
provision of adding domain-specific knowledge separately, as with OntoCAPE, a
large-scale ontology for computer-aided process engineering.[94] Querying
knowledge graphs of sensor data represented as class instances has been used for
process anomaly detection.[95] Process safety by risk analysis is achieved by
querying knowledge graphs where equipment, chemicals, and flows are treated as
entities whose attributes and relations are ascribed from text mining hazard and
operability (HAZOP) reports,[96] while hazardous chemical management is achieved
by named entity recognition using knowledge graphs of incident records.[97]
Knowledge graphs constructed from databases or text mining efforts have been
used to mitigate risk, enhance process safety, and even recommend green
manufacturing alternatives across a variety of industries from petrochemical to
pharmaceutical.[98-100]

Plants constitute an aggregate of processes, but it is not trivial to represent
them by linking process knowledge graphs at common pinch points, owing to the
siloed and static nature of knowledge graphs, unless the processes are
represented by meta-ontologies that can easily be reused. A distributed plant
ontology, OntoSafe, adds to the capabilities of OntoCAPE and benefits from using
semantics for plant safety supervision, based on process changes.[101]
Task-specific representations at the plant level primarily cater to safety and
decision-making involving several stakeholders. Unstructured data from incident
records and HAZOP reports are mined to extract text-embeddings to classify
hazards and thereby calibrate consequence severity.[102] Industrial safety
knowledge (ISK) from HAZOP reports are represented as knowledge graphs,[103]
followed by using description logic to build heavy-weight plant ontologies from
knowledge-bases to diagnose faults and characterize hazards.[104] Once the
elements of an ontology have been defined/standardized, the process of
constructing knowledge graphs has been automated by mining unstructured text
using language transformer models for entity recognition and relation
discovery,[105] so that the entire product and process development lifecycles
can be informed by considering not just material safety datasheets, mass and
energy balances, and various process technologies, but also manufacturing
guidelines set by stakeholders.[106]

Supply chains encompass resources (goods, money) that move across enterprises,
whose semantic knowledge is represented using modular supply chain ontologies
(SCONTO) for interoperability in decision-making.[107] Several other ontologies
have been used to describe the relations and restrictions on resource movement
among entities that include suppliers, manufacturers, distributors, logistics,
and customers, but are still far from catering to interoperability because they
do not adequately represent the reality,[108] are highly non-standardized, and
do not integrate information from ever-evolving supply chains.[109] This is
proposed to be tackled by the industrial ontology foundry (IOF) that provides an
open-source, standardized, collaborative, and modular approach to build
enterprise ontologies.[110] Further, enterprises across several geographical
locations are best represented as geospatial knowledge graphs that are
constructed from embeddings of heterogeneous geographic information systems
(GIS) in the form of images (satellite images, street views, aerial photos),
text (tags, reviews, social media posts), and numeric data (weather,
traffic).[111] Geospatial knowledge graphs enable decision-making using supply
chain ontologies to be resilient to entity disruptions.[112]


3 HYBRID AI

Building predictive models for physical processes is important for achieving the
objectives of process systems engineering (PSE), which involves improving
decision-making in the production of chemical products.[113] The development of
digital models that represent a physical process is the key component in
PSE.[114] Therefore, most efforts in chemical engineering have been to develop
‘digital twins’, which are virtual representations of the processes. Digital
twins include a predictive component that is accurate, robust, and fast enough
to be used for optimization, control, fault detection, and diagnosis.[115]
Predictive models can be 1. Mechanistic: derived using physics-based knowledge
about the process in the form of transport equations (mass, momentum, and energy
balance), physical phenomena (absorption, adsorption, crystallization, etc.),
chemical kinetics, initial and boundary conditions, or 2. Data-driven: developed
by fitting the process data to one or more mathematical functions. Developing
mechanistic models requires a thorough understanding of physical principles and
mechanisms involved in the process and a computational capability for
simulations. The parameters used in mechanistic models usually have a physical
meaning. On the other hand, data-driven models are developed purely based on
correlations between input–output data, and the parameters typically have no
physical meaning, making the models less transparent in predicting the behaviour
of the process variables. These models, however, are easier to develop than
physics-based mechanistic models. Examples of data-driven models used in PSE are
the traditional time series models such as ARX, NARX, and so forth, and the more
recent ML models (Gaussian processes, neural networks, etc.).

Recently, the abundance of data and enhanced computational capabilities have led
to an increase in the use of ML. Such approaches have been used extensively for
various PSE applications such as process monitoring, fault detection, control,
optimization, and so forth.[1, 116] Usually, these are black-box models, having
no knowledge about the underlying physics of the process. Although such models
are easy to develop, they are highly system-specific, lack extrapolation
capability and interpretability, and may produce physically inconsistent
results. Hybrid models, also called as grey-box models, offer a promising
solution to tackle some of these problems, with other advantages of robustness
and efficient model development for complex processes.[5] Hybrid models in PSE
first appeared in works of Psichogios and Ungar,[117] Kramer et al.,[118]
Johansen and Foss,[119] Mavrovouniotis and Chang,[120] and Su et al.,[121] and
have seen increasing use since then. A variety of methods for incorporating
mechanistic knowledge into a ML framework can be observed right at the start of
the hybrid modelling paradigm in chemical engineering, which has led multiple
researchers to propose taxonomical guidelines for the use of hybrid AI in PSE.
Note that while taxonomical guidelines exist for the use and implementation of
hybrid AI in PSE, the applications have focused almost exclusively on the
process scale. This reflects the fact that the modelling focus and first
principles knowledge of many process systems engineers is at the process scale,
but it also speaks to this being an opportunity to extend the scope of hybrid AI
in PSE.

Sansana et al.[122] provide a review on hybrid modelling and classified
approaches into series, parallel, and surrogate models. However, it can be
argued that surrogate models may not necessarily belong to the category of
hybrid models. Bradley et al.,[123] reviewed approaches for hybrid modelling for
cases where mechanistic models are used in some way during the training process
of the data-driven model. Sharma and Liu[124] describe a comprehensive
classification of hybrid modelling techniques, dividing the approaches into ML
assisting science and science assisting ML. Rajulapati et al.[125] give a
perspective on integrating physics with ML and classify hybrid models into
residual models, first-principles constrained models, and first-principles
initialization models. Gallup et al.[126] demonstrated the benefits of three
different classes of hybrid models using a CSTR case study. Physics-guided
architecture, physics-guided loss, and physics-guided initialization are used as
classes for distinguishing different hybrid methods. Narayanan et al.[127]
outlined a step-wise procedure to obtain different types of hybrid models in
bio-pharmaceutical processes. The most relevant classes of hybrid models for
chemical engineering are the semi-parametric approach (series–parallel
architectures), physics-informed regularization, and physics-guided
architectures. Most works on hybrid AI can be classified into these three types.


3.1 SEMI-PARAMETRIC APPROACH

This form of hybrid modelling is the oldest and most popular approach.
Semi-parametric or conjunction approaches can be further divided into series,
parallel, and combined architectures. The term ‘semi-parametric’ derives from
the fact that a part of the model is described using physics-derived equations
having physically meaningful parameters, while the data-driven part is
considered as non-parametric. Agarwal[128] described a framework for combining
prior process knowledge with data-driven models in the context of state
estimation and control using a semi-parametric approach. von Stosch et al.[129]
reviewed applications of series–parallel hybrid model structures in PSE. McBride
et al.[130] reviewed applications of semi-parametric hybrid modelling to
separation processes, while Zendehboudi et al.[131] reviewed applications of
semi-parametric strategies in chemical, petrochemical, and energy industries.
Also, Schweidtmann et al.[132] reviewed the current state, challenges, and
future directions of semi-parametric hybrid models.

The serial architecture can be further divided into two types. For the first
approach, outputs from one model are used as inputs to the subsequent model.
Usually (see Figure 4A), a black box model is used to estimate the part of the
physics-based model that is too complex to model. The output of this model is
used as input to the mechanistic model. For example, in the work of Psichogios
and Ungar,[117] the authors proposed a serial architecture where the kinetic
parameters of a fed-batch bioreactor were predicted using a neural network and
fed to a first principles model for simulation. Such models are useful when rich
process data is available, but interpretation of mechanistic parameters is
difficult. In the work of von Stosch et al.,[133] the authors use a nonlinear
partial least squares method to approximate cellular system dynamics in a mass
balance equation, thereby avoiding the unrealistic estimation of metabolic
fluxes from concentration measurements. Recently, Nielsen et al.[134] used such
a model in particle analysis in crystallization and flocculation applications.

FIGURE 4
Open in figure viewerPowerPoint
(A) Serial architecture: Output from data-driven model used as inputs for
mechanistic model. (B) Serial architecture: Output from mechanistic model used
as inputs for data-driven model. (C) Parallel architecture.

As seen in Figure 4B, output from the first-principles model are used as inputs
to the data-driven model. This type of model employs feature engineering, where
some knowledge about the process is used to create custom inputs to be fed to
the data-driven model. Although, this type of model is less used, there have
been some reports in literature.[135-137] Recently, Yan et al.[138] applied
serial architecture to predict gasification products using a neural network, in
a process optimization framework during design stage. Knowledge about the
thermodynamic equilibrium is used to generate feasible gas temperature inputs to
a neural network that predicts product composition.

The parallel architecture is given in Figure 4C. Such types of models aim to
capture the unmodelled part of the physical system or the model-plant-mismatch
by modelling the residuals using a data-driven approach. The outputs from
mechanistic and data-driven models are combined to get a pooled prediction of
outputs, resembling an ensemble-based ML algorithm. The first works in
application of the parallel approach can be found in the works of Su et al.[121]
and Mark et al.[118] Ghosh et al.[139] demonstrated a parallel structure for
modelling residuals of system identification models in a control framework.
Bikmukhametov and Jaschke[140] developed different types of parallel models
using measurement and simulation data for virtual flow metering. There have been
multiple works using a parallel architecture, especially in control and state
estimation, as reviewed in the literature.[129, 132]

Most works on semi-parametric approaches report increased efficiency in training
the models and improved accuracy. The data requirement is also reduced for
achieving an equivalent performance level. One major advantage of developing
hybrid model architectures is an increasing capability to extrapolate in the
unexplored input space. For instance, van Can et al.[141] demonstrate the
extrapolation capability of series hybrid models for prediction of the pH effect
on the conversion of Penicillin G using enzymes. Fairly accurate predictions
were obtained for input values outside the range of experimental training data.
Also, in the work of Narayanan et al.,[142] an improvement in extrapolation
capability with increasing knowledge about the process is demonstrated. This
shows that combining mechanistic and data-driven approaches enhances the
performance of either and can complement each other in terms of model fitting.
Depending on the application, availability of physics-based knowledge, and
amount and quality of data, different semi-parametric architectures can be
created to increase model performance. Though most reported works use either
serial or parallel, a combined approach can also be used to model a complex
system.


3.2 PHYSICS-INFORMED REGULARIZATION

The idea for incorporating physics-based terms in the overall loss function of a
neural network was proposed by Raissi et al.[143] and was demonstrated using PDE
equations derived from physics. It was subsequently successfully adapted into
several domains, as reviewed by Cuomo et al.,[144] by crafting the loss
functions according to the application. Naturally, it was adapted to chemical
engineering applications to model complex processes described using PDEs that
capture spatiotemporal variation.

In any black-box model, the parameters are obtained by minimizing the errors
between predictions and true values of the labelled data. In physics-informed
regularization, an additional loss function is included that incorporates the
available knowledge about the process, for example, partial differential
algebraic equations (PDAEs), thereby introducing a ‘learning bias’ during
training of the data-driven model.[145] This nudges the ML model in the
direction to learn the underlying physics of the problem, thereby increasing its
extrapolation or generalization capability. As shown in Figure 5, the prediction
loss is first generated using labelled data, and the predictions from the
data-driven model are used to generate the regularization loss using
physics-based model equations or constraints. One key difference from
semi-parametric approaches is that the inputs for enforcing physics-based losses
can be sampled independently of labelled data. The model can therefore be
trained to extrapolate beyond the range of the input space.

FIGURE 5
Open in figure viewerPowerPoint
Physics-informed regularization. PDAE, partial differential algebraic equations.

Physics-informed neural networks (PINNs) are applied across a wide range of
engineering applications. After their introduction, there has been a notable
increase in their applications to chemical engineering. Because of the extensive
use of PDEs in the form of Navier–Stokes equations, a wide range of articles can
be found in fluid mechanics and heat transfer that use PINNs. Multiple reports
about both physics-guided architectures (which is covered subsequently) and
physics-informed regularization in the field of fluid mechanics have been
reviewed by Sharma et al.[146] Wang and Ren[147] use PINN to predict
temperatures in a heat conduction process. Patel et al.[148] use a PINN to
predict temperature profile in a PFR using limited measurements. PINNs have also
been used to model adsorption-based processes such as chromatography[149, 150]
and pressure swing adsorption.[151] These approaches are used in several other
chemical engineering applications to predict process states. Chen et al.[152]
develop a PINN for the initial estimation of phase-equilibrium calculations in
shale reservoir models. In this work, a PINN is used to reduce the time required
to predict equilibrium ratios. Chen et al.[153] provide a comparative evaluation
of PINN models developed for various 1D, 2D, and 3D simulations in voltammetry
and suggest best practices to be followed for developing PINN models in the
field of electrochemical analysis. Takehara et al.[154] used a PINN as a
surrogate model for predicting fluid and thermal fields in the growth of single
bulk crystals and reported a significant reduction in computation time for
predictions. Merdasi et al.[155] used PINNs to predict zeta potential in a
mixing process and demonstrated the efficacy of PINN in comparison to finite
volume method (FVM) simulations. Bibeau et al.[156] trained and used PINNs to
predict the kinetics of biodiesel to show the regularization obtained by PINNs.
Liu et al.[157] used PINNs to predict turbulent combustion fields using sparse
data. Zhang and Li[158] developed a PINN to predict haemoglobin response in an
Erythropoietin treatment. Ren et al.[159] developed PINNs for predicting the
product composition of a biomass gasification process. In the work of Ryu et
al.,[160] a PINN was developed to predict variables in a polymer reactor. Asrav
et al.[161] apply a PINN for modelling an industrial wastewater treatment unit.
Most of the models developed based on PINNs are used to predict process
variables, which describe the state of the system, and are often modelled using
PDEs. This enables researchers to develop digital twins for processes by
creating spatiotemporal profiles using limited measurements.

The advantages of PINNs are similar to those of semi-parametric approaches.
Since there is no requirement of labelled data to enforce physics-based soft
constraints, the desired input space can be easily explored and incorporated
into the model. A physics-informed neural ODE is introduced in Sorourifar et
al.[162] to model multiphase chemical reaction systems. The extrapolation
capability of the model after adding information about the kinetics is
demonstrated on a real-world experimental data set. The extrapolation capability
and robustness of a PINN can then be utilized in process monitoring and control.
For instance, Zheng and Wu[163] introduce a PIRNN in an MPC framework to control
nonlinear processes and emphasize the utility of physics-based regularization on
a process with uncertain parameters. Also, Franklin et al.[164] use PINN as a
soft sensor to monitor flow rates in an oil well system. Wu et al.[165] develop
a PIRNN-based MPC for controlling crystal growth and nucleation in a batch
crystallization process.

PINNs are also used in inverse problems or parameter estimation. Rogers et
al.[166] use PINNs for identifying a model with time-varying parameters. Lu et
al.[167] use PINNs to identify model parameters for an optimal finned heat sink
system. Selvarajan et al.[168] introduce differential flatness in neural ODEs
for parameter identification. Tappe et al.[169] use PINN-based design of
experiments for parameter identification.

There can be other forms of adding regularization to train a ML model. For
instance, prior knowledge about the process was used to incorporate monotonicity
constraints for training the neural network in the work of Muralidhar et
al.[170] for predicting oxygen solubility in water. The improvement in
performance over conventional neural network models for training with limited
data and noisy measurements is demonstrated. However, at this time, PINN-style
approaches dominate the literature.

PINNs have shown a promise in increasing interpretability and generalizability
of conventional feed-forward neural networks. Since the information about the
physics of the process is added as a soft constraint in loss formulation, an
issue of convergence to a sub-optimal solution may arise. Cuomo et al.[144]
provide a comprehensive review for different types of PINNs developed in
literature and discuss challenges related to convergence and error analysis and
future research directions related to PINNs. The weighting parameter on
residuals is a critical factor in determining convergence. Currently, most
methods use a trial and error approach to obtain an appropriate value of this
hyperparameter. There is a need to develop a generalized framework to address
this issue. One way to fix this is by adding a hard constraint in the NN
formulation. Asrav and Aydin[171] used a physics-informed recurrent neural
network (PIRNN), trained using a genetic algorithm (GA) with regularization loss
as a hard constraint to model a dynamic process, and it was reported that the
testing accuracy for PIRNN increased despite a reduction in training accuracy as
compared to conventional neural networks. GAs, however, impose a huge
computational burden for optimizing model parameters. Nonlinear programming
(NLP) algorithms such as interior point methods may be used as an alternative to
GA. However, there is no guarantee of convergence of NLP algorithms.


3.3 PHYSICS-GUIDED ARCHITECTURE

This type of model makes use of physics-based knowledge about the process in
designing a neural network architecture. The neurons and hidden layers are
arranged such that the parameters and connections represent physical knowledge
about the process. A physics-guided hierarchical neural network architecture was
proposed by Mavrovouniotis and Chang[120] for monitoring a complex distillation
column. Instead of using a fully connected architecture, different sub-nets were
defined at each hierarchical level to monitor specific aspects of the process.
This modular design results in a sparser network than a fully-connected network
and more interpretability of model outputs. Russell and Baker[172, 173] applied
the subnet neural network modelling strategy, using prior knowledge, to model
falling-film evaporator. Each sub-network represented a specific sub-system with
a localized set of inputs. This resulted in a sparse and a relatively
interpretable neural network. Munõz-Ibañez et al.[174] developed a
material-properties that influenced the design of a hierarchical neural network
to model material properties in a die casting process of aluminium alloys.
Reductions in computational times have been achieved using a modular design of a
low-fidelity neural network, representing a simplified physics-based model,
followed by a high-fidelity neural network model, for predicting magnetic
interactions between particles in close proximity of each other in
colloids.[175]

The prior knowledge about material composition and properties is used in
designing the graph structure in a neural network. Another notable work in
chemical engineering from this category would be the chemical reaction neural
network (CRNN) by Ji and Deng.[176] The architecture of the neural network is
designed in a way that the weights and biases of the model represent kinetic
parameters, and inputs and outputs are the concentrations of reactants and
products. This enables the model to learn the reaction parameters inherently
using measurement data. Bangi et al.[177] develop a hybrid model called
universal differential equation by modelling unknown process dynamics using a
neural ODE[178] and combining it with known differential equations for a
lab-scale batch fermentation process. Puliyanda et al.[179] use a neural ODE to
obtain reaction mechanism-constrained kinetic models from spectroscopic data.

Additionally, Muralidhar et al.[180, 181] used a physics-guided architecture to
predict drag force on a particle. Machalek et al.[182] used knowledge about
energy balance to isolate and predict individual phenomena occurring within each
boiler of heat absorption units in a thermal power plant. Gallup et al.[126]
developed a physics-guided neural network design such that each layer represents
a component of the CSTR system.

Physics-based feature engineering could also be considered a sub-category of
this class. For example, Yang et al.[183] used physical information to create
features that improved the extrapolation capabilities of the model in predicting
wall shear stress in large eddy simulations. Prior knowledge based on physics
about a CO injection process is used to craft features to conduct a risk
assessment in the work of Yamada et al.[184] Most works also include
physics-guided initialization as a separate class, but we argue that this can be
considered as a subset of the physics-guided architecture. Carranza-Abaid and
Jakobsen[185] have developed a neural network programming paradigm to build
hybrid neural networks. Alhajeri et al.[186] develop a RNN structure inspired by
physics to control a noisy process.


3.4 PERSPECTIVE

All the effort in modelling a chemical process is usually to accurately and
quickly simulate the process and predict values of state variables in a
continuum of time and space. All effort is therefore to create a digital twin of
the process that can virtually represent an actual physical process. The
predictors in a digital twin completely lack interpretability, especially if
neural networks are used for modelling. Hybrid models may overcome this problem
using physics-based knowledge along with experimental data. Sitapure and
Kwon[187] introduced a transformer-based approach to develop a hybrid model of
time-series data with semi-parametric approaches to improve the interpretability
of neural network-based digital twins. Uncertainty in measurements and processes
can be handled effectively using hybrid models.

Although any data-driven model may be used for developing hybrid models,
artificial neural networks (ANNs) have gained the most popularity due to their
universal approximation ability. All issues arising for developing ANNs are
consequently inherited by hybrid models. Hyperparameter tuning is a major hurdle
in developing better models and is an active area of research. This is mostly
carried out using trial and error methods. Using an elaborate optimization
scheme may be developed but it is computationally heavy. In a PINN, the weights
of the loss function are the most critical parameters that require tuning. Since
the regularization acts as a soft constraint, the value determines the local
minima to which the solution converges. If an incorrect value is chosen, the
accuracy reduces drastically, and adding physics information becomes
counterproductive. There have been some works for adaptively generating an
optimal value of the weights on loss function components. For instance, McClenny
and Braga-Neto[188] developed a self-adaptive PINN by setting the weights as
trainable parameters for every collocation point. These methods, effective for
simpler equations in the physics domain as demonstrated in the paper, may be far
more complex for chemical processes with a high number of states and parameters.

One reason why PINNs have become popular is the use of automatic differentiation
(AD) in Python packages for developing neural network models. AD stores the
gradient of the function in the forward pass during training and accesses it
while calculating physics-based constraints, thereby reducing computational
effort. Calculation of gradients for predicted variables becomes much easier.

Hybrid models are currently focused almost exclusively at the process scale and
with operational data. There is an opportunity to extend them to other scales,
however, it will require capturing/encoding knowledge or data in other forms
(such as from databases) into the hybrid framework, and this may involve looking
at different formats for PINNs other than including differential equations into
the loss function.


4 HUMAN AND AI

The integration of AI alongside human knowledge can enhance (and has enhanced)
the efficiency and effectiveness of chemical engineering processes. Human-AI
collaboration is a transformative partnership where each entity complements the
strengths of the other, leading to enhanced decision-making, problem-solving,
and innovation. In the context of chemical engineering, the collaboration
between humans and AI can be categorized into two main types, depending on who
ultimately benefits (see Figure 6): (1) Human complements AI and (2) AI
complements human. In these frameworks, the human-AI interactions are mostly
focused at the process and plant scales in the context of automation and
automated systems, but they also span the scales of molecules, reactions, and
materials in the context of active learning systems.

FIGURE 6
Open in figure viewerPowerPoint
Overview of human-artificial intelligence (AI) collaborations.


4.1 HUMAN COMPLEMENTS AI

The human complements AI framework involves integrating human knowledge into AI
systems to enhance their capabilities, particularly in real-world scenarios
where data may be limited, expensive, or of poor quality.[170] It is important
to highlight that when referring to ‘human’, we are specifically addressing
domain experts such as plant operators, plant managers, or R&D scientists,
depending on the nature of the problem. The aim is to build a robust AI model by
incorporating human knowledge by following the workflow, which includes feature
engineering, model design, model training, and model validation. While this
workflow is familiar to data scientists, augmenting human expert knowledge can
enhance model performance beyond purely data-driven approaches. The human
knowledge may take various forms, including relationships between features,
physical models, constraints, or scientific laws. This approach not only
addresses challenges related to data availability but also promotes adaptability
and trustworthiness in AI systems by end-users, ultimately leading to more
robust and effective solutions. Next, we delve into various techniques through
which humans complement AI.

4.1.1 FEATURE ENGINEERING

The performance of the AI model largely depends on the representation of the
feature vector, necessitating significant effort in designing preprocessing
pipelines and data transformations during algorithm deployment.[189] Human
experts play a crucial role in selecting relevant features for AI models,
identifying key variables that contribute to model performance and enhancing
efficiency. Feature engineering is the process of creating new features that
capture the most important information from existing data for AI modelling.
These new features might be ratios, differences, or other mathematical
transformations of existing features. Domain expertise can help to design and
extract features that are relevant, meaningful, and useful for the specific
problem or goal. Their profound understanding of the domain enables them to
identify key variables, relationships, and patterns that matter most in the
specific context. Examples include using the log mean temperature difference in
modelling heat exchangers and the use of dimensionless numbers such as the
Reynolds number in flow systems. While considering feature engineering, it is
crucial to recognize that it is an important yet labour-intensive aspect of ML
applications.[189] Sometimes, different types of engineered features yield
varied performances across models, such as the count features proving effective
for DNNs and SVMs but not for tree-based methods such as random forests in the
work of Heaton.[190]

4.1.2 MODEL DESIGN

Model developers play a crucial role in designing AI models by determining
architectures, selecting hyper-parameters, and designing loss functions to
enhance performance and address specific tasks. Domain experts are vital in this
step as well, by leveraging their expertise to select the most suitable model,
considering domain knowledge, assumptions, and constraints. For instance,
consider the scenario of a pressurized reactor vessel where the reaction rate
increases with increasing pressure. An AI model controls the reactor pressure
with the goal of maximizing production. However, if the model continuously
raises the pressure without considering safety limits, it risks surpassing the
design pressure, potentially leading to a catastrophic accident. This reflects
the importance of domain knowledge during model design. Incorporating domain
knowledge into AI models can be achieved through various means, including
adjusting model initialization, modifying architecture, or modifying the loss
function.[126]

Model initialization refers to the process of setting initial values for the
parameters (such as weights and biases of a neural network) before training
begins. This initialization step is crucial as it can significantly impact the
convergence and performance of the model during training. Model initialization
involves a migration strategy wherein a new model is initialized by copying
weights and biases from a pre-trained model on similar systems. This approach
leverages existing models and data from the new process to build the model when
there is limited data availability or to reduce training time. Model migration
can be seen in the work of Lu and Gao[191] in building a soft-sensor model for
online prediction of melt-flow length in injection moulding. To achieve this,
existing mould geometries were used, reducing the need for extensive
experimentation. The models corresponding to each mould geometry were aggregated
and trained with new data, resulting in an effective model compared to the base
case. Alternatively, in situations where similar systems are not available,
model learning can be employed as an alternative. This technique entails initial
pre-training of the model using simulated data generated by a rough first
principles model. A notable example of this is in the work of Brand Rihm et
al.,[192] wherein the lack of real plant batch distillation data is addressed by
leveraging expert knowledge to create a preliminary simulated dataset.
Subsequently, refining the model through training with actual plant data
significantly enhances its model performance.

Modification in architecture refers to a model structure that incorporates
principles and insights from physics to enhance its performance,
interpretability, and generalization capabilities. Such architectures typically
involve various adjustments to the network's structure, connections, neuron
functionalities, or combinations of first principle models to better align with
the underlying physical processes governing the system being modelled. Daw et
al.[193] present a physics-guided architecture employing LSTM models to simulate
lake temperature dynamics. Leveraging the known monotonic relationship between
water density and depth, the authors introduce physical intermediate variables
into the model architecture, thereby ensuring the preservation of monotonicity
within the LSTM framework. Alternatively, one can adopt a hybrid approach by
combining AI with first principle models. This can be achieved through a series
combination, where outputs from each model feed into the other, or by working in
parallel to enhance overall performance. For more details, refer to Section 3 on
hybrid AI models.

As mentioned in Section 3, modification of the loss function is a technique used
to enforce faithfulness to physical laws by penalizing model outputs that
violate these laws. It is typically implemented by augmenting the standard loss
function with an additional term based on the deviation from physical
constraints. The purpose is to guide the model towards solutions that not only
fit the data but also conform to the underlying physical principles, such as in
PINNs. A notable example is PANACHE,[151] which utilizes PINNs to accurately
simulate cyclic adsorption processes. It achieves this by integrating a
physics-constrained loss function, enabling the model to learn underlying
partial differential equations without the need for system-specific inputs like
isotherm parameters. This approach enhances computational efficiency and
reliability while ensuring accurate representation of adsorption phenomena.

4.1.3 MODEL TRAINING

Human expertise can complement AI during model training by providing domain
knowledge, context, and nuanced understanding that may not be readily captured
by algorithms alone. Through a human-in-the-loop mechanism, experts can curate
and label data, offer insights on ambiguous cases, and guide the training
process towards more relevant and meaningful outcomes. One such approach is
active learning. Active learning is a type of semi-supervised learning with a
query strategy for selecting specific instances from which it wants to learn.
This method, when coupled with human expertise within a human-in-the-loop
framework for labelling selected instances, enhances the refinement process,
leading to more meaningful outcomes in a much faster timeline. Moreover, active
learning operates efficiently with smaller labelled datasets, thereby reducing
manual annotation expenses while maintaining elevated accuracy levels. This
symbiosis of human insight and ML prowess sets the stage for advancements in the
realm of high throughput experimentation (HTE).

HTE is a technique facilitating the simultaneous execution of large numbers of
experiments to be conducted in parallel, offering increased efficiency compared
to traditional experimental approaches.[194] In the R&D labs of the chemical
industry, HTE is primarily utilized for the rapid determination of solvents,
reagents, and catalysts by examining a wide array of reactions. Notably, the
study of heterogeneous catalyst performance involves intricate interactions
among various catalyst attributes and operational conditions, resulting in a
multidimensional catalyst discovery and optimization space. In general, HTE
studies are capable of screening large amounts of catalysts efficiently.
However, their efficacy in discovery depends on encountering a high-performing
catalyst within the screened parameter space. Therefore, the integration of HTE
techniques with predictive algorithms is advantageous to guide experiments
effectively.[195] The integration of ML and HTE into integrated workflows has
given rise to SDLs, leveraging the complementary strength of both approaches.
The ML component of the workflow aids in predictions and experiment design,
while the HTE component executes the suggested experiments, allowing the results
to inform ML model updates.[196] The synergy between ML and HTE becomes much
more powerful when combined with a fully automated system in a closed-loop
fashion to autonomously perform the experiments selected based on the ML model
and decision-making algorithm.[197] In the past decade, SDL applications have
shown promise in diverse areas such as the discovery/development of complex
organic compounds,[198-200] nanomaterials,[201-206] thin-film materials,[207,
208] and carbon nanotubes.[209] Despite these achievements, the full potential
of SDLs in chemical and materials sciences has been hindered by challenges such
as the lack of standardized hardware, accessible software, and user-friendly
operational guidelines, as well as the inability to incorporate physics-based
models easily. Several open-access SDL software packages have emerged to
facilitate autonomous experimentation in chemical and materials sciences,
including ChemOS[210] and ARES OS.[211] These software platforms incorporate
distinct experiment planning algorithms such as Phoenics,[212] Gryffin,[213] and
Golem.[214] Phoenics, a Bayesian global optimization algorithm based on kernel
density estimation, proposes new experimental conditions by leveraging prior
results to minimize redundant evaluations.[212] Gryffin, a general-purpose
Bayesian optimization framework for categorical variables, relies on kernel
density estimation and utilizes descriptors to approximate categorical variables
in a continuous space.[213] Golem is an algorithm applicable to any experimental
planning strategy, accounting for input uncertainties through probability
distributions to locate optima robust to variations arising from uncertainties
in experimental conditions or instrument imprecision.[214] A comprehensive
description of the ML algorithms utilized in SDLs can be found in the cited
literature.[215-218]

While SDLs hold tremendous promise, several challenges persist, including the
automatic dispensing of solids and handling heterogeneous mixtures. Dispensing
solids, especially those with varying properties and in small quantities,
remains a significant challenge. Additionally, handling heterogeneous mixtures
poses difficulties in systems designed for liquid transfer due to the risk of
damage or malfunction. Addressing these challenges is crucial for unlocking the
full potential of SDLs in chemical and materials sciences.[196]

4.1.4 MODEL VALIDATION

In model validation, human expertise is a crucial asset. As AI algorithms
process vast amounts of data and identify patterns, human experts bring
contextual understanding and detailed judgement to the validation process. Human
experts possess the ability to interpret results in the broader context of their
field, discerning between genuine insights and spurious correlations that might
mislead an AI system. Moreover, they can identify biases or limitations within
the data or the model itself, ensuring that the validation process is
comprehensive and rigorous. One such approach is adversarial AI (discussed
next).

Since the 1960s, cybersecurity in the process industry, particularly in systems
like SCADA, has evolved from mainframe to distributed architectures with LAN
introduced in the 1980s.[219] Presently, these systems are complex and highly
interconnected, increasing vulnerability to cyber-threats.[220] For example, in
2010, attackers used the Stuxnet worm to infect the Natanz nuclear plant's
network via USB. It manipulated centrifuge speeds, causing damage and halting
uranium enrichment for a week, leading to major economic losses.[221]
Cyber-attacks can have significant consequences even without targeting process
equipment. A cyber-attack on an operator's computer leaked confidential data
from multiple Japanese nuclear plants onto the internet, including inspection
forms and reports from 2003 to 2005.[221]

The past two decades have seen a surge of AI and ML techniques for different
applications, such as process monitoring and predictive maintenance in chemical
plants. However, recent research has uncovered a series of security
vulnerabilities within ML models.[222] These vulnerabilities not only pose
economic and reputational risks but also have the potential to trigger
catastrophic events such as hazardous material releases, fires, and explosions,
with severe consequences for workers, the population, and the environment, just
as with cyber-attacks on conventional systems. The necessity for attention in
this direction is underscored by a survey conducted by Microsoft.[223] In this
survey, the authors interviewed 28 ‘security-sensitive’ organizations in fields
such as finance, cybersecurity, and food processing, aiming to comprehensively
elucidate the tactical and strategic methodologies employed to safeguard ML
systems against potential attacks. The study's findings highlight a prevalent
deficiency: a significant portion of ML engineers and incident responders lack
the necessary proficiency to fortify industry-grade ML systems against
adversarial threats. Additionally, the research revealed that 25 out of the 28
organizations surveyed acknowledged the absence of appropriate tools essential
for effectively securing their ML systems.

To address these vulnerabilities, there has been significant interest in
adversarial ML (AML). AML a mixture of cybersecurity and ML, is most commonly
defined as the design of ML algorithms that can resist sophisticated attacks and
the study of the capabilities and limitations of attackers.[224] Attacks on ML
models can target various components, including training data, model parameters,
and desired outputs. AML attacks are classified based on the attacker's
knowledge of the ML model's inner workings into three categories: white-box,
grey-box, and black-box.[225] White-box attacks occur when attackers possess
complete knowledge of the ML model, similar to the model's operator or
developer. For instance, FGSM,[226] C&W,[227] JSMA,[228] and Deep-Fool[229] are
examples of white box attacks. Grey-box attacks happen when attackers have
partial knowledge of the model, potentially causing model failure. Black-box
attacks involve attackers operating blindly without any knowledge of the model.
In a black-box attack scenario, various methods like decision-based
attacks,[230] alternative model attacks,[231] one-pixel attacks,[232] and others
are employed.

Another way to categorize AML involves assessing the attack's capability and its
impact on the ML model. This classification encompasses poisoning, evasion, and
oracle attacks.[233] Poisoning attacks disrupt the model training process by
corrupting either the training data or the model's logic itself, sometimes
targeting the input parameters of ML models. Evasion attacks manipulate the ML
model's decisions to induce misclassifications but don't affect the training
process. Oracle attacks occur when an attacker crafts a malicious substitute for
the original ML model by accessing its application programming interface (API),
a trend amplified by the proliferation of commercial cloud-based computing
services. Oracle attacks can be further subdivided into three categories: (i)
Extraction attacks, which glean architectural details by observing the model's
predictions and probabilities; (ii) Inversion attacks, which involve
reconstructing the training data; and (iii) Inference attacks, facilitating the
identification of specific data points within the distribution of the training
dataset.

In response to such attacks, researchers have also proposed and adapted numerous
defence methods such as GAN-based defence,[234] adversarial training,[235]
defence distillation,[236] and adversarial example detection.[237] Early
approaches, such as information hiding and randomization, aimed to increase
model robustness.[238] More recent studies have categorized defensive techniques
into reactive and proactive approaches.[239] Reactive approaches detect
adversarial attacks post-model deployment, while proactive approaches focus on
designing models inherently resistant to attacks. Although significant
advancements have been achieved in AML, challenges persist within the chemical
process industry. Adversarial examples are predominantly utilized in image
classification, whereas chemical process data is often presented in multivariate
time-series format. Consequently, there is a pressing need for dedicated
research in AML, particularly in domains like chemical engineering where safety
is of paramount importance. Many defensive techniques are specialized to counter
specific attacks and may be susceptible to other forms of attacks. Additionally,
these defence mechanisms are closely tied to specific ML models and network
architectures. Thus, there is a demand for more generalized defences and ML
designs that inherently possess resilience against diverse attacks. Predicting
the exact attacks to train against, as well as those to disregard, presents
further challenges.[240] Ultimately, attempting to train models against every
existing ML attack poses a significant challenge within the chemical process
industry.

In conclusion, while AML has made significant progress in addressing security
concerns in other domains, ongoing research and development are crucial in the
chemical industry to enable the widespread deployment of AI in process
industries. To our knowledge, there has been limited work done in this area,
indicating future opportunities for the development of robust AI and ML systems
resilient to adversarial attacks.


4.2 AI COMPLEMENTS HUMAN

AI complements humans involves leveraging AI to enhance human capabilities,
particularly in tasks where human expertise may be limited, time-consuming to
deploy, or prone to errors. Contrary to the concept of human complements AI,
where human knowledge is integrated into AI systems, AI complements human
focuses on how AI technologies can support and empower human experts. This
entails utilizing AI algorithms for tasks such as data analysis, pattern
recognition, and decision support. Rather than replacing human expertise, AI
serves as a tool to augment and amplify human capabilities. By harnessing the
strengths of both human intelligence (HI) and AI, organizations can achieve
greater productivity, make better-informed decisions, and solve complex problems
more effectively. Moreover, AI complementing humans fosters collaboration and
mutual learning, leading to continuous improvement and adaptation in dynamic
environments. The various techniques through which AI complements humans are
presented next.

4.2.1 MODEL EXPLANATIONS

In the chemical industry, processes are developed by process engineers and
executed by process operators based on their knowledge in chemical engineering
and practical field experience. Therefore, domain experts in chemical
engineering may find it hard to trust the decisions and recommendations made by
AI-based models, limiting their practical adoption. This challenge is especially
prominent in cases that demand a high degree of reliability, a common requisite
in the chemical industry. To overcome this challenge, recent years have
witnessed the development of distinct Explainable Artificial Intelligence (XAI)
techniques to enhance the interpretability of AI models. The most common ways
humans interpret any system or process is by explanation-by-example, feature
relevance explanation, visual explanation, rule-based explanation, or
explanation-by-simplification,[241] as shown in Figure 7.

FIGURE 7
Open in figure viewerPowerPoint
Overview of explanation types.

Explanation-by-example considers the extraction of data examples that relate to
the result generated by a certain model, enabling a better understanding of the
model itself, similar to how humans behave when attempting to explain a given
process. These techniques include: (1) Prototype explanations, which entail the
identification and presentation of typical examples that best represent a
specific prediction. Such explanations play a crucial role in identifying
typical process conditions that lead to desired outcomes, aiding in the
optimization of chemical processes. (2) Adversarial explanations (in contrast)
serve the purpose of revealing potential weaknesses and vulnerabilities within
the model's decision-making process. These adversarial examples are crucial for
identifying potential vulnerabilities in safety-critical systems, ensuring the
reliability and security of chemical processes. (3) Counterfactual explanations
involve the generation of examples or scenarios that are similar to the input
but would result in a distinct prediction from the model. These counterfactual
examples assist operators in understanding how minor adjustments to operational
parameters can rectify the abnormality, thereby facilitating process control and
troubleshooting. Manca et al.[242] present the XAI dashboard for the process
industry, which includes a statistical module for data visualization and an XAI
module for exploring counterfactual explanations at varying levels of
abstraction. An illustrative application in batch process control showcases the
utilization of counterfactual variable values that must be altered to attain a
target outcome. Concurrently, Harinarayan and Shalinie[243] established a
process monitoring framework with dual objectives: the provision of explanations
for AI predictions and recommendations to restore processes from abnormal to
normal states. Employing the TreeSHAP method, the study achieves explanatory
insights into fault identification in the Tennessee Eastman process.
Furthermore, it employs the diverse counterfactual explanations method to
recommend corrective actions in response to identified faults.

Feature relevance explanations aim to describe the functioning of complex models
by measuring the importance of individual features in the model prediction. In
the process industry, these methods play a crucial role in identifying the key
factors, such as temperature, pressure, and reactant concentrations, that most
significantly influence the outcomes of a chemical process, thus aiding in
optimizing and controlling processes efficiently. In chemical engineering, the
Shapley additive explanation (SHAP) is a widely used feature relevance
explanation method for interpreting FDI models in process systems[244, 245];
However, it is important to note that the computational complexity of SHAP
increases with an increase in number of features. To mitigate this challenge,
Bhakte et al. have proposed the integrated gradient (IG) method, which
effectively utilizes gradient calculations to operationalize SHAP.[246]
Additionally, Agarwal et al. introduced the layer wise relevance propagation
method to explain FDI results and employed it to prune irrelevant input
variables, thereby enhancing the accuracy of test classifications.[247] Sivaram
and Venkatasubramanian presented one such work that generates a textual
explanation. This work pioneers the development of an XAI framework capable of
generating mechanistic and causal explanations by integrating symbolic AI with
numeric AI.[248] In contrast, Gandhi and White employed the LIME methodology to
acquire explanations based on molecular descriptors.[249] These explanations are
subsequently converted into textual form by leveraging the GPT-3
text-davinci-001 model. These explanations are further used to interpret the
molecular structure property predictions.

Visual explanation refers to techniques and tools that use visualizations to
make AI model predictions or decisions more understandable and interpretable for
humans. Visual explanations often include visual representations such as
heatmaps, feature importance plots, variable attentions, and other graphical
elements that highlight the most relevant input features, regions, or patterns
contributing to a model's output. In chemical engineering, visualization is
important in tasks like quality assurance and process monitoring. For example,
Sun et al.[250] used a class activation map (CAM) to generate heatmaps that
characterize the machine's status using real-time vibration footage, depicting
the normal or fault state of the cantilever beam and water pump. A similar
application was also seen in process monitoring, where Bhakte et al.[251]
employed the Grad-CAM technique to identify the input data feature responsible
for the occurrence of a fault in a chemical process. Danesh et al.[252] used
partial dependence plots, individual conditional expectation, and accumulated
local effects plots to understand the AI model used in a case study of a
combined cycle power plant. Wu et al.[253] leveraged self-attention weights to
interpret FDI results, while Aouichaoui et al.[254] utilized attention weights
as explanations, representing the molecular components contributing to property
predictions. Additionally, Schwaller et al.[255] applied attention mechanisms to
interpret the atom-mapping information between products and reactants.

Explanation-by-simplification encompasses techniques wherein an entirely new
system is constructed based on a complex model to be explained. This simplified
model typically attempts to optimize its likeness to the complex model while
keeping a similar performance score. The simplification of the complex model
makes it more interpretable, leading to more transparent process control
strategies. One such method is limit-based explanations for monitoring (LEMON)
that build the local linear model in the vicinity of input samples to explain
the FDI results generated by AI models.[256] LEMON uses the alarm limits, which
makes the explanations more operator-friendly and easier to understand due to
model simplification.

Rule-based-explanations employ a rule-based approach to elucidate the outcomes
of AI models utilized in various processes. This method entails the generation
of interpretable rules tailored to explain decisions, aiding users in
comprehending the intricacies of decision-making processes inherent in AI
systems applied to chemical engineering. In chemical engineering applications,
the rule-based approach allows engineers and stakeholders to not only trust the
decisions made by AI models but also gain insights into the governing principles
behind those decisions. For instance, in the optimization of chemical processes
or the design of new materials, rule-based explanations enable professionals to
comprehend the variables, conditions, and parameters that significantly
influence AI-driven decisions. This transparency becomes crucial for ensuring
the safety, efficiency, and reliability of chemical engineering systems,
aligning technology with human expertise in a synergistic manner.

The selection of the aforementioned techniques allows end-users to choose the
most suitable approach that aligns with their domain knowledge and serves
specific purposes, providing diverse insights into black-box models. In the
above classification of XAI methods, we acknowledge the potential for overlap
among the categories. Notably, some methods may exhibit dual characteristics,
such as feature relevance and visual explanation.

An important consideration arises regarding when to leverage XAI and when not
to. XAI proves particularly valuable during high-stakes decision-making and
model development. In the chemical industry, where decisions impact operational
safety and environmental concerns, XAI provides transparency, enabling engineers
to grasp the rationale behind decisions and adhere strictly to safety protocols
to prevent accidents and mitigate risks to human health and the environment.
Similarly, during model development, XAI aids in understanding factors
influencing predictions and identifying and rectifying biases, thus facilitating
model debugging. Conversely, it is essential to outline situations where XAI may
not be necessary: first, in low-stakes applications where errors have minimal
consequences, such as in niche product R&D or experimental trials with
negligible economic impact. The second instance is in situations with minimal
interpretability needs, where decisions are straightforward and well-understood,
and the demand for interpretability may be marginal. Third, in cases where there
is a trade-off with performance; incorporating XAI may lead to compromises in
performance, particularly when performance is paramount. Finally, a
self-interpretable model (decision tree) can be preferred, especially when the
accuracy difference between white box and black box models is negligible.

4.2.2 OPERATOR MONITORING

In modern process industries, despite advanced automation and safety protocols,
over 70% of accidents stem from human errors.[257] Addressing this challenge
necessitates methodologies for assessing operators' competence. A notable
contribution in this domain is the AI-powered eye-tracking system developed by
Shajahan et al.[258] This system, known as Dhrushti-AI, establishes a robust
framework for capturing operator cognitive behaviour within process plant
environments through multi-screen-multi-user eye tracking capabilities. By
integrating facial recognition algorithms and image processing techniques, such
systems offer valuable insights into operators' cognitive capabilities, thereby
supplementing human judgement and bolstering safety across industrial settings.
Within this framework, various image processing methods, such as thresholding,
closing, and contour extraction, are employed to detect pupils in captured
images. The eye tracker accurately estimates gaze direction by leveraging
information such as head orientation, pupil centre locations, and personalized
calibrated gaze models. It is widely acknowledged that understanding cognitive
behaviour is pivotal in addressing the root causes of abnormal situations and
accidents.


4.3 HUMAN-AI CONFLICT

Incorporating AI-based decision-making in process industries is quickly gaining
traction, and it is but natural that dissonance can occur between the operator's
decision and AI's. Arunthavanathan et al.[259] classify these conflicts into (i)
Observation conflicts, (ii) Interpretation conflicts, and (iii) Action
conflicts. Situational awareness demonstrated by human operators affects human
decision-making to be an emotional and intellectual process, allowing for
creative problem solving, which is found to be lacking in AI-based
decision-making. Furthermore, AI-based decision-making is closely linked to its
ability to contextual-learn these decisions from available data. With projected
trends in AI's learning capacity, solely AI-based decision-making in process
industries can be a realizable future, but at present, the general consensus is
that AI needs to be overseen by HI owing to safety concerns.[260]


5 GENERATIVE AI

Process engineering concepts enable chemical transformation at the highest scale
determined by macroeconomic trends. The inherent challenges involve new product
discovery, design and upscaling, unit operations development, and integration of
multifunctional processes while meeting quality and safety standards and
reducing costs and complexity with increased business flexibility. This
underscores the significant potential for generative AI applications in various
aspects of process engineering because of its multi-model and multi-scale big
data processing and generative capabilities, along with natural language
augmentation. In this section, we highlight the key generative AI work done
across process system engineering, that is, in process design and development,
process modelling and control, and process diagnosis and prognosis.


5.1 PROCESS DESIGN

A significant fraction of generative AI applications are focused on process
design, including molecular and materials design and process development.

5.1.1 PRODUCT DEVELOPMENT

The process development cycle involves novel compound generation, property
prediction, and optimization. Properties like synthesizability, solubility,
affinity, and drug-like characteristics such as ADMET (adsorption, distribution,
metabolism, elimination, and toxicity), and so forth are specifically targeted
in the pharmaceutical and chemical industry. AI has been used for targeted
product development without synthesizing materials with significantly lower
resources. In the molecular discovery space, RNNs, graph neural networks (GNNs),
autoencoders, and specifically VAEs, GANs, and most recently, transformer-based
architectures have shown potential in generating novel products with desired
properties.

RNNs were the earliest architecture that utilized generative capabilities in
molecular design applications, property optimization, and inverse design
settings.[17, 261-264] Notably, Segler et al. applied long short-term memory
(LSTM)-based models along with RL to fine-tune the molecular property for ab
initio drug design.[262]

GNNs are especially well-suited for molecular design due to their ability to
model non-Euclidean data and representations similar to the chemical
structures.[265-267] Various authors used variants like the message passing
neural network (MPNN),[268-270] Schnet,[271] and graph convolutional networks
(GCNs)[272] to predict properties and obtained better performance than any
previous model. Although GNNs show good accuracy, they require large amounts of
data to learn the mappings.

Autoencoders (AEs) were employed next, but had limited success due to the sparse
representation by simple AEs. However, the probabilistic version of AEs, known
as VAEs, proved effective in modelling the latent distributions of molecular
structure to map and extrapolate the properties. In 2018, Gomez-Bombarelli et
al. first mapped molecular properties and designed new molecules using VAE.[23]
Further, Flam-Shepherd et al. added a MPNN to the VAE architecture to enhance
the performance of the model.[273] Moreover, variants like junction tree VAE
(JTVAE) have shown immense potential in molecule generation.[274] The
conditional version of VAEs usually constrains the representation towards a
smoother distribution surface that is optimized for desired goals such as
novelty, validity, and functionality.[275-278]

The generative nature of adversarial networks has also been utilized for
molecular design to map and generate the molecular structure distribution. The
competing nature of the generator and discriminator were utilized to generate
novel molecules with desired characteristics. Cao et al. and others applied GANs
with RL-based optimization to achieve the required property for chemical
molecule generation.[43, 279, 280]

The adoption of transformers for de-novo design of molecules is favoured by the
shared similarity of the synergistic combination of building blocks between
natural language and molecular representations.[281, 282] These models have
shown state-of-the-art performance compared to their counterparts.[283, 284]
Transformer models utilize various architectures depending on the end
application, such as encoder-decoder architectures (mapping and translation),
encoder-only models (representation learning like property prediction[285-287]),
and decoder-only architectures (primarily generative applications like
generating novel valid molecules[288]). Honda et al. reported the use of
transformers for the prediction of molecular properties in 2019.[289] The
conditional variants of transformers such as MolGPT[288] constrain the input
with desired objectives such as properties or scaffold information to guide
molecule generation. The notable applications of the transformer model for
molecular representation and property prediction include MolBERT[290] and
ChemBERTA.[291] The outcome of these models can be further optimized towards
certain objectives using techniques like Bayesian optimization[23, 292] and RL.

Although the majority of generative AI efforts have focused on the design of
drug-like molecules, yet we observe the emerging trend across other chemical
sectors such VAE-based solid-state materials design,[293] porous crystalline
material discovery,[294, 295] and GAN-based design of metal organic
frameworks.[296]

5.1.2 PROCESS DEVELOPMENT

Conventional methods like full factorial designs,[297] fractional factorial
designs,[298] central composite designs,[299] designs for mixtures,[300] and
optimal design methods such as definitive screening designs[301] have
traditionally been used for process development.[302-304] However, the
uncertainty in design space, intractability, laborious rule-based compilation,
and scalability issues limit their application for optimal process
development,[305] and the recent use cases of generative AI in forward
synthetic, retrosynthetic, and condition recommendation are promising in this
direction.

The molecular transformer model of Schwaller et al.[306] can predict reaction
outcomes based on available reactants and reagents and is considered a landmark
in this direction. It utilizes the transformer architecture to achieve over 90%
top-1 accuracy. Inspired by this, some retrosynthetic models based on the same
architecture have been proposed.[20, 307-309] Further, Wang et al. proposed a
single-step template-free transformer-based method to improve the chemical
validity and diversity of the previous work.[310] Lin et al. improved the model
accuracy further in their work.[19] Other notable variants include the
graph-enhanced transformer and hybrid models.[311, 312] Moreover, generative AI
has also shown the potential to infer the appropriate reaction conditions, but
limited to a single reaction class.[313, 314] The lack of high-quality data
makes it difficult to develop a model to extract reaction kinetics. However,
Angello et al. reported a closed-loop workflow to discover general reaction
conditions.[315] Schwaller et al., on the other hand, applied the transformer
model to predict reaction yields.[17, 287, 316] Similarly, Sato et al. utilized
a MPNN model for the prediction of yield.[317] Likewise, Sandfort et al. have
utilized multiple representations to improve yield prediction.[318] Thus,
generative AI has shown promising results, but the lack of multi-step synthetic
capabilities and process condition augmentation, such as catalyst, solvent
condition, and so forth, requires further improvements. Another noteworthy trend
is the introduction of automated flowsheet generation, where generative
transformer models have taken centre stage. Examples such as ‘Learning from
Flowsheets’[80] and ‘Flowsheet Synthesis through Hierarchical RL and Graph
Neural Networks’[319] highlight advancements in implementing the representation
and enhancement of flowsheets. These methods include the representation of
process topology using simplified flowsheet-input line-entry system (SFILES)[77]
and mapping of PFDs and P&IDs for generative purposes[78, 80, 81, 84] as also
highlighted in Section 2. Approaches like RL using deep Q-networks[320] and
hierarchical RL[319] introduce a structured and automated approach to flowsheet
synthesis. The hierarchical framework systematically breaks down the generation
process, improving efficiency and the quality of flowsheet representations.
Further, Oeing et al. utilized the RNN to generate the subsequent processing
stage[84] and Vogel et al. utilized the transformer model for flowsheet
completion.[80] Further, they utilized the transformer model for decentralized
control generation for the given PFDs with top-5 accuracy above 85%. Along with
control structure design, the method showed promise in piping and
instrumentation diagrams (P&ID) generation as well.[81]


5.2 PREDICTIVE MODELLING AND CONTROL

The next aspect of process engineering includes modelling process behaviour for
process monitoring, control, and optimization.[321-323] Although the literature
highlights the success of generative AI in multivariate time series forecasting,
the full process scale application for the chemical industry is yet to be
seen.[324, 325]

In this direction, fully automated chemical synthesis using robots is a future
trend. The advantage not only is the faster and more efficient synthesis of even
hazardous material but also in achieving expert-level yields and purity. The
conceptual and physical realization of this concept using the AI robots that
perform experimentation, synthesis, and analysis at a laboratory scale has been
shown in the cited literature.[198, 208, 326] Moreover, the emergence of visual
synthesis through generative models has made considerable progress in capturing
and reproducing complex visual patterns within process systems, leading to
better monitoring and control. Specifically, works like ‘Bubble Generative
Adversarial Networks for Synthesizing Realistic Bubbly Flow Images’[327] and
‘Generative Principal Component Thermography for Enhanced Defect Detection and
Analysis’[328] highlight the versatility of GANs in depicting various processes,
whether capturing bubbly flow dynamics or improving defect detection in
manufacturing. These models demonstrate a robust capability to improve
decision-making in complex and dynamic systems.


5.3 PROCESS DIAGNOSIS AND PROGNOSIS

In process engineering, process diagnosis and prognosis involve identification
of the root cause of underlying faults and, subsequently, time-to-failure (TTF)
estimation by extrapolating the underlying root cause. These tools provide a
foundation for condition-based predictive maintenance while mitigating the risk
of unplanned downtime. Traditionally, the method involves techniques like
PCA-based process diagnosis, transfer entropy,[329] time delay analysis,[330]
Granger causality,[331] and process causal maps[332, 333] to identify the root
cause of the anomaly. However, these methods have been shown to struggle for
processes with a very large number of variables that are highly correlated,
leading to a smearing effect.[334] Therefore, generative AI is particularly
suited to model the vast topological space of multivariate time series. Some
work for anomaly detection has been reported in the literature,[311, 335, 336]
and has the potential to be applied directly to process level data.

Figure 8 provides an overview of generative AI methods in process systems
engineering with the type of data and end-use application.

FIGURE 8
Open in figure viewerPowerPoint
The application of generative artificial intelligence (AI) methods in process
systems engineering. GAN, generative adversarial networks. VAE, variational
autoencoders. FBM, fractional Brownian motion. LLM, large language model. EBM,
energy-based model.


5.4 FUTURE

The innovations in generative AI have already transformed its capabilities from
unimodal applications such as molecule to molecule[306, 308, 337] to multimodal
applications such as spectral to compound,[338] natural language (query-based
molecule generation[339]), inferring experimental procedures,[67, 340] and so
forth. The continually expanding range of capabilities exhibited by large
language models like GPT and BERT[341] serve as motivation for researchers to
leverage these capabilities to effectively address unique challenges faced by
generative AI in the process industry, such as (1) scarcity of process level
quality data and experimental uncertainty[342]; (2) suboptimal representation
learning leading to generative gaps[343, 344]; (3) lack of process-specific
benchmarks[342, 345]; (4) process level integration issues; and (5)
interpretability and ethical concerns associated due to the black box nature of
generative AI, hindering its ability to overcome the confidence barrier. Use
cases like Ligppt[346] that incorporate GPT-style architectures for generative
applications are promising in this direction.

The low data issue necessitates conducting high-quality high-throughput
experimentation or meta-learning coupled with data mining to collect the
reported data. Moreover, the use of active learning,[347, 348] transfer
learning,[349] GANs, and hybrid models[131] are shown to be robust in such
scenarios. RL[350-353] or Bayesian-based goal directed generative AI[255] have
shown superior performance in limited data regimes. Further, benchmarking
platforms like GuacaMol[354] that give different performance criteria to
evaluate the models are available but limited to product development, and such
platforms are required at process scale benchmarking.

Recent innovations in generative AI have shown emerging capabilities such as
chain of thought (CoT) reasoning,[355] instruction following,[356] and
few-shot[357] or zero-shot[358] generalization with their sizes. Further,
innovation in quantum computing has the potential to reduce the computational
load and improve the model performance and size of generative AI.[359, 360]
These new capabilities have the potential to benefit process development,
monitoring, and control, along with better interpretability and interactivity.
The development of an AI-based centralized process operator to monitor every
process operation, control in real-time, and optimize conflicting objectives
such as reaction yield, cost, purity, emission, and so forth, is an aspirational
target. In this direction, agents powered by large language models[361] along
with chain-of-thought capabilities represent a promising application in
generative AI with the capacity for reasoning, executing tasks in an active
learning style, and validating results through collaboration with other
computational tools. Similarly, highly automated robotic factories have the
potential to transform process engineering towards miniature factories capable
of being installed in laboratories, houses, and so forth for better efficiency
and resource management.[199, 362]


6 TO AI OR NOT TO AI?

The promise of AI, as reviewed in the preceding sections, is now being tested in
many real-life applications. While there are many success stories, there have
also been a large number of cases where they have been found wanting. One survey
indicated that up to 70% of AI initiatives may yield no or minimal impact.[363]
Another survey by SAS, Accenture Applied Intelligence, Intel, and Forbes
Insights revealed the need for human oversight, especially when AI is used for
automating critical operational decisions. Specifically, it found that one in
four enterprises had to reassess, redesign, or override AI-based systems due to
unsatisfactory results. Reasons for AI system failures include deviations from
intended use (48%), inconsistent outputs (38%), and ethical concerns (34%).[363]
These highlight the need for an understanding of the factors that contribute to
the success or failure of AI projects. Five key factors can be distinguished as
elaborated next.

Data quality is a multifaceted concept encompassing attributes such as accuracy,
completeness, consistency, relevance, and timeliness.[364, 365] Inaccurate data
(for instance, arising from faulty sensor readings) can introduce biases or
errors. Incomplete data (such as missing entries or fragmented records) can
hinder the model's ability to determine meaningful patterns. Inconsistencies in
data (different sampling frequencies or data integration from different sources)
can confound AI algorithms, leading to inaccuracies. Irrelevant data (e.g., for
a fault diagnosis application, extensive data from normal operations in lieu of
data from abnormal conditions) may introduce biases. Additionally, data that
does not account for system changes (such as equipment degradation or catalyst
deactivation) or changes in environmental conditions can compromise long-term
performance in the dynamic settings that are common in the chemical industry.

Data quantity: The development of AI models relies on adequate data. Data
adequacy must be evaluated with respect to the complexity of the AI model (both
structural and parametric). Limited data (faulty data for fault diagnosis or
reaction data in retrosynthesis), in general, leads to poorer model performance.
To mitigate data limitations, various strategies can be adopted, such as
augmenting the data using process simulations, leveraging existing datasets from
similar processes, or incorporating insights from domain experts, as discussed
in Section 4.1. If, even with such augmentation, the quantity of data is
inadequate, then alternate non-AI approaches should be considered.

Usability refers to the usage of the AI system's functionalities and interfaces
by the end-user, easily and without errors. The complexity of AI models, along
with their black-box nature, can result in (often unstated) expectations being
placed on the end-user related to the proper usage. If an end-user, such as a
control room operator, is not well-versed with the intricacies of the AI model
but is responsible for the outcome arising from its usage (overall
responsibility for ensuring stable operations), there is a potential for
mal-usage. The extent of usage of AI models and the model's complexity may need
to be constrained in order to ensure compatibility with the user's capabilities.
Also related is the need for regulatory compliance, such as in pharmaceutical
manufacturing, of the data used for developing and maintaining AI models.[366]
Model explanations discussed in Section 4.2.1 offer one means to address or
ameliorate such limitations.

The paradox of automation, especially the long-term effects of replacing human
decision-making with an AI system, must also be considered when evaluating the
benefits. As the AI model gets better, the frequency at which the human end-user
would be called upon will decrease. Excessive reliance on AI would, over time,
diminish human end-users skills.[367] However, the out-of-practice human
end-user would still be required to step in during the rarest of the rare but
most complex situations where the AI is itself likelier to fail (sparsity of
rare event data in training). A careful evaluation of these considerations is
crucial when deciding if high-level AI should not be employed, especially in
human-in-the-loop systems.

AI economics: Any large-scale project would require economic justification and a
positive return on investment (ROI). The development, deployment, and long-term
maintenance of AI systems often involve unforeseen expenses associated with
additional data collection (see data quality and quantity issues above) and
human resources. Training end-users to become savvy users of AI could be
essential for project success but can be complicated due to the usability
challenges. Further, new personnel responsible for model maintenance may be
needed since automated model updates are still in their infancy. These
requirements may not be fully evident at the AI project's inception or evolve
over its lifecycle and significantly affect the AI project's ROI.[368] Hence,
lifecycle cost analysis, accounting for uncertainty and risks, should be used to
decide on the attractiveness of AI projects.


AUTHOR CONTRIBUTIONS

Karthik Srinivasan: Conceptualization; investigation; writing – original draft.
Anjana Puliyanda: Writing – original draft; conceptualization; investigation.
Devavrat Thosar: Conceptualization; investigation; writing – original draft.
Abhijit Bhakte: Conceptualization; investigation; writing – original draft.
Kuldeep Singh: Conceptualization; investigation; writing – original draft.
Prince Addo: Conceptualization; investigation; writing – original draft.
Rajagopalan Srinivasan: Conceptualization; investigation; writing – original
draft; writing – review and editing; funding acquisition. Vinay Prasad:
Conceptualization; investigation; writing – original draft; writing – review and
editing; funding acquisition.


OPEN RESEARCH


DATA AVAILABILITY STATEMENT

Data sharing not applicable to this article as no datasets were generated or
analyzed during the current study.

REFERENCES

 * 1V. Venkatasubramanian, AIChE J. 2019, 65, 466.
   10.1002/aic.16489
   
   CASWeb of Science®Google Scholar
 * 2E. Rich, Artificial intelligence, McGraw-Hill, Inc., New York 1983.
   
   Google Scholar
 * 3C. Thon, B. Finke, A. Kwade, C. Schilde, Advanced Intelligent Systems 2021,
   3, 2000261.
   10.1002/aisy.202000261
   
   Web of Science®Google Scholar
 * 4D. Dutta, S. R. Upreti, Can. J. Chem. Eng. 2021, 99, 2467.
   10.1002/cjce.24246
   
   CASWeb of Science®Google Scholar
 * 5A. M. Schweidtmann, E. Esche, A. Fischer, M. Kloft, J. U. Repke, S. Sager,
   A. Mitsos, Chem. Ing. Tech. 2021, 93, 2029.
   10.1002/cite.202100083
   
   CASWeb of Science®Google Scholar
 * 6J. M. Weber, Z. Guo, C. Zhang, A. M. Schweidtmann, A. A. Lapkin, Chem. Soc.
   Rev. 2021, 50, 12013.
   10.1039/D1CS00477H
   
   CASPubMedWeb of Science®Google Scholar
 * 7L. von Rueden, S. Mayer, K. Beckh, B. Georgiev, S. Giesselbach, R. Heese, B.
   Kirsch, J. Pfrommer, A. Pick, R. Ramamurthy, M. Walczak, J. Garcke, C.
   Bauckhage, J. Schuecker, IEEE Transactions on Knowledge and Data Engineering
   2023, 35, 614.
   
   Web of Science®Google Scholar
 * 8L. S. Keren, A. Liberzon, T. Lazebnik, Sci. Rep. 2023, 13, 1249.
   10.1038/s41598-023-28328-2
   
   CASPubMedGoogle Scholar
 * 9S. L. Brunton, B. W. Brunton, J. L. Proctor, J. N. Kutz, PLoS One 2016, 11,
   e0150171.
   10.1371/journal.pone.0150171
   
   PubMedWeb of Science®Google Scholar
 * 10J. Willard, X. Jia, S. Xu, M. Steinbach, V. Kumar, ACM Computing Surveys
   2022, 55, 1.
   10.1145/3514228
   
   Web of Science®Google Scholar
 * 11D. S. Wigh, J. M. Goodman, A. A. Lapkin, Wiley Interdiscip. Rev: Comput.
   Mol. Sci. 2022, 12, e1603.
   10.1002/wcms.1603
   
   Web of Science®Google Scholar
 * 12L. David, A. Thakkar, R. Mercado, O. Engkvist, J. Cheminf. 2020, 12, 1.
   10.1186/s13321-020-00460-5
   
   PubMedGoogle Scholar
 * 13K. C. Leonard, F. Hasan, H. F. Sneddon, F. You, ACS Sustainable Chem. Eng.
   2021, 9, 6126.
   10.1021/acssuschemeng.1c02741
   
   CASWeb of Science®Google Scholar
 * 14D. Weininger, J. Chem. Inf. Comput. Sci. 1988, 28, 31.
   10.1021/ci00057a005
   
   CASWeb of Science®Google Scholar
 * 15N. M. O'boyle, A. Dalke, 2018, https://doi.org/10.26434/chemrxiv.7097960.v1
   10.26434/chemrxiv.7097960.v1
   
   Google Scholar
 * 16M. Krenn, F. Häse, A. K. Nigam, P. Friederich, A. Aspuru-Guzik, Machine
   Learning: Science and Technology 2020, 1, 045024.
   10.1088/2632-2153/aba947
   
   Google Scholar
 * 17P. Schwaller, T. Gaudin, D. Lányi, C. Bekas, T. Laino, Chem. Sci. 2018, 9,
   6091.
   10.1039/C8SC02339E
   
   CASPubMedWeb of Science®Google Scholar
 * 18W. Bort, I. I. Baskin, T. Gimadiev, A. Mukanov, R. Nugmanov, P. Sidorov, G.
   Marcou, D. Horvath, O. Klimchuk, T. Madzhidov, A. Varnek, Sci. Rep. 2021, 11,
   3178.
   10.1038/s41598-021-81889-y
   
   CASPubMedWeb of Science®Google Scholar
 * 19K. Lin, Y. Xu, J. Pei, L. Lai, Chem. Sci. 2020, 11, 3355.
   10.1039/C9SC03666K
   
   CASPubMedWeb of Science®Google Scholar
 * 20A. A. Lee, Q. Yang, V. Sresht, P. Bolgar, X. Hou, J. L. Klug-Mcleod, C. R.
   Butler, Chem. Commun. 2019, 55, 12152.
   10.1039/C9CC05122H
   
   CASPubMedWeb of Science®Google Scholar
 * 21N. Pillai, A. Dasgupta, S. Sudsakorn, J. Fretland, P. D. Mavroudis, Drug
   Discovery Today 2022, 27, 2209.
   10.1016/j.drudis.2022.03.017
   
   CASPubMedWeb of Science®Google Scholar
 * 22R. Winter, F. Montanari, F. Noá, D. A. Clevert, Chem. Sci. 2019, 10, 1692.
   10.1039/C8SC04175J
   
   CASPubMedWeb of Science®Google Scholar
 * 23R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B.
   Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P.
   Adams, A. Aspuru-Guzik, ACS Cent. Sci. 2018, 4, 268.
   10.1021/acscentsci.7b00572
   
   CASPubMedWeb of Science®Google Scholar
 * 24L. Krasnov, I. Khokhlov, M. V. Fedorov, S. Sosnin, Sci. Rep. 2021, 11, 1.
   10.1038/s41598-021-94082-y
   
   PubMedGoogle Scholar
 * 25K. Rajan, A. Zielesny, C. Steinbeck, J. Cheminf. 2021, 13, 1.
   10.1186/s13321-020-00477-w
   
   PubMedGoogle Scholar
 * 26J. M. Goodman, I. Pletnev, P. Thiessen, E. Bolton, S. R. Heller, J.
   Cheminf. 2021, 13, 1.
   10.1186/s13321-021-00517-z
   
   PubMedGoogle Scholar
 * 27H. Kim, J. Na, W. B. Lee, J. Chem. Inf. Model. 2021, 61, 5804.
   10.1021/acs.jcim.1c01289
   
   CASPubMedWeb of Science®Google Scholar
 * 28J. Handsel, B. Matthews, N. J. Knight, S. J. Coles, J. Cheminf. 2021, 13,
   1.
   10.1186/s13321-021-00535-x
   
   PubMedGoogle Scholar
 * 29L. Yao, M. Yang, J. Song, Z. Yang, H. Sun, H. Shi, X. Liu, X. Ji, Y. Deng,
   X. Wang, Anal. Chem. 2023, 95, 5393.
   10.1021/acs.analchem.2c05817
   
   CASPubMedGoogle Scholar
 * 30O. Mández-Lucio, B. Baillif, D. A. Clevert, D. Rouquiá, J. Wichard, Nat.
   Commun. 2020, 11, 1.
   10.1038/s41467-019-13993-7
   
   PubMedGoogle Scholar
 * 31M. Simonovsky, N. Komodakis, in Artificial Neural Networks and Machine
   Learning–ICANN 2018: 27th International Conference on Artificial Neural
   Networks, Proc., Part I 27, Springer, Berlin 2018, pp. 412–422.
   
   Google Scholar
 * 32O. Prykhodko, S. V. Johansson, P. C. Kotsias, J. Arús-Pous, E. J. Bjerrum,
   O. Engkvist, H. Chen, J. Cheminf. 2019, 11, 1.
   10.1186/s13321-019-0397-9
   
   PubMedGoogle Scholar
 * 33S. Wang, Y. Guo, Y. Wang, H. Sun, J. Huang, in Proc. of the 10th ACM Int.
   Conf on Bioinformatics, Computational Biology and Health Informatics,
   Association for Computing Machinery, Niagra Falls, NY 2019, pp. 429–436.
   
   Google Scholar
 * 34G. A. Pinheiro, J. Mucelini, M. D. Soares, R. C. Prati, J. L. D. Silva, M.
   G. Quiles, J. Phys. Chem. A 2020, 124, 9854.
   10.1021/acs.jpca.0c05969
   
   CASPubMedWeb of Science®Google Scholar
 * 35G. Grethe, G. Blanke, H. Kraut, J. M. Goodman, J. Cheminf. 2018, 10, 1.
   10.1186/s13321-018-0277-8
   
   PubMedGoogle Scholar
 * 36C. W. Gao, J. W. Allen, W. H. Green, R. H. West, Comput. Phys. Commun.
   2016, 203, 212.
   10.1016/j.cpc.2016.02.013
   
   CASWeb of Science®Google Scholar
 * 37S. N. Elliott, K. B. M. Iii, A. V. Copan, M. Keçeli, C. Cavallotti, Y.
   Georgievskii, H. F. S. Iii, S. J. Klippenstein, Proc. Combust. Inst. 2020,
   000, 1.
   
   Google Scholar
 * 38S. Rangarajan, A. Bhan, P. Daoutidis, Comput. Chem. Eng. 2012, 45, 114.
   10.1016/j.compchemeng.2012.06.008
   
   CASWeb of Science®Google Scholar
 * 39P. P. Plehiers, G. B. Marin, C. V. Stevens, K. M. V. Geem, J. Cheminf.
   2018, 10, 1.
   10.1186/s13321-018-0269-8
   
   PubMedGoogle Scholar
 * 40C. W. Coley, W. Jin, L. Rogers, T. F. Jamison, T. S. Jaakkola, W. H. Green,
   R. Barzilay, K. F. Jensen, Chem. Sci. 2019, 10, 370.
   10.1039/C8SC04228D
   
   CASPubMedWeb of Science®Google Scholar
 * 41M. H. Segler, M. P. Waller, Chem. – Eur. J. 2017, 23, 5966.
   10.1002/chem.201605499
   
   CASPubMedWeb of Science®Google Scholar
 * 42M. Sacha, M. BłaŻ, P. Byrski, P. Dąbrowski-Tumański, M. Chromiński, R.
   Loska, P. Włodarczyk-Pruszyński, S. Jastrzębski, J. Chem. Inf. Model. 2021,
   61, 3273.
   10.1021/acs.jcim.1c00537
   
   CASPubMedWeb of Science®Google Scholar
 * 43N. De Cao, T. Kipf, ArXiv preprint 2022, arXiv:1805.11973,
   http://arxiv.org/abs/1805.11973 (accessed: November 2023).
   
   Google Scholar
 * 44A. E. Blanchard, C. Stanley, D. Bhowmik, J. Cheminf. 2021, 13, 1.
   10.1186/s13321-021-00494-3
   
   PubMedGoogle Scholar
 * 45S. H. Hong, S. Ryu, J. Lim, W. Y. Kim, J. Chem. Inf. Model. 2020, 60, 29.
   10.1021/acs.jcim.9b00694
   
   CASPubMedWeb of Science®Google Scholar
 * 46D. A. Clevert, T. Le, R. Winter, F. Montanari, Chem. Sci. 2021, 12, 14174.
   10.1039/D1SC01839F
   
   CASPubMedWeb of Science®Google Scholar
 * 47V. Mann, V. Venkatasubramanian, React. Chem. Eng. 2023, 8, 619.
   10.1039/D2RE00309K
   
   CASGoogle Scholar
 * 48L. Xue, J. Bajorath, Comb. Chem. High Throughput Screening 2012, 3, 363.
   10.2174/1386207003331454
   
   Google Scholar
 * 49A. Cereto-Massaguá, M. J. Ojeda, C. Valls, M. Mulero, S. Garcia-Vallvá, G.
   Pujadas, Methods 2015, 71, 58.
   10.1016/j.ymeth.2014.08.005
   
   PubMedGoogle Scholar
 * 50E. F.-D. Gortari, C. R. García-Jacas, K. Martinez-Mayorga, J. L.
   Medina-Franco, J. Cheminf. 2017, 9, 1.
   10.1186/s13321-016-0187-6
   
   PubMedWeb of Science®Google Scholar
 * 51N. Schneider, D. M. Lowe, R. A. Sayle, G. A. Landrum, J. Chem. Inf. Model.
   2015, 55, 39.
   10.1021/ci5006614
   
   CASPubMedWeb of Science®Google Scholar
 * 52J. N. Wei, D. Duvenaud, A. Aspuru-Guzik, ACS Cent. Sci. 2016, 2, 725.
   10.1021/acscentsci.6b00219
   
   CASPubMedWeb of Science®Google Scholar
 * 53K. Z. Myint, L. Wang, Q. Tong, X. Q. Xie, Mol. Pharmaceutics 2012, 9, 2912.
   10.1021/mp300237z
   
   CASPubMedWeb of Science®Google Scholar
 * 54A. U. Danishuddin, Drug Discovery Today 2016, 21, 1291.
   10.1016/j.drudis.2016.06.013
   
   CASPubMedWeb of Science®Google Scholar
 * 55Q. Zang, K. Mansouri, A. J. Williams, R. S. Judson, D. G. Allen, W. M.
   Casey, N. C. Kleinstreuer, J. Chem. Inf. Model. 2017, 57, 36.
   10.1021/acs.jcim.6b00625
   
   CASPubMedWeb of Science®Google Scholar
 * 56A. Ilnicka, G. Schneider, Mol. Inf. 2023, 42, 2300059.
   10.1002/minf.202300059
   
   CASWeb of Science®Google Scholar
 * 57M. Nasser, N. Salim, F. Saeed, S. Basurra, I. Rabiu, H. Hamza, M. A.
   Alsoufi, Biomolecules 2022, 12, 508.
   10.3390/biom12040508
   
   CASPubMedGoogle Scholar
 * 58E. J. Bjerrum, B. Sattarov, Biomolecules 2018, 8, 131.
   10.3390/biom8040131
   
   PubMedWeb of Science®Google Scholar
 * 59K. T. Schütt, M. Gastegger, A. Tkatchenko, K. R. Müller, R. J. Maurer, Nat.
   Commun. 2019, 10(1), 5024, https://doi.org/10.1038/s41467-019-12875-2
   10.1038/s41467-019-12875-2
   
   CASPubMedWeb of Science®Google Scholar
 * 60J. Behler, M. Parrinello, Phys. Rev. Lett. 2007, 98, 146401.
   10.1103/PhysRevLett.98.146401
   
   CASPubMedWeb of Science®Google Scholar
 * 61S. Liu, W. Du, Y. Li, Z. Li, Z. Zheng, C. Duan, Z. Ma, O. Yaghi, A.
   Anandkumar, C. Borgs, J. Chayes, H. Guo, J. Tang, ArXiv preprint 2023,
   arXiv:2306.09375, https://arxiv.org/abs/2306.09375 (accessed: May 2024).
   
   Google Scholar
 * 62A. D. Smith, P. Dłotko, V. M. Zavala, Comput. Chem. Eng. 2021, 146, 107202.
   10.1016/j.compchemeng.2020.107202
   
   CASGoogle Scholar
 * 63M. R. Wilkinson, U. Martinez-Hernandez, C. C. Wilson, B. Castro-Dominguez,
   J. Mater. Res. 2022, 37, 2293.
   10.1557/s43578-022-00628-9
   
   CASGoogle Scholar
 * 64S. Zhong, J. Hu, X. Yu, H. Zhang, Chem. Eng. J. 2021, 408, 127998.
   10.1016/j.cej.2020.127998
   
   CASGoogle Scholar
 * 65P. van Gerwen, A. Fabrizio, M. D. Wodrich, C. Corminboeuf, Machine
   Learning: Science and Technology 2022, 3, 045005.
   10.1088/2632-2153/ac8f1a
   
   Google Scholar
 * 66V. Mann, V. Venkatasubramanian, AIChE J. 2021, 67, e17190.
   10.1002/aic.17190
   
   CASWeb of Science®Google Scholar
 * 67A. C. Vaucher, P. Schwaller, J. Geluykens, V. H. Nair, A. Iuliano, T.
   Laino, Nat. Commun. 2021, 12, 2573.
   10.1038/s41467-021-22951-1
   
   CASPubMedWeb of Science®Google Scholar
 * 68M. Haghighatlari, J. Li, F. Heidar-Zadeh, Y. Liu, X. Guan, T. Head-Gordon,
   Chem 2020, 6, 1527.
   10.1016/j.chempr.2020.05.014
   
   CASPubMedWeb of Science®Google Scholar
 * 69F. Musil, A. Grisafi, A. P. Bartók, C. Ortner, G. Csányi, M. Ceriotti,
   Chem. Rev. 2021, 121, 9759.
   10.1021/acs.chemrev.1c00021
   
   CASPubMedWeb of Science®Google Scholar
 * 70J. Damewood, J. Karaguesian, J. R. Lunger, A. R. Tan, M. Xie, J. Peng, R.
   Gómez-Bombarelli, Annu. Rev. Mater. Res. 2023, 53(1), 399.
   10.1146/annurev-matsci-080921-085947
   
   Google Scholar
 * 71K. M. Jablonka, A. S. Rosen, A. S. Krishnapriyan, B. Smit, ACS Cent. Sci.
   2023, 9, 563.
   10.1021/acscentsci.2c01177
   
   CASPubMedGoogle Scholar
 * 72M. Ziatdinov, C. Y. T. Wong, S. V. Kalinin, Machine Learning: Science and
   Technology 2023, 4, 045033.
   10.1088/2632-2153/ad073b
   
   Google Scholar
 * 73J. Rickman, T. Lookman, S. Kalinin, Acta Mater. 2019, 168, 473.
   10.1016/j.actamat.2019.01.051
   
   CASWeb of Science®Google Scholar
 * 74A. Fabrizio, K. R. Briling, C. Corminboeuf, Discovery 2022, 1, 286.
   
   CASGoogle Scholar
 * 75V. Tshitoyan, J. Dagdelen, L. Weston, A. Dunn, Z. Rong, O. Kononova, K. A.
   Persson, G. Ceder, A. Jain, Nature 2019, 571, 95.
   10.1038/s41586-019-1335-8
   
   CASPubMedWeb of Science®Google Scholar
 * 76N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J.
   McDermott, M. Gallant, E. D. Cubuk, A. Merchant, H. Kim, A. Jain, C. J.
   Bartel, K. Persson, Y. Zeng, G. Ceder, Nature 2023, 624, 86.
   10.1038/s41586-023-06734-w
   
   CASPubMedWeb of Science®Google Scholar
 * 77G. Vogel, E. Hirtreiter, L. S. Balhorn, A. M. Schweidtmann, Optimization
   and Engineering 2023, 24, 2911.
   10.1007/s11081-023-09798-9
   
   Google Scholar
 * 78C. Zheng, X. Chen, T. Zhang, N. V. Sahinidis, J. J. Siirola, Comput. Chem.
   Eng. 2022, 159, 107676.
   10.1016/j.compchemeng.2022.107676
   
   CASGoogle Scholar
 * 79T. Zhang, N. V. Sahinidis, J. J. Siirola, AIChE J. 2019, 65, 592.
   10.1002/aic.16443
   
   CASWeb of Science®Google Scholar
 * 80G. Vogel, L. S. Balhorn, A. M. Schweidtmann, Comput. Chem. Eng. 2023, 171,
   108162.
   10.1016/j.compchemeng.2023.108162
   
   CASGoogle Scholar
 * 81E. Hirtreiter, L. S. Balhorn, A. M. Schweidtmann, AIChE J. 2023, 70,
   e18259.
   10.1002/aic.18259
   
   Google Scholar
 * 82M. F. Theisen, K. N. Flores, L. S. Balhorn, A. M. Schweidtmann, Digital
   Chemical Engineering 2023, 6, 100072.
   10.1016/j.dche.2022.100072
   
   Google Scholar
 * 83E. S. Yu, J. M. Cha, T. Lee, J. Kim, D. Mun, Energies 2019, 12, 4425.
   10.3390/en12234425
   
   Google Scholar
 * 84J. Oeing, W. Welscher, N. Krink, L. Jansen, F. Henke, N. Kockmann, Digital
   Chemical Engineering 2022, 4, 100038.
   10.1016/j.dche.2022.100038
   
   Google Scholar
 * 85L. Mencarelli, Q. Chen, A. Pagot, I. E. Grossmann, Comput. Chem. Eng. 2020,
   136, 106808.
   10.1016/j.compchemeng.2020.106808
   
   CASWeb of Science®Google Scholar
 * 86P. O. Ludl, R. Heese, J. Höller, N. Asprion, M. Bortz, Front. Chem. Sci.
   Eng. 2022, 16, 183.
   10.1007/s11705-021-2073-7
   
   Web of Science®Google Scholar
 * 87J.-Y. Cheung, G. Stephanopoulos, Comput. Chem. Eng. 1990, 14, 495.
   10.1016/0098-1354(90)87023-I
   
   CASWeb of Science®Google Scholar
 * 88X. Yuan, J. Zhou, B. Huang, Y. Wang, C. Yang, W. Gui, IEEE Transactions on
   Industrial Informatics 2020, 16, 3721.
   10.1109/TII.2019.2938890
   
   Web of Science®Google Scholar
 * 89E. Martinez-Hernandez, Digital Chemical Engineering 2023, 6, 100075.
   10.1016/j.dche.2022.100075
   
   Google Scholar
 * 90S. C. Brandt, J. Morbach, M. Miatidis, M. Theißen, M. Jarke, W. Marquardt,
   Comput. Chem. Eng. 2008, 32, 320.
   10.1016/j.compchemeng.2007.04.013
   
   CASWeb of Science®Google Scholar
 * 91G. Buchgeher, D. Gabauer, J. Martinez-Gil, L. Ehrlinger, IEEE Access 2021,
   9, 55537.
   10.1109/ACCESS.2021.3070395
   
   Google Scholar
 * 92A. Yang, W. Marquardt, Comput. Chem. Eng. 2009, 33, 822.
   10.1016/j.compchemeng.2008.11.015
   
   CASWeb of Science®Google Scholar
 * 93N. Trokanas, L. Koo, F. Cecelja, in 28th European Symposium on Computer
   Aided Process Engineering, Vol. 43 (Eds: A. Friedl, J. J. Klemeš, S. Radl, P.
   S. Varbanov, T. Wallek), Elsevier, Amsterdam, The Netherlands 2018, p. 471.
   10.1016/B978-0-444-64235-6.50084-X
   
   Google Scholar
 * 94J. Morbach, A. Yang, W. Marquardt, Engineering Applications of Artificial
   Intelligence 2007, 20, 147.
   10.1016/j.engappai.2006.06.010
   
   Web of Science®Google Scholar
 * 95N. Hamedi, I. Mutlu, F. Rani, L. Urbas, in 33rd European Symposium on
   Computer Aided Process Engineering, Vol. 52 (Eds: A. C. Kokossis, M. C.
   Georgiadis, E. Pistikopoulos), Elsevier, Amsterdam, The Netherlands 2023, p.
   1687.
   10.1016/B978-0-443-15274-0.50268-7
   
   Google Scholar
 * 96S. Mao, Y. Zhao, J. Chen, B. Wang, Y. Tang, Comput. Chem. Eng. 2020, 143,
   107094.
   10.1016/j.compchemeng.2020.107094
   
   CASGoogle Scholar
 * 97X. Zheng, B. Wang, Y. Zhao, S. Mao, Y. Tang, Neurocomputing 2021, 430, 104.
   10.1016/j.neucom.2020.10.095
   
   Google Scholar
 * 98Y. Zhao, B. Zhang, D. Gao, J. Loss Prev. Process Ind. 2022, 76, 104736.
   10.1016/j.jlp.2022.104736
   
   CASGoogle Scholar
 * 99A. Menon, N. B. Krdzavac, M. Kraft, Curr. Opin. Chem. Eng. 2019, 26, 33.
   10.1016/j.coche.2019.08.004
   
   Web of Science®Google Scholar
 * 100A. Pavel, L. A. Saarimäki, L. Möbus, A. Federico, A. Serra, D. Greco,
   Biotechnol. J. 2022, 20, 4837.
   
   CASGoogle Scholar
 * 101S. Natarajan, K. Ghosh, R. Srinivasan, Comput. Chem. Eng. 2012, 46, 124.
   10.1016/j.compchemeng.2012.06.009
   
   CASWeb of Science®Google Scholar
 * 102X. Feng, Y. Dai, X. Ji, L. Zhou, Y. Dang, Process Saf. Environ. Prot.
   2021, 155, 41.
   10.1016/j.psep.2021.09.001
   
   CASWeb of Science®Google Scholar
 * 103Z. Wang, B. Zhang, D. Gao, Computers in Industry 2022, 139, 103647.
   10.1016/j.compind.2022.103647
   
   Google Scholar
 * 104E. Musulin, F. Roda, M. Basualdo, Comput. Chem. Eng. 2013, 59, 164.
   10.1016/j.compchemeng.2013.06.009
   
   CASWeb of Science®Google Scholar
 * 105V. Mann, S. Viswanath, S. Vaidyaraman, J. Balakrishnan, V.
   Venkatasubramanian, Comput. Chem. Eng. 2023, 179, 108446.
   10.1016/j.compchemeng.2023.108446
   
   CASGoogle Scholar
 * 106V. Venkatasubramanian, C. Zhao, G. Joglekar, A. Jain, L. Hailemariam, P.
   Suresh, P. Akkisetty, K. Morris, G. Reklaitis, Comput. Chem. Eng. 2006, 30,
   1482.
   10.1016/j.compchemeng.2006.05.036
   
   CASWeb of Science®Google Scholar
 * 107M. M. Vegetti, A. Böhm, H. P. Leone, G. P. Henning, Domain Ontologies for
   Research Data Management in Industry Commons of Materials and Manufacturing,
   https://epubs.stfc.ac.uk/manifestation/53374896/DL-CONF-2021-002.pdf#page=47
   (accessed: January 2024).
   
   Google Scholar
 * 108T. Grubic, I.-S. Fan, Computers in Industry 2010, 61, 776.
   10.1016/j.compind.2010.05.006
   
   Web of Science®Google Scholar
 * 109N. E. Samaridi, N. N. Karanikolas, M. Papoutsidakis, E. C. Papakitsos, C.
   E. Papakitsos, International Journal of Production Management and Engineering
   2023, 11, 89.
   10.4995/ijpme.2023.18702
   
   Google Scholar
 * 110F. Ameri, B. Kulvatunyou, International Design Engineering Technical
   Conferences and Computers and Information in Engineering Conference, Vol.
   59179, American Society of Mechanical Engineers, Anaheim, CA 2019,
   V001T02A052.
   
   Google Scholar
 * 111S. Dopler, J. Scholz, in Short Paper Proc. of the Spatial Data Science
   Symposium. 2021 pp. 1–7.
   
   Google Scholar
 * 112J. Rao, S. Gao, M. Miller, A. Morales, in Proc of the 1st ACM SIGSPATIAL
   International Workshop on Geospatial Knowledge Graphs, GeoKG 22, Association
   for Computing Machinery, New York, NY. 2022 pp. 17–25.
   
   Google Scholar
 * 113I. E. Grossmann, A. W. Westerberg, AIChE J. 2000, 46, 1700.
   10.1002/aic.690460902
   
   CASWeb of Science®Google Scholar
 * 114E. Örs, R. Schmidt, M. Mighani, M. Shalaby, in 2020 IEEE Int. Conf. on
   Engineering, Technology and Innovation (ICE/ITMC). 2020 pp. 1–8.
   
   Google Scholar
 * 115M. Sokolov, M. von Stosch, H. Narayanan, F. Feidl, A. Buttá, Curr. Opin.
   Chem. Eng. 2021, 34, 100715.
   10.1016/j.coche.2021.100715
   
   Web of Science®Google Scholar
 * 116J. H. Lee, J. Shin, M. J. Realff, Comput. Chem. Eng. 2018, 114, 111.
   10.1016/j.compchemeng.2017.10.008
   
   CASWeb of Science®Google Scholar
 * 117D. C. Psichogios, L. H. Ungar, AIChE J. 1992, 38, 1499.
   10.1002/aic.690381003
   
   CASWeb of Science®Google Scholar
 * 118M. A. Kramer, M. L. Thompson, P. M. Bhagat, in 1992 American Control Conf,
   IEEE. 1992 pp. 475–479.
   
   Google Scholar
 * 119T. A. Johansen, B. A. Foss, in American Control Conference, IEEE. 1992 pp.
   3037–3043.
   
   Google Scholar
 * 120M. L. Mavrovouniotis, S. Chang, Comput. Chem. Eng. 1992, 16, 347.
   10.1016/0098-1354(92)80053-C
   
   CASWeb of Science®Google Scholar
 * 121H.-T. Su, N. Bhat, P. Minderman, T. McAvoy, Distillation Columns and Batch
   Processes 1992, 25, 327.
   
   Google Scholar
 * 122J. Sansana, M. N. Joswiak, I. Castillo, Z. Wang, R. Rendall, L. H. Chiang,
   M. S. Reis, Comput. Chem. Eng. 2021, 151, 1073654.
   
   Google Scholar
 * 123W. Bradley, J. Kim, Z. Kilwein, L. Blakely, M. Eydenberg, J. Jalvin, C.
   Laird, F. Boukouvala, Comput. Chem. Eng. 2022, 166, 107898.
   10.1016/j.compchemeng.2022.107898
   
   CASWeb of Science®Google Scholar
 * 124N. Sharma, Y. A. Liu, AIChE J 2022, 68, e17609.
   10.1002/aic.17609
   
   CASWeb of Science®Google Scholar
 * 125L. Rajulapati, S. Chinta, B. Shyamala, R. Rengaswamy, AIChE J 2022, 6,
   e17715.
   10.1002/aic.17715
   
   Google Scholar
 * 126E. Gallup, T. Gallup, K. Powell, Comput. Chem. Eng. 2023, 170, 108111.
   10.1016/j.compchemeng.2022.108111
   
   Google Scholar
 * 127H. Narayanan, M. von Stosch, F. Feidl, M. Sokolov, M. Morbidelli, A.
   Buttá, Front. Chem. React. Eng. 2023, 5, 1157889.
   10.3389/fceng.2023.1157889
   
   Google Scholar
 * 128M. Agarwal, International Journal of Systems Science 1997, 28, 65.
   10.1080/00207729708929364
   
   Web of Science®Google Scholar
 * 129M. von Stosch, R. Oliveira, J. Peres, S. F. de Azevedo, Comput. Chem. Eng.
   2014, 60, 86.
   10.1016/j.compchemeng.2013.08.008
   
   CASWeb of Science®Google Scholar
 * 130K. McBride, E. I. S. Medina, K. Sundmacher, Chem. Ing. Tech. 2020, 92,
   842.
   10.1002/cite.202000025
   
   CASGoogle Scholar
 * 131S. Zendehboudi, N. Rezaei, A. Lohi, Appl. Energy 2018, 228, 2539.
   10.1016/j.apenergy.2018.06.051
   
   CASWeb of Science®Google Scholar
 * 132A. M. Schweidtmann, D. Zhang, M. von Stosch, Digital Chemical Engineering
   2024, 10, 100136.
   10.1016/j.dche.2023.100136
   
   Google Scholar
 * 133M. von Stosch, R. Oliveira, J. Peres, S. Feyo de Azevedo, Expert Systems
   with Applications 2011, 38, 10862.
   10.1016/j.eswa.2011.02.117
   
   Web of Science®Google Scholar
 * 134R. F. Nielsen, N. Nazemzadeh, L. W. Sillesen, M. P. Andersson, K. V.
   Gernaey, S. S. Mansouri, Comput. Chem. Eng. 2020, 140, 106916.
   10.1016/j.compchemeng.2020.106916
   
   Google Scholar
 * 135L. Zhang, M. Pan, S. Quan, Q. Chen, Y. Shi, in 2006 6th World Congress on
   Intelligent Control and Automation, IEEE. pp. 8319–8323.
   
   Google Scholar
 * 136T.-M. Hwang, H. Oh, Y.-J. Choi, S.-H. Nam, S. Lee, Y.-K. Choung,
   Desalination 2009, 247, 210.
   10.1016/j.desal.2008.12.025
   
   CASWeb of Science®Google Scholar
 * 137V. Mahalec, Y. Sanchez, Comput. Chem. Eng. 2012, 45, 15.
   10.1016/j.compchemeng.2012.05.012
   
   CASWeb of Science®Google Scholar
 * 138B. Yan, S. Zhao, J. Li, G. Chen, J. Tao, Chem. Eng. J. 2022, 427, 130881.
   10.1016/j.cej.2021.130881
   
   Google Scholar
 * 139D. Ghosh, E. Hermonat, P. Mhaskar, S. Snowling, R. Goel, Ind. Eng. Chem.
   Res. 2019, 58, 13533.
   10.1021/acs.iecr.9b00900
   
   CASWeb of Science®Google Scholar
 * 140T. Bikmukhametov, J. Jäschke, Comput. Chem. Eng. 2019, 138, 106834.
   10.1016/j.compchemeng.2020.106834
   
   Google Scholar
 * 141H. J. L. van Can, H. te Braake, S. Dubbelman, C. Hellinga, K. Luyben, J.
   J. Heijnen, AIChE J. 1998, 44, 1071.
   10.1002/aic.690440507
   
   Web of Science®Google Scholar
 * 142H. Narayanan, M. Luna, M. Sokolov, A. Buttá, M. Morbidelli, Ind. Eng.
   Chem. Res. 2022, 61, 8658.
   10.1021/acs.iecr.1c04507
   
   CASWeb of Science®Google Scholar
 * 143M. Raissi, P. Perdikaris, G. E. Karniadakis, J. Comput. Phys. 2019, 378,
   686.
   10.1016/j.jcp.2018.10.045
   
   Web of Science®Google Scholar
 * 144S. Cuomo, V. S. di Cola, F. Giampaolo, G. Rozza, M. Raissi, F. Piccialli,
   ArXiv preprint 2022 arXiv:2201.05624v4, http://arxiv.org/abs/2201.05624
   (accessed: December 2023).
   
   Google Scholar
 * 145G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L.
   Yang, Nat. Rev. Phys. 2021, 3, 422.
   10.1038/s42254-021-00314-5
   
   Google Scholar
 * 146P. Sharma, W. T. Chung, B. Akoush, M. Ihme, Energies 2023, 16, 2343.
   10.3390/en16052343
   
   CASGoogle Scholar
 * 147Y. Wang, Q. Ren, Int. J. Heat Mass Transfer 2022, 186, 4.
   
   Google Scholar
 * 148R. Patel, S. Bhartiya, R. Gudi, J. Process Control 2023, 128, 103003.
   10.1016/j.jprocont.2023.103003
   
   CASWeb of Science®Google Scholar
 * 149V. V. Santana, M. S. Gama, J. M. Loureiro, A. E. Rodrigues, A. M. Ribeiro,
   F. W. Tavares, A. G. Barreto, I. B. Nogueira, ChemEngineering 2022, 6,
   6020021.
   
   Google Scholar
 * 150S. G. Subraveti, Z. Li, V. Prasad, A. Rajendran, Ind. Eng. Chem. Res.
   2023, 62, 5929.
   10.1021/acs.iecr.2c04355
   
   CASGoogle Scholar
 * 151S. G. Subraveti, Z. Li, V. Prasad, A. Rajendran, Ind. Eng. Chem. Res.
   2022, 61, 4095.
   10.1021/acs.iecr.1c04731
   
   CASGoogle Scholar
 * 152F. Chen, S. Luo, S. Wang, H. Nasrabadi, Fluid Phase Equilib. 2022, 558,
   113423.
   10.1016/j.fluid.2022.113423
   
   CASGoogle Scholar
 * 153H. Chen, C. Batchelor-McAuley, E. Kätelhön, J. Elliott, R. G. Compton, J.
   Electroanal. Chem. 2022, 925, 116918.
   10.1016/j.jelechem.2022.116918
   
   CASWeb of Science®Google Scholar
 * 154Y. Takehara, Y. Okano, S. Dost, J. Chem. Eng. Jpn. 2023, 56, 2236656.
   10.1080/00219592.2023.2236656
   
   Google Scholar
 * 155A. Merdasi, S. Ebrahimi, X. Yang, R. Kunz, Chem. Eng. Prog. 2023, 193,
   109540.
   10.1016/j.cep.2023.109540
   
   CASGoogle Scholar
 * 156V. Bibeau, D. C. Boffito, B. Blais, Chem. Eng. Prog. 2024, 196, 109652.
   10.1016/j.cep.2023.109652
   
   Google Scholar
 * 157S. Liu, H. Wang, J. H. Chen, K. Luo, J. Fan, Combust. Flame. 2024, 260,
   113275.
   10.1016/j.combustflame.2023.113275
   
   CASGoogle Scholar
 * 158Z. Zhang, Z. Li, Can. J. Chem. Eng. 2023, 101, 4307.
   10.1002/cjce.24922
   
   CASWeb of Science®Google Scholar
 * 159S. Ren, S. Wu, Q. Weng, Bioresour. Technol. 2023, 369, 128472.
   10.1016/j.biortech.2022.128472
   
   CASPubMedGoogle Scholar
 * 160Y. Ryu, S. Shin, J. J. Liu, W. Lee, J. Na, Computer Aided Chemical
   Engineering, Vol. 52, Elsevier, Athens, Greece 2023, p. 493.
   
   Google Scholar
 * 161T. Asrav, E. S. Koksal, E. E. Esenboga, A. Cosgun, G. Kusoglu, D. Aydin,
   E. Aydin, Computer Aided Chemical Engineering, Vol. 52, Elsevier, Athens,
   Greece, 2023, p. 227.
   
   Google Scholar
 * 162F. Sorourifar, Y. Peng, I. Castillo, L. Bui, J. Venegas, J. A. Paulson,
   Ind. Eng. Chem. Res. 2023, 62, 15563.
   10.1021/acs.iecr.3c01471
   
   CASGoogle Scholar
 * 163Y. Zheng, Z. Wu, Ind. Eng. Chem. Res. 2023, 62, 2804.
   10.1021/acs.iecr.2c03691
   
   CASWeb of Science®Google Scholar
 * 164T. S. Franklin, L. S. Souza, R. M. Fontes, M. A. Martins, Digital Chemical
   Engineering, 2022, 5, 100056.
   10.1016/j.dche.2022.100056
   
   Google Scholar
 * 165G. Wu, W. T. G. Yion, K. L. N. Q. Dang, Z. Wu, Chem. Eng. Res. Des. 2023,
   192, 556.
   10.1016/j.cherd.2023.02.048
   
   CASWeb of Science®Google Scholar
 * 166A. W. Rogers, I. O. S. Cardenas, E. A. D. Rio-Chanona, D. Zhang, Computer
   Aided Chemical Engineering, Vol. 52, Elsevier B.V, Athens, Greece 2023, p.
   83.
   
   Google Scholar
 * 167Z. Lu, Y. Li, C. He, J. Ren, H. Yu, B. Zhang, Q. Chen, Comput. Chem. Eng.
   2024, 180, 108500.
   10.1016/j.compchemeng.2023.108500
   
   CASWeb of Science®Google Scholar
 * 168S. Selvarajan, A. A. Tappe, C. Heiduk, S. Scholl, R. Schenkendorf,
   Processes 2022, 10, 1764.
   10.3390/pr10091764
   
   Google Scholar
 * 169A. A. Tappe, S. Selvarajan, C. Heiduk, S. Scholl, R. Schenkendorf,
   Computer Aided Chemical Engineering, Vol. 52, Elsevier, Athens, Greece 2023,
   p. 837.
   
   Google Scholar
 * 170N. Muralidhar, M. R. Islam, M. Marwah, A. Karpatne, N. Ramakrishnan,
   in 2018 IEEE International Conference on Big Data (Big Data), IEEE, Seattle,
   WA 2018, p. 36.
   10.1109/BigData.2018.8621955
   
   Google Scholar
 * 171T. Asrav, E. Aydin, Comput. Chem. Eng. 2023, 173, 108195.
   10.1016/j.compchemeng.2023.108195
   
   Google Scholar
 * 172N. T. Russell, H. H. C. Bakker, Artificial Intelligence in Engineering
   1997, 11, 347.
   10.1016/S0954-1810(96)00053-2
   
   Google Scholar
 * 173N. T. Russell, H. H. C. Bakker, R. I. Chaplin, Control Engineering
   Practice 2000, 8, 49.
   10.1016/S0967-0661(99)00123-9
   
   Web of Science®Google Scholar
 * 174C. Muñz-Ibañez, M. Alfaro-Ponce, I. Chairez, International Journal of
   Advanced Manufacturing Technology 2019, 104, 1541.
   10.1007/s00170-019-04019-z
   
   Google Scholar
 * 175C. Pan, M. Mahmoudabadbozchelou, X. Duan, J. C. Benneyan, S. Jamali, R. M.
   Erb, J. Colloid Interface Sci. 2022, 611, 29.
   10.1016/j.jcis.2021.11.195
   
   CASPubMedGoogle Scholar
 * 176W. Ji, S. Deng, J. Phys. Chem. A 2021, 125, 1082.
   10.1021/acs.jpca.0c09316
   
   CASPubMedWeb of Science®Google Scholar
 * 177M. S. F. Bangi, K. Kao, J. S. I. Kwon, Chem. Eng. Res. Des. 2022, 179,
   415.
   10.1016/j.cherd.2022.01.041
   
   CASWeb of Science®Google Scholar
 * 178R. T. Q. Chen, Y. Rubanova, J. Bettencourt, D. Duvenaud, ArXiv preprint
   2018, arXiv:1806.07366v5, http://arxiv.org/abs/1806.07366 (accessed: March
   2024).
   
   Google Scholar
 * 179A. Puliyanda, K. Srinivasan, Z. Li, V. Prasad, Engineering Applications of
   Artificial Intelligence 2023, 125, 106690.
   10.1016/j.engappai.2023.106690
   
   Google Scholar
 * 180N. Muralidhar, J. Bu, Z. Cao, L. He, N. Ramakrishnan, D. Tafti, A.
   Karpatne, ArXiv preprint 199, arXiv:1911.04240v1,
   http://arxiv.org/abs/1911.04240 (accessed: March 2024).
   
   Google Scholar
 * 181N. Muralidhar, J. Bu, Z. Cao, L. He, N. Ramakrishnan, D. Tafti, A.
   Karpatne, Big Data 2020, 8, 431.
   10.1089/big.2020.0071
   
   PubMedGoogle Scholar
 * 182D. Machalek, J. Tuttle, K. Andersson, K. M. Powell, Energy and AI 2022, 9,
   100172.
   10.1016/j.egyai.2022.100172
   
   Google Scholar
 * 183X. I. Yang, S. Zafar, J. X. Wang, H. Xiao, Physical Review Fluids 2019, 4,
   034602. https://doi.org/10.1103/PhysRevFluids.4.034602
   10.1103/PhysRevFluids.4.034602
   
   Google Scholar
 * 184K. Yamada, B. R. B. Fernandes, A. Kalamkar, J. Jeon, M. Delshad, R.
   Farajzadeh, K. Sepehrnoori, Fuel 2024, 357, 129670.
   10.1016/j.fuel.2023.129670
   
   Google Scholar
 * 185A. Carranza-Abaid, J. P. Jakobsen, Comput. Chem. Eng. 2022, 163, 107858.
   10.1016/j.compchemeng.2022.107858
   
   Google Scholar
 * 186M. S. Alhajeri, F. Abdullah, Z. Wu, P. D. Christofides, Chem. Eng. Res.
   Des. 2022, 186, 34.
   10.1016/j.cherd.2022.07.035
   
   CASWeb of Science®Google Scholar
 * 187N. Sitapure, J. S.-I. Kwon, Ind. Eng. Chem. Res. 2023, 62, 21278.
   10.1021/acs.iecr.3c02624
   
   CASGoogle Scholar
 * 188L. D. McClenny, U. M. Braga-Neto, J. Comput. Phys. 2023, 474, 111722.
   10.1016/j.jcp.2022.111722
   
   Google Scholar
 * 189Y. Bengio, A. Courville, P. Vincent, IEEE Transactions on Pattern Analysis
   and Machine Intelligence 2013, 35, 1798.
   10.1109/TPAMI.2013.50
   
   PubMedWeb of Science®Google Scholar
 * 190J. Heaton, in SoutheastCon 2016, IEEE. 2016 pp. 1–6.
   
   Google Scholar
 * 191J. Lu, F. Gao, Ind. Eng. Chem. Res. 2008, 47, 9508.
   10.1021/ie800595a
   
   CASWeb of Science®Google Scholar
 * 192G. Brand Rihm, E. Esche, J.-U. Repke, M. Schueler, C. Nentwich, Chem. Ing.
   Tech. 2023, 95, 1125.
   10.1002/cite.202200228
   
   Google Scholar
 * 193A. Daw, R. Q. Thomas, C. C. Carey, J. S. Read, A. P. Appling, A. Karpatne,
   in Proc. of the 2020 SIAM Int. Conf. on Data Mining (SDM). pp. 532–540.
   
   Google Scholar
 * 194M. Shevlin, ACS Med. Chem. Lett. 2017, 8, 601.
   10.1021/acsmedchemlett.7b00165
   
   CASPubMedWeb of Science®Google Scholar
 * 195T. Williams, K. McCullough, J. A. Lauterbach, Chem. Mater. 2020, 32, 157.
   10.1021/acs.chemmater.9b03043
   
   CASWeb of Science®Google Scholar
 * 196M. Abolhasani, E. Kumacheva, Nature Synthesis 2023, 2, 483.
   10.1038/s44160-022-00231-0
   
   Google Scholar
 * 197N. S. Eyke, B. A. Koscher, K. F. Jensen, Trends Chem 2021, 3, 120.
   10.1016/j.trechm.2020.12.001
   
   CASGoogle Scholar
 * 198B. Burger, P. M. Maffettone, V. V. Gusev, C. M. Aitchison, Y. Bai, X.
   Wang, X. Li, B. M. Alston, B. Li, R. Clowes, N. Rankin, B. Harris, R. S.
   Sprick, A. I. Cooper, Nature 2020, 583, 237.
   10.1038/s41586-020-2442-2
   
   CASPubMedWeb of Science®Google Scholar
 * 199C. W. Coley, D. A. Thomas, J. A. M. Lummiss, J. N. Jaworski, C. P. Breen,
   V. Schultz, T. Hart, J. S. Fishman, L. Rogers, H. Gao, R. W. Hicklin, P. P.
   Plehiers, J. Byington, J. S. Piotti, W. H. Green, A. J. Hart, T. F. Jamison,
   K. F. Jensen, Science 2019, 365, eaax1566.
   10.1126/science.aax1566
   
   CASPubMedWeb of Science®Google Scholar
 * 200J. M. Granda, L. Donina, V. Dragone, D.-L. Long, L. Cronin, Nature 2018,
   559, 377.
   10.1038/s41586-018-0307-8
   
   CASPubMedWeb of Science®Google Scholar
 * 201K. Abdel-Latif, R. Epps, F. Bateni, S. Han, K. Reyes, M. Abolhasani,
   Advanced Intelligent Systems 2021, 3, 2170022.
   https://doi.org/10.1002/aisy.202170022
   10.1002/aisy.202170022
   
   Google Scholar
 * 202R. Epps, M. Bowen, A. Volk, K. Abdel-Latif, S. Han, K. Reyes, A. Amassian,
   M. Abolhasani, Adv. Mater. 2020, 32, 2070222.
   
   PubMedGoogle Scholar
 * 203J. Li, J. Li, R. Liu, Y. Tu, Y. Li, J. Cheng, T. He, X. Zhu, Nat. Commun.
   2020, 11, 2046.
   10.1038/s41467-020-15728-5
   
   CASPubMedWeb of Science®Google Scholar
 * 204J. Li, Y. Tu, R. Liu, Y. Lu, X. Zhu, Adv. Sci. 2020, 7, 1901957.
   10.1002/advs.201901957
   
   CASGoogle Scholar
 * 205D. Salley, G. Keenan, J. Grizou, A. Sharma, S. Martín, L. Cronin, Nat.
   Commun. 2020, 11, 2771.
   10.1038/s41467-020-16501-4
   
   CASPubMedWeb of Science®Google Scholar
 * 206H. Tao, T. Wu, S. Kheiri, M. Aldeghi, A. Aspuru-Guzik, E. Kumacheva, Adv.
   Funct. Mater. 2021, 31, 2106725.
   
   Google Scholar
 * 207B. MacLeod, F. Parlane, C. Rupnow, K. Dettelbach, M. Elliott, T.
   Morrissey, T. Haley, O. Proskurin, M. Rooney, N. Taherimakhsousi, D. Dvorak,
   H. Chiu, C. Waizenegger, K. Ocean, M. Mokhtari, C. Berlinguette, Nat. Commun.
   2022, 13, 995.
   10.1038/s41467-022-28580-6
   
   CASPubMedWeb of Science®Google Scholar
 * 208B. P. MacLeod, F. G. L. Parlane, T. D. Morrissey, F. Häse, L. M. Roch, K.
   E. Dettelbach, R. Moreira, L. P. E. Yunker, M. B. Rooney, J. R. Deeth, V.
   Lai, G. J. Ng, H. Situ, R. H. Zhang, M. S. Elliott, T. H. Haley, D. J.
   Dvorak, A. Aspuru-Guzik, J. E. Hein, C. P. Berlinguette, Sci. Adv. 2020, 6,
   eaaz8867.
   10.1126/sciadv.aaz8867
   
   CASPubMedWeb of Science®Google Scholar
 * 209P. Nikolaev, D. Hooper, N. Perea-López, M. Terrones, B. Maruyama, ACS Nano
   2014, 8, 10214.
   10.1021/nn503347a
   
   CASPubMedWeb of Science®Google Scholar
 * 210L. Roch, F. Häse, C. Kreisbeck, T. Tamayo-Mendoza, L. Yunker, J. Hein, A.
   Aspuru-Guzik, PLoS One 2020, 15, e0229862.
   10.1371/journal.pone.0229862
   
   CASPubMedWeb of Science®Google Scholar
 * 211J. Deneault, J. Chang, J. Myung, D. Hooper, A. Armstrong, M. Pitt, B.
   Maruyama, MRS Bull. 2021, 46, 566.
   10.1557/s43577-021-00051-1
   
   Google Scholar
 * 212F. Häse, L. M. Roch, C. Kreisbeck, A. Aspuru-Guzik, ACS Cent. Sci. 2018,
   4, 1134.
   10.1021/acscentsci.8b00307
   
   CASPubMedWeb of Science®Google Scholar
 * 213F. Häse, L. M. Roch, A. Aspuru-Guzik, ArXiv preprint 2020, arXiv:
   abs/2003.12127, https://api.semanticscholar.org/CorpusID:214693268 (accessed:
   May 2024).
   
   Google Scholar
 * 214M. Aldeghi, F. Häse, R. J. Hickman, I. Tamblyn, A. Aspuru-Guzik, Chem.
   Sci. 2021, 12, 14792.
   10.1039/D1SC01545A
   
   CASPubMedWeb of Science®Google Scholar
 * 215R. Epps, M. Abolhasani, Applied Physics Reviews 2021, 8, 041316.
   10.1063/5.0061799
   
   CASWeb of Science®Google Scholar
 * 216P. S. Gromski, A. Henson, J. M. Granda, L. Cronin, Chemistry 2019, 3, 119.
   
   Google Scholar
 * 217R. Pollice, G. dos Passos Gomes, M. Aldeghi, R. Hickman, M. Krenn, C.
   Lavigne, M. Lindner-D'Addario, A. Nigam, C. T. Ser, Z. Yao, A. Aspuru-Guzik,
   Accounts of Chemical Research 2021, 54, 849.
   10.1021/acs.accounts.0c00785
   
   PubMedGoogle Scholar
 * 218H. Tao, T. Wu, M. Aldeghi, A. Aspuru-Guzik, E. Kumacheva, Nat. Rev. Mater.
   2021, 6, 1.
   10.1038/s41578-021-00337-5
   
   Google Scholar
 * 219A. Nicholson, S. Webber, S. Dyer, T. Patel, H. Janicke, Computers &
   Security 2012, 31, 418.
   10.1016/j.cose.2012.02.009
   
   Web of Science®Google Scholar
 * 220M. Iaiani, A. Tugnoli, S. Bonvicini, V. Cozzani, Reliability Engineering &
   System Safety 2021, 209, 107485.
   10.1016/j.ress.2021.107485
   
   Google Scholar
 * 221 RISI, The Repository of Industrial Security Incidents.
   https://www.risidata.com/Database/event_date/desc/P30 (accessed: May 2024).
   
   Google Scholar
 * 222A. Nguyen, J. Yosinski, J. Clune, in Proc. of the IEEE Conf. on Computer
   Vision and Pattern Recognition, IEEE. 2015 pp. 427–436.
   
   Google Scholar
 * 223R. S. Siva Kumar, M. Nyström, J. Lambert, A. Marshall, M. Goertzel, A.
   Comissoneru, M. Swann, S. Xia, in 2020 IEEE Security and Privacy Workshops
   (SPW), IEEE. 2020 pp. 69–75.
   
   Google Scholar
 * 224L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, J. D. Tygar, in Proc.
   of the 4th ACM Workshop on Security and Artificial Intelligence, AISec 11,
   Association for Computing Machinery, ACM Press, New York, NY. 2011 pp. 43–58.
   
   Google Scholar
 * 225N. Pitropakis, E. Panaousis, T. Giannetsos, E. Anastasiadis, G. Loukas,
   Computer Science Review 2019, 34, 100199.
   10.1016/j.cosrev.2019.100199
   
   Web of Science®Google Scholar
 * 226I. Goodfellow, J. Shlens, C. Szegedy, ArXiv preprint 2014, arXiv
   1412.6572, https://api.semanticscholar.org/CorpusID:6706414 (accessed: May
   2024).
   
   Google Scholar
 * 227N. Carlini, D. A. Wagner, in IEEE Symposium on Security and Privacy (SP),
   IEEE. 2017 pp. 39–57.
   
   Google Scholar
 * 228N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, A. Swami, in
   2016 IEEE European symposium on security and privacy (EuroS&P), IEEE. 2016
   pp. 372–387.
   
   Google Scholar
 * 229S.-M. Moosavi-Dezfooli, A. Fawzi, P. Frossard, in Proc. of the IEEE
   Conference on Computer Vision and Pattern Recognition, IEEE. 2016 pp.
   2574–2582.
   
   Google Scholar
 * 230W. Brendel, J. Rauber, M. Bethge, ArXiv preprint 2017, arXiv: 1712.04248,
   https://api.semanticscholar.org/CorpusID:2410333 (accessed: May 2024).
   
   Google Scholar
 * 231N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, A. Swami, in
   Proc. of the 2017 ACM on Asia Conf. on Computer and Communications Security.
   2017 pp. 506–519.
   
   Google Scholar
 * 232J. Su, D. V. Vargas, K. Sakurai, IEEE Transactions on Evolutionary
   Computation 2017, 23, 828.
   10.1109/TEVC.2019.2890858
   
   Google Scholar
 * 233O. Ibitoye, R. Abou-Khamis, ArXiv preprint 2019, arXiv:1911.02621,
    https://api.semanticscholar.org/CorpusID:207848033 (accessed: May 2024).
   
   Google Scholar
 * 234H. Lee, S. Han, J. Lee, ArXiv preprint 2017, arXiv:1705.03387,
    https://api.semanticscholar.org/CorpusID:6222110 (accessed: May 2024).
   
   Google Scholar
 * 235F. Tramér, A. Kurakin, N. Papernot, D. Boneh, P. Mcdaniel, ArXiv preprint
   2017, arXiv:1705.07204,  https://api.semanticscholar.org/CorpusID:21946795
   (accessed: May 2024).
   
   Google Scholar
 * 236B. Zhou, A. Khosla, ß. Lapedriza, A. Oliva, A. Torralba, ArXiv preprint
   2014, arXiv:1412.6856, https://api.semanticscholar.org/CorpusID:8217340
   (acessed: May 2024).
   
   Google Scholar
 * 237W. Xu, D. Evans, Y. Qi, ArXiv preprint 2017,
   arXiv:1704.01155, https://api.semanticscholar.org/CorpusID:3851184 (accessed:
   May 2024).
   
   Google Scholar
 * 238M. Barreno, B. Nelson, A. D. Joseph, J. D. Tygar, Machine Learning 2010,
   81, 121.
   10.1007/s10994-010-5188-5
   
   Web of Science®Google Scholar
 * 239X. Yuan, P. He, Q. Zhu, X. Li, IEEE Transactions on Neural Networks and
   Learning Systems 2019, 30, 2805.
   10.1109/TNNLS.2018.2886017
   
   PubMedWeb of Science®Google Scholar
 * 240A. Rawal, D. B. Rawat, B. M. Sadler, Artificial Intelligence and Machine
   Learning for Multi-Domain Operations Applications III 2021, 11746, 701.
   
   Google Scholar
 * 241A. B. Arrieta, N. Díaz-Rodríguez, J. D. Ser, A. Bennetot, S. Tabik, A.
   Barbado, S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, F.
   Herrera, Information Fusion 2020, 58, 82.
   10.1016/j.inffus.2019.12.012
   
   Web of Science®Google Scholar
 * 242G. Manca, N. Bhattacharya, S. Maczey, D. Ziobro, E. Brorsson, M. Bäng, in
   Frontiers in Artificial Intelligence and Applications (Eds: P. Lukowicz, S.
   Mayer, J. Koch, J. Shawe-Taylor, I. Tiddi), IOS Press, Munich, Germany 2023,
   p. 401. https://doi.org/10.3233/FAIA230110
   
   Google Scholar
 * 243R. R. A. Harinarayan, S. M. Shalinie, Process Saf. Environ. Prot. 2022,
   165, 463.
   10.1016/j.psep.2022.07.019
   
   CASGoogle Scholar
 * 244L. C. Brito, G. A. Susto, J. N. Brito, M. A. V. Duarte, Mechanical Systems
   and Signal Processing 2022, 163, 108105.
   10.1016/j.ymssp.2021.108105
   
   Web of Science®Google Scholar
 * 245K. Jang, K. E. S. Pilario, N. Lee, I. Moon, J. Na, IEEE Transactions on
   Industrial Informatics 2023, https://doi.org/10.1109/TII.2023.3240601
   10.1109/TII.2023.3240601
   
   PubMedGoogle Scholar
 * 246A. Bhakte, V. Pakkiriswamy, R. Srinivasan, Chem. Eng. Sci. 2022, 250,
   117373.
   10.1016/j.ces.2021.117373
   
   CASGoogle Scholar
 * 247P. Agarwal, M. Tamer, H. Budman, Comput. Chem. Eng. 2021, 154, 107467.
   10.1016/j.compchemeng.2021.107467
   
   CASWeb of Science®Google Scholar
 * 248A. Sivaram, V. Venkatasubramanian, AIChE J. 2022, 68, e17687.
   10.1002/aic.17687
   
   CASWeb of Science®Google Scholar
 * 249H. Gandhi, A. White, Explaining Molecular Properties with Natural
   Language,
   https://chemrxiv.org/engage/chemrxiv/article-details/633731d1f764e6e535093041
   (accessed: May 2024).
   
   Google Scholar
 * 250K. H. Sun, H. Huh, B. A. Tama, S. Y. Lee, J. H. Jung, S. Lee, IEEE Access
   2020, 8, 129169.
   10.1109/ACCESS.2020.3009852
   
   Google Scholar
 * 251A. Bhakte, S. Bairi, R. Srinivasan, in 2021 AIChE Annual Meeting, AIChE.
   2021.
   
   Google Scholar
 * 252T. Danesh, R. Ouaret, P. Floquet, S. Negny, Comput. Chem. Eng. 2023, 176,
   108306.
   10.1016/j.compchemeng.2023.108306
   
   CASGoogle Scholar
 * 253D. Wu, X. Bi, J. Zhao, Ind. Eng. Chem. Res. 2023, 62, 8350.
   10.1021/acs.iecr.3c00206
   
   CASGoogle Scholar
 * 254A. R. N. Aouichaoui, F. Fan, J. Abildskov, G. Sin, Comput. Chem. Eng.
   2023, 176, 108291.
   10.1016/j.compchemeng.2023.108291
   
   CASGoogle Scholar
 * 255P. Schwaller, B. Hoover, J.-L. Reymond, H. Strobelt, T. Laino, Sci. Adv.
   2021, 7, eabe4166.
   10.1126/sciadv.abe4166
   
   PubMedWeb of Science®Google Scholar
 * 256A. Bhakte, M. Chakane, R. Srinivasan, Comput. Chem. Eng. 2023, 179,
   108442.
   10.1016/j.compchemeng.2023.108442
   
   CASGoogle Scholar
 * 257N. Leveson, Safety Science 2004, 42, 237.
   10.1016/S0925-7535(03)00047-X
   
   Web of Science®Google Scholar
 * 258T. V. Shajahan, R. Madbhavi, M. A. Shahab, B. Srinivasan, R. Srinivasan,
   in Computer Aided Chemical Engineering, Vol. 52 (Eds: A. C. Kokossis, M. C.
   Georgiadis, E. Pistikopoulos), Elsevier, 2023, p. 2043.
   
   Google Scholar
 * 259R. Arunthavanathan, Z. Sajid, F. Khan, E. Pistikopoulos, Digital Chemical
   Engineering 2024, 11, 100151.
   10.1016/j.dche.2024.100151
   
   Google Scholar
 * 260N. Jones, Nature 2023, 623, 229.
   10.1038/d41586-023-03472-x
   
   CASPubMedGoogle Scholar
 * 261E. J. Bjerrum, ArXiv preprint 2017, arXiv:1703.07076,
   http://arxiv.org/abs/1703.07076 (accessed: May 2024).
   
   Google Scholar
 * 262M. H. S. Segler, T. Kogej, C. Tyrchan, M. P. Waller, ACS Cent. Sci. 2018,
   4, 120.
   10.1021/acscentsci.7b00512
   
   CASPubMedWeb of Science®Google Scholar
 * 263K. Kim, S. Kang, J. Yoo, Y. Kwon, Y. Nam, D. Lee, I. Kim, Y.-S. Choi, Y.
   Jung, S. Kim, npj Comput. Mater. 2018, 4, 67.
   10.1038/s41524-018-0128-1
   
   Web of Science®Google Scholar
 * 264M. H. Segler, M. Preuss, M. P. Waller, Nature 2018, 555, 604.
   10.1038/nature25978
   
   CASPubMedWeb of Science®Google Scholar
 * 265Z. Hao, C. Lu, Z. Huang, H. Wang, Z. Hu, Q. Liu, E. Chen, C. Lee, in Proc.
   of the 26th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, ACM.
   2020 pp. 731–752.
   
   Google Scholar
 * 266G. Lamb, B. Paige, ArXiv preprint 2020, arXiv:2012.02089,
   http://arxiv.org/abs/2012.02089 (accessed: May 2024).
   
   Google Scholar
 * 267Y. Pathak, S. Laghuvarapu, S. Mehta, U. D. Priyakumar, in Proc. of the
   AAAI Conf. on Artificial Intelligence. 2020 pp. 873–880.
   
   Google Scholar
 * 268M. Blomberg, T. Borowski, F. Himo, R.-Z. Liao, P. Siegbahn, Chem. Rev.
   2014, 114, 3601.
   10.1021/cr400388t
   
   CASPubMedWeb of Science®Google Scholar
 * 269J. Gilmer, S. Schoenholz, P. Riley, O. Vinyals, G. Dahl, Proceedings of
   the 34th International Conference on Machine Learning, Proceedings of Machine
   Learning Research, Sydney, NSW 2017, p. 1263.
   
   Google Scholar
 * 270K. Yang, K. Swanson, W. Jin, C. Coley, P. Eiden, H. Gao, A. Guzman-Perez,
   T. Hopper, B. Kelley, M. Mathea, J. Chem. Inf. Model. 2019, 59, 3370.
   10.1021/acs.jcim.9b00237
   
   CASPubMedWeb of Science®Google Scholar
 * 271K. Schütt, P.-J. Kindermans, H. E. Sauceda Felix, S. Chmiela, A.
   Tkatchenko, K.-R. Müller, Advances in Neural Information Processing Systems,
   2017, 30.
   
   Google Scholar
 * 272C. Lu, Q. Liu, C. Wang, Z. Huang, P. Lin, L. He, in Proc. of the AAAI
   Conf. on Artificial Intelligence, Vol. 33. 2019 pp. 1052–1060.
   
   Google Scholar
 * 273D. Flam-Shepherd, T. Wu, A. Aspuru-Guzik, ArXiv preprint 2020,
   arXiv:2002.07087, https://arxiv.org/abs/2002.07087 (accessed: May 2024).
   
   Google Scholar
 * 274W. Jin, R. Barzilay, T. Jaakkola, International Conference on Machine
   Learning, PMLR, Stockholm 2018, p. 2323.
   
   Google Scholar
 * 275J. Born, T. Huynh, A. Stroobants, W. D. Cornell, M. Manica, J. Chem. Inf.
   Model. 2022, 62, 240.
   10.1021/acs.jcim.1c00889
   
   CASPubMedWeb of Science®Google Scholar
 * 276J. Born, M. Manica, J. Cadow, G. Markert, N. A. Mill, M. Filipavicius, N.
   Janakarajan, A. Cardinale, T. Laino, M. R. Martínez, Machine Learning:
   Science and Technology 2021, 2, 025024.
   10.1088/2632-2153/abe808
   
   Google Scholar
 * 277J. Born, M. Manica, A. Oskooei, J. Cadow, G. Markert, M. R. Martínez,
   Iscience, 2021, 24, 102269.
   10.1016/j.isci.2021.102269
   
   PubMedGoogle Scholar
 * 278Y. Pathak, K. S. Juneja, G. Varma, M. Ehara, U. D. Priyakumar, Phys. Chem.
   Chem. Phys. 2020, 22, 26935.
   10.1039/D0CP03508D
   
   CASPubMedWeb of Science®Google Scholar
 * 279A. Kadurin, S. Nikolenko, K. Khrabrov, A. Aliper, A. Zhavoronkov, Mol.
   Pharmaceutics 2017, 14, 3098.
   10.1021/acs.molpharmaceut.7b00346
   
   CASPubMedWeb of Science®Google Scholar
 * 280E. Putin, A. Asadulaev, Y. Ivanenkov, V. Aladinskiy, B. Sanchez-Lengeling,
   A. Aspuru-Guzik, A. Zhavoronkov, J. Chem. Inf. Model. 2018, 58, 1194.
   10.1021/acs.jcim.7b00690
   
   CASPubMedWeb of Science®Google Scholar
 * 281A. Zhavoronkov, Y. A. Ivanenkov, A. Aliper, M. S. Veselov, V. A.
   Aladinskiy, A. V. Aladinskaya, V. A. Terentiev, D. A. Polykovskiy, M. D.
   Kuznetsov, A. Asadulaev, Nat. Biotechnol. 2019, 37, 1038.
   10.1038/s41587-019-0224-x
   
   CASPubMedWeb of Science®Google Scholar
 * 282N. H. Park, M. Manica, J. Born, J. L. Hedrick, T. Erdmann, D. Y. Zubarev,
   N. Adell-Mill, P. L. Arrechea, Nat. Commun. 2023, 14, 3686.
   
   Google Scholar
 * 283A. Izdebski, E. Weglarz-Tomczak, E. Szczurek, J. M. Tomczak, ArXiv
   preprint 2023, arXiv:2310.02066, http://arxiv.org/abs/2310.02066 (accessed:
   May 2024).
   
   Google Scholar
 * 284D. Kong, Y. Huang, J. Xie, Y. N. Wu, ArXiv preprint 2023,
   arXiv:2310.03253, http://arxiv.org/abs/2310.03253 (accessed: May 2024).
   
   Google Scholar
 * 285J. Ross, B. Belgodere, V. Chenthamarakshan, I. Padhi, Y. Mroueh, P. Das,
   Nature Machine Intelligence 2022, 4, 1256.
   10.1038/s42256-022-00580-7
   
   Google Scholar
 * 286P. Neves, K. McClure, J. Verhoeven, N. Dyubankova, R. Nugmanov, A. Gedich,
   S. Menon, Z. Shi, J. K. Wegner, J. Cheminf. 2023, 15, 20.
   10.1186/s13321-023-00685-0
   
   PubMedGoogle Scholar
 * 287P. Schwaller, A. C. Vaucher, T. Laino, J.-L. Reymond, Machine Learning:
   Science and Technology 2021, 2, 015016.
   10.1088/2632-2153/abc81d
   
   Google Scholar
 * 288V. Bagal, R. Aggarwal, P. K. Vinod, U. D. Priyakumar, J. Chem. Inf. Model.
   2022, 62, 2064.
   10.1021/acs.jcim.1c00600
   
   CASPubMedWeb of Science®Google Scholar
 * 289S. Honda, S. Shi, R. Hiroki, ArXiv preprint 2019, arXiv:1911.04738,
   http://arxiv.org/abs/1911.04738 (accessed: May 2024).
   
   Google Scholar
 * 290B. Fabian, T. Edlich, H. Gaspar, M. Segler, J. Meyers, M. Fiscato, M.
   Ahmed, ArXiv preprint 2020, arXiv:2011.13230, http://arxiv.org/abs/2011.13230
   (accessed: May 2024).
   
   Google Scholar
 * 291S. Chithrananda, G. Grand, B. Ramsundar, ArXiv preprint 2020,
   arXiv:2010.09885, http://arxiv.org/abs/2010.09885 (accessed: May 2024).
   
   Google Scholar
 * 292A. Tripp, E. Daxberger, J. M. Hernández-Lobato, Advances in Neural
   Information Processing Systems 2020, 33, 11259.
   
   Google Scholar
 * 293J. Noh, J. Kim, H. S. Stein, B. Sanchez-Lengeling, J. M. Gregoire, A.
   Aspuru-Guzik, Y. Jung, Matter 2019, 1, 1370.
   10.1016/j.matt.2019.08.017
   
   Google Scholar
 * 294B. Kim, S. Lee, J. Kim, Sci. Adv. 2020, 6, eaax9324.
   10.1126/sciadv.aax9324
   
   CASPubMedWeb of Science®Google Scholar
 * 295Z. Yao, B. Sánchez-Lengeling, N. S. Bobbitt, B. J. Bucior, S. G. H. Kumar,
   S. P. Collins, T. Burns, T. K. Woo, O. K. Farha, R. Q. Snurr, Nature Machine
   Intelligence 2021, 3, 76.
   10.1038/s42256-020-00271-1
   
   Google Scholar
 * 296S. Dieb, Z. Song, W.-J. Yin, M. Ishii, J. Appl. Phys., 2020, 128, 074901.
   10.1063/5.0012351
   
   Google Scholar
 * 297R. Fisher, The Design of Experiments, 3rd ed., Oliver and Boyd, Edinburgh
   1942.
   
   Google Scholar
 * 298D. Finney, Annals of Eugenics 1943, 12, 291.
   10.1111/j.1469-1809.1943.tb02333.x
   
   Web of Science®Google Scholar
 * 299G. Box, J. Hunter, W. Hunter, Statistics for Experimenters:
   Design,Innovation, and Discovery, 2nd ed., Wiley, Hoboken, NJ 2005.
   
   Google Scholar
 * 300J. Cornell, Experiments with Mixtures: Designs, Models, and the Analysis
   of Mixture Data, 3rd ed., Wiley, Hoboken, NJ 2002.
   10.1002/9781118204221
   
   Google Scholar
 * 301B. Jones, C. Nachtsheim, Journal of Quality Technology 2013, 45, 121.
   10.1080/00224065.2013.11917921
   
   PubMedGoogle Scholar
 * 302K. Jørgensen, T. Naes, J. Chemom. 2004, 18, 45.
   10.1002/cem.835
   
   CASWeb of Science®Google Scholar
 * 303L. Párez-Mosqueda, L. Trujillo-Cayado, F. Carrillo, P. Ramírez, J. Muñoz,
   Colloids Surf., B 2015, 128, 127.
   10.1016/j.colsurfb.2015.02.030
   
   PubMedGoogle Scholar
 * 304M. Reis, P. Saraiva, in Systems Engineering in the Fourth Industrial
   Revolution (Eds: R. Kenett, R. Swarz, A. Zonnenshain), Wiley, Hoboken, NJ
   2019, p. 137.
   10.1002/9781119513957.ch6
   
   Google Scholar
 * 305E. Castillo, M. Reis, Chemom. Intell. Lab. Syst. 2020, 206, 104121.
   10.1016/j.chemolab.2020.104121
   
   CASWeb of Science®Google Scholar
 * 306P. Schwaller, T. Laino, T. Gaudin, P. Bolgar, C. A. Hunter, C. Bekas, A.
   A. Lee, ACS Cent. Sci. 2019, 5, 1572.
   10.1021/acscentsci.9b00576
   
   CASPubMedWeb of Science®Google Scholar
 * 307X. Liu, P. Li, S. Song, Decomposing Retrosynthesis into Reactive Center
   Prediction and Molecule Generation. bioRxiv, 677849,
   https://www.bioarxiv.org/content/10.1101/677849v2.abstract (accessed: May
   2024).
   
   Google Scholar
 * 308P. Schwaller, R. Petraglia, V. Zullo, V. H. Nair, R. A. Haeuselmann, R.
   Pisoni, C. Bekas, A. Iuliano, T. Laino, Chem. Sci. 2020, 11, 3316.
   10.1039/C9SC05704H
   
   CASPubMedWeb of Science®Google Scholar
 * 309S. Zheng, J. Rao, Z. Zhang, J. Xu, Y. Yang, J. Chem. Inf. Model. 2020, 60,
   47.
   10.1021/acs.jcim.9b00949
   
   CASPubMedWeb of Science®Google Scholar
 * 310X. Wang, Y. Li, J. Qiu, G. Chen, H. Liu, B. Liao, C.-Y. Hsieh, X. Yao,
   Chem. Eng. J. 2021, 420, 129845.
   10.1016/j.cej.2021.129845
   
   CASWeb of Science®Google Scholar
 * 311Z. Chen, D. Chen, X. Zhang, Z. Yuan, X. Cheng, IEEE Internet of Things
   Journal 2021, 9, 9179.
   10.1109/JIOT.2021.3100509
   
   Google Scholar
 * 312K. Mao, X. Xiao, T. Xu, Y. Rong, J. Huang, P. Zhao, Neurocomputing 2021,
   457, 193.
   10.1016/j.neucom.2021.06.037
   
   Web of Science®Google Scholar
 * 313J. Li, M. Eastgate, Chem. Eng. 2019, 4, 1595.
   
   CASGoogle Scholar
 * 314M. Nielsen, D. Ahneman, O. Riera, A. Doyle, J. Am. Chem. Soc. 2018, 140,
   5004.
   10.1021/jacs.8b01523
   
   CASPubMedWeb of Science®Google Scholar
 * 315N. Angello, V. Rathore, W. Beker, A. Wołos, E. Jira, R. Roszak, T. Wu, C.
   Schroeder, A. Aspuru-Guzik, B. Grzybowski, Science 2022, 378, 399.
   10.1126/science.adc8743
   
   CASPubMedWeb of Science®Google Scholar
 * 316A. Cadeddu, E. Wylie, J. Jurczak, M. Wampler-Doty, B. Grzybowski, Angew.
   Chem., Int. Ed. 2014, 53, 8108.
   10.1002/anie.201403708
   
   CASPubMedWeb of Science®Google Scholar
 * 317A. Sato, T. Miyao, K. Funatsu, Mol. Inf. 2021, 41, 2100156.
   10.1002/minf.202100156
   
   Web of Science®Google Scholar
 * 318F. Sandfort, F. Strieth-Kalthoff, M. Kühnemund, C. Beecks, F. Glorius,
   Chem 2020, 6, 1379.
   10.1016/j.chempr.2020.02.017
   
   CASWeb of Science®Google Scholar
 * 319L. Stops, R. Leenhouts, Q. Gao, A. M. Schweidtmann, ArXiv preprint 2022,
   arXiv:2207.12051, http://arxiv.org/abs/2207.12051 (accessed: May 2024).
   
   Google Scholar
 * 320Q. Gao, A. M. Schweidtmann, ArXiv preprint 2023, arXiv:2308.07822,
   http://arxiv.org/abs/2308.07822 (accessed: May 2024).
   
   Google Scholar
 * 321K. Fujiwara, M. Kano, S. Hasebe, A. Takinami, AIChE J. 2009, 55, 1754.
   10.1002/aic.11791
   
   CASWeb of Science®Google Scholar
 * 322C. Shang, X. Huang, J. Suykens, D. Huang, J. Process Control 2015, 28, 17.
   10.1016/j.jprocont.2015.02.006
   
   CASWeb of Science®Google Scholar
 * 323F. Souza, R. Araújo, J. Mendes, Chemom. Intell. Lab. Syst. 2016, 152, 69.
   10.1016/j.chemolab.2015.12.011
   
   CASWeb of Science®Google Scholar
 * 324L. Huang, F. Mao, K. Zhang, Z. Li, Sensors 2022, 22, 841.
   10.3390/s22030841
   
   Web of Science®Google Scholar
 * 325S. Wu, X. Xiao, Q. Ding, P. Zhao, Y. Wei, J. Huang, Advances in Neural
   Information Processing Systems 2020, 33, 17105.
   
   Google Scholar
 * 326A. Nambiar, C. Breen, T. Hart, T. Kulesza, T. Jamison, K. Jensen, ACS
   Cent. Sci. 2022, 8, 825.
   10.1021/acscentsci.2c00207
   
   CASPubMedWeb of Science®Google Scholar
 * 327Z. Xiang, B. Xie, R. Fu, M. Qian, Ind. Eng. Chem. Res. 2022, 61, 1531.
   10.1021/acs.iecr.1c03883
   
   CASGoogle Scholar
 * 328K. Liu, Y. Li, J. Yang, Y. Liu, Y. Yao, IEEE Transactions on
   Instrumentation and Measurement 2020, 69, 8261.
   10.1109/TIM.2020.2983595
   
   Web of Science®Google Scholar
 * 329M. Bauer, J. Cox, M. Caveness, J. Downs, N. Thornhill, IEEE Transactions
   on Control Systems Technology 2007, 15, 12.
   10.1109/TCST.2006.883234
   
   Web of Science®Google Scholar
 * 330M. Bauer, N. Thornhill, J. Process Control 2008, 18, 707.
   10.1016/j.jprocont.2007.11.007
   
   CASWeb of Science®Google Scholar
 * 331S. Qin, Annual Reviews in Control 2012, 36, 220.
   10.1016/j.arcontrol.2012.09.004
   
   Web of Science®Google Scholar
 * 332L. Chiang, R. Braatz, Chemom. Intell. Lab. Syst. 2003, 65, 159.
   10.1016/S0169-7439(02)00140-5
   
   CASWeb of Science®Google Scholar
 * 333J. Thambirajah, L. Benabbas, M. Bauer, N. Thornhill, Comput. Chem. Eng.
   2009, 33, 503.
   10.1016/j.compchemeng.2008.10.002
   
   CASWeb of Science®Google Scholar
 * 334P. Kerkhof, J. Vanlaer, G. Gins, J. Impe, Chem. Eng. Sci. 2013, 104, 285.
   10.1016/j.ces.2013.08.007
   
   CASWeb of Science®Google Scholar
 * 335D. Li, ArXiv preprint 2024, arXiv:1809.04758,
   http://arxiv.org/abs/1809.04758 (accessed: May 2024).
   
   Google Scholar
 * 336A. Antoniou, ArXiv preprint 2024, arXiv:1711.04340,
   http://arxiv.org/abs/1711.04340 (accessed: May 2024).
   
   Google Scholar
 * 337G. Pesciullesi, P. Schwaller, T. Laino, J.-L. Reymond, Nat. Commun. 2020,
   11, 4874.
   10.1038/s41467-020-18671-7
   
   CASPubMedWeb of Science®Google Scholar
 * 338M. Alberts, T. Laino, A. C. Vaucher, Leveraging Infrared Spectroscopy for
   Automated Structure Elucidation,
   https://chemrxiv.org/engage/chemrxiv/article-details/645df5cbf2112b41e96da616
   (accessed: May 2024).
   
   Google Scholar
 * 339C. Edwards, T. Lai, K. Ros, G. Honke, K. Cho, H. Ji, ArXiv preprint 2022,
   arXiv:2204.11817, http://arxiv.org/abs/2204.11817 (accessed May 2024).
   
   Google Scholar
 * 340A. C. Vaucher, F. Zipoli, J. Geluykens, V. H. Nair, P. Schwaller, T.
   Laino, Nat. Commun. 2020, 11, 3601.
   10.1038/s41467-020-17266-6
   
   PubMedWeb of Science®Google Scholar
 * 341R. OpenAI, ArXiv preprint 2023, Arxiv: 2303.08774,
   http://arxiv.org/abs/2303.08774 (accessed: May 2024).
   
   Google Scholar
 * 342A. Bender, N. Schneider, M. Segler, W. Patrick Walters, O. Engkvist, T.
   Rodrigues, Nat. Rev. Chem. 2022, 6, 428.
   10.1038/s41570-022-00391-9
   
   PubMedWeb of Science®Google Scholar
 * 343B. Sanchez-Lengeling, A. Aspuru-Guzik, Science 2018, 361, 360.
   10.1126/science.aat2663
   
   CASPubMedWeb of Science®Google Scholar
 * 344M. Saebi, B. Nan, J. E. Herr, J. Wahlers, Z. Guo, A. M. Zurański, T.
   Kogej, P.-O. Norrby, A. G. Doyle, N. V. Chawla, Chem. Sci. 2023, 14, 4997.
   10.1039/D2SC06041H
   
   CASPubMedWeb of Science®Google Scholar
 * 345T. Rodrigues, Drug Discovery Today: Technol. 2019, 32, 3.
   10.1016/j.ddtec.2020.07.001
   
   PubMedGoogle Scholar
 * 346V. Bagal, R. Aggarwal, P. K. Vinod, U. D. Priyakumar, Liggpt: Molecular
   generation using a transformer-decoder model,
   https://chemrxiv.org/engage/chemrxiv/article-details/60c7588e469df48597f456ae
   (accessed: May 2024)
   
   Google Scholar
 * 347Y. Tian, R. Yuan, D. Xue, Y. Zhou, X. Ding, J. Sun, T. Lookman, J. Appl.
   Phys., 2020, 128, 014103.
   
   Google Scholar
 * 348R. Vasudevan, G. Pilania, P. V. Balachandran, J. Appl. Phys., 2021, 129,
   070401.
   10.1063/5.0043300
   
   Google Scholar
 * 349K. M. Jablonka, P. Schwaller, A. Ortega-Guerrero, B. Smit, Is GPT-3 all
   you need for low-data discovery in chemistry?,
   https://chemrxiv.org/engage/chemrxiv/article-details/63eb5a669da0bc6b33e97a35?trk=public_post_comment-text
   (accessed: May 2024).
   
   Google Scholar
 * 350K. Ahuja, W. H. Green, Y.-P. Li, J. Chem. Theory Comput. 2021, 17, 818.
   10.1021/acs.jctc.0c00971
   
   CASPubMedWeb of Science®Google Scholar
 * 351M. Olivecrona, T. Blaschke, O. Engkvist, H. Chen, J. Cheminf. 2017, 9, 48.
   10.1186/s13321-017-0235-x
   
   PubMedWeb of Science®Google Scholar
 * 352G. Simm, R. Pinsler, J. M. Hernández-Lobato, International Conference on
   Machine Learning, PMLR, 2020, p. 8959.
   
   Google Scholar
 * 353Z. Zhou, S. Kearnes, L. Li, R. N. Zare, P. Riley, Sci. Rep. 2019, 9,
   10752.
   10.1038/s41598-019-47148-x
   
   PubMedWeb of Science®Google Scholar
 * 354N. Brown, M. Fiscato, M. Segler, A. Vaucher, J. Chem. Inf. Model. 2019,
   59, 1096.
   10.1021/acs.jcim.8b00839
   
   CASPubMedWeb of Science®Google Scholar
 * 355J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D.
   Zhou, Advances in Neural Information Processing Systems 2022, 35, 24824.
   
   Google Scholar
 * 356L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C.
   Zhang, S. Agarwal, K. Slama, A. Ray, Advances in Neural Information
   Processing Systems 2022, 35, 27730.
   
   Google Scholar
 * 357T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A.
   Neelakantan, P. Shyam, G. Sastry, A. Askell, Advances in Neural Information
   Processing Systems 2020, 33, 1877.
   
   Google Scholar
 * 358V. Sanh, A. Webson, C. Raffel, S. H. Bach, L. Sutawika, Z. Alyafeai, A.
   Chaffin, A. Stiegler, T. L. Scao, A. Raja, M. Dey, M. S. Bari, C. Xu, U.
   Thakker, S. S. Sharma, E. Szczechla, T. Kim, G. Chhablani, N. Nayak, D.
   Datta, J. Chang, M. T.-J. Jiang, H. Wang, M. Manica, S. Shen, Z. X. Yong, H.
   Pandey, R. Bawden, T. Wang, T. Neeraj, J. Rozen, A. Sharma, A. Santilli, T.
   Fevry, J. A. Fries, R. Teehan, T. Bers, S. Biderman, L. Gao, T. Wolf, A. M.
   Rush, ArXiv preprint 2022, arXiv:2110.08207, http://arxiv.org/abs/2110.08207
   (accessed: May 2024).
   
   Google Scholar
 * 359J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, S. Lloyd,
   Nature 2017, 549, 195.
   10.1038/nature23474
   
   CASPubMedWeb of Science®Google Scholar
 * 360V. Dunjko, J. M. Taylor, H. J. Briegel, Phys. Rev. Lett. 2016, 117,
   130501.
   10.1103/PhysRevLett.117.130501
   
   PubMedWeb of Science®Google Scholar
 * 361T. Schick, J. Dwivedi-Yu, R. Dessí, R. Raileanu, M. Lomeli, L.
   Zettlemoyer, N. Cancedda, T. Scialom, ArXiv preprint 2023, arXiv:2302.04761,
   http://arxiv.org/abs/2302.04761 (accessed: May 2024).
   
   Google Scholar
 * 362L. Crowell, A. Lu, K. Love, A. Stockdale, S. Timmick, D. Wu, Y. Wang, W.
   Doherty, A. Bonnyman, N. Vecchiarello, C. Goodwine, L. Bradbury, J. Brady, J.
   Clark, N. Colant, A. Cvetkovic, N. Dalvie, D. Liu, Y. Liu, C. Mascarenhas, C.
   Matthews, N. Mozdzierz, K. Shah, S.-L. Wu, W. Hancock, R. Braatz, S. Cramer,
   J. Love, Nat. Biotechnol. 2018, 36, 988.
   10.1038/nbt.4262
   
   CASWeb of Science®Google Scholar
 * 363J. Weiner, Why AI/Data Science Projects Fail: How to Avoid Project
   Pitfalls, Springer Nature, Cham, Switzerland 2022.
   
   Google Scholar
 * 364L. L. Pipino, Y. W. Lee, R. Y. Wang, Commun. ACM 2002, 45, 211.
   10.1145/505248.506010
   
   Google Scholar
 * 365J. Liu, R. Srinivasan, P. SelvaGuru, Computer Aided Chemical Engineering,
   Vol. 25, Elsevier, Lyon, France 2008, p. 961.
   
   Google Scholar
 * 366 US FDA, Discussion paper: Artificial intelligence in drug manufacturing,
   notice; request for information and comments,
   https://www.federalregister.gov/documents/2023/03/01/2023-04206/discussion-paper-artificial-intelligence-in-drug-manufacturing-notice-request-for-information-and
   (accessed: April 2024).
   
   Google Scholar
 * 367A. Adhitya, S. F. Cheng, Z. Lee, R. Srinivasan, Comput. Chem. Eng. 2014,
   67, 1.
   10.1016/j.compchemeng.2014.03.013
   
   CASGoogle Scholar
 * 368S. Pandey, S. Gupta, S. Chhajed, Int. J. Eng. Res. Sci. Technol. 2021, 10,
   749.
   
   Google Scholar




Early View

Online Version of Record before inclusion in an issue




 * FIGURES


 * REFERENCES


 * RELATED


 * INFORMATION


METRICS




DETAILS

© 2024 The Author(s). The Canadian Journal of Chemical Engineering published by
Wiley Periodicals LLC on behalf of Canadian Society for Chemical Engineering.



This is an open access article under the terms of the Creative Commons
Attribution-NonCommercial License, which permits use, distribution and
reproduction in any medium, provided the original work is properly cited and is
not used for commercial purposes.



 * Check for updates


RESEARCH FUNDING

 * Jaffer Professorship in Process Systems and Control Engineering
 * Natural Sciences and Engineering Research Council of Canada. Grant Number:
   RGPIN-2019-04600


KEYWORDS

 * artificial intelligence
 * human-AI interaction
 * machine learning
 * process systems engineering


PUBLICATION HISTORY

 * Version of Record online: 06 November 2024
 * Manuscript accepted: 10 September 2024
 * Manuscript revised: 04 August 2024
 * Manuscript received: 15 April 2024




Close Figure Viewer



Previous FigureNext Figure

Caption

Download PDF
back



ADDITIONAL LINKS


ABOUT WILEY ONLINE LIBRARY

 * Privacy Policy
 * Terms of Use
 * About Cookies
 * Manage Cookies
 * Accessibility
 * Wiley Research DE&I Statement and Publishing Policies
 * Developing World Access


HELP & SUPPORT

 * Contact Us
 * Training and Support
 * DMCA & Reporting Piracy


OPPORTUNITIES

 * Subscription Agents
 * Advertisers & Corporate Partners


CONNECT WITH WILEY

 * The Wiley Network
 * Wiley Press Room

Copyright © 1999-2024 John Wiley & Sons, Inc or related companies. All rights
reserved, including rights for text and data mining and training of artificial
intelligence technologies or similar technologies.






LOG IN TO WILEY ONLINE LIBRARY

Email or Customer ID

Password
Forgot password?



NEW USER > INSTITUTIONAL LOGIN >


CHANGE PASSWORD

Old Password
New Password
Too Short Weak Medium Strong Very Strong Too Long

YOUR PASSWORD MUST HAVE 10 CHARACTERS OR MORE:

 * a lower case character, 
 * an upper case character, 
 * a special character 
 * or a digit

Too Short


PASSWORD CHANGED SUCCESSFULLY

Your password has been changed


CREATE A NEW ACCOUNT

Email

Returning user


FORGOT YOUR PASSWORD?

Enter your email address below.

Email




Please check your email for instructions on resetting your password. If you do
not receive an email within 10 minutes, your email address may not be
registered, and you may need to create a new Wiley Online Library account.


REQUEST USERNAME

Can't sign in? Forgot your username?

Enter your email address below and we will send you your username


Email

Close

If the address matches an existing account you will receive an email with
instructions to retrieve your username

The full text of this article hosted at iucr.org is unavailable due to technical
difficulties.


Close crossmark popup