Steve Chan bio photo

Steve Chan

Artificial Intelligence,
Machine Learning,
Numerical Algorithms,
Data Analytics,
Decision Science

Twitter GitHub GitLab E-Mail

Projects

A sampling of my AI-centric projects are below. I have provided one sample project for each year, in descending order, for the time period 2017 through 2021: (1) Numerical Stability Adaptive Inertial Weighting for Particle Swarm Optimization (PSO) Implementation on a Deep Convolutional Generative Adversarial Network (DCGAN), (2) Auto-tuning of an Artificial Intelligence (AI)-centric Steady State Genetic Algorithm (SSGA) Compression Factor on a Modified Numerical Computing Platform, (3) Stochastic Gradient Descent (SGD) Algorithm for Ascertaining Apropos Weights in the Fast Training of Support Vector Machines, (4) Bi-Normal Separation (BNS) and a Modified Association Matrix (MAM) for an Accelerated Inference Engine, and (5) Higher Tolerance for Uncertainty amidst Compressed Decision Cycles on an Stacked Generative Adversarial Network (SGAN).

2021

Numerical Stability Adaptive Inertial Weighting for Particle Swarm Optimization (PSO) Implementation on a Deep Convolutional Generative Adversarial Network (DCGAN)

Excerpt from my work-in-progress paper: AI-based Robust Convex Relaxations for Supporting Diverse QoS in Next-Generation Wireless Systems

For the involved experiment, explorations were conducted regarding a particular class of Convolutional Neural Networks (CNNs), namely Deep Convolutional Generative Adversarial Network (DCGANs), to solve not only certain convex optimization problems, but also to leverage the same mechanism for tuning its own hyperparameters. This gives rise to other interesting technical challenges. For example, Particle Swarm Optimization (PSO) is an approach for hyperparameter reduction/tuning, but the algorithmic challenge of implementing a PSO on a DCGAN centers upon the conversion of continuous or discontinuous hyperparameters to discrete values, which may result in premature stagnation of particles at local optima. The involved implementation mechanics, such as increasing the inertial weighting (so as to assist in mitigating the stagnation issue), may spawn yet other convex optimization problems. The involved experiments capitalized upon the feed-forward structure of a “You Only Look Once” (YOLO)-based DCGAN. Specifically, a squeezed Deep Convolutional-YOLO-Generative Adversarial Network (DC-YOLO-GAN), referred to as a Modified Squeezed YOLO v3 Implementation (MSY3I), combined with convex relaxation adversarial training, was utilized to improve the bound tightening for each successive neural network layer and better facilitate the global optimization, via a specific numerical stability implementation within the MSY3I.

2020

Auto-tuning of an Artificial Intelligence (AI)-centric Steady State Genetic Algorithm (SSGA) Compression Factor on a Modified Numerical Computing Platform

Excerpt from my publication: [Mitigation Factors for Multi-domain Resilient Networked Distributed Tessellation Communications], which received a Best Paper Award.

Simulations run atop a Modified GNU Octave (M-GNU-O) platform have indicated that statistical consistency tests are not reliable for discerning an optimal filter (which still necessitates parameter tuning). Rather, the tests yield an infinite set of consistent filters within which the optimal filter is a unique member. Preliminary experimental results indicate promise for the auto-tuning of the Steady State Genetic Algorithm (SSGA) compression factor ζ for more optimal convergence of an optimally tuned filter (or a set of near optimally tuned filters). Indeed, auto-tuning is central to this capability, and the compression factor ζ is instrumental in dictating the rate of the steady state towards convergence. Large ζ values may be indicative of earlier (i.e., premature) convergence, thereby segueing to specious solutions that have keyed in on local minima and/or noise, thereby precluding a more optimal convergence. Accordingly, one observation centers around the fact that the ability to re-tune the compression factor ζ to a lower value (i.e., <1) seems to be critical. Another observation centers around the Principal Tuning Result (PTR) for an exponentially bounded fitness, given the characteristic time λ for an overall time dependent population fitness F, which satisfies the convergence condition where PTR = [F_(t+1)-F_t ] < F_t(e^(- λt)-1). In essence, the PTR allows for an SSGA optimization estimate for the convergent approach of the time dependent population fitness F in a quasi-analytical fashion prior to a given numerical iteration, and this finding seems to be consistent with other research in the field.

2019

Stochastic Gradient Descent (SGD) Algorithm for Ascertaining Apropos Weights in the Fast Training of Support Vector Machines

Excerpt from my publication: [Fast Training of Support Vector Machine for Forest Fire Prediction]

The Support Vector Machine (SVM) is a binary classification model, which aims to find the optimal separating hyperplane with the maximum margin so as to classify the data. The maximum margin SVM is obtained by solving a convex Quadratic Programming Problem (QPP) and is termed the hard-margin linear SVM, whose training process is often time-consuming. Several decomposition methods have been experimented with, which split the problem into a sequence of smaller sub-problems. The Sequential Minimal Optimization (SMO) algorithm is a widely utilized decomposition method for SVM. The SMO decomposition method can lead to faster training, whereby the problem is decomposed more quickly into sub-problems. SMO avoids the resolving of numerical Quadratic Programming (QP) problems and takes the alternative pathway of solving the smallest optimization problem at each iteration (by repeatedly selecting a subset of the free variables and optimizing over these variables). Another method for solving optimization problems, which has also been widely utilized for machine learning is that of the Stochastic Gradient Descent (SGD), which is an iterative method. A SGD algorithm was utilized on the discussed experimental testbed to ascertain apropos weights (w_0, w) by iteratively updating the values of w_0 and w, via the utilization of the value of gradient V. The value of the gradient V depends upon the inputs (S), the current values of the model parameter (λ,η,σ), and the cost function f; η is the learning rate, which determines the size of the steps to reach a minimum, λ is the regularization parameter to reduces overfitting, and σ is standard deviation of sigma with loss function l (y ̂(x|Θ,y) that measures the cost of prediction ŷ when the actual answer is y.

2018

Bi-Normal Separation (BNS) and a Modified Association Matrix (MAM) for an Accelerated Inference Engine

Excerpt from my publication: [Countering an Anti-Natural Language Processing Mechanism in the Computer-Mediated Communication of “Trusted” Cyberspace Operations: Bi-Normal Separation Feature Scaling for Informing a Modified Association Matrix]

A prototypical Deep Learning Engine (Training Engine and Inference Engine) with the specified exemplar layers for the discussed Training Engine experiment are as follows: N-Grams (NG) (i.e. recurrent word combinations), Part-of-Speech (POS) N-Grams (POSNG) (i.e. recurrent POS combinations), words with semantic characteristics of relationships (using values from WordNet), Positive and Negative Values (PNV) of words (using values from the Macquarie Semantic Orientation Lexicon [MSOL] ), Pleasantness Value (PV) of words (using values from Whissel’s Dictionary of Affect in Language (WDAL), and Affective words Demonstrating Subjectivity (ADS). These same exemplar layers are utilized for both the Forward Propagation “Rough-Tuning” for the Training Model and well as the Continuous Back Propagation “Fine-Tuning” for the Pre-Trained Model. Bi-Normal Separation (BNS) and a Modified Association Matrix (MAM) were leveraged as accelerants for the inference engine. When combined with specifically chosen datasets to assist in the pre-training, the Transfer Learning was enhanced. By way of explanation, the Untrained Model eventually becomes a “Rough-Tuned” Trained Model (upon ingestion of the initial Training Dataset and Forward Propagation). Further “Rough Tuning” can be achieved by training specific layers, such as PNV and PV (e.g. via MSOL and WDAL, respectively). Eventually, the Trained Model becomes a Pre-Trained Model, and “Fine-Tuning” can be achieved by Continuous Back Propagation and optimizing at certain training layers, such as PNV (e.g., via the Yelp Restaurant Sentiment Lexicon [YRSL] and Amazon Laptop Sentiment Lexicon [ALSL]) and PV (e.g., via the Canadian National Research Council (NRC) Hashtag Emotion Lexicon [HEL] and NRC Word-Emotion Association Lexicon [WEAL]). The Pre-Trained Model is then further optimized when a new dataset is ingested. To avoid over-fitting, the Pre-Trained Model of the CNN also served as a feature extractor for which the features can be fed into a Support Vector Machine (SVM). Collectively, the described constituent components comprised an experimental framework for an enhanced inference system.

2017

Higher Tolerance for Uncertainty amidst Compressed Decision Cycles on an Stacked Generative Adversarial Network (SGAN)

Excerpt from my publication: [Prototype Orchestration Framework as a High Exposure Dimension Cyber Defense Accelerant Amidst Ever-Increasing Cycles of Adaptation by Attackers: A Modified Deep Belief Network Accelerated by a Stacked Generative Adversarial Network for Enhanced Event Correlation], which received a Best Paper Award.

The prototype orchestration framework involved a modified Stacked Generative Adversarial Network (SGAN) for Uncompressed Decision Cycles (UDC) and a modified Deep Belief Network (DBN) for Compressed Decision Cycles (CDC). A particular focus was given to the Artificial Intelligence (AI) accelerant methodology utilized to compress the involved decision-making cycles. Data was ingested by two disparate pathways: (1) UDC, and (2) CDC. For UDC, the data was passed along for Deep Learning (DL) as well as a paradigm of “higher ambiguity and lower uncertainty” (HALU) (i.e. more data is desired). In contrast, for CDC, data was passed along to a DBN as well as a paradigm of “lower ambiguity and higher uncertainty” (LAHU) module. For the UDC pathway, DL and HALU passed their votes to a modified N-Input Voting Algorithm (NIVA) 1 module, whose output was then passed along to a Voting Algorithm for Fault Tolerant Systems (VAFTs) variant for further processing prior to a decision being reached. For the CDC pathway, DBN and LAHU passed their votes down a fast track pathway that had its own NIVA 2 module, an additional “Lower Ambiguity Accelerant (LAA),” and a resultant decision output. In essence, the prototype orchestration framework was predicated upon the hybridization of a modified DBN conjoined with a particular cognitive computing precept (the acceptance of higher uncertainty amidst lower ambiguity for CDC); for UDC, it utilized a modified SGAN, which served as a feeder to an LAA.