Early identification of cognitive impairment in older adults could reduce the burden of age-related disabilities. Gait parameters are associated with and predictive of cognitive decline. Although a variety of sensors and machine learning analysis methods have been used in cognitive studies, a deep optimized machine vision-based method for analyzing gait to identify cognitive decline is needed.
This study used a walking footage dataset of 158 adults named West China Hospital Elderly Gait, which was labelled by performance on the Short Portable Mental Status Questionnaire. We proposed a novel recognition network, Deep Optimized GaitPart (DO-GaitPart), based on silhouette and skeleton gait images. Three improvements were applied: short-term temporal template generator (STTG) in the template generation stage to decrease computational cost and minimize loss of temporal information; depth-wise spatial feature extractor (DSFE) to extract both global and local fine-grained spatial features from gait images; and multi-scale temporal aggregation (MTA), a temporal modeling method based on attention mechanism, to improve the distinguishability of gait patterns.
An ablation test showed that each component of DO-GaitPart was essential. DO-GaitPart excels in backpack walking scene on CASIA-B dataset, outperforming comparison methods, which were GaitSet, GaitPart, MT3D, 3D Local, TransGait, CSTL, GLN, GaitGL and SMPLGait on Gait3D dataset. The proposed machine vision gait feature identification method achieved a receiver operating characteristic/area under the curve (ROCAUC) of 0.876 (0.852–0.900) on the cognitive state classification task.
The proposed method performed well identifying cognitive decline from the gait video datasets, making it a prospective prototype tool in cognitive assessment.