Text this: Clinical feature-related single-base substitution sequence signatures identified with an unsupervised machine learning approach