I have large experience with Python and scientific libraries, also an extensive background in Machine Learning and Deep Learning. I usually use TensorFlow/Keras for neural models.
There are some questions I wanna ask to clarify things. I cant quite get what you want to do. Are your dataset consists of a shape sequences or its just set of shapes? If you want to predict next element in a sequence, you should train your model with a sequences too. Is looks similar to text models -- you cant simply feed the net with randomly taken words and expect it to predict next word in a sentence :)
Second question is about "how often they appear inside the data". It looks a bit confusing. What do you mean by this? Firstly, which shapes will be equal when we will test 2 shapes for equality? Ones that are within eps-distance by a, b, c, or similar ones (with same proportions a/b, a/c), or something else?
Contact me if you are interested, we can discuss it in detail.