Answer to question 1: Yes, of course, you can. However, please note that you have to provide a sufficient number of images, preferably at least a few thousand images for each animal type. Otherwise, the model will not be trained well.
Answer to question 2: A possible reason could be you are trying to feed all the images at once or you are training on CPU (and your machine does not have a good configuration). The former can be addressed easily; we can undertake the training in batch mode, which is recommended for the era of deep learning.
The latter case can be addressed by migrating your training from CPU to GPU. However, if your machine does not have a GPU, you can try migrating to Amazon GPU instance to get the support for a single (p2.xlarge) or multiple GPUs (for example, p2.8xlarge).
Answer to question 3: The application provided should be enough to understand...