References
- López-Muñoz, F., Boya, J., and Alamo, C. (2006). Neuron theory, the cornerstone of neuroscience, on the centenary of the Nobel Prize award to Santiago Ramón y Cajal. Brain Research Bulletin. 70 (4–6): 391–405. https://pubmed.ncbi.nlm.nih.gov/17027775/
- Ramón y Cajal, S. (1888). Estructura de los centros nerviosos de las aves.
- McCulloch, W.S. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 115–133. https://doi.org/10.1007/BF02478259
- Rashwan, M., Ez, R., and Abd El reheem, G. (2017). Computational Intelligent Algorithms For Arabic Speech Recognition. Journal of Al-Azhar University Engineering Sector. 12. 886-893. 10.21608/auej.2017.19198. https://jaes.journals.ekb.eg/article_19198.html
- Artificial neuron. Wikipedia. Retrieved April 26, 2021, from https://en.wikipedia.org/wiki/Artificial_neuron
- Shackleton-Jones, N. (2019, May 3). How People Learn: Designing Education and Training that Works to Improve Performance. Kogan Page. London, United Kingdom
- Hebb, D. O. (1949). The Organization of Behavior: A Neuropsychological Theory. New York: Wiley and Sons
- Rosenblatt, F. (1957). The Perceptron—a perceiving and recognizing automaton. Report 85-460-1. Cornell Aeronautical Laboratory.
- Minsky, M. and Papert, S. (1972) (second edition with corrections, first edition 1969) Perceptrons: An Introduction to Computational Geometry, The MIT Press, Cambridge MA
- Hassan, H., Negm, A., Zahran, M., and Saavedra, O. (2015). Assessment of Artificial Neural Network for Bathymetry Estimation Using High Resolution Satellite Imagery in Shallow Lakes: Case Study El Burullus Lake. International Water Technology Journal. 5.
- Pollack, J. B. (1989). “No Harm Intended: A Review of the Perceptrons expanded edition”. Journal of Mathematical Psychology. 33 (3): 358–365.
- Crevier, D. (1993), AI: The Tumultuous Search for Artificial Intelligence, New York, NY: BasicBooks.
- Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signal Systems 2, 303–314 (1989). https://doi.org/10.1007/BF02551274
- Goodfellow, I., Bengio, Y., and Courville, A. (2016). 6.5 Back-Propagation and Other Differentiation Algorithms. Deep Learning. MIT Press. pp. 200–220
- Rumelhart, D., Hinton, G., and Williams, R. (1986) Learning representations by back-propagating errors. Nature 323, 533–536. https://doi.org/10.1038/323533a0
- Overview of PyTorch Autograd Engine: https://pytorch.org/blog/overview-of-pytorch-autograd-engine/
- Berland (2007). ReverseaccumulationAD.png. Wikipedia. Available from https://commons.wikimedia.org/wiki/File:ReverseaccumulationAD.png
- Automatic differentiation. Wikipedia. https://en.wikipedia.org/wiki/Automatic_differentiation
- Wengert, R.E. (1964). A simple automatic derivative evaluation program. Comm. ACM. 7 (8): 463–464.
- Bartholomew-Biggs, M., Brown, S., Christianson, B., and Dixon, L. (2000). Automatic differentiation of algorithms. Journal of Computational and Applied Mathematics. 124 (1–2): 171–190.
- The PyTorch authors (2018). automatic_differentiation.ipynb. Available from https://colab.research.google.com/github/PyTorch/PyTorch/blob/r1.9/PyTorch/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb#scrollTo=t09eeeR5prIJ
- The PyTorch authors. Introduction to gradients and automatic differentiation. PyTorch. Available from https://www.PyTorch.org/guide/autodiff
- Thomas (2018). The vanishing gradient problem and ReLUs—a PyTorch investigation. Adventures in Machine Learning. Available from https://adventuresinmachinelearning.com/vanishing-gradient-problem-PyTorch/
- Hinton, Osindero, and Yee-Whye (2005). A Fast Learning Algorithm for Deep Belief Nets. University of Toronto, Computer Science. Available from http://www.cs.toronto.edu/~fritz/absps/ncfast.pdf
- Cortes, C. and Vapnik, V. Support-vector networks. Mach Learn 20, 273–297 (1995). https://doi.org/10.1007/BF00994018
- Friedman, J. H. (February 1999). Greedy Function Approximation: A Gradient Boosting Machine (PDF)
- Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324
- Tibshirani, R. (1996). Regression Shrinkage and Selection via the lasso. Journal of the Royal Statistical Society. Series B (methodological). Wiley. 58 (1): 267–88.
- Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B: 301–320
- Hubel, D. H. and Wiesel, T. N. (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol, 1962, 160: 106-154. https://doi.org/10.1113/jphysiol.1962.sp006837
- http://charlesfrye.github.io/FoundationalNeuroscience/img/corticalLayers.gif
- Wolfe, Kluender, and Levy (2009). Sensation and Perception. Sunderland: Sinauer Associates Inc..
- LeCun, Yann, et al. Backpropagation applied to handwritten zip code recognition. Neural Computation 1.4 (1989): 541-551.
- ImageNet Classification with Deep Convolutional Neural Networks: https://www.nvidia.cn/content/tesla/pdf/machine-learning/imagenet-classification-with-deep-convolutional-nn.pdf
- Nair, V. and Hinton, G E. (2010). Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010.
- Agarap, A F. (2019). Avoiding the vanishing gradients problem using gradient noise addition. medium. https://medium.com/data-science/avoiding-the-vanishing-gradients-problem-96183fd03343
- Maas, A L., Hannun, A Y., and Ng, A Y. (2013). Rectifier Nonlinearities Improve Neural Network Acoustic Models. Proceedings of the 30th International Conference on Machine Learning, Atlanta, Georgia, USA.
- He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv:1502.01852. https://arxiv.org/abs/1502.01852
- Hinton, G E., Srivastava, N., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580. https://arxiv.org/abs/1207.0580
- Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Part of Advances in Neural Information Processing Systems 25 (NIPS 2012). https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
- Ioffe, S. and Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv:1502.03167. https://arxiv.org/abs/1502.03167
- Santurkar, S., Tsipras, D., Ilyas, A., and Madry, A. (2019). How Does Batch Normalization Help Optimization?. arXiv:1805.11604. https://arxiv.org/abs/1805.11604
- Dean, J. and Ng, A. Y. (2012). Using large-scale brain simulations for machine learning and A.I.. The Keyword | Google. https://blog.google/technology/ai/using-large-scale-brain-simulations-for/
- LeCun, Y., Bengio, Y., and Hinton, G. (2015) Deep learning. Nature 521, 436–444. https://www.nature.com/articles/nature14539.epdf
- Olah (2015). Understanding LSTM Networks. colah’s blog. Available from https://colah.github.io/posts/2015-08-Understanding-LSTMs/
- Mozer, M. C. (1995). A Focused Backpropagation Algorithm for Temporal Pattern Recognition. In Chauvin, Y.; Rumelhart, D. (eds.). Backpropagation: Theory, architectures, and applications. ResearchGate. Hillsdale, NJ: Lawrence Erlbaum Associates. pp. 137–169
- Greff, K., Srivastava, R K., Koutník, J., Steunebrink, B R., and Schmidhuber, J. (2017). LSTM: A Search Space Odyssey. arXiv:1503.04069v2. https://arxiv.org/abs/1503.04069v2
- Gers, F. A. and Schmidhuber, E. LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans Neural Netw. 2001;12(6):1333-40. doi: 10.1109/72.963769. PMID: 18249962.
- Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078. https://arxiv.org/abs/1406.1078
- Sutskever, I., Martens, J., Dahl, G., and Hinton, G. (2013). On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, in PMLR 28(3):1139-1147.
- Duchi, J., Hazan, E., and Singer, Y. (2011). Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research 12 (2011) 2121-2159.
- Hinton, Srivastava, and Swersky. Neural Networks for Machine Learning, Lecture 6a. Available from http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
- Zeiler, M. D. (2012). ADADELTA: An Adaptive Learning Rate Method. arXiv:1212.5701. https://arxiv.org/abs/1212.5701
- Kingma, D. P. and Ba, J. (2017). Adam: A Method for Stochastic Optimization. arXiv:1412.6980. https://arxiv.org/abs/1412.6980
- Martens, J. (2010). Deep Learning via Hessian-free Optimization. ICML. Vol. 27. 2010.
- Glorot, X. and Bengio, Y., (2010). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics.
- He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv:1502.01852. https://arxiv.org/abs/1502.01852
- Kagan, et al. (2022). In vitro neurons learn and exhibit sentience when embodied in a simulated game-world. Neuron volume 110, issue 23, P3952-3969.E8,