Project Alpha
Revolutionizing Education through the Power of Machine Learning and Artificial Intelligence
Project Alpha
Revolutionizing Education through the Power of Machine Learning and Artificial Intelligence
Project Alpha Abstract
Project Alpha examines the potential of Generative Pre-trained Transformers (GPTs) in education, with a specific focus on the AlphaGPT series and its innovative educational applications. Global educational challenges worsened by the COVID-19 pandemic have created a need for personalized and effective learning. The AlphaGPT series, with advanced neural networks and large language models (LLMs), and educational Alpha iOS apps offer an innovative way to address these needs. This paper provides a detailed description of the AlphaGPT series, from the simple AlphaGPT 0.01 to the complex AlphaGPT 0.04, and the advanced AlphaGPT 0.2, reflecting their architecture, uses, and potential to revolutionize educational content creation and delivery.
An analysis of my novel artificial intelligence (AI) tutor, AlphaGPT, with a final total tested accuracy of 99.17%, against other large language models, such as OpenAI's GPT series and Meta's LLAMA2 models, was used to assess each model’s performance in generating and interpreting educational content. The research then introduces Project Alpha's innovative applications, which are AlphaGPT, and my AI powered iOS apps Alpha Words, and Alpha Math & Science. Each application is designed to enhance the educational experience and outcomes for K-12 students of different ages and educational backgrounds.
The project highlights the significant role that AI and GPTs can play in overcoming educational challenges and creating personalized learning. The findings of this study show the potential of AlphaGPT and the innovative educational Alpha iOS apps in creating a more engaging, safe, and personalized learning experience for students worldwide.
Project Alpha Introduction
Machine learning and artificial intelligence (AI) can be used to revolutionize education by creating innovative educational solutions. This research paper goes into the revolutionary potential of Generative Pre-trained Transformers (GPTs) in the education field, particularly focusing on the AlphaGPT series and its innovative educational applications. GPTs have been common in recent advancements in natural language processing (NLP), revealing new capabilities in generating coherent and relevant text. The AlphaGPT series, a set of neural networks and large language models, and the innovative educational Alpha iOS apps represent a significant leap towards the education revolution, aiming to transform educational content creation, customization, and delivery.
The arrival of GPTs has unlocked new possibilities in creating interactive and adaptive learning environments. From generating educational content to providing detailed analyses, these models have the power to reshape the way educators teach and how students engage with material. This paper shows the fundamental concepts of GPTs and their relevance in the educational sector. The paper dives into a detailed exploration of the AlphaGPT neural networks, starting from the simple AlphaGPT 0.01 to the more complex AlphaGPT 0.04 and finally the sophisticated AlphaGPT 0.2. Each model is explained in detail for a deep understanding of their architecture and functionality.
Next, this paper presents an analysis of several large language models, including OpenAI’s GPT series and Meta’s LLAMA2 models, to assess their performance in interpreting and generating educational content. This analysis is important to understand the strengths and limitations of each model and how they can be fine-tuned and integrated into educational settings to enhance learning experiences.
Through this research, Project Alpha aims to provide educators and students with a deep and detailed understanding of the potential of GPTs in education. The project goal is to create innovative applications of AI to create more engaging, personalized, and effective learning environments. The AlphaGPT series and the iOS apps Alpha Words and Alpha Math & Science are a starting point toward a future of personalized learning.
Literature Review
Transformer Models in Deep Learning
Attention Is All You Need (Vaswani et al.)[5]:
This paper introduced the Transformer model, revolutionizing sequence modeling tasks like translation. The model is known for its reliance solely on attention mechanisms, it changed the previous norms of recurrent or convolutional layers in neural networks. Its main innovations included:
❖ A Self-Attention Mechanism, which allows the model to weigh the significance of different parts of the input data differently.
❖ Multi-Head Attention, which improves the model's ability to focus on different positions, letting it capture various contextual relationships.
❖ Scalability and Efficiency, due to parallelization advantages, which makes the model faster and more efficient to train.
The Transformer model showed superior performance in machine translation tasks, achieving record results in English-to-German and English-to-French translation tasks.
Scaling Language Models
LLaMA: Open and Efficient Foundation Language Models (Touvron et al.)[6]:
This paper focused on large-scale language models (LLMs) and the efficiency and its accessibility challenges. The paper proposed LLaMA, a LLM that balances performance with efficiency. Its key differences were:
❖ Efficient Training, which addressed the computational demands of training large models and showed how LLaMA offered a more resource-efficient approach.
❖ Open Accessibility, which made large language modeling more accessible to a broader research and experimental community.
❖ A Foundation Model Concept, which showed that LLaMA was made to be a foundational model, capable of being fine-tuned for various tasks.
Both of these papers represent significant advancements in neural network architectures and language modeling, allowing for new innovations in the AI field. Project Alpha builds on these ideas, applying and extending these concepts to new innovative areas and addressing challenges in machine learning and AI.
The Problem
The United Nations Facts and Figures from its recent studies shows that progress towards quality education was already slower than required before the pandemic, but COVID-19 has had devastating impacts on education, causing learning losses in four out of five of the 104 countries studied. [1]
Without additional measures, an estimated 84 million children and young people will still be out of school by 2030, and approximately 300 million students will lack the basic numeracy and literacy skills necessary for success in life. [1]
Additionally, United States students lag behind international peers. In tests of reading, math, science, US 15-year-olds were outperformed by many of their counterparts in Asia and Europe. The united States ranks 24th in reading, 36th in math, and 28th in science. [2]
OECD Reading, Math, Science Performance (PISA) Diagram, using the Python library Matplotlib.
Project Alpha Phase I Summary
The “AlphaGPT 0.01” neural network defines a simple character-based neural network for generating city names. It has 4,356 parameters. It starts by loading a dataset of city names and creating a bigram model as a baseline. The script then prepares data for a PyTorch [10] model by creating a matrix using character bigrams. The model is a basic neural network without hidden layers, and it uses one-hot encoded inputs with a single linear layer with 4,356 weights and no biases. The model's loss function is calculated, and training is performed with 10,000 iterations. The neural network includes code for generating city names using the trained model and visualizes both the loss during training and a heatmap of character transitions. This version is a simple model with a focus on character-level transitions.
The “AlphaGPT 0.02” neural network enhances the previous model by introducing hidden layers and embedding layers. It has 24,976 parameters, meaning this model is much larger than the one before it. The architecture includes an embedding layer, two linear layers with weights W1 and W2, and biases b1 and b2. The model uses blockSize = 3 characters, with numHidden = 250 neurons in the hidden layer. The training loop runs for 10,000 iterations, and the script includes a city name generation procedure like the previous version. There are additional visualizations for weights and biases. This model is more complex than the first, with an added non-linear function to capture more intricate patterns in the data.
The “AlphaGPT 0.03” neural network is further advanced in complexity compared to the previous versions. It has 51,958 parameters, more than double the parameters of the previous network. The architecture is deeper with multiple hidden layers, each using batch normalization and a Tanh activation function. The model is trained for 200,000 iterations, which shows a more extensive training process than before. This neural network includes visualizations for loss while training along with weights and biases histograms. Like the previous models, it features generation of city names but also includes the evaluation of losses for the train, develop, and test sets. This model is designed to find more complex patterns and dependencies in the dataset, with a focus on deeper learning.
The “AlphaGPT 0.04” neural network is the most advanced model among the four. It has an architecture similar to the last version but includes increased complexity from its layer depth and size. It has 1,191,682 parameters and 1,602 neurons. The model includes an Embedding layer, Combine layers, multiple Linear layers, Batch Normalization, and Tanh activation functions. It uses a larger training scale with 200,000 iterations with extensive training for better performance. This model offers the best loss and generation of city names. It includes visualizations for training loss and weights and biases histograms.
Final Comparison of the 4 AlphaGPT Models (0.01, 0.02, 0.03, 0.04), using the Python library Matplotlib.
AlphaGPT 0.2
The Alpha Generative Pre-Trained Transformer (GPT) 0.2 [5][8] is a highly advanced neural network designed for generating scientific text. It has 10,782,809 parameters and 20,825 neurons. It is trained on 5,276,931 tokens. It's structured to find and understand complex patterns in text data. Here's an explanation of its architecture and components:
Neural Network Architecture
Input Layer: The model takes in batchSize = 48 batches with blockSize = 192 characters each as input.
Embedding Layers: The input is converted into integer form using the embedding layers. The two main types of embeddings used token and positional embeddings. Token embeddings turn each character in the input sequence into a vector. Positional embeddings are added on top of the token embeddings to give the model information about the position of each character in the sequence.
Multi-Head Self-Attention Mechanism: Multi-head self-attention allows the model to weigh the importance of different parts of the input differently. It's made up of multiple heads, with each head performing attention calculations to capture various relationships in the input data. Each head calculates the key, query, and values of the input and the attention scores using the query and key to determine how much importance to give to certain characters in the input. These scores are applied to the value to have certain characters have higher probabilities over other characters.
Feed-Forward: After attention calculation, the output is moved through a feed-forward layer. It's a simple perceptron with a ReLU activation function.
Layer Normalization: Layer normalization is applied before each multi-head attention layer and feed-forward layer to stabilize and accelerate the training, similar to batch normalization in the previous models.
Residual Connections: These are used around each of the multi-head attention and feed-forward layers, allowing the model to pass over these layers if needed, which helps in keeping information over long sequences.
The multi-head self-attention mechanism, feed-forward, layer normalization, and residual connections pattern is repeated numLayer = 6 times.
Final Layer Normalization: This layer is similar to the other normalization layers and formats the final output of the loop by changing its dimensionalities to move on to the final linear layer.
Final Linear Layer: The final result of the process from above goes through a final linear layer that transforms it back into a vector of size vocabularySize, providing scores for each possible next character.
Softmax Layer: The softmax layer converts the scores from the linear layer into probabilities, making it easier for the rest of the model to predict the next character.
Final Output: The output is the vector of probabilities from the softmax layer. These probabilities are used to predict the next character in a given sequence. The model is repeatedly used to generate the number of characters needed for a specific task.
Training and Generation: The model learns by changing its parameters to minimize the difference between its predictions and the actual next characters, also known as minimizing the loss. It uses backpropagation, the optimizer AdamW for this purpose, and a special type of learning rate decay that uses exponentiation for each iteration. After the model is trained, it is provided with a starting context, and it will generate new text by running through the neural network repeatedly.
Generated Scientific Text from AlphaGPT 0.2
AlphaGPT 0.2’s Generated Scientific Text after 10 Iterations (Training 27.5 mins):
“rtheddsthhigin‚mpc(,stikvws0adasssAyyNKof'deriddrsKathissthasAgussangtrthangigrss,dabCAbylalhCciatintthiBapEclanthangthPys+baingAAn‚tstshalwwinniss)sadithatbarolitheabthan$anQanthithalryItihh‚atonsanthcindn.thyagiKZdazinthexy‚/1mssilashAgrawhPndussevaLeaKchinusadalingthCthnisindPa,Athseangheacsoornthangeanteageancsh,anpabheggegc‚theswaAinthessothustly”
AlphaGPT 0.2’s Generated Scientific Text after 100K Iterations (Training 3.76 hours):
“Hodanle Maps also helps social dimensions boat the game as four pitch yer. Chang, by Ghaba professional backgrounds, called for Rep. Schaimi, who is the lab in the United Association Program, New York many participants have asked students to thrive their first years ago, the researchers want to use deep networks for new jobs. Abook‚ ewarding in a few mirrors and 86 percent a step that. In each experiment, the researchers acked five seconds to explore the innervation of the 3D structure item. Beyond climate chips emerged from breaking cancer types of uncertainty.”
Project Alpha Phase I Conclusion
Based on the training and inference times, hardware demand, and the output quality, I realized that it is more practical to use an already pre-trained transformer, then fine tune it to work as an AI Tutor, primarily due to the hundreds of GPU hours required to train a newly built GPT, which are not feasible to accomplish with consumer grade electronics.
Project Alpha Phase II Summary and Conclusion
For Project Alpha Phase II, I compare several Large Language Models (LLMs), including OpenAI's GPT series [19] and the LLAMA2 models [7]. This comparative analysis aims to assess the performance of these LLMs in interpreting and generating educational content, in this case focusing on science. I used a series of three tests, the 8th grade Science STAAR [25], New York High School Physics [26], and New York High School Chemistry [27] to measure the performance of each model.
In addition to the standard models, I used a fine-tuning process on the local versions of the LLMs like the LLAMA2 models and Mistral 7B to enhance their performance on the science tests. The fine-tuning involves adjusting the models' parameters, aiming to improve their accuracy in answering these tests.
I recorded the results using a Python script [9] for data visualization, utilizing the pandas library [12] for data formatting and matplotlib [11] for visual representation. The performance of each LLM was recorded and graphed in a bar chart to provide a direct comparison of the different tests for each of the models.
Project Alpha Phase II provides an assessment of the models' abilities, with a focus on their application in educational science content. The results show that out of all the models tested, OpenAI’s ChatGPT 4 [19] achieved a 295% accuracy out of 300% (98.33%) for all three tests [25][26][27].
Large Language Models (LLMs) Test Results Diagram, using the Python library Matplotlib.
AlphaGPT achieved the highest accuracy of 99.17% for all three tests.
ChatGPT 4 answered each question and provided an explanation of how the answer was reached. Although ChatGPT 4 missed one of the questions on the physics test, ChatGPT 4 got the correct answer when asked the question a second time. In contrast, ChatGPT 3.5 only provided the answer to the question with no explanation with a total accuracy of 215% out of 300% (71.67%).
The LLAMA2 7B and 70B models only provided the answer to the questions, but the LLAMA2 70B model performed much better than the 7B model on the Science STAAR 8th Grade [25] and New York Physics [26] tests. LLAMA2 7B was more accurate than LLAMA2 70B on the New York Chemistry test [27].
The LLAMA2 13B model responded with the answer and an explanation for the chemistry test and responded with the answer and the question for the physics test, showing the qualities of a good AI tutor, however, the model performed poorly on the tests and tied at 20% with LLAMA2 70B for the New York Chemistry test.
Mistral 7B only responded with the answer to each question for all the tests, but still performed better than LLAMA2 7B on the New York Physics and New York Chemistry tests, also beating LLAMA2 13B and 70B on the New York Chemistry test.
Based on these results, I decided to build AlphaGPT, a helpful, supportive AI tutor for grades K-12 and coding assistant. AlphaGPT is using ChatGPT4 in the backend. Then I used AlphaGPT to develop two iOS apps, Alpha Words, a multi-language vocabulary and creative writing platform including the Latin and Mandarin languages. And, Alpha Math & Science covering mental math, K-12 science topics, and best-selling book summaries.
I also went back and tested AlphaGPT with the same three standard tests [25][26][27] and it achieved 297.5% out of the 300% (99.17%) for all three tests, slightly better than ChatGPT4. AlphaGPT answered each question and provided an explanation of how the answer was reached. AlphaGPT reported the incorrect answer for one of the physics test questions but then went on to state the correct answer when explaining. This was due to the fact that GPT models do not think, they simply just predict the next token in a sequence and they cannot go back and rethink their answer.
Project Alpha Phase III Summary
I developed three different innovative applications to revolutionize education using the power of Deep Machine Learning (DML) and Artificial Intelligence (AI).
AlphaGPT
AlphaGPT is an artificial intelligence model using OpenAI’s GPT 4 [19] in the backend, and it is specifically fine-tuned for educational purposes in grades K-12. Although GPT4 has been trained on a large amount of the internet, AlphaGPT has strict guidelines to ensure students receive age-appropriate responses and moves inappropriate conversations towards constructive and educational topics. It is designed to engage students in a supportive way, provide highly accurate educational information, respond while keeping in mind a student’s age and knowledge, and encourage curiosity. AlphaGPT also uses advanced natural language processing to create positive educational outcomes, emphasize safety, and personalize the education of every student.
Alpha Words iOS App
Alpha Words iOS App is the most comprehensive multi-language vocabulary app, providing thousands of the most used vocabularies in seven different languages, English, Mandarin, Latin, Spanish, French, Portuguese, and Italian on iPhones, iPads, and Apple Silicon Macs. Alpha Words also covers the United States Spelling Bee words used in the 2024 Spelling Bee National Competitions. All USA Spelling Bee words are organized by grade level and then alphabetically. Also, each spelling word has up to 20 different fields, providing greater and deeper understanding.
Alpha Words covers the Mandarin dialect and Simplified Chinese writing system. Common Chinese characters are covered with their Pinyin and meaning in five different languages, along with example sentences and pronunciations.
Alpha Words covers the most used Latin vocabularies found in English and the Romance Languages. The most used Latin vocabularies are combined with their English, Spanish, French, Portuguese, Italian, and Mandarin counterparts, examples and pronunciations.
Alpha Words provides a creative writing platform, where students can write an original story using the words that you learned in the app and publish it on the app in six languages across 175 countries.
Alpha Words iOS App Architecture
Alpha Math & Science iOS App
Alpha Math & Science has 37 mental math common rules with multiple examples and testing for every rule. The app covers science topics for grades K-8 with multiple examples for every topic. The app also includes book summaries of books from bestselling authors.
Alpha Math & Science App Architecture
Project Alpha Conclusion
This research has explored the revolutionary potential of Generative Pre-trained Transformers (GPTs) within education, specifically looking at the AlphaGPT series and its innovative applications of AlphaGPT and the Alpha iOS apps. The paper dove into the architectures and functionalities of several neural network models, from the rudimentary AlphaGPT 0.01 to the more complex AlphaGPT 0.04, moving to the highly sophisticated AlphaGPT 0.2. These models have established an important milestone to understand and generate educational text, showing how machine learning (ML) and artificial intelligence (AI) can potentially create safe, engaging, and personalized learning environments.
The comparative analysis of AlphaGPT against other large language models (LLMs) including OpenAI's GPTs and Meta's LLAMA2 models, provided important information about their performance in understanding and generating educational content. AlphaGPT had 99.17% accuracy in all tested scenarios, highlighting its potential as a powerful tool for educational purposes.
The paper addressed important educational challenges, emphasized by the United Nations' alarming statistics on learning losses and the United States’ educational lag behind other developed countries. Through Project Alpha, innovative applications like AlphaGPT, Alpha Words, and Alpha Math & Science were developed, each contributing uniquely to revolutionizing the education field. These applications leverage the capabilities of GPTs to provide safe, engaging, and personalized learning experiences, emphasizing the transformative impact of AI in education.
Project Alpha is a significant stride towards using ML and AI to revolutionize education. The AlphaGPT series and its innovative applications offer a promising pathway towards addressing current educational challenges, promoting personalized learning, and inspiring continuous innovation in the education field. The next steps of Project Alpha will focus on refining and creating more of these applications, ensuring that the benefits of personalized education are accessible to all.
Project Alpha Next Steps
There are several different next steps for Project Alpha and each requires future funding, moving from very high cost to lower cost:
One direction is to build a better Transformer architecture that does not require hundreds of parallel GPU hours to train and train it on selective high quality educational datasets.
Another direction is to enhance LLAMA2 13B and train it with high quality educational datasets to improve its accuracy, which will cost less than option 1 because LLAMA2 13B is already pre-trained on 2 trillion tokens.
A third direction is to continue working with OpenAI’s high accuracy models and utilize them to produce more educational content for education apps including Alpha Math & Science and Alpha Words, which will have the per token charge price from OpenAI’s API.
Research Equipment, Tools, and Programming Languages
Apple Equipment:
MacBook Pro M3 Max, MacBook Air M2, MacBook Pro Intel i9, iPad Pro 6th Generation, iPhone 11 Pro, iPhone 12 Pro, iPhone 14 Pro, and iPhone 15 Pro.
Programming Languages and Libraries:
Python, TensorFlow, Keras, NumPy, PyTorch, Pandas, Matplotlib, PyCharm IDE, Visual Studio Code IDE, Xcode IDE, LLMStudio, Swift, SwiftUI, Google Cloud, and JSON.
Large Language Models:
GPT4 (1.76 Trillion Parameters), GPT3.5 (175B), LLAMA2 70B, LLAMA2 13B, LLAMA2 7B, Mistral (7B).
[7][9][10][11][12][13][14][15][16][17][18][19][21][22][23]
References:
[1] Education - United Nations Sustainable Development. Retrieved from https://www.un.org/sustainabledevelopment/education/
[2] OECD, Retrieved from https://www.oecd.org/
[3] SimpleMaps. (n.d.). World Cities Dataset. Retrieved from https://simplemaps.com/data/world-cities
[4] Dalal, D. (2023). MIT AI news published from 1993 to 2023 [Dataset]. Kaggle. Retrieved from https://www.kaggle.com/datasets/deepanshudalal09/mit-ai-news-published-till-2023/
[5] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. arXiv. https://doi.org/10.48550/arXiv.1706.03762
[6] Meta AI. (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv. https://doi.org/10.48550/arXiv.2302.13971
[7] Meta. (n.d.). Llama 2: open source, free for research and commercial use. Retrieved from https://ai.meta.com/resources/models-and-libraries/llama/
[8] Karpathy, A. (n.d.). Andrej Karpathy GitHub. Retrieved from https://github.com/karpathy
[9] Python Software Foundation. (n.d.). Python Programming Language. Retrieved from https://www.python.org/
[10] PyTorch. (n.d.). PyTorch and Torch, Machine Learning Library in Python. Retrieved from https://pytorch.org/
[11] Hunter, J. D. (n.d.). Matplotlib, Visualization in Python Library. Retrieved from https://matplotlib.org/
[12] McKinney, W., & others. (n.d.). Pandas, Data Analysis Library in Python. Retrieved from https://pandas.pydata.org/
[13] Oliphant, T. E. (n.d.). NumPy, Arrays Library in Python. Retrieved from https://numpy.org/
[14] Apple Inc. (n.d.). Apple Laptops, iPads, and iPhones. Retrieved from https://www.apple.com/
[15] Apple Inc. (n.d.). Apple XCODE. Retrieved from https://developer.apple.com/xcode/
[16] Apple Inc. (n.d.). Apple Swift. Retrieved from https://developer.apple.com/swift/
[17] Apple Inc. (n.d.). Apple SwiftUI. Retrieved from https://developer.apple.com/xcode/swiftui/
[18] Google Cloud. (n.d.). Google Cloud. Retrieved from https://cloud.google.com/
[19] OpenAI. (n.d.). GPT 4 and Fine Tuning. Retrieved from https://openai.com/gpt-4
[20] Google. (n.d.). Google Bard. Retrieved from https://bard.google.com/
[21] JSON. (n.d.). JSON. Retrieved from https://www.json.org/json-en.html
[22] Microsoft. (2022). Microsoft Visual Studio 2022 IDE. Retrieved from https://visualstudio.microsoft.com/
[23] JetBrains. (n.d.). PyCharm IDE. Retrieved from https://www.jetbrains.com/pycharm/
[24] LM Studio. (n.d.). LLM Studio, Run Local LLMs. Retrieved from https://lmstudio.ai/
[25] Texas Education Agency. (2022). 8th Grade Texas STAAR Science Test 2022.
[26] The University of the State of New York. (2023). High School Physics Exam 2023. New York State Education Department. Retrieved December 16, 2023, from https://www.nysedregents.org/Physics/
[27] The University of the State of New York. (2023). High School Chemistry Exam 2023. New York State Education Department. Retrieved December 16, 2023, from https://www.nysedregents.org/Chemistry
Project Alpha completed by Adel S.
01/01/2024