AI Demystified #3 - Neural Networks Explained
Did you listen to someone say "neural networks" and you are thinking "What?" The answer is found here.
Welcome back to Tech Break by Friday. I’m Paraskevi Kivroglou, and this is episode 3 in our AI Demystified series. Today, we’re talking about a powerful term in the AI world: neural networks. Most people listen to the term and wonder what it is. So let’s dive into the episode and figure it out.
Part 1: What is a Neural Network?
A neural network is a computational model inspired by the human brain—but let's be crystal clear from the start: it's inspired by biology, not a digital copy of it. This difference matters because one of the biggest misunderstandings is that these systems think like humans do.
Picture a network of nodes—we'll call them neurons—connected in layers. Each connection has a weight, which means that it gives that specific node a more important or less meaning. When you feed data into the network, it passes through these neurons layer by layer, getting transformed and refined along the way.
Without special mathematical functions called activation functions, neural networks would be limited to simple, straight-line relationships, like saying house prices only go up with size. Activation functions introduce complexity, allowing networks to learn curved, intricate patterns, like understanding that house prices depend on size, location, neighbourhood quality, and dozens of other factors in complex ways.
Think of activation functions as decision-makers in each neuron. They ask: "Is this information important enough to pass along?" This simple question, repeated millions of times across the network, enables the system to learn incredibly sophisticated patterns.
Part 2: Anatomy of a Neural Network
Most neural networks have three types of layers:
Input Layer: where data enters.
Hidden Layers: where processing happens.
Output Layer: where the result comes out.
Think of it like making coffee:
The beans (raw data) go in.
They’re ground and brewed (hidden layers).
Then you get your espresso (output).
Each layer refines the input further, extracting more sophisticated features. Early layers might detect basic patterns, like edges in images or simple word combinations in text. Deeper layers recognise complex structures, like faces, objects, or nuanced language meaning.
Part 3: Learning Through Experience
Neural networks learn through a process called training. First, they make a guess—say, recognizing a cat in an image. If they get it wrong, they use a method called backpropagation to correct themselves.
Step 1: Forward Pass
The network makes a prediction—say, identifying a cat in an image.
Step 2: Error Calculation
A loss function measures how wrong the prediction was. Think of this as a very precise grading system.
Step 3: Backpropagation
Here's where the magic happens. The network traces back through every connection, asking: "How much did each connection contribute to this error?" It's like a detective investigating which decisions led to a mistake.
Step 4: Weight Adjustment
Using algorithms like gradient descent, the network adjusts millions of connection weights simultaneously—like fine-tuning a massive orchestra where every musician adjusts their performance based on the overall sound.
The Training Challenge: Networks need massive amounts of data. GPT-4, for instance, was trained on approximately 13 trillion tokens of text, roughly equivalent to millions of books. But here's the critical issue: if that training data contains human biases about gender, race, or culture, the AI learns and can amplify those biases. This is a consideration that researchers need to always keep in mind, since these biases have ethical consequences.
Part 5: Common Misconceptions
Let’s bust a few myths:
Myth 1: "Neural networks work like human brains"
Reality: They're mathematical models inspired by brain concepts, but operate very differently. Your brain uses electrical signals, hormones, and incredibly complex biological processes that we barely understand.Myth 2: "Bigger networks are always better"
Reality: Bigger isn't always better. Larger models can suffer from overfitting—imagine studying by memorizing every practice test question instead of understanding underlying principles. The model performs perfectly on training data but fails on new problems. Balance is key.Myth 3: "AI understands meaning like humans do"
Reality: Neural networks excel at pattern recognition and statistical relationships, but they don't "understand" in the human sense. When ChatGPT discusses emotions, it's processing patterns in text about emotions, not experiencing feelings.Myth 4: "Neural networks are infallible"
Reality: They inherit biases from training data and can make confident-sounding but completely wrong predictions. This is why human oversight remains crucial, especially in critical applications.
That wraps up today's deeper dive into neural networks on Tech Break by Friday. I hope you now see these systems not as mysterious black boxes, but as powerful, evolving tools with both incredible capabilities and significant limitations.
Don't forget to subscribe and share this episode with someone curious about the AI revolution happening around us.
Until next Friday, keep learning, stay curious, and keep building.
Thanks for reading Tech Break by Friday! Subscribe for free to receive new posts and support my work.
Thanks for reading Tech Break by Friday! Subscribe for free to receive new posts and support my work.
Sources:
https://www.datacamp.com/tutorial/introduction-to-activation-functions-in-neural-networks
https://encord.com/blog/activation-functions-neural-networks/
https://internationalpubls.com/index.php/anvi/article/download/2537/1671/5042
https://www.numberanalytics.com/blog/neural-networks-educational-innovation
https://www.cs.cmu.edu/~bhiksha/courses/deeplearning/Fall.2016/pdfs/Simard.pdf
https://www.ironhack.com/gb/blog/artificial-intelligence-breakthroughs-a-look-ahead-to-2024
https://www.linkedin.com/advice/3/what-best-practices-designing-neural-networks
https://www.assemblyai.com/blog/ai-trends-graph-neural-networks
https://the-decoder.com/gpt-4-architecture-datasets-costs-and-more-leaked/
https://www.linkedin.com/pulse/unlocking-power-gpt-dive-gpt-4-architecture-pinil-dissanayaka-cw2nc
https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939/
https://hackernoon.com/the-next-era-of-ai-inside-the-breakthrough-gpt-4-model
https://egusphere.copernicus.org/preprints/2024/egusphere-2024-1823/egusphere-2024-1823.pdf
https://www.cow-shed.com/blog/a-neural-network-and-overfitting-analogy
https://stats.stackexchange.com/questions/452842/analogy-for-the-process-of-neural-networks
https://web.stanford.edu/~jlmcc/papers/LampinenHsuMcC17AnalogiesCogSciProc.pdf
https://www.linkedin.com/pulse/neural-network-factory-analogy-asif-shah-jstkf
https://www.ucl.ac.uk/news/2024/dec/bias-ai-amplifies-our-own-biases
https://developers.google.com/machine-learning/crash-course/neural-networks/activation-functions
https://www.linkedin.com/pulse/neural-network-best-practices-leaders-guide-ratnesh-pandey-lxwnc
https://developers.google.com/machine-learning/crash-course/neural-networks/nodes-hidden-layers