AI Scope

چرا شبکه‌های عصبی؟

🔬هوش مصنوعی مدرن روی دوش شبکه‌های عصبی بنا شده. اما قبل از اینکه تبدیل به موتور پشت LLMها و مدل‌های بینایی بشن، از چیزی خیلی ساده‌تر شروع شدن: تقلید ریاضی از مغز.

Why Neural Networks?

Modern AI is powered by neural networks. But before they became the engines behind LLMs and vision models, they started from something much simpler: a mathematical imitation of the brain. To appreciate the cutting-edge, we need to grasp the basics.

🦴 @scopeofai | #concepts

❤1

74 viewsedited 19:01

AI Scope

یک نورون دقیقاً چیکار می‌کنه؟

در زیست‌شناسی، نورون سیگنال‌ها رو دریافت می‌کنه، پردازش می‌کنه و اگه قوی باشن، خروجی می‌ده.
در شبکه عصبی هم همین اتفاق می‌افته:

1️⃣ورودی‌ها به شکل عدد میان.

2️⃣هر ورودی در یک وزن (اهمیت) ضرب می‌شه.

3️⃣همه با هم جمع می‌شن.

4️⃣یک بایاس هم اضافه می‌شه (برای تغییر نقطه حساسیت).

5️⃣ در نهایت، تابع فعال‌سازی تصمیم می‌گیره خروجی چی باشه.

این واحد ساده، آجر اولیه تمام مدل‌های غول‌پیکر امروزیه.

What Does a Neuron Actually Do?

In biology, a neuron receives signals through its dendrites, processes them, and fires an output if the signal is strong enough.
In neural networks, we mimic this:

1️⃣Inputs come in as numbers (features).

2️⃣Each input is multiplied by a weight (importance).

3️⃣All weighted inputs are added together.

4️⃣A bias is added (shifts the decision boundary).

5️⃣Finally, an activation function decides the output.

This simple unit is the foundation of everything from perceptrons to GPT-4.

🦴 @scopeofai | #concepts

❤1

75 viewsedited 19:04

AI Scope

شبکه عصبی چطور یاد می‌گیره؟

یادگیری یعنی تنظیم وزن‌ها و بایاس‌ها تا پیش‌بینی درست انجام بشه.

۱. ورودی وارد شبکه می‌شه.
۲. خروجی ساخته می‌شه.
۳. با جواب درست مقایسه می‌کنیم.
۴. خطا (loss) حساب می‌شه.
۵. وزن‌ها طوری تغییر می‌کنن که خطا کمتر بشه.

🌀این چرخه بارها تکرار می‌شه تا شبکه الگوها رو درست یاد بگیره.

How Does a Neural Network Learn?

Learning means adjusting weights and biases so that predictions match reality.

1-Input goes through the network.

2-Network produces an output.

3-Compare output with the correct answer (label).

4-Calculate the error (loss).

5-Update the weights to reduce that error.

🌀 Repeat this thousands (or millions) of times, and the network gradually internalizes patterns. This is called training.

🦴 @scopeofai | #concepts

❤1

74 viewsedited 19:07

AI Scope

وزن و بایاس؛ اهرم‌های پنهان

🔹وزن: اهمیت هر ورودی رو مشخص می‌کنه.

🔹بایاس: مثل یه ثابت عمل می‌کنه و کل تابع رو جابه‌جا می‌کنه.

بدون بایاس، شبکه‌ها قدرت انعطاف کمتری داشتن.

📣 فرض کن یک نورون داریم که ورودی‌ها رو می‌گیره، هر کدوم رو در وزن خودش ضرب می‌کنه و بعد جمع می‌کنه. بدون بایاس، خروجی نورون همیشه از صفر عبور می‌کنه وقتی همه ورودی‌ها صفر باشن یعنی شبکه محدود می‌شه و نمی‌تونه بعضی الگوها رو یاد بگیره.

وزن = قدرت تاثیر ورودی

بایاس = خط پایه یا نقطه شروع نورون

بدون بایاس، شبکه مجبور می‌شه خط پاسخ از مبدا (0,0) عبور کنه. با بایاس، می‌تونه این خط رو جابه‌جا کنه و هرجایی که لازمه تصمیم بگیره.

Weights and Biases — The Hidden Levers

Weight: Determines the importance of each input.

Bias: Acts like a constant that shifts the entire function.

Without bias, networks would have much less flexibility.
Think of weight as the volume knob, and bias as shifting the baseline level.

🔈 Imagine a neuron that takes inputs, multiplies each by its weight, and then sums them up. Without bias, the neuron’s output will always pass through zero when all inputs are zero. That limits the network and prevents it from learning certain patterns.

Weight = strength of influence of an input

Bias = the baseline or starting point of the neuron

Without bias, the network is forced to have its decision boundary pass through the origin (0,0). With bias, it can shift that boundary and make decisions wherever needed.

🦴 @scopeofai | #concepts

❤1👍1

64 viewsedited 19:13

AI Scope

پس‌انتشار خطا (Backpropagation)؛ قلب یادگیری شبکه

برای اینکه شبکه عصبی یاد بگیره، باید بدونیم کدوم ورودی‌ها و وزن‌ها باعث خطا شدن و چقدر باید تغییر کنن.

اینجاست که پس‌انتشار خطا وارد می‌شه:

شبکه یه پیش‌بینی می‌کنه و ما خطا (فرق بین جواب واقعی و جواب شبکه) رو حساب می‌کنیم.

🗝 این خطا به تدریج از خروجی به سمت لایه‌های قبلی منتقل می‌شه تا مشخص بشه هر وزن چقدر مسئول خطاست.

حالا می‌تونیم هر وزن رو به اندازه سهمش در خطا تغییر بدیم.

📍 تصور کن شبکه مثل یه گروه آدمه که با هم پروژه‌ای انجام دادن. نتیجه نهایی اشتباه بود. پس همه به عقب نگاه می‌کنن و می‌فهمن هر کس چقدر تو اشتباه تاثیر داشت و بر اساس اون، کارش رو اصلاح می‌کنه.

پس‌انتشار خطا روشیه که باعث شد شبکه‌های عمیق و پیشرفته ممکن بشن، چون بدون اون نمی‌تونستیم لایه‌ها رو درست آموزش بدیم.

Backpropagation — The Real Breakthrough

Training requires knowing which weights to tweak and how much. That’s where backpropagation enters.

It’s essentially calculus at scale:

Compute the gradient of the loss function with respect to every weight.

🖇 Use the chain rule to propagate errors backward from output → hidden layers → input.

Update each parameter in proportion to its contribution to the error.

Backpropagation was the key innovation that unlocked deep learning in the 1980s. Without it, we wouldn’t have today’s AI revolution.

🦴 @scopeofai | #concepts

❤1

70 viewsedited 19:17

AI Scope

نرخ یادگیری؛ اندازه قدم‌ها برای اصلاح

وقتی وزن‌ها رو تغییر می‌دیم، باید میزان تغییر رو کنترل کنیم. این همون چیزی‌یه که بهش می‌گیم نرخ یادگیری (η):

⚫️اگه زیاد باشه شبکه ممکنه مسیر رو اشتباه بره و نره سر جای درست.

⚪️اگه کم باشه آموزش خیلی آهسته پیش می‌ره

روش‌های مدرن مثل Adam یا RMSProp این مقدار رو به صورت هوشمند تنظیم می‌کنن تا هم سرعت مناسب باشه هم پایدار.

The Learning Rate — The Dial of Progress

When updating weights, we don’t apply the raw gradient. We multiply it by a small constant: the learning rate (η).

Too high → the network overshoots, oscillates, or fails to converge.

Too low → training crawls, maybe never reaching a good solution.

Tuning the learning rate is both an art and a science. Modern optimizers (Adam, RMSProp, etc.) adapt it dynamically.

🦴 @scopeofai | #concepts

❤1

78 viewsedited 19:20

AI Scope

ساخت اولین شبکه عصبی (پرسیپترون)

ساده‌ترین شبکه عصبی پرسیپترونه: دو ورودی و یک خروجی.

مثل یاد دادن دروازه منطقی OR یا AND به ماشین. همه ورودی‌ها به شبکه داده می‌شن، خروجی با جواب مقایسه می‌شه، وزن‌ها تغییر می‌کنن تا جدول درست پیاده بشه.

اینجا کل چرخه یادگیری رو در کوچک‌ترین مقیاس می‌بینیم:
ورودی ⬅️ جمع وزنی ⬅️ تابع فعال‌سازی ⬅️ خروجی ⬅️ خطا ⬅️ به‌روزرسانی وزن‌ها.

با کنار هم گذاشتن چند پرسیپترون ساده، به شبکه‌های چندلایه (MLP) می‌رسیم که می‌تونن تقریباً هر تابعی رو مدل کنن.

Building Our First Neural Network (Perceptron)

The simplest neural network is the Perceptron. Imagine two inputs feeding into one output neuron.

Training it on logic gates (like OR/AND) is the classic exercise. You feed in all possible inputs, compare the output to the truth table, and adjust weights until the perceptron reproduces the logic perfectly.

This shows the full learning cycle in miniature:

Inputs → weighted sum → activation → output → error → weight update.

From here, stacking multiple perceptrons leads to multi-layer networks, which can approximate almost any function.

🦴 @scopeofai | #concepts

❤1

89 views19:24

AI Scope

https://medium.com/data-science/first-neural-network-for-beginners-explained-with-code-4cfd37e06eaf

Medium

First neural network for beginners explained (with code)

Understand and create a Perceptron

❤1

115 views19:34

AI Scope

why-language-models-hallucinate.pdf

672.6 KB

112 views19:22

AI Scope

🧩 بریم با هم آخرین مقاله OpenAI رو تحلیل کنیم، مقاله جالبی که به مسئله hallucination توی مدل های زبانی بزرگ می‌پردازه و شرح می‌ده که چرا اصلا این اتفاق می‌افته...

Let's analyze OpenAI's latest paper together, an interesting article that addresses the issue of hallucination in large language models and explains why this happens at all...

❤1

800 views19:29

AI Scope

چکیده

🤖 وقتی صحبت از مدل های بزرگ زبانی می‌شه، یه مشکل بزرگ وجود داره: اونا توهم می‌زنن. یعنی چیزهایی رو با اعتمادبه‌نفس کامل می‌سازن که اصلاً درست نیست.

این مقاله دنبال یک جواب اساسی می‌گرده: چرا این اتفاق حتی در بزرگ‌ترین و پیشرفته‌ترین مدل‌ها می‌افته؟

توهم یه اشکال تصادفی نیست، بلکه عمیقاً به خودِ روش آموزش مدل‌ها گره خورده.

Abstract — The Big Mystery

🔬 Large language models are impressive — they write essays, code, even poetry.
But there’s a catch: they hallucinate. They make things up with full confidence.

This paper asks a hard question: why does this happen, even in the biggest and most advanced models?

The promise: by the end, we’ll see hallucination not as a random glitch, but as something deeply tied to how these models are trained.

🔰 @scopeofai | #papers

❤1

86 viewsedited 19:47

AI Scope

مقدمه

🔹 توهم نادر نیست

🔹یه نویز اتفاقی نیست

ریشه‌ش توی فرمول آموزشی مدله: مدل‌ها یاد گرفتن کلمه بعدی رو پیش‌بینی کنن، نه اینکه حقیقت رو بگن.

پس معما اینه که چطور سیستمی که این‌قدر خوب و دقیقه، توی واقعیت این‌قدر خطا می‌کنه؟

Introduction

🔹Hallucination isn’t rare.

🔹It’s not just noise.

It comes from the training recipe itself: models are taught to predict the next word, not to tell the truth.

So the puzzle: how can a system so good at language fail at facts?

🔰 @scopeofai | #papers

❤1

86 viewsedited 19:47

AI Scope

کارهای مرتبط

قبل از این مقاله، پژوهش‌ها چند توضیح داده بودن:

▫️شاید مدل اصلاً اون دانش رو نداره.

▫️شاید خیلی «اعتمادبه‌نفس کاذب» داره.

▫️داده‌های آموزشی مدل کافی نبوده.

مقاله می‌گه که اینا فقط بخشی از ماجراست. دلیل عمیق‌تر ساختاریه. توهم فقط یه شکاف دانشی نیست؛ توی DNA روش آموزش مدل‌هاست.

Related Work — Previous Clues

Before this paper, researchers gave several explanations:

▫️Maybe the model just doesn’t have the right knowledge.

▫️Maybe it’s too “overconfident” in its outputs.

▫️Maybe the training data was too limited.

This paper says: those are partial answers. The deeper reason is structural. Hallucinations aren’t only gaps — they’re baked into the way we train LLMs.

🔰 @scopeofai | #papers

❤1

86 viewsedited 19:48

AI Scope

روش تحقیق

برای اینکه فقط حدس و گمان نباشه، نویسنده‌ها آزمایش‌هایی طراحی کردن:

♦️به مدل سؤال‌های واقعی و قابل‌بررسی دادن.

⬅️خروجی‌ها رو ثبت کردن.

🔁هر جواب رو با حقیقت مقایسه کردن.

روی احتمال انتخاب کلمات هنگام تولید متن تمرکز کردن تا بفهمن چرا مدل مسیر اشتباه رو انتخاب کرده.

انگار مغز مدل رو باز کردن و قدم‌به‌قدم دیدن چطور فکر می‌کنه.

Methodology — Into the Lab

To go beyond speculation, the authors set up controlled experiments.

They feed models factual questions with known answers.

They log what the model generates.

They compare each response to the truth.

They dive into the token probabilities to see why the wrong choice was made.

Think of it like opening up the model’s brain and watching its thought process in slow motion.

🔰 @scopeofai | #papers

❤1

79 views19:49

AI Scope

نتایج

اینجاست که غافلگیر می‌شیم:

🔶 مدل‌ها توهم می‌زنن حتی وقتی قبلاً جواب درست رو دیده‌ان.

چرا؟ چون در عمل، روان بودن متن مهم‌تر از درست بودنشه.

احتمال انتخاب یک کلمه روانِ اشتباه، بیشتر از یه کلمه درستِ دست‌وپا شکسته است.

بزرگ‌تر کردن مدل هم مشکل رو حل نمی‌کنه؛ بعضی وقتا حتی توهم رو بیشتر می‌کنه.

نتیجه ناراحت‌کننده اینه که توهم نشانه نادانی مدل نیست، عوارض جانبی همون هدف آموزشیه.

Results — The Strange Discovery

Here’s the twist:

🔸Models hallucinate even when they’ve seen the correct fact before.

Why? Because when generating text, fluency beats factuality.

The model often prefers a smooth-sounding wrong answer over a clunky correct one.

Scaling up (making the model bigger) doesn’t solve it. In some cases, bigger models hallucinate more.

That’s the uncomfortable truth: hallucinations are not ignorance, they’re a side-effect of the objective.

🔰 @scopeofai | #papers

❤1

85 viewsedited 19:49

AI Scope

بحث

👁‍🗨 واقعاً چه خبره؟

راستش تنها هدف مدل، پیش‌بینی کلمه بعدیه.

«حقیقت» اصلاً توی معادله نیست.

اگه یه کلمه غلط بیشتر با جمله جور باشه، احتمال انتخابش بالاتر می‌ره.

اینجوری توهم تبدیل به یه معامله اجتناب‌ناپذیر می‌شه:

هرچی متن روان تر باشه و به متن انسان شباهت بیشتری داشته باشه، ریسک توهم بالاتر می‌ره.
پس باید روش آموزش عوض بشه.

Discussion — The Heart of the Matter

🪝 So what’s really going on?

The model’s only goal is to predict the next word.

“Truth” isn’t part of the equation.

If a wrong word fits better into a sentence, probability pushes the model there.

This reframes hallucination as a trade-off:

If you want smooth, human-like text, hallucination risk increases.

If you want pure factuality, you’d need a different training paradigm.

🔰 @scopeofai | #papers

❤3

81 viewsedited 19:50

AI Scope

نتیجه‌گیری

پیام پایانی مقاله صریح بود: توهم‌ها خودبه‌خود محو نمی‌شن.

راه‌حل‌های احتمالی:

✳️ تولید همراه با بازیابی (RAG): مدل حین نوشتن بره و منبع بیرونی چک کنه.

❎ کالیبراسیون واقعیت: خروجی‌ها رو طوری تنظیم کنیم که حقیقت بیشتر وزن داشته باشه.

❇️ آموزش ترکیبی: فقط به روان بودن متن پاداش ندیم، بلکه برامون مهم باشه که با واقعیت هم معیار باشه.

تا اون موقع، استفاده از خروجی خام LLMها بدون کنترل یعنی پذیرش توهم‌ها.

Conclusion — What Next?

The paper closes with a sober message: hallucinations won’t magically vanish.

Possible fixes:

❇️Retrieval-Augmented Generation (RAG): let the model check a database while writing.

✳️Factual calibration: tune outputs to favor truth over style.

❎Hybrid training: reward not just language fluency, but factual grounding.

Until then, using LLMs without external checks means living with hallucinations.

🔰 @scopeofai | #papers

❤1🔥1

90 views19:51

AI Scope

برداشت پایانی

🔅 توهم‌ها باگ نیستن. اون‌ها روی تاریک پیش‌بینی کلمه بعدی هستن.
تا وقتی مدل‌ها برای روان بودن بهینه بشن نه برای حقیقت، توهم خواهند زد.

این مقاله زاویه نگاه رو عوض می‌کنه. درمان توهم با «داده بیشتر» یا «مدل بزرگ‌تر» نیست؛ باید خود هدف آموزش رو دوباره طراحی کنیم.

Final Takeaway

🔅Hallucinations are not a bug. They’re the shadow side of next-word prediction.
As long as models are optimized for fluency, not truth, they will invent.

The paper changes how we see the problem: fixing hallucination isn’t about “more data” or “bigger models” — it’s about rethinking the very goal of training.

🔰 @scopeofai | #papers

❤1🔥1

132 views19:52

AI Scope

پنجره زمینه (Context Window) چیه؟

🪟 پنجره‌ی زمینه (یا طول زمینه) یعنی مقدار متنی که مدل زبانی بزرگ (LLM) می‌تونه همزمان «ببینه» یا «به خاطر بسپاره»، که با واحدی به اسم «توکن» اندازه‌گیری می‌شه.

مثل حافظه‌ی کاری بشره. مدل وقتی داره متن تولید می‌کنه، بخش‌هایی از مکالمه یا سند قبلی رو به یاد داره تا خروجی مناسبی بده.

اگر مقداری که وارد می‌کنی از این حد فراتر بره، بخشی از متن باید بریده بشه یا خلاصه شه تا مدل بتونه ادامه بده.

The context window (also called “context length”) is how much text (in tokens) a large language model (LLM) can “see” or “remember” at once.

It’s like working memory: it lets the model use prior parts of a conversation or document when generating output.
IBM

If you give a prompt + conversation that exceed the context window, the extra parts have to be truncated (cut off) or summarized.

🦴 @scopeofai | #concepts

❤1

90 viewsedited 20:28

AI Scope

♨️ پنجره زمینه بزرگ‌تر به مدل‌ها اجازه می‌ده ورودی‌های بلندتری رو پردازش کنن: اسناد طولانی، کدهای زیاد، سابقه مکالمه‌ها بدون اینکه جزئیات اوایلشون رو فراموش کنن.

مدل‌هایی که مقدار context بزرگ‌تری دارن معمولا پاسخ‌های منسجم‌تری می‌دن، خطاهای توهمی‌شون کمتره، مخصوصاً وقتی درخواست یا پرامپت طولانی باشه.

اما بزرگ‌تر کردن پنجره زمینه هزینه‌ها هم داره: محاسبات بیشتر، مصرف حافظه بالاتر، هزینه مالی و زمان پاسخ‌دهی بیشتر. همچنین ریسک‌های امنیتی، مثل این که ورودی‌های مخرب بتونن توی متون بلند پنهان بمونن، افزایش پیدا می‌کنن.

Bigger context windows let LLMs handle longer inputs: long documents, code, chat histories without forgetting early details.

Models with larger context length tend to be more coherent, make fewer hallucinations, and give more accurate responses when prompts are long.

But increasing context window has trade-offs: more computation, higher memory, more cost, potentially slower responses. Also, security risks like adversarial prompt injections grow.

🦴 @scopeofai | #concepts

❤1

101 viewsedited 20:28

About

Blog

Apps

Platform