the last neural cell

Channel created

20:42

#01 Review.
"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale"

Tags: #ML#CV

🧐 At a glance:
The authors provide the first evidence of transformer encoder application for image classification. The intuitive is the following: describe image as a sequence of vectors, feed a lot of data and see what happens. Spoiler: they beat previous SOTA results with model pretrained on largescale image datasets.

github - https://github.com/google-research/vision_transformer
paper - https://arxiv.org/abs/2010.11929]

🤿 Motivation:

- Transformers dominate in solving NLP tasks.
- Can we adopt this approach to image and replace standard CNN approaches?
- Transformer has more global receptive field in comparison with CNN models. It might be helpful for any kind of tasks.

The authors tested these hypotheses on commonly used large image datasets

🍋 Main Ideas

[For visualization purposes see figure 1 in attached poster]

1) Patch extraction
Apply sliding window for 16x16 extraction windows. Then we flat into N=16*16*3D vectors.
Then use linear embeddings for mapping into certain size of tokens.

2) Learnable position embedding:
It is necessary to retain 2D space information. Authors use learnable position embeddings instead of using fixed position embeddings (e.g. sin, cos). Also they show that learnable embeddings work better than fixed.
❕ Positional embeddings - vectors which we add to sequence of tokens to code the position information
3) Learnable classification token [CLS].

Concatenate additional learnable token to image tokens. Motivation is that token can capture global information aggregated from the other tokens.

📈 Experiment insights / Key takeaways:

- First, Vision Transformers dominate ResNets on the performance/computation trade-off. ViT uses approximately 2 - 4 less computation to attain the same performance (average over 5 datasets).
- Second, hybrids slightly outperform ViT at small computational budgets, but the difference vanishes for larger models. This result is somewhat surprising, since one might expect convolutional local feature processing to assist ViT at any size. Third, Vision Transformers appear not to saturate within the range tried, motivating future scaling efforts
- Position embedding can not work with variable input size. They do 2D-interpolation of positional embeddings for fine tuning on smaller or bigger data sets.
- It is possible to analyze weight in CLS token and obtain semantically consistent areas.
- They tested self supervised → do not work well (but better than w/o)

- Training parameters:
- Linear learning rate warmup and decay
- Adam with batch = 4096, high weight_decay = 0.1 → improves transferability.
- Dataset description.
- ImageNet - 1k classes 1.3m images
- ImageNet21 - 21k calsses with 14M images
- JFT - 18k classes with 303M images( high resollution)

✏️ My Notes:

- The application of transformers to image classification shows their potential to be applied to other fields as well. Especially it can be used in time-series analysis as self-attention captures the interaction between input components. As well as the CLF token - which should capture features describing global class attributes.
- Capacity of transformer models is so large that we can approximate much more complicated functions than CNN. All we should do is just feed enough data! As a result everything should work well.
- We can try to adopt this model for brain signals analysis:
- We can represent brain activity as a sequence of tokens (electrode feature vector on each time step)
- The problem is how do we get so much data to pretrain the model? Maybe it is not essential as the complexity of approximated functions in BCI research requires less parameters.

🖼 Paper Poster: see next post

👍2🔥1

400 viewsАлександр Ковалев, edited 15:51

the last neural cell

Vision_transformer_picture_v2.pdf

541.8 KB

🔥2

363 viewsАлександр Ковалев, 15:52

the last neural cell

Forwarded from Interesting papers and links 🧐

https://www.nature.com/articles/s41928-021-00631-8

Nature

Neural recording and stimulation using wireless networks of microimplants

Nature Electronics - Wirelessly powered microchips, which have an ~1 GHz electromagnetic transcutaneous link to an external telecom hub, can be used for multichannel in vivo neural sensing,...

👍3❤1🐳1

360 viewsAlexey Timchenko, 10:58

the last neural cell

#02 Review.
Brain stimulation and imaging methods. Part 1.
#neuroscience #neurostimulation

"Transcranial Focused Ultrasound (tFUS) and Transcranial Unfocused Ultrasound (tUS) Neuromodulation: From Theoretical Principles to Stimulation Practices"

With this review I start the series of articles on state-of-the-art brain stimulation and imaging methods that show hope for application in next-generation brain-computer interface device 😇

🧐 At a glance:

Neuroscience research and application is based on physical methods to record and manipulate brain activity. The review proposed by authors covers the mechanisms, problems and applications of recently proposed method of non-invasive, spatially and temporally selective brain modulation method - transcranial focused ultrasound stimulation (tFUS)

paper - https://www.frontiersin.org/articles/10.3389/fneur.2019.00549/full

⌛️Prerequisites:

I suggest readers are familiar basic notions of neuroscience and neuroimaging:

- Neural cells, neuronal membrane
- Neuroscience methods: fMRI, EEG, transcranial magnetic stimulation (TMS)

---

🚀 Motivation:

We can now record the brain activity, interpret it and extract useful information fairly well. If we further wish to engineer a bidirectional brain communication device, we should develop a means of focused brain stimulation. Ideally, it should convey information directly into the brain or modulate its activity in a predictable way. Non-invasive methods, such as transcranial electrical/magnetic stimulation stimulate superficial layers in a non-specific way, while optogenetics is invasive, requires genome editing and is not portable. In this review a recently developed method of non-invasive focused ultrasound stimulation is broken down into pieces.

🔍 Main Ideas:

1) How tFUS works

Multiple ultrasound sources emit signals which interfere at the point of focus, selectively affecting a small brain region. Modern algorithms can make the precise calculation of focus point possible taking into account skull acoustic properties.

Main mechanisms of sound modulation of electrical activity of the neural cells:
- Mechanical stretching of membrane causing capacitive current
- Activation of mechanoreceptors changing membrane permeability for different ions

Thus its mechanic forces transforming the neuronal membrane to cause electrical changes, which constitute brain activity.

2) Potential problems

- Heating
- Formation of cavities (mechanical pressure)

These lead to local tissue damage. The good thing is, the stimulation parameters can be adjusted to avoid these issues.

3) tFUS features:

😃 Non-invasive - does not require surgery
😃 Allows targeting deeper structures
🙁 Non-cell type specific - affects all cells in a region of focus
🙄 Modulatory - changes the population activity (affects information flows but does not convey information itself) - can cause excitation or inhibition (but not information transfer)
🙂 Spatial resolution <1cm
🙂 Time resolution <1 sec, effects can last up to 2 hours
😃 Portable and affordable device

4) Experimental insights:

Focused ultrasound can facilitate targeted drug delivery! "FUS in combination with microbubbles administered intravenously can open the blood-brain barrier, in a targeted, non-invasive, safe, and reversible manner"

tFUS modulates EEG oscillatory dynamics and affects individual components of somatosensory evoked potentials - this suggests it alters sensory stimulus processing

tFUS in somatosensory cortex caused sensations and in visual cortex - phosphenes (similar to TMS)

tFUS affects BOLD (fMRI) responses both in cortical and deeper structures

Appliying tFUS to right prefrontal cortex altered mood and influenced functional connectivity (study)

📈 Possibility for development:

In combination with optogenetics it was shown possible to develop proteins that are sensitive to ultrasound frequencies, thus making it possible to cell-specific manipulation of brain activity (sonogenetics)

👍3❤1

587 viewsAlexey Timchenko, edited 13:39

the last neural cell

#02 Review.
Brain stimulation and imaging methods. Part 2.

📝 My Notes:

- Particularly excited about possible sonogenetic approach: this overcomes both optogenetics pitfalls (invasive) and tFUS (non-specific) allowing for precise information tranfer in the brain

- tFUS resembles TMS a lot (modulatory, non cell-type specific), although it has two advantages: stimulation of deeper structures and more spatial specificity. I'd say it's next step in non-invasive neuromodulation!

🐎 Further reading:

Physics underlying tFUS
Original ultrasound focusing method article (2001)
----------------
{author} = @Altime

🔥 Next review is gonna be about a very cool method of manipulation of cell activity by light - optogenetics. Stay tuned!

🤩3🔥1🐳1

484 viewsAlexey Timchenko, edited 13:39

the last neural cell

#03 Review. Part 1.

Imagined speech can be decoded from low- and cross-frequency intracranial EEG features.

🧐 At a glance:

Brain computer interfaces (BCI) let you use your brain activity to control devices. You can manage a computer, a prosthesis, and even a speech vocoder. In the present day, researchers are actively investigating the capabilities of such interfaces. In this paper researchers investigated the possibility of developing "language prostheses".

paper: link
code: link
data: not availabe. But you can try to ask authors. → link

🤿 Motivation:

This article is about decoding imagined speech. Speech retrieval is one of the most significant and challenging tasks in BCI. We can observe significant progress in explicit (overt) speech decoding. In other words, a person says the words out loud and we can tell from the brain activity what they said.
However, it's more difficult to decode imagined speech. There's a problem in that it's not clear where to measure brain activity and how to process it (which features and signs to use). These are the questions the authors try to answer:
What brain regions have the best decoding potential?
What's the most informative neural feature?

🍋 Main Ideas

Neuroscience views:
Actually, it's not easy to say how speech is made in our brains. Several theories exist.
Motor hypothesis. Imagined speech and overt speech have similar an articulatory plan in our brain.
Abstraction hypothesis. We can produce imagined speech without explicit motor plan.
Flexible abstraction hypothesis. Imagined speech is phonemic based (sound of language). In this case, the neural activity depends on how each person imagines speech: subarticulation or perceptual.

Experiments description. What they did?
Electrocorticography (ECoG) - electrodes placed directly on the brain (like EEG but without annoying scalp). It is an invasive procedure.
Patients with ECoG perform language tasks. There are 3 studies with different experimental protocols and different ECoG positions. In this task, you should imagine/speak/listen certain words after a clue.

Feature extraction.
Frequence decomposition. Extract 4 frequency bands.
Cross-frequency coupling (CFC) allows to link activities that occur at different rates (frequencies). Authors use link between phase and amplitude from different bands.

Analysis.
Compare brain activity during different task (listen vs speak vs imagine)
Determine influence of each region and feature for word decoding accuracy.

📈 Experiment insights / Key takeaways:

Researchers found that overt and imagined speech production had different dynamics and neural organization. The biggest difference between them was that high-frequency activity (BHA)in the superior temporal cortex increased during overt speech, but decreased during imagined speech. The high frequency band is the best for telling the difference between real and imagined.
It means means that transition in language decoding from overt to imagined speech might be tricky. (We can not train model on overt speech and use it for imagined speech decoding.)

Difference and similarity.
- Superior temporal lobe : BHA increased during overt but decreased during imagined.
- Motor sensory region: BHA increase in both case.
- Left inferior and right anterior temporal lobe: Strong CFC in the between theta phase in both cases.

Nature

Imagined speech can be decoded from low- and cross-frequency intracranial EEG features

Nature Communications - Reconstructing imagined speech from neural activity holds great promises for people with severe speech production deficits. Here, the authors demonstrate using human...

👍4🤩2❤1

462 viewsАлександр Ковалев, edited 20:36

the last neural cell

#03 Review. Part 2.

Imagined speech can be decoded from low- and cross-frequency intracranial EEG features.

Decoding features part.
They showed that high frequency band (BHA) provides best perfomance for overt speech decoding.
Low frequency bands help decode imagined and overt speech approximately on the same level.
Beta-band is a good feature for decoding imagined speech. In terms of power and CFC. Decoding imagined speech is possible due to CFC.
Decoding worked better if we use not only articulatery one (motor sensory cortex). It is defined by phonemic rather than motor level only.

ECoG signal analysis.
Signal processing

- DC shifts HP filter on 0.5 Hz + Notch filter 60, 120 and so on.
- Common average re reference + downsample to 400 Hz (antialising).
- Morlet Wavelet transform with extraction 4 bands: theta (4–8 Hz), low-beta (12–18 Hz), and low-gamma (25–35 Hz), broadband high-frequency activity (BHA) (80–150 Hz)
- Cross frequency copling. They use phase-amplitude cross frequency coupling. Perform between the phase of one band and the amplitude of a higher frequency band. This is a measure of interaction of different frequency bands and in this case the measure of locking of higher frequency oscillation to lower frequency phase.

✏️ My Notes:
Firstly, I think it is important that the authors studied the neuroscience side of language decoding and did not focus on algorithm development. However decoding accuracy is not very high and it is interesting to apply advanced ML algorithm for their datasets.

Improvements and next steps:
- It is essential to develop adaptive algorithms. The position of the ECoG differs significantly.
- It is interesting to explore how neural networks can be implemented for automatic coupling. Transformers should be investigated for that purpose.
- For CFC calculation. It might be useful to use transfer entropy (or other causality metrics) between some phases and amplitudes. It is a time-resolved algorithm.
- Also we can use dynamic PAC time-resolved algorithm for online CFC extraction. site

Author: @koval_alvi
Medium : link

Medium

Decoding imagined and spoken speech using ECoG: insights from neuroscience — Paper Summary

Brain computer interfaces (BCI) let you use your brain activity to control devices. How can we use it for speech restoration.

🔥5👍3🐳1

592 viewsАлександр Ковалев, edited 20:36

the last neural cell

image_2022-01-28_23-38-54.png

#03 Review. Part 2.

Imagined speech can be decoded from low- and cross-frequency intracranial EEG features.

🔥Visual part

682 viewsАлександр Ковалев, edited 20:38

the last neural cell

#04 Review.
Neurons learn by predicting future activity

📑 paper (Nature Machine Intelligence, 2022)

💡 "Neurons have intrinsic predictive learning rule, which is updating synaptic weights (strength of connections) based on minimizing “surprise”: difference between actual and predicted activity. This rule optimizes energy balance of a neuron"

What the authors did to show this suggestion indeed could be the case?

🔥 Read the full review using free Medium link:
https://medium.com/@timchenko.alexey/bdb51a7a00cf?source=friends_link&sk=97920dd2d602e9187bd8fabeb1b39a0b

Feel free to comment on anything that caught your attention or you didn't quite understand. I want these reviews to be concise and clear, so your feedback is highly appreciated ☺️

🔥7❤2🤩2

660 viewsAlexey Timchenko, edited 15:40

the last neural cell

#05 Review.
#bci #deeplearning

Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria → paper
Cool video about it → video

🧐 At a glance:

Anarthria (the inability to articulate speech) makes it hard for paralyzed people to interact with the world. The opportunity to decode words and sentences directly from cerebral activity (ECoG) could give such patients a way to communicate.

Authors build AI model to predict word from neural activity. They achieve 98% accuracy for speech detection and 47% for word classification from 50 classes.

🔥 Read the full review using free Medium link → medium

🔥6👍1🐳1

687 viewsАлександр Ковалев, edited 17:28

the last neural cell

That is very fascinating topic.
There is link on research. Check it if you are interested ☺️

Link https://www.sciencealert.com/completely-locked-in-patient-with-als-communicates-again-with-a-brain-transplant

ScienceAlert

Brain Implant Enables Completely 'Locked-In' Man to Communicate Again

A pair of brain microchips could one day allow those in 'pseudocomas' to communicate whatever they want, a new breakthrough suggests.

🔥3

646 viewsАлександр Ковалев, edited 18:38

the last neural cell

Hi everyone, we're really glad you're reading us. There are already 120 🔥🔥

We have great news. We have created a blog on Medium where all our reviews are collected in one place and categorized.
Telegram articles will also continue to be published.

If you have a subscription to Medium then follow this link and subscribe
Any feedback would be welcome
https://medium.com/the-last-neural-cell

the last neural cell

Diving in the world of neuroscience, brain-computer interface and machine learning research papers to create concise and understandable summaries. We do that for ourselves and for you, to make your life of scientist, engineer or just neuro-enthuasiast a little…

🔥11🤩2

867 viewsАлександр Ковалев, 13:20

About

Blog

Apps

Platform