Scientific Programming
160 subscribers
158 photos
30 videos
138 files
446 links
Tutorials and applications from scientific programming

https://github.com/Ziaeemehr
Download Telegram
Applications for Students & Teaching Assistants are Open!

1️⃣ 3-week Courses (July 8 - 26, 2024):
- Computational Neuroscience: Explore the intricacies of the brain's computational processes and join an engaging community of learners.
- Deep Learning: Delve into the world of machine learning, uncovering the principles and applications of deep learning.

2️⃣ 2-week Courses (July 15 - 26, 2024):
- Computational Tools for Climate Science: Uncover the tools and techniques driving climate science research in this dynamic two-week course.
- NeuroAI - Inaugural Year!: Be part of history as we launch our first-ever NeuroAI course, designed to explore the intersection of neuroscience and artificial intelligence.

https://neuromatch.io/courses/
Incremental principal component analysis (IPCA) is typically used as a replacement for principal component analysis (PCA) when the dataset to be decomposed is too large to fit in memory.

IPCA builds a low-rank approximation for the input data using an amount of memory which is independent of the number of input data samples. It is still dependent on the input data features, but changing the batch size allows for control of memory usage.

I have made some changes on the example from sklearn documentation so one does not need to load the whole dataset in memory.

python
X_ipca = np.zeros((X.shape[0], n_components))
for i in range(3):
ipca.partial_fit(X[i*50:(i+1)*50])

for i in range(3):
X_ipca[i*50:(i+1)*50] = ipca.transform(X[i*50:(i+1)*50])




GitHub
How can you create an audiobook with a natural human voice and a customized accent? Let's say you have an EPUB file and you're tired of the robotic voice generated by common text-to-speech (TTS) systems. One of the most advanced TTS technologies available today is provided by Openvoice. You can find more information about it here.

It performs optimally with a GPU, but it's also compatible with CPU. To use it on your own machine, simply set up a virtual environment and install the package. You'll also need to download a few additional files. I'm currently using the basic setup with the default voice, but the ability to clone any voice is an incredibly exciting feature.

follow the notebook demo1, extract text from epub and replace the sample test with your favourite book.

You may need to split the book into several chapters to fit into the gpu memory and avoid killing the job.

It took me about 10 min to make audio book from Shogun, a novel which is about 500 pages.
How to Use ZSH Auto-suggestions?

ZSH is a popular Unix shell that extends the Bourne Again Shell. It comes packed with features and improvements over Bash.
If you already zsh as default terminal just use:



# Linux
git clone https://github.com/zsh-users/zsh-autosuggestions ~/.zsh/zsh-autosuggestions
# add to .zshrc
source ~/.zsh/zsh-autosuggestions/zsh-autosuggestions.zsh
# Mac
brew install zsh-autosuggestions
# add to .zshrc
source $(brew --prefix)/share/zsh-autosuggestions/zsh-autosuggestions.zsh


Read more here.
Also for #JAX 😢
Credit: Geek_code
JAX is an open-source Python library developed by Google for high-performance numerical computing, especially suited for machine learning and scientific computing. It provides a combination of automatic differentiation, just-in-time compilation, and support for GPU/TPU acceleration, making it particularly well-suited for scalable and efficient computation on large datasets. JAX is built on top of the XLA (Accelerated Linear Algebra) compiler and is heavily inspired by NumPy, making it easy 🤨 for users familiar with NumPy to transition to JAX.

Let's practice some JAX:
I recommend start with this repo and the series of videos for start.

Videos
GitHub

Then you can move to the Deep learning book.

Deep Learning with JAX
Workshop JAX
I found this to be quite useful, and it might be beneficial for you as well. There's a YouTube video course available, along with a GitHub page focusing on R and Python.
PDF can be found here: @reza_jafari_ai
PyGWalker: A Python Library for Exploratory Data Analysis with Visualization
docs
colab
Vectorizing in JAX

def dot(v1, v2):
return jax.numpy.dot(v1, v2)

1️⃣ Naively vectorizing

dot_naive =[dot(v1, v2) for v1, v2 in zip(v1s, v2s)]

2️⃣ Manual vectorizing

def dot_vectorized(v1s, v2s):
return jnp.einsum("ij,ij->i", v1s, v2s)

3️⃣ Automatic vectorizing


dot_vmapped = jax.vmap(dot)

Timing

%timeit [dot(v1, v2) for v1, v2 in zip(v1s, v2s)]
%timeit dot_vectorized(v1s, v2s).block_until_ready()
%timeit dot_vmapped(v1s, v2s).block_until_ready()
# 5.15 ms ± 54.3 µs per loop
# 135 µs ± 171 ns per loop
# 543 µs ± 1.38 µs per loop

Adding JIT
dot_vectorized_jitted = jax.jit(dot_vectorized)
dot_vmapped_jitted = jax.jit(dot_vmapped)

Timing
bash
6.5 µs ± 12.9 ns per loop
6.39 µs ± 13.4 ns per loop



Notebook
ویرگول
Here are some of the most important and frequently used commands in scikit-learn (sklearn):

1. Model Selection:
- train_test_split(): Split arrays or matrices into random train and test subsets.
- cross_val_score(): Evaluate a score by cross-validation.
- GridSearchCV(): Exhaustive search over specified parameter values for an estimator.
- StratifiedKFold(): Provides train/test indices to split data into train/test sets while maintaining class distribution.

2. Preprocessing:
- StandardScaler(): Standardize features by removing the mean and scaling to unit variance.
- MinMaxScaler(): Transform features by scaling each feature to a given range.
- OneHotEncoder(): Encode categorical integer features as one-hot numeric arrays.

3. Model Building:
- LinearRegression(): Ordinary least squares Linear Regression.
- LogisticRegression(): Logistic Regression (for classification tasks).
- RandomForestClassifier(): Random Forest Classifier.
- RandomForestRegressor(): Random Forest Regressor.
- GradientBoostingClassifier(): Gradient Boosting Classifier.
- GradientBoostingRegressor(): Gradient Boosting Regressor.
- DecisionTreeClassifier(): Decision Tree Classifier.

4. Model Evaluation:
- accuracy_score(): Accuracy classification score.
- precision_score(), recall_score(), f1_score(): Compute precision, recall, F-measure, and support for classification.
- mean_squared_error(): Mean squared error regression loss.
- r2_score(): R^2 (coefficient of determination) regression score function.

5. Pipeline and Feature Union:
- Pipeline(): Chain multiple estimators into one.
- FeatureUnion(): Combine several transformer objects into a new transformer.

6. Dimensionality Reduction:
- PCA(): Principal Component Analysis.
- TruncatedSVD(): Dimensionality reduction using truncated singular value decomposition.

7. Clustering:
- KMeans(): K-Means clustering.
- AgglomerativeClustering(): Agglomerative hierarchical clustering.

These are just a few of the many functionalities provided by scikit-learn for machine learning tasks.
👍1
Seaborn is a popular Python visualization library built on top of Matplotlib. Here are some of the most frequently used functions in Seaborn:

1. Data Visualization:
- sns.scatterplot(): Scatter plot.
- sns.lineplot(): Line plot.
- sns.barplot(): Bar plot.
- sns.countplot(): Count plot.
- sns.boxplot(): Box plot.
- sns.violinplot(): Violin plot.
- sns.heatmap(): Heatmap.
- sns.pairplot(): Pairwise plot.
- sns.jointplot(): Joint plot.
- sns.distplot(): Distribution plot.
- sns.regplot(): Regression plot.

2. Styling and Aesthetics:
- sns.set_style(): Set aesthetic style of plots.
- sns.set_context(): Set the context for plot elements.
- sns.set_palette(): Set the color palette for the plot.

3. Categorical Data Visualization:
- sns.catplot(): Figure-level interface for drawing categorical plots.
- sns.factorplot(): Draw categorical plots onto a FacetGrid.

4. Matrix Plots:
- sns.clustermap(): Plot a matrix dataset as a hierarchically-clustered heatmap.
- sns.heatmap(): Plot rectangular data as a color-encoded matrix.

5. Time Series Visualization:
- sns.tsplot(): Time series plot.

6. Faceting:
- sns.FacetGrid(): Multi-plot grid for plotting conditional relationships.

7. Regression Plots:
- sns.lmplot(): Plot data and regression model fits across a FacetGrid.
- sns.regplot(): Plot data and a linear regression model fit.

8. Distribution Plots:
- sns.distplot(): Flexibly plot a univariate distribution of observations.
- sns.kdeplot(): Fit and plot a univariate or bivariate kernel density estimate.
*️⃣ Data Science Dojo has added more than 43 data sets to this repository.
1️⃣ The repository carries a diverse range of themes, difficulty levels, sizes and attributes.
2️⃣ They offer hands-on practice to boost their skills in exploratory data analysis, data visualization, data wrangling and machine learning.
3️⃣ The data sets below have been sorted with increasing level of difficulty for convenience (Beginner, Intermediate, Advanced).
https://code.datasciencedojo.com/datasciencedojo/datasets
Python examples for beginners.📌📌.pdf
418.2 KB
Python 100 programs.
Partial functions allow us to fix a certain number of arguments of a function and generate a new function

from functools import partial

def add(a, b, c):
return 100 * a + 10 * b + c

# A partial function with b = 1 and c = 2
add_part = partial(add, c = 2, b = 1)

# Calling partial function, input is a
print(add_part(3))
👍2
Post-doctoral in Marseille.

Project Title: Higher-order interactions in human brain networks supporting causal learning
I have a package let's call it my_package in python. how to implement a function to call it like my_package.tests() to run all tests?


## Create a tests Module

1. In your my_package directory, create a new directory called tests.
2. Inside the tests directory, create an empty __init__.py file to make it a Python package.
3. Create a new Python file, e.g., test_suite.py, where you will define your test suite.

## Define the tests() Function

1. In the test_suite.py file, import the necessary testing framework (e.g., unittest or `pytest`).
2. Define a function called tests() that will run all the tests in your package.

Here's an example using the unittest framework:


import unittest
from . import test_module1, test_module2

def tests():
suite = unittest.TestSuite()
suite.addTests(unittest.TestLoader().loadTestsFromModule(test_module1))
suite.addTests(unittest.TestLoader().loadTestsFromModule(test_module2))

runner = unittest.TextTestRunner(verbosity=2)
runner.run(suite)


test_module1.py is something like this:


import unittest
import numpy as np

class test_module_add(unittest.TestCase):
def test_add(self):
self.assertEqual(np.add(1, 2), 3)


In this example, the tests() function creates a TestSuite object, adds tests from the test_module1 and test_module2 modules, and then runs the test suite using a TextTestRunner.

## Import the tests() Function

1. In your my_package/__init__.py file, import the tests() function from the test_suite.py file:


from .tests.test_suite import tests


Now, you can call the tests() function from your package like this:


import my_package
my_package.tests()


This will run all the tests in your my_package package.