Data Science | Machine Learning with Python for Researchers
32.7K subscribers
3.33K photos
126 videos
23 files
3.55K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images


Publication date:
IEEE Transactions on Geoscience and Remote Sensing 2024

Topic: Object detection

Paper
: https://arxiv.org/pdf/2409.19648v1.pdf

GitHub: https://github.com/wokaikaixinxin/OrientedFormer

Description:

In this paper, we propose an end-to-end transformer-based oriented object detector, consisting of three dedicated modules to address these issues. First, Gaussian positional encoding is proposed to encode the angle, position, and size of oriented boxes using Gaussian distributions. Second, Wasserstein self-attention is proposed to introduce geometric relations and facilitate interaction between content and positional queries by utilizing Gaussian Wasserstein distance scores. Third, oriented cross-attention is proposed to align values and positional queries by rotating sampling points around the positional query according to their angles.

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘4❀2
Forwarded from Free Online Courses
πŸ“ˆHow to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. πŸ’‘

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER πŸ“ˆ
2. Write in private messages : β€œMarathon” and start participating!

πŸ‘‰CLICK HEREπŸ‘ˆ
🌟 INTELLECT-1: Release of the first decentralized learning model.

PRIME Intellect has published INTELLECT-1 ( Instruct + Base ), the first 10 billion parameter language model collaboratively trained in 50 days by 30 participants worldwide.

PRIME Intellect used its own PRIME platform, designed to address the main problems of decentralized learning: network unreliability and dynamic management of computing nodes.

The platform utilized a network of 112 H100 GPUs across 3 continents and achieved a compute utilization rate of 96% under optimal conditions.

The training corpus consisted of 1 trillion public dataset tokens with the following percentage distribution: 55% fineweb-edu, 10% fineweb, 20% Stack V1, 10% dclm-baseline, 5% open-web-math.

▢️ Technical specifications:

🟒 Parameters: 10B;
🟒 Layers: 42;
🟒 Attention Heads: 32;
🟒 Hidden Size: 4096;
🟒 Context Length: 8192;
🟒 Vocabulary Size: 128256.

INTELLECT-1 achieved 37.5% accuracy on the MMLU test and 72.26% on HellaSwag, and outperformed several other open-source models on WinoGrande with a score of 65.82%.

While these figures lag slightly behind today's popular models, the results of the experiment are a critical step toward democratizing AI development and preventing the consolidation of AI capabilities within a few organizations.

▢️ GGUF quantized versions of INTELLECT-1_Instruct in 3-bit (5.46 GB) to 8-bit (10.9 GB) bit depths from the LM Studio community.

▢️ Example of inference on Transformers:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("PrimeIntellect/INTELLECT-1")
tokenizer = AutoTokenizer.from_pretrained("PrimeIntellect/INTELLECT-1")

input_text = "%prompt%"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(output_text)


πŸ“Œ Licensing: Apache 2.0 License.


🟑 Article
🟑 HF Model Kit
🟑 Set of GGUF versions
🟑 Technical report
🟑 Demo
πŸ–₯ GitHub

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘5❀2
Forwarded from Tomas
πŸ“ˆHow to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. πŸ’‘

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER πŸ“ˆ
2. Write in private messages : β€œMarathon” and start participating!

πŸ‘‰CLICK HEREπŸ‘ˆ
πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
❇️ AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction πŸ”₯


πŸ”— Discover More:
  *  Github Link
  *  Project Page: AniGS
  *  Paper: Read the paper

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘4
This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

βœ… https://t.iss.one/addlist/8_rRW2scgfRhOTc0

βœ… https://t.iss.one/Python53
Please open Telegram to view this post
VIEW IN TELEGRAM
❀1
🌟 BioNeMo: A Framework for Developing AI Models for Drug Design.

NVIDIA BioNeMo2 Framework is a set of tools, libraries, and models for computational drug discovery and design.

It accelerates the most time-consuming and expensive steps in building and adapting biomolecular AI models by providing optimized models and tools that are easily integrated into GPU-based computing resources.

The framework enables the creation, training and tuning of models, and its capabilities span a variety of workloads and therapeutic mechanisms: molecule generation, protein structure prediction, protein-ligand prediction and representation learning.

In addition to pipeline code, scripts and utilities, BioNeMo2 Framework contains:

▢️ Pre-trained models:

🟒 ESM-2 is a pre-trained bidirectional encoder (BERT-like) for amino acid sequences. BioNeMo2 includes checkpoints with parameters 650M and 3B;

🟒 Geneformer is a tabular scoring model that generates a dense representation of a cell's scRNA by examining co-expression patterns in individual cells.


▢️ Datasets:

🟠 CELLxGENE is a collection of publicly available single-cell datasets collected by the CZI (Chan Zuckerberg Initiative) with a total volume of 24 million cells;


🟠 UniProt is a database of clustered sets of protein sequences from UniProtKB, created on the basis of translated genomic data.


πŸ“Œ Licensing: Apache 2.0 License.


🟑 Project page
🟑 Documentation
πŸ–₯ GitHub

#AI #ML #Framework #NVIDIA
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘6
2DMatGMM: An open-source robust machine learning platform for real-time detection and classification of 2D material flakes

πŸ–₯ Github: https://github.com/jaluus/2dmatgmm

πŸ“• Paper: https://arxiv.org/abs/2412.09333v1

⭐️ Dataset: https://paperswithcode.com/task/instance-segmentation

https://t.iss.one/DataScienceT 🏳
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘3❀1
OASIS Alzheimer's Detection

Large-scale brain MRI dataset for deep neural network analysis

About Dataset
The dataset used is the OASIS MRI dataset (https://sites.wustl.edu/oasisbrains/), which consists of 80,000 brain MRI images. The images have been divided into four classes based on Alzheimer's progression. The dataset aims to provide a valuable resource for analyzing and detecting early signs of Alzheimer's disease.

To make the dataset accessible, the original .img and .hdr files were converted into Nifti format (.nii) using FSL (FMRIB Software Library). The converted MRI images of 461 patients have been uploaded to a GitHub repository, which can be accessed in multiple parts.
For the neural network training, 2D images were used as input. The brain images were sliced along the z-axis into 256 pieces, and slices ranging from 100 to 160 were selected from each patient. This approach resulted in a comprehensive dataset for analysis.

Patient classification was performed based on the provided metadata and Clinical Dementia Rating (CDR) values, resulting in four classes: demented, very mild demented, mild demented, and non-demented. These classes enable the detection and study of different stages of Alzheimer's disease progression.

During the dataset preparation, the .nii MRI scans were converted to .jpg files. Although this conversion presented some challenges, the files were successfully processed using appropriate tools. The resulting dataset size is 1.3 GB.

https://t.iss.one/datasets1 🌟
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘7❀2
⚑️ Byte Latent Transformer: Patches Scale Better Than Tokens

Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness.

πŸ–₯ Github: https://github.com/facebookresearch/blt

πŸ“• Paper: https://arxiv.org/abs/2412.09871v1

🌟 Dataset: https://paperswithcode.com/dataset/mmlu

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘2
Forwarded from Tomas
πŸ€‘EARN YOUR $100 TODAY! EASY!

Lisa Trader has launched a free marathon on her VIP channel.

Now absolutely everyone can earn from trading. It has become even easier to earn in the cryptocurrency market, you can start today!

WHAT DO YOU NEED TO START?

1. Subscribe to the channel SIGNALS BY LISA TRADER πŸ“ˆ.
2. Write β€œMARATHON” in private messages. She will then tell you how to get on the vip channel for absolutely FREE!

πŸ‘‰CLICK HEREπŸ‘ˆ
πŸ‘‰CLICK HEREπŸ‘ˆ
πŸ‘‰CLICK HEREπŸ‘ˆ
πŸ‘4❀3
πŸ€„ GuoFeng Webnovel: A Discourse-Level and Multilingual Corpus of Web Fiction

πŸ–₯ Github: https://github.com/longyuewangdcu/guofeng-webnovel

πŸ“• Paper: https://arxiv.org/abs/2412.11732v1

🌟 Dataset: www2.statmt.org/wmt24/literary-trans

https://t.iss.one/DataScienceT 🏳
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘7
Large Language Models Course: Learn by Doing LLM Projects

πŸ–₯ Github: https://github.com/peremartra/Large-Language-Model-Notebooks-Course

πŸ“• Paper: https://doi.org/10.31219/osf.io/qgxea

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘8
LOOKING FOR A NEW SOURCE OF INCOME?
Average earnings from 100$ a day

Lisa is looking for people who want to earn money. If you are responsible, motivated and want to change your life. Welcome to her channel.

WHAT YOU NEED TO WORK:
1. phone or computer
2. Free 15-20 minutes a day
3. desire to earn

❗️ Requires 20 people ❗️
Access is available at the link below
πŸ‘‡

https://t.iss.one/+FcwoGw3QeO40NmIx
πŸ‘4❀1