Data Science | Machine Learning with Python for Researchers
32.7K subscribers
3.33K photos
126 videos
23 files
3.55K links
ads: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
Download Telegram
Hey guys,

As you all know, the purpose of this community is to share notes and grow together. Hence, today I am sharing with you an app called DevBytes. It keeps you updated about dev and tech news.

This brilliant app provides curated, bite-sized updates on the latest tech news/dev content. Whether it’s new frameworks, AI breakthroughs, or cloud services, DevBytes brings the essentials straight to you.

If you're tired of information overload and want a smarter way to stay informed, give DevBytes a try.

Download here: https://play.google.com/store/apps/details?id=com.candelalabs.devbytes&hl=en-IN
It’s time to read less and know more!
πŸ‘4❀2
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

πŸ–₯ Github: https://github.com/gair-nlp/o1-journey

πŸ“• Paper: https://arxiv.org/abs/2411.16489v1

🌟 Dataset: https://paperswithcode.com/dataset/lima

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘1
Forwarded from Tomas
❗️ WITH LISA YOU WILL START EARNING MONEY

Lisa will leave a link with free entry to a channel that draws money every day. Each subscriber gets between $100 and $5,000.

πŸ‘‰πŸ»CLICK HERE TO JOIN THE CHANNEL πŸ‘ˆπŸ»
πŸ‘‰πŸ»CLICK HERE TO JOIN THE CHANNEL!πŸ‘ˆπŸ»
πŸ‘‰πŸ»CLICK HERE TO JOIN THE CHANNEL πŸ‘ˆπŸ»

🚨FREE FOR THE FIRST 500 SUBSCRIBERS ONLY!
πŸ‘6❀1
⭐️ Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

RAG-Diffusion now supports FLUX.1 Redux!

πŸ”₯ Ready to take control? Customize your region-based images with our training-free solution and achieve powerful, precise results!

πŸ”— Code: https://github.com/NJU-PCALab/RAG-Diffusion

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘4❀1
OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images


Publication date:
IEEE Transactions on Geoscience and Remote Sensing 2024

Topic: Object detection

Paper
: https://arxiv.org/pdf/2409.19648v1.pdf

GitHub: https://github.com/wokaikaixinxin/OrientedFormer

Description:

In this paper, we propose an end-to-end transformer-based oriented object detector, consisting of three dedicated modules to address these issues. First, Gaussian positional encoding is proposed to encode the angle, position, and size of oriented boxes using Gaussian distributions. Second, Wasserstein self-attention is proposed to introduce geometric relations and facilitate interaction between content and positional queries by utilizing Gaussian Wasserstein distance scores. Third, oriented cross-attention is proposed to align values and positional queries by rotating sampling points around the positional query according to their angles.

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘4❀2
Forwarded from Free Online Courses
πŸ“ˆHow to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. πŸ’‘

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER πŸ“ˆ
2. Write in private messages : β€œMarathon” and start participating!

πŸ‘‰CLICK HEREπŸ‘ˆ
🌟 INTELLECT-1: Release of the first decentralized learning model.

PRIME Intellect has published INTELLECT-1 ( Instruct + Base ), the first 10 billion parameter language model collaboratively trained in 50 days by 30 participants worldwide.

PRIME Intellect used its own PRIME platform, designed to address the main problems of decentralized learning: network unreliability and dynamic management of computing nodes.

The platform utilized a network of 112 H100 GPUs across 3 continents and achieved a compute utilization rate of 96% under optimal conditions.

The training corpus consisted of 1 trillion public dataset tokens with the following percentage distribution: 55% fineweb-edu, 10% fineweb, 20% Stack V1, 10% dclm-baseline, 5% open-web-math.

▢️ Technical specifications:

🟒 Parameters: 10B;
🟒 Layers: 42;
🟒 Attention Heads: 32;
🟒 Hidden Size: 4096;
🟒 Context Length: 8192;
🟒 Vocabulary Size: 128256.

INTELLECT-1 achieved 37.5% accuracy on the MMLU test and 72.26% on HellaSwag, and outperformed several other open-source models on WinoGrande with a score of 65.82%.

While these figures lag slightly behind today's popular models, the results of the experiment are a critical step toward democratizing AI development and preventing the consolidation of AI capabilities within a few organizations.

▢️ GGUF quantized versions of INTELLECT-1_Instruct in 3-bit (5.46 GB) to 8-bit (10.9 GB) bit depths from the LM Studio community.

▢️ Example of inference on Transformers:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("PrimeIntellect/INTELLECT-1")
tokenizer = AutoTokenizer.from_pretrained("PrimeIntellect/INTELLECT-1")

input_text = "%prompt%"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(output_text)


πŸ“Œ Licensing: Apache 2.0 License.


🟑 Article
🟑 HF Model Kit
🟑 Set of GGUF versions
🟑 Technical report
🟑 Demo
πŸ–₯ GitHub

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘5❀2
Forwarded from Tomas
πŸ“ˆHow to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. πŸ’‘

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER πŸ“ˆ
2. Write in private messages : β€œMarathon” and start participating!

πŸ‘‰CLICK HEREπŸ‘ˆ
πŸ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
❇️ AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction πŸ”₯


πŸ”— Discover More:
  *  Github Link
  *  Project Page: AniGS
  *  Paper: Read the paper

https://t.iss.one/DataScienceT βœ…
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘4
This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

βœ… https://t.iss.one/addlist/8_rRW2scgfRhOTc0

βœ… https://t.iss.one/Python53
Please open Telegram to view this post
VIEW IN TELEGRAM
❀1
🌟 BioNeMo: A Framework for Developing AI Models for Drug Design.

NVIDIA BioNeMo2 Framework is a set of tools, libraries, and models for computational drug discovery and design.

It accelerates the most time-consuming and expensive steps in building and adapting biomolecular AI models by providing optimized models and tools that are easily integrated into GPU-based computing resources.

The framework enables the creation, training and tuning of models, and its capabilities span a variety of workloads and therapeutic mechanisms: molecule generation, protein structure prediction, protein-ligand prediction and representation learning.

In addition to pipeline code, scripts and utilities, BioNeMo2 Framework contains:

▢️ Pre-trained models:

🟒 ESM-2 is a pre-trained bidirectional encoder (BERT-like) for amino acid sequences. BioNeMo2 includes checkpoints with parameters 650M and 3B;

🟒 Geneformer is a tabular scoring model that generates a dense representation of a cell's scRNA by examining co-expression patterns in individual cells.


▢️ Datasets:

🟠 CELLxGENE is a collection of publicly available single-cell datasets collected by the CZI (Chan Zuckerberg Initiative) with a total volume of 24 million cells;


🟠 UniProt is a database of clustered sets of protein sequences from UniProtKB, created on the basis of translated genomic data.


πŸ“Œ Licensing: Apache 2.0 License.


🟑 Project page
🟑 Documentation
πŸ–₯ GitHub

#AI #ML #Framework #NVIDIA
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
πŸ‘6