Data Science by ODS.ai π¦
ββInteractive and explorable explanations Collection of links to different explanations of how things work. Link: https://explorabl.es How network effect (ideas, diseases) works: https://meltingasphalt.com/interactive/going-critical/ How trust works: hβ¦
Complexity Explorables
Another collection of interactive explorable explanations of complex systems in biology, physics, mathematics, social sciences, epidemiology, ecology
Link: https://www.complexity-explorables.org
The emergence of communities in weighted networks: https://www.complexity-explorables.org/explorables/jujujajaki-networks/
#interactive #demo #systems #explanations
Another collection of interactive explorable explanations of complex systems in biology, physics, mathematics, social sciences, epidemiology, ecology
Link: https://www.complexity-explorables.org
The emergence of communities in weighted networks: https://www.complexity-explorables.org/explorables/jujujajaki-networks/
#interactive #demo #systems #explanations
π10π₯4β€2
Reliable ML track at Data Fest Online 2023
Call for Papers
Friends, we are glad to inform you that the largest Russian-language conference on Data Science - Data Fest - from the Open Data Science community will take place in 2023 (at the end of May).
And it will again have a section from Reliable ML community. We are waiting for your applications for reports: write directly to me or Dmitry.
Track Info
The concept of Reliable ML is about what to do so that the result of the work of data teams would be, firstly, applicable in the business processes of the customer company and, secondly, brought benefits to this company.
For this you need to be able to:
- correctly build a portfolio of projects (#business)
- think over the system design of each project (#ml_system_design)
- overcome various difficulties when developing a prototype (#tech #causal_inference #metrics)
- explain to the business that your MVP deserves a pilot (#interpretable_ml)
- conduct a pilot (#causal_inference #ab_testing)
- implement your solution in business processes (#tech #mlops #business)
- set up solution monitoring in the productive environment (#tech #mlops)
If you have something to say on the topics above, write to us! If in doubt, write anyway. Many of the coolest reports of previous Reliable ML tracks have come about as a result of discussion and collaboration on the topic.
If you are not ready to make a report but want to listen to something interesting, you can still help! Repost to a relevant community / forward to a friend = participate in the creation of good content.
Registration and full information about Data Fest 2023 is here.
@Reliable ML
Call for Papers
Friends, we are glad to inform you that the largest Russian-language conference on Data Science - Data Fest - from the Open Data Science community will take place in 2023 (at the end of May).
And it will again have a section from Reliable ML community. We are waiting for your applications for reports: write directly to me or Dmitry.
Track Info
The concept of Reliable ML is about what to do so that the result of the work of data teams would be, firstly, applicable in the business processes of the customer company and, secondly, brought benefits to this company.
For this you need to be able to:
- correctly build a portfolio of projects (#business)
- think over the system design of each project (#ml_system_design)
- overcome various difficulties when developing a prototype (#tech #causal_inference #metrics)
- explain to the business that your MVP deserves a pilot (#interpretable_ml)
- conduct a pilot (#causal_inference #ab_testing)
- implement your solution in business processes (#tech #mlops #business)
- set up solution monitoring in the productive environment (#tech #mlops)
If you have something to say on the topics above, write to us! If in doubt, write anyway. Many of the coolest reports of previous Reliable ML tracks have come about as a result of discussion and collaboration on the topic.
If you are not ready to make a report but want to listen to something interesting, you can still help! Repost to a relevant community / forward to a friend = participate in the creation of good content.
Registration and full information about Data Fest 2023 is here.
@Reliable ML
π11π₯1
ββBloombergGPT: A Large Language Model for Finance
The realm of financial technology involves a wide range of NLP applications, such as sentiment analysis, named entity recognition, and question answering. Although Large Language Models (LLMs) have demonstrated effectiveness in various tasks, no LLM specialized for the financial domain has been reported so far. This work introduces BloombergGPT, a 50-billion-parameter language model trained on an extensive range of financial data. The researchers have created a massive 363-billion-token dataset using Bloomberg's data sources, supplemented with 345 billion tokens from general-purpose datasets, potentially creating the largest domain-specific dataset to date.
BloombergGPT has been validated on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that accurately reflect its intended usage. The mixed dataset training results in a model that significantly outperforms existing models on financial tasks without sacrificing performance on general LLM benchmarks. The paper also discusses modeling choices, training processes, and evaluation methodology. As a next step, the researchers plan to release training logs (Chronicles) detailing their experience in training BloombergGPT.
Paper: https://arxiv.org/abs/2303.17564
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-bloomberggpt
#deeplearning #nlp #transformer #sota #languagemodel #finance
The realm of financial technology involves a wide range of NLP applications, such as sentiment analysis, named entity recognition, and question answering. Although Large Language Models (LLMs) have demonstrated effectiveness in various tasks, no LLM specialized for the financial domain has been reported so far. This work introduces BloombergGPT, a 50-billion-parameter language model trained on an extensive range of financial data. The researchers have created a massive 363-billion-token dataset using Bloomberg's data sources, supplemented with 345 billion tokens from general-purpose datasets, potentially creating the largest domain-specific dataset to date.
BloombergGPT has been validated on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that accurately reflect its intended usage. The mixed dataset training results in a model that significantly outperforms existing models on financial tasks without sacrificing performance on general LLM benchmarks. The paper also discusses modeling choices, training processes, and evaluation methodology. As a next step, the researchers plan to release training logs (Chronicles) detailing their experience in training BloombergGPT.
Paper: https://arxiv.org/abs/2303.17564
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-bloomberggpt
#deeplearning #nlp #transformer #sota #languagemodel #finance
π€16π7π₯4β€2π₯°1
Forwarded from gonzo-ΠΎΠ±Π·ΠΎΡΡ ML ΡΡΠ°ΡΠ΅ΠΉ
Stanford 2023 AI Index Report is published!
The section on machine translation is based on Intento data as usual :)
https://aiindex.stanford.edu/report/
The section on machine translation is based on Intento data as usual :)
https://aiindex.stanford.edu/report/
π₯11π4π₯±1
Pandas v2.0.0
The main enhancements:
- installing optional dependencies with pip extras
-
- argument
- copy-on-write improvements
- ..
+ other notable bug fixes
Full list of changes: https://pandas.pydata.org/docs/whatsnew/v2.0.0.html
The main enhancements:
- installing optional dependencies with pip extras
-
index
can now hold numpy numeric dtypes- argument
dtype_backend
, to return pyarrow-backed or numpy-backed nullable dtypes- copy-on-write improvements
- ..
+ other notable bug fixes
Full list of changes: https://pandas.pydata.org/docs/whatsnew/v2.0.0.html
π₯29β€4π3π2π2π2
Kandinsky 2.1
by Sber & AIRI
The main features:
- 3.3B parameters
- generation resolution - 768x768
- image prior transformer
- new MoVQ image autoencoder
- doing a cleaner set of 172M text-image pairs
- work modes: generate by text, blend image, generate images by pattern, change images by text, inpainting/outpainting
The FID on the COCO_30k dataset reaches 8.21
Few posts where compare Kandinsky 2.1 with another similar models
- https://t.iss.one/dushapitona/643
- https://t.iss.one/antidigital/6153
Habr: https://habr.com/ru/companies/sberbank/articles/725282/
Telegram-bot: https://t.iss.one/kandinsky21_bot
ruDALL-E: https://rudalle.ru/
MLSpace: https://sbercloud.ru/ru/datahub/rugpt3family/kandinsky-2-1
GH: https://github.com/ai-forever/Kandinsky-2
HF model: https://huggingface.co/ai-forever/Kandinsky_2.1
HF space: https://huggingface.co/spaces/ai-forever/Kandinsky2.1
FusionBrain: https://fusionbrain.ai/diffusion
by Sber & AIRI
The main features:
- 3.3B parameters
- generation resolution - 768x768
- image prior transformer
- new MoVQ image autoencoder
- doing a cleaner set of 172M text-image pairs
- work modes: generate by text, blend image, generate images by pattern, change images by text, inpainting/outpainting
The FID on the COCO_30k dataset reaches 8.21
Few posts where compare Kandinsky 2.1 with another similar models
- https://t.iss.one/dushapitona/643
- https://t.iss.one/antidigital/6153
Habr: https://habr.com/ru/companies/sberbank/articles/725282/
Telegram-bot: https://t.iss.one/kandinsky21_bot
ruDALL-E: https://rudalle.ru/
MLSpace: https://sbercloud.ru/ru/datahub/rugpt3family/kandinsky-2-1
GH: https://github.com/ai-forever/Kandinsky-2
HF model: https://huggingface.co/ai-forever/Kandinsky_2.1
HF space: https://huggingface.co/spaces/ai-forever/Kandinsky2.1
FusionBrain: https://fusionbrain.ai/diffusion
π31β€1
Forwarded from Kier from TOP
Rask β service for AI-supported video localization
TLDR: Service which allows to translate video end-to-end between languages.
Rask AI offers voice cloning capabilities to make your voice part of your brand, although it has a library of natural and human-like voices to choose from. They currently support the output of videos in the following languages: German, French, Spanish, Chinese, English, and Portuguese, regardless of the source language.
In the near future, a team plans to offer additional services such as captions and subtitles and increase the number of supported languages up to 60 languages.
They havenβt raised any funds for the current setup and currently are launched on the Product Hunt. You are welcome to support them via link below (we all know how important it is for founders, right?).
Website: https://www.rask.ai/
ProductHunt: https://www.producthunt.com/posts/rask-ai-video-localization-dubbing-app
#producthunt #aiproduct #localization
TLDR: Service which allows to translate video end-to-end between languages.
Rask AI offers voice cloning capabilities to make your voice part of your brand, although it has a library of natural and human-like voices to choose from. They currently support the output of videos in the following languages: German, French, Spanish, Chinese, English, and Portuguese, regardless of the source language.
In the near future, a team plans to offer additional services such as captions and subtitles and increase the number of supported languages up to 60 languages.
They havenβt raised any funds for the current setup and currently are launched on the Product Hunt. You are welcome to support them via link below (we all know how important it is for founders, right?).
Website: https://www.rask.ai/
ProductHunt: https://www.producthunt.com/posts/rask-ai-video-localization-dubbing-app
#producthunt #aiproduct #localization
π9β€4π©4π2π€‘2
Hey, letβs see how many of us have some Data Science-related vacancies to share. Please submit them through Google Form.
Best vacancies may be published in this channel.
Google Form: link.
#ds_jobs
Best vacancies may be published in this channel.
Google Form: link.
#ds_jobs
π13β€5π₯5π3π1
Forwarded from ml4se
Tabby: Self-hosted AI coding assistant
Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.
- Self-contained, with no need for a DBMS or cloud service
- Web UI for visualizing and configuration models and MLOps.
- OpenAPI interface, easy to integrate with existing infrastructure.
- Consumer level GPU supports (FP-16 weight loading with various optimization).
Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.
- Self-contained, with no need for a DBMS or cloud service
- Web UI for visualizing and configuration models and MLOps.
- OpenAPI interface, easy to integrate with existing infrastructure.
- Consumer level GPU supports (FP-16 weight loading with various optimization).
GitHub
GitHub - TabbyML/tabby: Self-hosted AI coding assistant
Self-hosted AI coding assistant. Contribute to TabbyML/tabby development by creating an account on GitHub.
π₯23β€8π5π±4
ββSegment Anything
The Segment Anything project aims to democratize image segmentation in computer vision, a core task used across various applications such as scientific imagery analysis and photo editing. Traditionally, accurate segmentation models require specialized expertise, AI training infrastructure, and large amounts of annotated data. This project introduces a new task, dataset, and model for image segmentation to overcome these challenges and make segmentation more accessible.
The researchers are releasing the Segment Anything Model (SAM) and the Segment Anything 1-Billion mask dataset (SA-1B), the largest segmentation dataset to date. These resources will enable a wide range of applications and further research into foundational models for computer vision. The SA-1B dataset is available for research purposes, while the SAM is provided under the permissive Apache 2.0 open license. Users can explore the demo to try SAM with their own images.
Paper link: https://arxiv.org/abs/2304.02643
Code link: https://github.com/facebookresearch/segment-anything
Demo link: https://segment-anything.com/demo
Blogpost link: https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/
Dataset link: https://ai.facebook.com/datasets/segment-anything/
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-sam
#deeplearning #cv #pytorch #imagesegmentation #dataset
The Segment Anything project aims to democratize image segmentation in computer vision, a core task used across various applications such as scientific imagery analysis and photo editing. Traditionally, accurate segmentation models require specialized expertise, AI training infrastructure, and large amounts of annotated data. This project introduces a new task, dataset, and model for image segmentation to overcome these challenges and make segmentation more accessible.
The researchers are releasing the Segment Anything Model (SAM) and the Segment Anything 1-Billion mask dataset (SA-1B), the largest segmentation dataset to date. These resources will enable a wide range of applications and further research into foundational models for computer vision. The SA-1B dataset is available for research purposes, while the SAM is provided under the permissive Apache 2.0 open license. Users can explore the demo to try SAM with their own images.
Paper link: https://arxiv.org/abs/2304.02643
Code link: https://github.com/facebookresearch/segment-anything
Demo link: https://segment-anything.com/demo
Blogpost link: https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/
Dataset link: https://ai.facebook.com/datasets/segment-anything/
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-sam
#deeplearning #cv #pytorch #imagesegmentation #dataset
π₯13π5β€1
Forwarded from Spark in me (Alexander)
Paper Review: Segment Anything
- 99% of masks are automatic, i.e. w/o labels;
- Main image encoder model is huge;
- To produce masks you need a prompt or a somewhat accurate bbox (partial bbox fails miserably);
- Trained on 128 / 256 GPUs;
- Most likely - useful a large scale data annotation tool;
- Not sure that it can be used in production as is, also license for the dataset is research only, the model is Apache 2.0
https://andlukyane.com//blog/paper-review-sam
Unless you have a very specific project (i.e. segment just one object type and you have some priors), this can serve as a decent pre-annotation tool.
This is nice, but probably it can offset 10-20% of CV annotation costs.
- 99% of masks are automatic, i.e. w/o labels;
- Main image encoder model is huge;
- To produce masks you need a prompt or a somewhat accurate bbox (partial bbox fails miserably);
- Trained on 128 / 256 GPUs;
- Most likely - useful a large scale data annotation tool;
- Not sure that it can be used in production as is, also license for the dataset is research only, the model is Apache 2.0
https://andlukyane.com//blog/paper-review-sam
Unless you have a very specific project (i.e. segment just one object type and you have some priors), this can serve as a decent pre-annotation tool.
This is nice, but probably it can offset 10-20% of CV annotation costs.
GitHub
segment-anything/notebooks/predictor_example.ipynb at main Β· facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. -...
π24π3β€2π1
ββInceptionNeXt: When Inception Meets ConvNeXt
Large-kernel convolutions, such as those employed in ConvNeXt, can improve model performance but often come at the cost of efficiency due to high memory access costs. Although reducing kernel size may increase speed, it often leads to significant performance degradation.
To address this issue, the authors propose InceptionNeXt, which decomposes large-kernel depthwise convolution into four parallel branches along the channel dimension. This new Inception depthwise convolution results in networks with high throughputs and competitive performance. For example, InceptionNeXt-T achieves 1.6x higher training throughputs than ConvNeX-T and a 0.2% top-1 accuracy improvement on ImageNet-1K. InceptionNeXt has the potential to serve as an economical baseline for future architecture design, helping to reduce carbon footprint.
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-inceptionnext
Paper link:https://arxiv.org/abs/2303.16900
Code link: https://github.com/sail-sg/inceptionnext
#cnn #deeplearning #computervision
Large-kernel convolutions, such as those employed in ConvNeXt, can improve model performance but often come at the cost of efficiency due to high memory access costs. Although reducing kernel size may increase speed, it often leads to significant performance degradation.
To address this issue, the authors propose InceptionNeXt, which decomposes large-kernel depthwise convolution into four parallel branches along the channel dimension. This new Inception depthwise convolution results in networks with high throughputs and competitive performance. For example, InceptionNeXt-T achieves 1.6x higher training throughputs than ConvNeX-T and a 0.2% top-1 accuracy improvement on ImageNet-1K. InceptionNeXt has the potential to serve as an economical baseline for future architecture design, helping to reduce carbon footprint.
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-inceptionnext
Paper link:https://arxiv.org/abs/2303.16900
Code link: https://github.com/sail-sg/inceptionnext
#cnn #deeplearning #computervision
π6
Forwarded from ml4se
AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges (Salesforce AI)
A review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques.
1. INTRODUCTION
2. CONTRIBUTION OF THIS SURVEY
3. DATA FOR AIOPS
A. Metrics
B. Logs
C. Traces
D. Other data
4. INCIDENT DETECTION
A. Metrics based Incident Detection
B. Logs based Incident Detection
C. Traces and Multimodal Incident Detection
5. FAILURE PREDICTION
A. Metrics based Failure Prediction
B. Logs based Incident Detection
6. ROOT CAUSE ANALYSIS
A. Metric-based RCA
B. Log-based RCA
C. Trace-based and Multimodal RCA
7. AUTOMATED ACTIONS
A. Automated Remediation
B. Auto-scaling
C. Resource Management
8. FUTURE OF AIOPS
A. Common AI Challenges for AIOps
B. Opportunities and Future Trends
9. CONCLUSION
A review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques.
1. INTRODUCTION
2. CONTRIBUTION OF THIS SURVEY
3. DATA FOR AIOPS
A. Metrics
B. Logs
C. Traces
D. Other data
4. INCIDENT DETECTION
A. Metrics based Incident Detection
B. Logs based Incident Detection
C. Traces and Multimodal Incident Detection
5. FAILURE PREDICTION
A. Metrics based Failure Prediction
B. Logs based Incident Detection
6. ROOT CAUSE ANALYSIS
A. Metric-based RCA
B. Log-based RCA
C. Trace-based and Multimodal RCA
7. AUTOMATED ACTIONS
A. Automated Remediation
B. Auto-scaling
C. Resource Management
8. FUTURE OF AIOPS
A. Common AI Challenges for AIOps
B. Opportunities and Future Trends
9. CONCLUSION
π9π₯2
Forwarded from ml4se
AI / ML / LLM / Transformer Models Timeline
This is a collection of important papers in the area of LLMs and Transformer models.
PDF file.
This is a collection of important papers in the area of LLMs and Transformer models.
PDF file.
π40π₯18β€5π3
Forwarded from gonzo-ΠΎΠ±Π·ΠΎΡΡ ML ΡΡΠ°ΡΠ΅ΠΉ
Stability AI just released initial set of StableLM-alpha models, with 3B and 7B parameters. 15B and 30B models are on the way.
Base models are released under CC BY-SA-4.0.
StableLM-Alpha models are trained on the new dataset that build on The Pile, which contains 1.5 trillion tokens, roughly 3x the size of The Pile. These models will be trained on up to 1.5 trillion tokens. The context length for these models is 4096 tokens.
As a proof-of-concept, we also fine-tuned the model with Stanford Alpaca's procedure using a combination of five recent datasets for conversational agents: Stanford's Alpaca, Nomic-AI's gpt4all, RyokoAI's ShareGPT52K datasets, Databricks labs' Dolly, and Anthropic's HH. We will be releasing these models as StableLM-Tuned-Alpha.
https://github.com/Stability-AI/StableLM
Base models are released under CC BY-SA-4.0.
StableLM-Alpha models are trained on the new dataset that build on The Pile, which contains 1.5 trillion tokens, roughly 3x the size of The Pile. These models will be trained on up to 1.5 trillion tokens. The context length for these models is 4096 tokens.
As a proof-of-concept, we also fine-tuned the model with Stanford Alpaca's procedure using a combination of five recent datasets for conversational agents: Stanford's Alpaca, Nomic-AI's gpt4all, RyokoAI's ShareGPT52K datasets, Databricks labs' Dolly, and Anthropic's HH. We will be releasing these models as StableLM-Tuned-Alpha.
https://github.com/Stability-AI/StableLM
GitHub
GitHub - Stability-AI/StableLM: StableLM: Stability AI Language Models
StableLM: Stability AI Language Models. Contribute to Stability-AI/StableLM development by creating an account on GitHub.
π₯15β€6π3
ββDINOv2: Learning Robust Visual Features without Supervision
Get ready for a game-changer in computer vision! Building on the groundbreaking achievements in natural language processing, foundation models are revolutionizing the way we use images in various systems. By generating all-purpose visual features that excel across diverse image distributions and tasks without finetuning, these models are set to redefine the field.
The researchers behind this work have combined cutting-edge techniques to scale pretraining in terms of data and model size, turbocharging the training process like never before. They've devised an ingenious automatic pipeline to create a rich, diverse, and curated image dataset, setting a new standard in the self-supervised literature. To top it off, they've trained a colossal ViT model with a staggering 1 billion parameters and distilled it into a series of smaller, ultra-efficient models. These models outshine the best available all-purpose features, OpenCLIP, on most benchmarks at both image and pixel levels.
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-dinov2
Project link: https://dinov2.metademolab.com/
#deeplearning #cv #pytorch #imagesegmentation #sota #pretraining
Get ready for a game-changer in computer vision! Building on the groundbreaking achievements in natural language processing, foundation models are revolutionizing the way we use images in various systems. By generating all-purpose visual features that excel across diverse image distributions and tasks without finetuning, these models are set to redefine the field.
The researchers behind this work have combined cutting-edge techniques to scale pretraining in terms of data and model size, turbocharging the training process like never before. They've devised an ingenious automatic pipeline to create a rich, diverse, and curated image dataset, setting a new standard in the self-supervised literature. To top it off, they've trained a colossal ViT model with a staggering 1 billion parameters and distilled it into a series of smaller, ultra-efficient models. These models outshine the best available all-purpose features, OpenCLIP, on most benchmarks at both image and pixel levels.
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-dinov2
Project link: https://dinov2.metademolab.com/
#deeplearning #cv #pytorch #imagesegmentation #sota #pretraining
π₯14π9β€5π€£2π€2