Forwarded from Spark in me (Alexander)
DALL-E Mini Explained with Demo
Tech report:
- Financed by Google Cloud and HF, essentially an advertising campaign for JAX, 8 person team
- 27x smaller than the original, trained on a single TPU v3-8 for only 3 days + ~3 weeks for experiments, 400M params
- 30m image-text pairs, only 2m used to fine-tune the VQGAN encoder
- Could use preemptible TPU instances
- Pre-trained BART Encoder
- Pre-trained VQGAN encoder
- Pre-trained CLIP is used to select the best generated images
- (so the actual cost probably is actually ~1-2 orders of magnitude higher)
- (compare with 20k GPU days stipulated by Sber)
- The report is expertly written and easy to read
Tech report:
- Financed by Google Cloud and HF, essentially an advertising campaign for JAX, 8 person team
- 27x smaller than the original, trained on a single TPU v3-8 for only 3 days + ~3 weeks for experiments, 400M params
- 30m image-text pairs, only 2m used to fine-tune the VQGAN encoder
- Could use preemptible TPU instances
- Pre-trained BART Encoder
- Pre-trained VQGAN encoder
- Pre-trained CLIP is used to select the best generated images
- (so the actual cost probably is actually ~1-2 orders of magnitude higher)
- (compare with 20k GPU days stipulated by Sber)
- The report is expertly written and easy to read
W&B
DALL-E Mini Explained
Generate images from a text prompt in this interactive report: DALL-E Mini Explained, a reproduction of OpenAI DALL·E. Made by Boris Dayma using W&B
👍13❤1
Forwarded from Machinelearning
💬 Yandex: An Open-source Yet another Language Model 100B
YaLM 100B is trained for 2 terabyte of text: dataset the Pile and web-pages, including not only Wikipedia, news articles, and books, but also Github and arxiv.org. Yandex has applied the generative neural networks YaLM in the recent Y1 search update. Now they are already helping to give answers to searches in Yandex and Alice.
Github: https://github.com/yandex/YaLM-100B
@ai_machinelearning_big_data
YaLM 100B is trained for 2 terabyte of text: dataset the Pile and web-pages, including not only Wikipedia, news articles, and books, but also Github and arxiv.org. Yandex has applied the generative neural networks YaLM in the recent Y1 search update. Now they are already helping to give answers to searches in Yandex and Alice.
Github: https://github.com/yandex/YaLM-100B
@ai_machinelearning_big_data
👍18🤔17👎6🔥6
In 15 minutes, start the last section of DataFestOnline 2022.
[language of presentations - Russian]
Mikhail Neretin - Topic management in Kafka
Evgeniy Sorokin - Human and ML => collaboration
Olga Filippova - Data drift & Concept drift: How to monitor ML models in production
Dmitry Beloborodov - Matrix factorizations with embeddings of variable dimension
Stanislav Yarkin - Summarization Is All You Need: summarization in RecSys
Artem Trofimov - MLOps by DS forces
Possible: Secret Speaker - MLEM: Version and deploy your ML models following Hitops principles
Start at 11:00 MSK (UTC+3)
There is an opportunity to ask a question to the speaker
Where?
Come to the session -> https://live.ods.ai/
Room: Last chance
Password: followthepinkparrot
[language of presentations - Russian]
Mikhail Neretin - Topic management in Kafka
Evgeniy Sorokin - Human and ML => collaboration
Olga Filippova - Data drift & Concept drift: How to monitor ML models in production
Dmitry Beloborodov - Matrix factorizations with embeddings of variable dimension
Stanislav Yarkin - Summarization Is All You Need: summarization in RecSys
Artem Trofimov - MLOps by DS forces
Possible: Secret Speaker - MLEM: Version and deploy your ML models following Hitops principles
Start at 11:00 MSK (UTC+3)
There is an opportunity to ask a question to the speaker
Where?
Come to the session -> https://live.ods.ai/
Room: Last chance
Password: followthepinkparrot
👍21
Forwarded from Binary Tree
Today mimesis has been designated as a critical project on PyPI.
It's ain't much, but I feel warm when I think about how many people use think I built.
Thank you everyone!
P.S If you don't know what the hell mimesis is, then go and check it out. Maybe you'll find it useful for you.
#mimesis #pypi #python
It's ain't much, but I feel warm when I think about how many people use think I built.
Thank you everyone!
P.S If you don't know what the hell mimesis is, then go and check it out. Maybe you'll find it useful for you.
#mimesis #pypi #python
👍32❤4😁1
No Language Left Behind
Scaling Human-Centered Machine Translation
No Language Left Behind (NLLB) is a first-of-its-kind, AI breakthrough project that open-sources models capable of delivering high-quality translations directly between any pair of 200+ languages — including low-resource languages like Asturian, Luganda, Urdu and more. It aims to help people communicate with anyone, anywhere, regardless of their language preferences.
To enable the community to leverage and build on top of NLLB, the lab open source all they evaluation benchmarks (FLORES-200, NLLB-MD, Toxicity-200), LID models and training code, LASER3 encoders, data mining code, MMT training and inference code and our final NLLB-200 models and their smaller distilled versions, for easier use and adoption by the research community.
Paper: https://research.facebook.com/publications/no-language-left-behind/
Blog: https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/
GitHub: https://github.com/facebookresearch/fairseq/tree/26d62ae8fbf3deccf01a138d704be1e5c346ca9a
#nlp #translations #dl #datasets
Scaling Human-Centered Machine Translation
No Language Left Behind (NLLB) is a first-of-its-kind, AI breakthrough project that open-sources models capable of delivering high-quality translations directly between any pair of 200+ languages — including low-resource languages like Asturian, Luganda, Urdu and more. It aims to help people communicate with anyone, anywhere, regardless of their language preferences.
To enable the community to leverage and build on top of NLLB, the lab open source all they evaluation benchmarks (FLORES-200, NLLB-MD, Toxicity-200), LID models and training code, LASER3 encoders, data mining code, MMT training and inference code and our final NLLB-200 models and their smaller distilled versions, for easier use and adoption by the research community.
Paper: https://research.facebook.com/publications/no-language-left-behind/
Blog: https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/
GitHub: https://github.com/facebookresearch/fairseq/tree/26d62ae8fbf3deccf01a138d704be1e5c346ca9a
#nlp #translations #dl #datasets
👍39👎1
Data Science by ODS.ai 🦜
Big step after first DALL·E — DALL·E 2 In January 2021, OpenAI introduced DALL·E. One year later, their newest system, DALL·E 2, generates more realistic and accurate images with 4x greater resolution. The first DALL·E is a transformer model. It receives…
DALL·E Now Available in Beta
#Dalle by #openai was released for public (though obviously moderated) access. Join waitlist to play around.
Images are also available for commercial use.
By default user can generate 460 images, further generations (or variations on generated images) will be available on paid plan.
Link: https://openai.com/blog/dall-e-now-available-in-beta/
#image #CV #GAN #generation #generatinveart
#Dalle by #openai was released for public (though obviously moderated) access. Join waitlist to play around.
Images are also available for commercial use.
By default user can generate 460 images, further generations (or variations on generated images) will be available on paid plan.
Link: https://openai.com/blog/dall-e-now-available-in-beta/
#image #CV #GAN #generation #generatinveart
Openai
DALL·E now available in beta
We’ll invite 1 million people from our waitlist over the coming weeks. Users can create with DALL·E using free credits that refill every month, and buy additional credits in 115-generation increments for $15.
👍14❤9🤮2
Data Science by ODS.ai 🦜
DALL·E Now Available in Beta #Dalle by #openai was released for public (though obviously moderated) access. Join waitlist to play around. Images are also available for commercial use. By default user can generate 460 images, further generations (or variations…
Images, generated by #dalle on the input
#openai #generatinveart #generation
Illustration for dalle-2 beta release post
, courtesy of @stas_kulesh.#openai #generatinveart #generation
👍9
Data Science by ODS.ai 🦜
Experimenting with CLIP+VQGAN to Create AI Generated Art Tips and tricks on prompts to #vqclip. TLDR: * Adding rendered in unreal engine, trending on artstation, top of /r/art improves image quality significally. * Using the pipe to split a prompt into…
Some stats to get the perspective of the development of #dalle
«Used 1000 prompts in Dalle over the last 2 days, about 9 hours each day. Of those, saved ~300. 50 I like enough to share w/ socials. 12 enough to rework for future projects. 3 were perfect, may mint someday. Curation is *1* step of AI art. I used to finish physicals in less time.»
Source: https://twitter.com/clairesilver12/status/1550709299797577729
#visualization #gan #generation #generatinveart #aiart #artgentips
«Used 1000 prompts in Dalle over the last 2 days, about 9 hours each day. Of those, saved ~300. 50 I like enough to share w/ socials. 12 enough to rework for future projects. 3 were perfect, may mint someday. Curation is *1* step of AI art. I used to finish physicals in less time.»
Source: https://twitter.com/clairesilver12/status/1550709299797577729
#visualization #gan #generation #generatinveart #aiart #artgentips
👍6👏2
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios
While vision transformers demostrate high performance, they can't be deployed as efficiently as CNNs in realistic industrial deployment scenarios, e. g. TensorRT or CoreML.
The authors propose Next-ViT, which has a higher latency/accuracy trade-off than existing CNN and ViT models. They develop two new architecture blocks and a new paradigm to stack them. As a result, On TensorRT, Next-ViT surpasses ResNet by 5.4 mAP (from 40.4 to 45.8) on COCO detection and 8.2% mIoU (from 38.8% to 47.0%) on ADE20K segmentation. Also, it achieves comparable performance with CSWin, while the inference speed is accelerated by
3.6×. On CoreML, Next-ViT surpasses EfficientFormer by 4.6 mAP (from 42.6 to 47.2) on COCO detection and 3.5% mIoU (from 45.2% to 48.7%) on ADE20K segmentation under similar latency.
Paper: https://arxiv.org/abs/2207.05501
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-next-vit
#deeplearning #cv #transformer #computervision
While vision transformers demostrate high performance, they can't be deployed as efficiently as CNNs in realistic industrial deployment scenarios, e. g. TensorRT or CoreML.
The authors propose Next-ViT, which has a higher latency/accuracy trade-off than existing CNN and ViT models. They develop two new architecture blocks and a new paradigm to stack them. As a result, On TensorRT, Next-ViT surpasses ResNet by 5.4 mAP (from 40.4 to 45.8) on COCO detection and 8.2% mIoU (from 38.8% to 47.0%) on ADE20K segmentation. Also, it achieves comparable performance with CSWin, while the inference speed is accelerated by
3.6×. On CoreML, Next-ViT surpasses EfficientFormer by 4.6 mAP (from 42.6 to 47.2) on COCO detection and 3.5% mIoU (from 45.2% to 48.7%) on ADE20K segmentation under similar latency.
Paper: https://arxiv.org/abs/2207.05501
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-next-vit
#deeplearning #cv #transformer #computervision
👍24❤2😁2🔥1👏1
Forwarded from gonzo-обзоры ML статей
Now it is official.
I've started writing a book on JAX. This seems to be the first book ever on this topic.
For those who don't know, JAX is an exceptionally cool numeric computations library from Google, a kind of NumPy on steroids, with autodiff, XLA compilation, and hardware acceleration on TPU/GPU. JAX also brings the functional programming paradigm to deep learning.
JAX is heavily used for deep learning and already pretends to be the deep learning framework #3. Some companies, like DeepMind, have already switched to JAX internally. There are rumors that Google also switches to JAX.
JAX ecosystem is constantly growing. There are a lot of high-quality deep learning-related modules. But JAX is not limited to deep learning. There are many exciting applications and libraries on top of JAX for physics, including molecular dynamics, fluid dynamics, rigid body simulation, quantum computing, astrophysics, ocean modeling, and so on. There are libraries for distributed matrix factorization, streaming data processing, protein folding, and chemical modeling, with other new applications emerging constantly.
Anyway, it's a perfect time to start learning JAX!
The book is available today as a part of the Manning Early Access Program (MEAP), so you can read the book as I write it 🙂 This is a very smart way of learning something new, you do not have to wait until the complete book is ready. You can start learning right away, and at the moment the book is published, you already know everything. Your feedback will also be very valuable, and you can influence how the book is made.
Here's a link to the book: https://mng.bz/QvAG
If you want a decent discount, use the discount code mlsapunov. It will provide you with 40% off, and it's valid through August 11th.
I've started writing a book on JAX. This seems to be the first book ever on this topic.
For those who don't know, JAX is an exceptionally cool numeric computations library from Google, a kind of NumPy on steroids, with autodiff, XLA compilation, and hardware acceleration on TPU/GPU. JAX also brings the functional programming paradigm to deep learning.
JAX is heavily used for deep learning and already pretends to be the deep learning framework #3. Some companies, like DeepMind, have already switched to JAX internally. There are rumors that Google also switches to JAX.
JAX ecosystem is constantly growing. There are a lot of high-quality deep learning-related modules. But JAX is not limited to deep learning. There are many exciting applications and libraries on top of JAX for physics, including molecular dynamics, fluid dynamics, rigid body simulation, quantum computing, astrophysics, ocean modeling, and so on. There are libraries for distributed matrix factorization, streaming data processing, protein folding, and chemical modeling, with other new applications emerging constantly.
Anyway, it's a perfect time to start learning JAX!
The book is available today as a part of the Manning Early Access Program (MEAP), so you can read the book as I write it 🙂 This is a very smart way of learning something new, you do not have to wait until the complete book is ready. You can start learning right away, and at the moment the book is published, you already know everything. Your feedback will also be very valuable, and you can influence how the book is made.
Here's a link to the book: https://mng.bz/QvAG
If you want a decent discount, use the discount code mlsapunov. It will provide you with 40% off, and it's valid through August 11th.
🔥34👍27❤4👎1
Frankfurt Data Science welcome you back to the onsite event in Frankfurt-am-Main! As usual special - a demo of the new #dalle2 model by statworx team, Sebastian Heinz and Fabian Müller, socialising time, and a cocktail bar with free cocktails. Frankfurt DS Community thanks SAS for supporting and TechQuartier for hosting! They are looking for more supporters.
Event details:
Date: 24th August 2022
Time: 6PM - 11PM
Location: TechQuartier, Frankfurt
Register here
Event details:
Date: 24th August 2022
Time: 6PM - 11PM
Location: TechQuartier, Frankfurt
Register here
👍27❤17🔥1
Forwarded from Kali Novskaya (Tatiana Shavrina)
#nlp #про_nlp
Друзья, так как много из моих статей и статей коллег выходит на англ языке, я решила завести отдельную группу для общения про многоязычность в NLP: добавляйтесь!
В группе будут посты и обсуждения новых важных работ по теме — кажется, что это первый чат в tg такого рода.
https://t.iss.one/multilingual_nlp
Multilingual NLP discussion group
I've decided to create a group on #multilinguality so we could have a place in telegram to discuss important new papers and share cool results during conferences, meetups and public discussions!
Group is public, feel free to share the link!
https://t.iss.one/multilingual_nlp
Друзья, так как много из моих статей и статей коллег выходит на англ языке, я решила завести отдельную группу для общения про многоязычность в NLP: добавляйтесь!
В группе будут посты и обсуждения новых важных работ по теме — кажется, что это первый чат в tg такого рода.
https://t.iss.one/multilingual_nlp
Multilingual NLP discussion group
I've decided to create a group on #multilinguality so we could have a place in telegram to discuss important new papers and share cool results during conferences, meetups and public discussions!
Group is public, feel free to share the link!
https://t.iss.one/multilingual_nlp
👍29🥰1
RuLeanALBERT : https://github.com/yandex-research/RuLeanALBERT
RuLeanALBERT is a pretrained masked language model for the Russian language using a memory-efficient architecture.
RuLeanALBERT is a pretrained masked language model for the Russian language using a memory-efficient architecture.
GitHub
GitHub - yandex-research/RuLeanALBERT: RuLeanALBERT is a pretrained masked language model for the Russian language that uses a…
RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture. - yandex-research/RuLeanALBERT
👍18
Forwarded from DataGym Channel [Power of data]
#opensource : RuLeanALBERT от Yandex Research
2.9B трансформер для русского, которая влезет в домашнюю ПеКарню ресерчера
Мало того, что это самая большая БЕРТ-подобная модель для русского языка, которая показывает крутые результаты в бенчмарках, так еще и с кодом для fine-tuning-а
GitHub
А в статье можете узнать, как обучалась эта модель (а-ля коллаборативное глубокое обучение) на фреймворке по децентрализованному обучению Hivemind
2.9B трансформер для русского, которая влезет в домашнюю ПеКарню ресерчера
Мало того, что это самая большая БЕРТ-подобная модель для русского языка, которая показывает крутые результаты в бенчмарках, так еще и с кодом для fine-tuning-а
GitHub
А в статье можете узнать, как обучалась эта модель (а-ля коллаборативное глубокое обучение) на фреймворке по децентрализованному обучению Hivemind
GitHub
GitHub - yandex-research/RuLeanALBERT: RuLeanALBERT is a pretrained masked language model for the Russian language that uses a…
RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture. - yandex-research/RuLeanALBERT
👍28❤3
Kolesa Conf 2022 — Kolesa Group's conference for IT community
Kolesa Group will host its traditional conference on 8 October. Registration is open. Kolesa Conf is an annual Kazakhstan IT conference where leading experts from the best IT companies of the CIS take part.
One of the tracks is devoted to Data. There will be 12 speakers. Some of featured reports:
• «Methodology of construction of cloud CDW: case Air Astana».
• How to correctly estimate the sample size and time of the test. And what non-obvious difficulties can be encountered along the way.
• «Creation of a synthetic control group with the help of Propensity Score Matching».
• How the dynamic minimum cheque (surge pricing) hated by users was developed and implemented.
• «Analytics at the launch of a new product. An example of market place spare parts»
Conference website: https://bit.ly/3UmCoWC
#conference #it #data #analysis
Kolesa Group will host its traditional conference on 8 October. Registration is open. Kolesa Conf is an annual Kazakhstan IT conference where leading experts from the best IT companies of the CIS take part.
One of the tracks is devoted to Data. There will be 12 speakers. Some of featured reports:
• «Methodology of construction of cloud CDW: case Air Astana».
• How to correctly estimate the sample size and time of the test. And what non-obvious difficulties can be encountered along the way.
• «Creation of a synthetic control group with the help of Propensity Score Matching».
• How the dynamic minimum cheque (surge pricing) hated by users was developed and implemented.
• «Analytics at the launch of a new product. An example of market place spare parts»
Conference website: https://bit.ly/3UmCoWC
#conference #it #data #analysis
kolesa-conf.kz
Kolesa Conf`24
Самая масштабная IT-конференция в Казахстане
👍18🔥5🍌4👎2😁2
Wanna have impromptu Data Science breakfast tomorrow in the center of Helsinki. Please dm @malev if you are interested
🤔11🤡4