Easy to miss in the PyTorch 2.0 release notes, they've added a small, but useful feature: torch.device, which previously just returned a device object, can now be used as a context manager.
A code speaks more than a thousand words: (1st pic)
At first, it doesn't look so useful, because you could also just call .to() on the tensor.
But when you create large tensors, and many of them, it may take a little bit to 1) overwrite the memory in CPU and 2) transfer it to the GPU.
With that context manager, you can just tell PyTorch to create the tensor on the device, rather than allocating memory on the CPU first.
This makes even more sense when you apply the context manager over the creation of a NN module: (second pic)
This is nice, because the entire module and all submodules get init directly on the device.
https://twitter.com/adrianwaelchli/status/1636161187632107521?s=19
A code speaks more than a thousand words: (1st pic)
At first, it doesn't look so useful, because you could also just call .to() on the tensor.
But when you create large tensors, and many of them, it may take a little bit to 1) overwrite the memory in CPU and 2) transfer it to the GPU.
With that context manager, you can just tell PyTorch to create the tensor on the device, rather than allocating memory on the CPU first.
This makes even more sense when you apply the context manager over the creation of a NN module: (second pic)
This is nice, because the entire module and all submodules get init directly on the device.
https://twitter.com/adrianwaelchli/status/1636161187632107521?s=19
👍1
Multimodal Machine Learning - Carnegie Mellon, 2022
A great series of lectures on multimodal machine learning(MML). The course covers fundamental concepts related to MML and recent state-of-the-art MML systems.
Lectures: https://www.youtube.com/playlist?list=PL-Fhd_vrvisNM7pbbevXKAbT_Xmub37fA
Webpage: https://cmu-multicomp-lab.github.io/mmml-course/fall2022/
Multimodal machine learning is a hot area in AI research. Unimodal learning has developed massively in the last 5 years. The challenge now is how we fuse different modalities(vision, audio, text, robot actions) into a single agent. GPT-4 & similar models are the beginning.
So good to see courses that are dedicated to this new and vibrant area of AI research.
A great series of lectures on multimodal machine learning(MML). The course covers fundamental concepts related to MML and recent state-of-the-art MML systems.
Lectures: https://www.youtube.com/playlist?list=PL-Fhd_vrvisNM7pbbevXKAbT_Xmub37fA
Webpage: https://cmu-multicomp-lab.github.io/mmml-course/fall2022/
Multimodal machine learning is a hot area in AI research. Unimodal learning has developed massively in the last 5 years. The challenge now is how we fuse different modalities(vision, audio, text, robot actions) into a single agent. GPT-4 & similar models are the beginning.
So good to see courses that are dedicated to this new and vibrant area of AI research.
🔥1
Forwarded from Silicon Brain | جامعه هوش مصنوعی
این روزا که chatgpt خیلی ترند شده، از تکنولوژی های جدید حوزه تصویر عقب نمونید!
دیفیوژن مدل برای تشخیص اشیا
تا به حال روشی برای تشخیص اشیا در نظر گرفته اید که بدون نیاز به داده های اولیه لیبل خورده، اشیای موجود در تصویر را تشخیص دهد؟
این مدل #دیفیوژن یعنی #DiffusionDet، از روش خاصی برای تشخیص اشیا استفاده میکند. این مدل ابتدا تصویر فعلی را با جعبه های تصادفی نویزی کرده و در ادامه با دینویز کردن جعبه ها فرآیند تشخیص تصویر را انجام میدهد!
پیپرویدکد | گیتهاب | مقاله
#denoising
@silicon_brain
دیفیوژن مدل برای تشخیص اشیا
تا به حال روشی برای تشخیص اشیا در نظر گرفته اید که بدون نیاز به داده های اولیه لیبل خورده، اشیای موجود در تصویر را تشخیص دهد؟
این مدل #دیفیوژن یعنی #DiffusionDet، از روش خاصی برای تشخیص اشیا استفاده میکند. این مدل ابتدا تصویر فعلی را با جعبه های تصادفی نویزی کرده و در ادامه با دینویز کردن جعبه ها فرآیند تشخیص تصویر را انجام میدهد!
پیپرویدکد | گیتهاب | مقاله
#denoising
@silicon_brain
👍2🔥2
توی مدل جدیدی که توسط تیم تحقیقاتی استنفورد منتشر شده، تونستن به واسطه فاین تیون مدل سبک Meta LLama (ورژن 7B) و متدولوژی self-instruct و API های مدل ساده داوینچی GPT با هزینه کمتر از ۶۰۰ دلار، یک چتبات با نام Alpaca توسعه بدن. نکات مهم مربوط به این چت بات، زمان مورد نیاز برای فاین تیون (حدود ۳ ساعت) و عدم نیازمندی به افراد برای label زدن و رنک کردن دستورات و پاسخ های چت بات (به کمک متد self-instruct) هستن.
پ.ن: بماند که فاینتیون رو با هشت تا GPU با ۸۰ گیگ رم انجام دادن🥲
https://youtu.be/xslW5sQOkC8
پ.ن: بماند که فاینتیون رو با هشت تا GPU با ۸۰ گیگ رم انجام دادن🥲
https://youtu.be/xslW5sQOkC8
🤯2👏1
The Annotated Transformer
Annotated version of the paper "Attention is All You Need" and line by line implementation in pytorch
https://nlp.seas.harvard.edu/annotated-transformer/
Annotated version of the paper "Attention is All You Need" and line by line implementation in pytorch
https://nlp.seas.harvard.edu/annotated-transformer/
👌5
MIT Researchers Introduce LiGO: A New Technique that Accelerates Training of Large Machine-Learning Models, Reducing the Monetary and Environmental Cost of Developing AI Applications
The transformer architecture has become a go-to choice for representing various domain structures. The empirical inductive biases of the transformer make it a good candidate for scaling. This paves the way for the periodic training and release of expanded versions of existing, smaller models. Although often a scaled-up version of their smaller counterparts, new instances of such models are normally trained from the start. Since even the smallest models need a significant amount of computational resources to train, the parameters of smaller pretrained models should be used to speed up the training of larger models.
When looking at this issue from the perspective of model growth, one strategy is to use the pretrained parameters of a smaller model to initialize some of the parameters of the larger model. Recent research has shown that training can be accelerated by copying a subset of the pretrained parameters to initialize the new parameters and then fine-tuning the entire network. This contrasts earlier works, which generally froze the parameters initialized from the pretrained model and only trained the new (randomly initialized) parameters.
The Computer Science and Artificial Intelligence Laboratory (CSAIL) suggests using pre-trained, smaller language models to boost the effectiveness of these training approaches at a reduced cost and time commitment. Their approach uses machine learning to “grow” a more complex model from a simpler one to encode the smaller model’s prior knowledge. This allows for the larger model to be trained more quickly. The team doesn’t just throw away old models but takes their best parts and uses them to create something new.
Project: https://vita-group.github.io/LiGO/
Blog: https://www.marktechpost.com/2023/03/24/mit-researchers-introduce-ligo-a-new-technique-that-accelerates-training-of-large-machine-learning-models-reducing-the-monetary-and-environmental-cost-of-developing-ai-applications/
The transformer architecture has become a go-to choice for representing various domain structures. The empirical inductive biases of the transformer make it a good candidate for scaling. This paves the way for the periodic training and release of expanded versions of existing, smaller models. Although often a scaled-up version of their smaller counterparts, new instances of such models are normally trained from the start. Since even the smallest models need a significant amount of computational resources to train, the parameters of smaller pretrained models should be used to speed up the training of larger models.
When looking at this issue from the perspective of model growth, one strategy is to use the pretrained parameters of a smaller model to initialize some of the parameters of the larger model. Recent research has shown that training can be accelerated by copying a subset of the pretrained parameters to initialize the new parameters and then fine-tuning the entire network. This contrasts earlier works, which generally froze the parameters initialized from the pretrained model and only trained the new (randomly initialized) parameters.
The Computer Science and Artificial Intelligence Laboratory (CSAIL) suggests using pre-trained, smaller language models to boost the effectiveness of these training approaches at a reduced cost and time commitment. Their approach uses machine learning to “grow” a more complex model from a simpler one to encode the smaller model’s prior knowledge. This allows for the larger model to be trained more quickly. The team doesn’t just throw away old models but takes their best parts and uses them to create something new.
Project: https://vita-group.github.io/LiGO/
Blog: https://www.marktechpost.com/2023/03/24/mit-researchers-introduce-ligo-a-new-technique-that-accelerates-training-of-large-machine-learning-models-reducing-the-monetary-and-environmental-cost-of-developing-ai-applications/
vita-group.github.io
Learning to Grow Pretrained Models for Efficient Transformer Training
Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Cox, Atlas Wang, Yoon Kim. Learning to Grow Pretrained Models for Efficient Transformer Training. In ICLR, 2023.
Sparks of Artificial General Intelligence: Early experiments with GPT-4
https://arxiv.org/abs/2303.12712
https://arxiv.org/abs/2303.12712
🔥3😱1
Forwarded from Meysam
خب خب خب،
اگه میخواهید واقعا پردازش زبان یاد بگیرید و بدونید داستان چیه و کل این چت جی پی تی و ... چطوری کار میکنه، دو راه داره:
۱. ولش کنید.
۲. این لیست رو بخونید، حدودا دو سه ماه حداقل وقت میبره و به ترتیب بخونید:
پردازش زبان ژورافسکی:
https://web.stanford.edu/~jurafsky/slp3/
یادگیری عمیق:
https://www.deeplearningbook.org/
مقالات مهم و تاثیر گذار پردازش زبان:
LSTM:
https://arxiv.org/abs/1512.08849
اتنشن تو پردازش زبان:
https://arxiv.org/abs/1409.0473
Word2vec, Fasttext
ترنسفرمر:
https://arxiv.org/abs/1706.03762
T5, BERT, Longformer
Instruction fine-tuning:
https://arxiv.org/abs/2204.07705
Bloom:
https://arxiv.org/abs/2211.05100
RLHF:
https://arxiv.org/abs/2009.01325
پیش نیاز:
ریاضی و مخصوصا مشتق و امثالهم
برنامه نویسی پایتون
تفکر نقادانه
یادگیری ماشین
اگه میخواهید واقعا پردازش زبان یاد بگیرید و بدونید داستان چیه و کل این چت جی پی تی و ... چطوری کار میکنه، دو راه داره:
۱. ولش کنید.
۲. این لیست رو بخونید، حدودا دو سه ماه حداقل وقت میبره و به ترتیب بخونید:
پردازش زبان ژورافسکی:
https://web.stanford.edu/~jurafsky/slp3/
یادگیری عمیق:
https://www.deeplearningbook.org/
مقالات مهم و تاثیر گذار پردازش زبان:
LSTM:
https://arxiv.org/abs/1512.08849
اتنشن تو پردازش زبان:
https://arxiv.org/abs/1409.0473
Word2vec, Fasttext
ترنسفرمر:
https://arxiv.org/abs/1706.03762
T5, BERT, Longformer
Instruction fine-tuning:
https://arxiv.org/abs/2204.07705
Bloom:
https://arxiv.org/abs/2211.05100
RLHF:
https://arxiv.org/abs/2009.01325
پیش نیاز:
ریاضی و مخصوصا مشتق و امثالهم
برنامه نویسی پایتون
تفکر نقادانه
یادگیری ماشین
🔥4👍3❤1
تقدیم به پایتورچ فن های کانال
جزئیات مربوط به آپدیت جدید پایتورچ
What's New in PyTorch 2.0? torch.compile - PyImageSearch
https://pyimagesearch.com/2023/03/27/whats-new-in-pytorch-2-0-torch-compile/
جزئیات مربوط به آپدیت جدید پایتورچ
What's New in PyTorch 2.0? torch.compile - PyImageSearch
https://pyimagesearch.com/2023/03/27/whats-new-in-pytorch-2-0-torch-compile/
PyImageSearch
What's New in PyTorch 2.0? torch.compile - PyImageSearch
Learn and implement what is new in PyTorch 2.0.
🔥2👎1
Forwarded from 10th WSS ☃️
☃️ معرفی ارائهدهندگان
👤 دکتر ایمان حاج رسولیها
👤 استادیار در Joan & Sanford I. Weill Medical College of Cornell University
📁 سوابق علمی:
🔵 پسا دکتری زیست شناسی محاسباتی و ژنومیک سرطان، دانشگاه Brown
🔵 پژوهشگر پسا دکتری زیست شناسی محاسباتی و ژنومیک سرطان، دانشگاه استنفورد
🎓 تحصیلات آکامیک
🔵 کارشناسی مهندسی کامپیوتر(نرمافزار)، دانشگاه صنعتی شریف
🔵 کارشناسی ارشد علوم کامپیوتر، دانشگاه Simon Fraser
🔵 دکتری علوم کامپیوتر، دانشگاه Simon Fraser
🎖افتخارات و دستاوردها:
🔵 بورسیه تحقیقاتی Simons-Berkeley، سال ۲۰۱۶
🔴 بورسیه تحصیلی پسادکتری NSERC
🔵 بورسیه تحصیلی الکساندر گراهام بل
🔴 بهترین مقاله ISMB-HiTSeq، سال ۲۰۱۱
🔗 صفحات سخنران:
🌐 HomePage |🌐 Linkedin |💬 Google Scholar
👥 عنوان ارائه:
Weakly-supervised tumor purity prediction from frozen H&E stained slides
💬 خلاصه ارائه
📌 زبان ارائه: انگلیسی
💼 #computational_genomics_and_technology #ai_in_medicine #cancer_genomics_and_pathology #algorithms #deel_learning
#8th_WSS
#Speakers
💻 اطلاعات بیشتر و ثبت نام:
🌎 https://wss.ce.sharif.edu
────────────────────
⛓ Connect with us
💬 Telegram | 📷 Instagram
💬 Twitter | 🌐 LinkedIn
💬 Facebook | 🌐 YouTube
☃ @WSS_SUT
🎖افتخارات و دستاوردها:
🌐 HomePage |
Weakly-supervised tumor purity prediction from frozen H&E stained slides
#8th_WSS
#Speakers
────────────────────
Please open Telegram to view this post
VIEW IN TELEGRAM
Forwarded from نوشتههای ترمینالی
انواع ایدهها و راهها برای لود کردن config file
البته با تمرکز بر پایتون
https://towardsdatascience.com/from-novice-to-expert-how-to-write-a-configuration-file-in-python-273e171a8eb3
البته با تمرکز بر پایتون
https://towardsdatascience.com/from-novice-to-expert-how-to-write-a-configuration-file-in-python-273e171a8eb3
Medium
From Novice to Expert: How to Write a Configuration file in Python
Treat config file like your production code
👍1👏1
از GPT4 خواستن که با بودجه ۱۰۰ دلاری یه کسب و کاری رو ارائه بده که باهاش بشه بیشترین میزان پول رو در اورد. ببینید چه کرده:
https://twitter.com/jacksonfall/status/1636107218859745286?s=20
https://twitter.com/jacksonfall/status/1636107218859745286?s=20
🤯5
Forwarded from نوشتههای ترمینالی
تشبیه مدل های بزرگ زبانی به فشرده سازی lossy مثل تصویر jpg
https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web?utm_source=pocket-newtab-android
https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web?utm_source=pocket-newtab-android
The New Yorker
ChatGPT Is a Blurry JPEG of the Web
OpenAI’s chatbot offers paraphrases, whereas Google offers quotes. Which do we prefer?
🤔1